Query By Image Content (QBIC) is a technology that allows you to query collections of images by their content. "Query by content" means that you can query a collection of images in order to locate images that are similar to the query image, where similarity can be based on color, texture, or other image properties. For example, you can search for images that have predominantly red colors or striped textures, where the color and texture information is automatically computed. Queries by content complement traditional queries that use image file names or keyword descriptions.
This chapter provides a brief overview of QBIC including:
When you install QBIC, you get:
QBIC supports the following commonly used image formats. QBIC recognizes image format based on the image file extension:
In addition to the standard image formats listed above, the QbGenericImageDataClass, described on page 125, can also read a QBIC Picker Image Description string either in memory (ReadPickerImageDescriptionString) or in a file (ReadPickerImageDescriptionFile). This description string is a simple encoding of an image in a text string. Currently, this method only recognizes rectangles. The string format is:
Dwidth,height:Rulx,uly,rwidth,rheight,R,G,B:...
You can specify multiple rectangles (R strings), but the first rectangle must be "painted" in the drawing area first, and subsequent rectangles which overlap are painted over it. QBIC considers any area that is not painted is "to be ignored". Color information inside the "to be ignored" region does not affect the QBIC distance evaluation.
The QBIC Picker Image Description string is useful for sending image descriptions from a client to a server so that the server can construct a query image based on the description, and then search for similar images in the database. The description string is heavily used in the QBIC demo.
The API computes, stores, and retrieves data in databases and catalogs.
For example, a user may create the MY_IMAGE_DATA database with two catalogs, VACATION_PICTURES and FABRIC_SAMPLES.
In the case of DB2 or Oracle, this will create the MY_IMAGE_DATABASE database containing one set of tables named VACATION_PICTURES and another set of tables named FABRIC_SAMPLES.
In the case of dbm, this will create a directory named MY_IMAGE_DATABASE and two sets of files in that directory. One set will have file names with the extension VACATION_PICTURES, and the other with the extension FABRIC_SAMPLES.
Querying by content requires two phases:
During the database creation phase, you use one or more feature classes to compute the features of input images as numeric values. Each feature class creates a feature table in the database where these computed values are stored.
During the database query phase, QBIC compares feature data in the query with the computed data in the feature tables. A query can search on one or more features for similarity.
A simple query involves only one feature. An example of a simple query would be to find images in the database that have a color distribution similar to the query image.
A complex query involves more than one feature and can be either a multi-feature or a multi-pass query. In a multi-feature query, QBIC searches through different types of feature data in the database in order to find images that closely resemble the query image. All feature classes are treated equally during the database search, and all involved feature tables are searched at the same time. An example of a multi-feature query would be finding images in the database that have a color distribution and texture similar to a query image.
A "pass" is a single feature or multi-feature search. In a multi-pass query, the output of an initial search is used as the input for the next search. QBIC reorders the search results from a previous pass based on the "feature distances" in the current pass. An example of a multi-pass query would be finding images in the database that have a color distribution similar to the query image, and then reordering the results based on color composition.
For multi-feature and multi-pass queries, you can weight features to specify their relative importance, which provides flexibility for advanced applications where the returned results must be fine tuned.
QBIC queries can be classified into the following types:
QBIC has predefined methods to compute image properties. Each method is implemented as a C++ class.
The QBIC API implements extensive error checking routines that keep track of internal errors. If any QBIC method call returns a nonzero return code, it implies that an error occurred. If an error occurs in the class constructor (or any other method that does not return a value), you can query the internal state of the instance of the class by using the IsOk() method. If this method returns False, it means that the instance has failed, most likely due to a memory allocation failure.
QBIC includes error handling routines, which you can call at the beginning of your program or before QBIC calls. See "QBIC Error Handling Routines" on page 165 for information.
The QBIC API contains advanced features that can speed up a QBIC search and improve QBIC feature extraction. These advanced features are not supported in this QBIC release, although some methods that support these advanced features are included in this API. If you are interested in implementing any of the following features in your applications, contact IBM. See "Licensing Information" on page xiii for how to contact IBM.
Many of the QBIC feature classes support the concept of SubPart or Object features. SubPart and Object features represent feature data that is computed for only part of an image, the "part" being selected by an mask image. Queries are computed for these subimages using the SubPartFeatureClass.
To speed up queries for very large databases, QBIC has a method that organizes feature vectors into clusters for a given feature class. This feature is called cluster indexing.
If cluster indexing is enabled, QBIC will match against a subset of the feature vectors during a query, thereby reducing the query time. The cluster tables are stored in special database files. Modifications, such as the addition or deletion of records, are handled by the QbCatalogClass.
This QBIC advanced feature provides the ability to store the top results of the image keys inside a database to any query keys also inside the database. Using this feature if a user submits a query using a key inside the database which is often the case, QBIC only needs to read a record from a special database file and return the result.
As with cluster indexing, modifications to the database are handled by the QbCatalogClass. Extra data is stored in the global information.