XML Indexing

XTC provides several index structures for all kind of XML documents and queries. Besides the common element or content index, XTC provides for path indexes and CAS (content-and-structure) indexes as well. Fine-grained index specifications allow for cluster properties, selective document indexing, and efficient query evaluation support.

Element Index

The element index covers all XML elements and stores their translated vocabulary ID, the corresponding PCR if in elementless mode, and the unique DeweyID. This index structure has two levels. First the name directory (XML tag names), and secondary the reference lists.

Supports: element-based access, //<tag> access, structural joins

Content Index

The content index covers texts from text nodes or attribute nodes. This index is available for both storage mappings (full and elementless). Entries are primarily ordered by their text value and secondarily by their DeweyID.

Supports: value-based predicate evaluation or keyword-based search and access

Path Index

The path index is only available for the elementless storage mapping. Based on PCRs, reference lists of according document nodes are captured. An index entry consists of the PCR and DeweyID. Because both of them are used as key, the index does not contain any values.

A cluster property is available to switch the index key elements (PCR|DeweyID or DeweyID|PCR).

Supports: path queries or path expressions

Content And Structure Index (CAS)

The CAS index is only available for the elementless storage mapping. Based on text content and PCRs, reference lists of according document nodes are captured. An index entry consists of the text value as key, the PCR and the DeweyID as value and a secondary sort criteria. Therefore, optional clustering is possible via PCR or DeweyID.

Supports: complex queries, content and path evaluation

Lehrgebiet Informationssysteme

XML Indexing

Element Index

Content Index

Path Index

Content And Structure Index (CAS)