What happens when an XML document is indexed

The following table shows what is put into the index.

Table 8. Entries in the text index
Field boundary information Indexed text
start of "addresses" field start of "customerName" field Alice Smith [1] and [2]
end of "customerName" field
123 Maple Street
Mill Hill
CA 90999
[2]
end of "addresses" field
123  1
S&B Lawnmower Type ABC-x
239.90  2001-01-25
987Z  1
Multifunction Rake ZYX
69.90  2001-01-24
Attribute name Attribute value(s)
Part number 123, 987

Note that nested fields are possible, as shown in this example. The field addresses selects a node in the XML document that dominates the node selected by field customerName. The content of that embedded node therefore, logically belongs to both fields. Although text fields may be overlapping, the text inside those fields is indexed only once. In this example, when searching with a field restriction, Alice Smith is found in addresses as well as in customerName.

The content of fields is determined by the following rules:

The document must contain well-formed XML, but it is not necessary for a DTD to be specified in the XML document. No DTD validation or external entity resolution is carried out; Net Search Extender only matches the XML document against the document model. Internal entities are substituted as required by XML.

For information on the Document Type Definitions, see DTD for document models.

For restrictions, see Limitations for text fields and document attributes.