For HTML and XML documents, Net Search Extender provides default document models that are used if you do not define a document model. For structured plain text documents, you must provide and specify a document model.
If you use one of the default document models:
Table 6. Behavior of the default document models for the supported document formats
Document type | Behavior of the default document model |
---|---|
HTML | Accepts these as text fields: <a> <address> <au>
<author> <h1> <h2> <h3> <h4> <h5>
<h6> <title>.
Field name is the tag name, for example "address". |
XML | Accepts all tags as text fields.
Field name is the tag path name in Xpath notation, for example "/play/title". |
Structured plain text (GPP) | No default document model. |
Outside-In (INSO) | Accepts as text fields, the document properties shown in Element parameters as returned by the Outside-In filters. The Field name is the name of the document property used by Outside-In, for example: "SCCCA_TITLE". There is no attribute support. |
For each type of document, a document model is defined. As the models are all different, an example and explanation is provided for each one.
Note |
---|
Although the default document models do correctly process documents, for better indexing and search you should define your own document models. With the default document model, the text of a document is fully indexed regardless of whether or not it is part of a text field. This means that unrestricted text searches include a search of that text. |