Select the Create command and from the extended menu highlight Text Index. A Create Text Index Wizard pops up. Use the wizard to specify the configuration options for the text index in a number of panels.
To move between the panels, enter all the mandatory information and click on the Next button until the Finish button is enabled. Click on the Finish button to create the text index.
To create a text index on views, use the CREATE INDEX command described in CREATE INDEX.
This panel allows you to specify the schema and name for the text index. You can also specify a work and index directory for the text index files. Create the administrative tables for the index on the administration tablespace.
Here is a description of the fields in the panel:
Field Name | Mandatory/ Optional | Default | Description |
Index schema | Mandatory | user ID | Select a schema name of the text index. This is the DB2 schema name for the index-specific administration tables. |
Index name | Mandatory | N/A | Enter a valid DB2 index name for the text index. With the index schema, this uniquely identifies a full-text index in the database. |
Index directory | Optional | See the path name | Specify the directory path where you will store the text index. The directory must exist with read, write, and run permissions for the DB2 instance owner user ID. |
Work directory | Optional | See the path name | Specify the work directory where you will store temporary files during search and administration operations. The directory must exist with read, write, and run permissions for the DB2 instance owner user ID. |
Administration tablespace | Optional | Use the default tablespace | Select a tablespace name for the text index administration tables. You must define the tablespace on the same node group as the tablespace for the user table. |
This panel allows you to specify the schema and name of the table or nickname table, and the name of the text column containing the data you want to index. You can use a transformation function to modify the content of the text column. In addition to the text column, you can also specify numeric attributes if you want to index content of a table column expression.
Here is a description of the fields in the panel:
Field Name | Mandatory/ Optional | Default | Description |
Table schema (1) | Mandatory | user ID | Select the schema of the table or nickname table on which you are creating a text index. |
Table name (2) | Mandatory | N/A | Select the name of the table or nickname table on which you are creating an index. The table must have a primary key. |
Text column (3) | Mandatory | N/A | Select the name of the column used for creating the text index. The column must be transformed to, or be one of the following types: CHAR (for bit data), VARCHAR (for bit data), LONG VARCHAR (for bit data), CLOB, DBCLOB, BLOB, GRAPHIC, VARGRAPHIC, LONG VARGRAPHIC, and DATALINK. |
Transformation function | Optional | Disabled | Select to use a transformation function. |
Transformation function: Schema | Mandatory (if function selected) | user ID | Select the schema of the UDF used to access the text documents. |
Transformation function: Name | As above | N/A | Select the name of a UDF used to access the text documents. |
Note that you can only specify the table schema (1), table name (2) and text column (3) in this order.
To view or add attributes, click on the Numeric Attributes button. A window displays. To add numeric attributes to the index, click on the Add button and a further window displays. Specify the SQL column expression and name for the attribute.
Alternatively, select an attribute and press the appropriate buttons to change, move, or remove an entry.
Use Numeric Attributes to index column expressions in addition to the text column. For example, if you want to index the column date of type TIMESTAMP in addition to the text column, specify a numeric attribute "cast(julian_day(date) as double)" and specify a name for the attribute.
Specify a numeric attribute if you want to use a numeric expression inside a search query. If you are searching with SQL queries, you can use a combined search instead of using numeric attributes, for example: WHERE numattrib = 123 AND contains('...').
This panel allows you to specify the language and format of the text documents. If the documents are not the same CCSID as the database and the text column is of binary type, specify the CCSID. Note that the database CCSID is initially selected. If your documents are of a GPP, HTML, Outside In, or XML structured format, you can specify a document model.
In the format list box, the Outside In filtering format is known as INSO.
Here is a description of the fields in the panel:
Field Name | Mandatory/ Optional | Default | Description |
Language | Optional | EN_US | Select a language to determine end-of-sentence and end-of-paragraph delimiters when indexing documents. |
CCSID | Optional | CCSID of database | Select the CCSID for indexing text documents. |
Format | Optional | TEXT | Select the text document format: HTML, XML, TEXT, INSO or GPP. |
Default Document Model | Optional | Enabled | Use the default document model. |
User Document Model | Optional | Disabled | Use your document model. |
Model name | Mandatory (if User Document Model selected) | N/A | Enter the name of the document model. For HTML, XML, Outside In, and GPP formats, you can specify a document model. Note that the name is only found in the model file. |
Model file | As above | N/A | Specify the document model file. The file must be readable by DB2 instance owners. |
Model CCSID | As above | Database CCSID | Select the CCSID to interpret the contents of the document model file. |
Treat numbers as words | Optional | Disabled | Select to interpret sequences of digits as separate words, even if they are adjacent to characters. |
Index stopwords | Optional | Enabled | Select to enable language-specific stopword processing. The <language>.tsw in the directory sqllib/db2ext/resources contains the stopword list. |
This panel allows you to specify whether the index updates incrementally or is recreated from scratch. You can specify update settings so that the index automatically updates at the specified time.
Here is a description of the fields in the panel:
Field Name | Mandatory/ Optional | Default | Description |
Incremental update | Optional | Enabled | Select for incremental index updates. If you do not enable the check box, you recreate the index when an update operation is performed. |
Commitcount | Optional | 0 | Number of changes processed during an update in
one transaction.
Commitcount has implications on performance. For information, see Performance considerations. |
Capture table characteristics | Optional | N/A | Select to use a replication capture table for capturing changes on the source table. The replication capture table must either be a Capture Data (CD) table , or a Capture Change Data (CCD) table and replaces the DB2 Net Search Extender generated log table. |
Replication capture schema name | Optional | User ID | The schema name of the replication capture table. Note that the table must have been previously created using DB2 Replication. |
Replication capture table name | Mandatory, if Capture table characteristics enabled | N/A | The table name of the replication capture table. Note that the table must have been previously created using DB2 Replication. |
Control table schema name | Mandatory, if Capture table characteristics enabled | N/A | The control table schema name. Note that the tables must have been previously created using DB2 Replication. |
Reorg automatic or manual radio button | Optional or Mandatory | Enabled/ disabled | Completes index reorganization automatically or manually. |
Minimum number of changes for Update | Optional | 1 | Specify the minimum number of changes to the text documents before the index incrementally updates at the specified time. |
Update schedule | Optional | Disabled | Select to add automatic update settings. |
To add index update settings, click on the Settings button. Note that this button is only enabled if you select Update Schedule. In the dialog, select the days, hours, and minutes for the update time. Note that if you select multiple days, the update occurs at the same time on all the selected days.
This panel allows you to specify a cached table in addition to the index. You can specify the result columns to be cached and you can search the cache using a stored procedure. You can also specify other cache parameters, such as type, maximum size, and the order in which you retrieve the contents of the user table during initial indexing.
Here is a description of the fields in the panel:
Field Name | Mandatory/ Optional | Default | Description |
Enable table cache | Optional | Disabled | Select to enable the building of a cached table. |
Result column table | Mandatory (if Enable table cache selected) | N/A | Displays a list of SQL column expressions specifying the search result columns. |
Maximum number of documents | Mandatory | Row count of table | See the following section: Determining cache utilization and cache size. |
Average cache row length | Mandatory | N/A | See the following section: Determining cache utilization and cache size. |
Percentage of initial cache utilization | Optional | 50% | Select the percentage of the cache held free for additional documents. |
Maximum cache size | Optional | N/A | Specify a maximum size for the cached table built during index activate. If the number is too small, the activation will fail. |
Initial search result order | Optional | Disabled | Select to define the search result order. Documents are returned in the same indexing order as in the cached table. This order can not be ensured after incremental update. |
Use persistent cache | Optional | Enabled | This option enables a fast activate execution after a deactivation or system reboot. Note that you must specify a directory path for the persistent cache. Leave disabled if the cache should be temporary. |
The Percentage of initial cache utilization specifies the percentage of the cache to be held free for additional documents. The Maximum cache size specifies the maximum size of the cached table to be built during activate cache. These options depend on the following factors:
You can enter the recommended values for the Percentage of initial cache utilization and Maximum cache size. Alternatively, you can let them be calculated each time you enter values in the Maximum number of documents or Average cache row length fields.
The Maximum number of documents value is initially set to the row count of the table. Modify this according to the number of documents and expected number of changes. Include all document updates, additions, and deletions.
When you add an SQL expression to the cache table list, the Average cache row length is calculated according to the length of the result. As this is based on the number of rows in your table, the calculation can take a considerable amount of time. If you know that on average this value is smaller, modify the value.
For example, if your table has 10 entries and the sum of your column expressions is 100, then these values are initially set. If you expect that the maximum number of documents (including deleted ones) is 10 000, enter this figure. If you know that column expressions on average are smaller than the calculated value, such as a VARCHAR(100) and a filled-in text size of 10, use this figure for the average row size.
To define the initial search result order, click on the Define button. Note that this button is only enabled if you select the Initial Search Result Order check box. A dialog displays all the specified SQL column expressions. To add a result order, click on the Add button and, in the dialog, specify the SQL result order.
To change, move, or remove an entry, select the expression and click on the appropriate buttons.
To add SQL column expressions, click on the Add button, next to the Result Column table. In the dialog, specify the result column expression and name.
To change or remove an entry, click on the column expression which enables the appropriate buttons.
This panel provides an overview of the previously selected parameters.
Click on the Show Command button to view the commands that are run when you click on the Finish button. This action creates the text index.