IBM OmniFind Analytics Edition Dictionary Editor Guide
Edition Notice
First Edition (February 2007)

This edition applies to version 8, release 4 of IBM® OmniFind™ Analytics Edition and to all subsequent releases and modifications until otherwise indicated in new editions.

This document contains proprietary information of IBM. This proprietary information is provided in accordance with the license conditions and is protected by copyright. Information contained in this document provides no warranties whatsoever for any products. Also, no descriptions provided in this document should be interpreted as product warranties. Depending on the system environment, the yen symbol may be displayed as the backslash symbol, or the backslash symbol may be displayed as the yen symbol.

© Copyright International Business Machines Corporation 2007. All rights reserved.

US Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.

1 Introduction
This document describes how to use the IBM OmniFind Analytics Edition Dictionary Editor application.
1.1 Functional Overview
The Dictionary Editor is a Web application that you can use to edit the following items. See the Overview document for definitions of terms such as category, keyword, and synonyms.
  • Edit the category tree:
    Add or delete categories.

  • Edit keywords:
    Add or delete keywords, or register them in a category.

  • Edit synonyms:
    Add synonyms to keywords or delete synonyms from keywords; also, select synonyms to be used as keywords.

The following figure shows the relationship between editing a dictionary with Dictionary Editor and analysis by Text Miner:

1.2 Dictionary Resource Files
The Dictionary Editor supports editing operations by multiple users. To avoid editing conflicts, Dictionary Editor includes a mechanism to lock the files to be edited and prevent other users from editing the same files. Smooth operation can be ensured if users know which files might cause conflicts when they edit them. A description of each file type is as follows:
  • Category tree file:
    The entire category tree is saved as one file, and you use the Edit category tree screen to edit it. Because editing of the category tree and editing of keywords conflict with each other, other users cannot edit keywords while you edit the category tree. At the same time, you cannot edit the category tree while another user is editing keywords.

  • Keyword file:
    A keyword file saves a list of keywords and synonyms with their category information. Only one user can edit a keyword file at a time. Therefore, it is recommended to create a keyword file for each operator to avoid conflicts. Usually, keyword files are created for each category, such as a product name dictionary, a service name dictionary, and so on. Operators are also divided into categories. Note that you cannot edit the category tree and keyword files at the same time because they conflict with each other.

  • Candidate word file:
    A candidate word file loads frequently used character strings that have been extracted, or lists of product names or service names retrieved from internal databases, into the IBM OmniFind Analytics Edition dictionary. Multiple users can use this file simultaneously because this file is a read-only file.
1.3 Page Transition
Dictionary Editor provides the following screens for editing the category tree, keywords, and synonyms.

  • In the Select Database screen, select a database to be edited.

  • Menu items are always displayed in the left frame, but edit menu items are not available until a database is selected.

  • In the Configuration screen, set parameters for keyword edit.

  • Add or delete categories in the Edit category tree screen.

  • In the Select keyword file dialog, specify a keyword file to be edited. You can also specify a candidate word file, if it exists.

  • In the Edit keyword candidate mode screen, add or delete keywords. In this mode, you can add new keywords by entering the words. You can also add candidate words in the candidate word file as keywords.

  • In the Edit keyword category tree mode screen, in addition to adding or deleting keywords, you can register keywords in categories or delete keywords from categories. The category tree is displayed in this mode, and keywords can be searched for each currently registered category.

  • In the Edit synonyms screen, add or delete synonyms to keywords. When synonyms are already set, you can use a particular synonym as a keyword, and use the currently used keyword as its synonym.

1.4 Browser Settings
Set the browser in accordance with the security policy of your environment.

To enable pop-up windows while using Dictionary Editor, the following settings are required.

  1. Open Internet Explorer.
  2. Click Tools -> Internet Options -> Security.
  3. Select Trusted sites and click the Site button.
  4. Type the base URL of Dictionary Editor in the Add this site to the zone field. For example, if the URL of Dictionary Editor is https://dic.ibm.com:9443/OAE_DIC/, type:
    https://dic.ibm.com.
Depending on the settings, pop-up windows can also be disabled by the Google Toolbar; therefore, disable the pop-up block function to be able to use Dictionary Editor.
2 Before Editing the Dictionary
2.1 Initial Screen and Database Selection
In the initial state, a database has not been selected and among the menu items listed in the left side of the screen, only "Select Database" and "Help" are active.

Select a database in the initial screen:
  1. Click Select Database under Menu.
  2. Select the database that is to be used in the dictionary edit operation from the list.
  3. Click OK.
2.2 After Selecting a Database
The following screen is shown after you select a database.

After selecting a database:
  1. A message saying that the selected database has been loaded is displayed.
  2. The selected database is shown in the Current Database area.
  3. Configuration, Edit Category Tree, and Edit Keywords become active.
2.3 Editing the Settings
To change the keyword edit settings, click Configuration under Menu.

Configuration screen:
  1. Click Configuration under Menu to open the Configuration screen.
  2. Use the select box to specify the number of keywords to be displayed in a page.
  3. Click the Save button to save the change.
Note: The value set here becomes the maximum number of lines in the candidate word list and registered keyword list in 4.4 Candidate Words Display Mode in the Edit keyword screen.
2.4 After Editing the Settings
The following screen is shown after you edit the settings.

After editing the settings:


A message to confirm that the new settings have been saved is displayed.

3 Editing the Category Tree
3.1 How to Start
To edit the category tree, click Edit Category Tree from the Menu.

How to start editing the category tree:
  1. Click Edit Category Tree under Menu.
  2. Currently registered categories are displayed.
3.2 Warning when Editing the Category Tree
When multiple users are editing the dictionary, you must ensure that no other users are using Dictionary Editor when you edit the category tree. The following warning message appears if you try to edit the category tree while another user is editing the category tree or keywords.

Category tree edit interrupt warning:


Click the OK button to interrupt the edit operation. Unsaved data that is being edited by another user will be discarded. The same warning message is also displayed if you or a user who edited the category tree or keywords immediately before you start editing closed the window without properly completing the edit operation. Clicking OK to start editing will not affect the user who already finished editing.

3.3 Adding a Category
Adding a category:
  1. To add a category, click Add Subcategory at the next hierarchy level.
  2. Type a category name in the dialog box.
  3. Click OK.
Notes:
  • To create a category without a parent (root category), click Add Root Category at the top hierarchy level.
  • The category will not be added if you click Cancel in the dialog box.
3.4 Renaming a Category
Renaming a category:
  1. Click the Rename link to the right of the category name that you want to change.
  2. Type a category name in the dialog box.
  3. Click OK.
Note: The category name will not be changed if you click Cancel in the dialog box.
3.5 Deleting a Category
Deleting a category:
  1. Click the Delete link to the right of the category name that you want to delete.
  2. Click OK in the confirmation dialog box.
Notes:
  • The category will not be deleted if you click Cancel in the dialog box.
  • When a category is deleted, keyword information registered in that category will also be deleted (keywords will not be deleted). Registered keyword information will not be restored even though the same category is recreated; therefore, you must be careful when you delete a category.
3.6 Saving and Exiting Edit Mode
After editing the category tree, you must run the termination processing regardless of whether or not changes such as adding or deleting categories have been made, or whether or not changes must be saved. If the screen is closed while the edit operation continues, the category file (see 1.2 Dictionary Resource Files) is locked, and other users must interrupt when they want to edit the category tree or keywords.

Category save/quit menu:


(1) Save the current changes and continue the operation. The file stays locked; therefore, termination processing (2) or (3) is necessary.

(2) Save the changes and exit the category tree edit mode. The file will be unlocked.

(3) Exit the category tree edit mode without saving changes. The file will be unlocked.

The following screen is shown after you save and exit the edit mode:


(1) The termination message appears.

(2) The file is unlocked, and Edit Category Tree and Edit Keywords become active again.

3.7 Automatically Generated Dependency Categories
When the category tree is edited, categories for dependency keywords are automatically created. These categories cannot be seen while using Dictionary Editor, but they can be used with Text Miner.

Dictionary Editor category tree
Product
     Hardware
     Software

    ↓

Text Miner category tree
Product
     Hardware
         Dependency
             Hardware .. bad reputation
             Hardware … verbs
             Hardware .. problem
             Hardware .. good reputation
             Hardware .. senses
             Hardware .. requests
             Hardware .. questions
     Software
         Dependency
             Software .. bad reputation
             Software .. verb
             Software .. problem
             Software .. good reputation
             Software .. senses
             Software .. requests
             Software .. questions
     Dependency
         Product .. bad reputation
         Product .. verb
         Product .. problem
         Product .. good reputation
         Product .. senses
         Product .. requests
         Product .. questions

In this example, the "dependency" category is added immediately below the "product," "hardware," and "software" categories, and below that, categories to show phrases using various types of declinable words are added. Dependency expressions belonging to these dependency categories are phrases consisting of keywords registered in individual categories and indeclinable words, in the same manner as the basic dependency categories described in 3 System-defined Categories. Note, however, that the "dependency" category immediately below the "product" category is only for phrases consisting of keywords that belong to the "product" category and various indeclinable words; dependency involving the "hardware" and "software" categories is not included.
4 Editing Keywords
4.1 How to Start
To edit keywords, click Edit Keywords from the Menu.

How to start editing keywords:


Click Edit Keywords to open the keyword file selection dialog.

  1. Select the check box if you want to use candidate word files (see 1.2 Dictionary Resource Files).
  2. Specify the keyword file to be edited (see 1.2 Dictionary Resource Files).
  3. Click OK. The dialog closes and the candidate words display mode of the edit keyword screen starts.
Hint: To create a new keyword file, select New File, type a file name without an extension in the text field, and then click OK.
4.2 Warning when Editing Keywords
If another user is editing the category tree, a pop-up warning message is displayed.

Keyword edit interrupt warning:


Click the OK button to interrupt the edit operation. Unsaved category tree data that is being edited by another user will be discarded. The same warning message will be displayed if you or a user who edited the category tree immediately before you start editing closed the window without properly completing the edit operation. Clicking OK to start editing will not affect the user who already finished editing.

4.3 Keyword Files Currently Being Edited
When selecting a keyword file in the keyword file selection dialog box, if another user is editing the keyword file, the message "Used by another user" appears on the right side of the file currently being edited.

Keyword file selection dialog while the keyword file is being edited:


Select the keyword file that is being edited and then click OK to open a dialog to confirm interrupting the edit. Click OK again. Unsaved edit data created by another user will be discarded.

4.4 Candidate Words Display Mode
The structure of the Edit keyword screen in the candidate words display mode is as follows.

Edit keyword screen in the candidate words display mode:


(1) Search/Sort menu: use this area to narrow down or sort candidate words or keywords that are displayed in (3) and (4). Operations in this area will be reflected in both lists at the same time.

(2) Display mode select box: use this select box to switch the display modes between the candidate words display mode and category tree display mode.

(3) Candidate word list: a list of candidate words will be displayed when two or more candidate files are selected in the keyword file selection dialog (see 1.2 Dictionary Resource Files).

(4) Registered keyword list: this is a list of keywords that are already registered in the keyword file (see 1.2 Dictionary Resource Files). Keywords can be registered while comparing between the candidate word list and the registered keyword list.

(5) Add and Delete: use these arrows to add or delete keywords. Use the right arrow to add keywords and use the left arrow to delete keywords.

(6) Save buttons: Use these buttons to save keyword edit information and exit the edit mode.

4.5 Search and Sort
The candidate word list and the registered keyword list can be further narrowed or sorted by using the word type filter (search), string match (search), and sort functions.

Search/Sort menu:


  • Word type filter: Narrow the set of candidate words and keywords by type.

    • Cancel: Disable the word type filter.

    • Hiragana: List candidate words and keywords consisting only of Japanese hiragana.

    • Katakana: List candidate words and keywords consisting only of Japanese katakana.

    • Alphanumeric characters: List candidate words and keywords consisting only of alphanumeric characters.

  • String match: Type a character string in the text field and click the Search button to search for candidate words and keywords that use the specified character string.

    • Cancel: Cancel search.

    • Partial: Search candidate words and keywords that contain the specified character string.

    • Exact: Search candidate words and keywords that are exactly the same as the input character string. Use this option to see if a particular keyword is already registered.

    • Prefix: Search candidate words and keywords that start with the specified character string.

    • Suffix: Search candidate words and keywords that end with the specified character string.

  • Sort: Specify the order in which candidate words and keywords are displayed.

    • Frequency: If the frequency of appearance of candidate words in documents is included as data when the candidate word file was created, the candidate words are listed in order of frequency of appearance. Character strings that were used frequently are believed to be highly effective when they are used as keywords.

    • Confidence score: If the confidence score (to indicate the possibility that individual candidate words can be useful as keywords) is included as data when the candidate word file was created, the candidate words are listed in descending order of score.

    • Alphabetical: Candidate words and keywords are listed in order of the world standard character code called Unicode. In Unicode, characters, letters, and numbers are arranged in the following order: Japanese hiragana, Japanese katakana, Chinese character, numbers, uppercase alphabet, and lower case alphabet. In Japanese hiragana and katakana, characters are listed in the order of the Japanese syllabary; however, lowercase characters (such as the small "a" sound in Japanese) come before the regular character, and characters with the voiced sound symbol come after the regular character. Chinese characters are sorted and listed by radical, which is basically in the same manner as in a kanji (Chinese character) dictionary.

    • Modification time: Candidate words and keywords are listed in order of date of modification such as the addition to or deletion from keywords, editing of categories, and editing of synonyms. The newly modified keywords are shown at the top of the list for easier operation.

4.6 Adding Keywords from the List of Candidate Words
To use particular candidate words as keywords, follow the procedures below.

Adding candidate words as keywords:
  1. Select the check box of the candidate word to be added. You can select multiple candidate words at the same time.
  2. Click the arrow button to add the candidate words.
The newly added keywords are shown and highlighted at the top of the Registered keyword list.

After adding candidate words as keywords:


Hint: Clicking the Select All button above the Candidate word list selects all of the check boxes. This is a useful function when you want to add many keywords at the same time. After you click the Select All button, this button changes into the Cancel All button, and clicking this button will clear all of the selected check boxes.

4.7 Adding New Keywords by Entering Character Strings
To add keywords by directly entering character strings instead of selecting them from the candidate word list, follow the procedures below.

Adding candidate words as keywords:
  1. Click the New Keyword button of the Registered keyword list.
  2. When the dialog box appears, type the keyword.
  3. Click OK.
Notes:
  • The newly added keyword is shown and highlighted at the top of the Registered keyword list.
  • If the keyword that you typed in the dialog box has already been registered, the keyword will not be added but the already registered keyword will be displayed and highlighted at the top of the Registered keyword list.
  • If you type a synonym for an already registered keyword in the dialog box, it is added as a new keyword.
  • The keyword that you type will not be added if you click the Cancel button in the dialog box.
4.8 Deleting Keywords
To delete keywords, follow the procedures below.

Deleting candidate words as keywords:
  1. Select the check box of the keyword in the Registered keyword list that you want to delete. You can select multiple keywords simultaneously.
  2. Click the arrow button for deleting keywords.
Notes:
  • If the deleted keyword is not a synonym of a different keyword, the deleted keyword is shown and highlighted at the top of the Candidate word list.
  • If the deleted keyword exists as a synonym of a different keyword, the deleted keyword will not be shown on the Candidate word list.
  • Clicking the Select All button above the Registered word list checks all of the check boxes. This is a useful function when you want to delete many keywords at the same time. After you click the Select All button, this button changes into the Cancel All button, and clicking this button will clear all of the selected check boxes.
4.9 Editing Synonyms
To edit synonyms of a particular keyword, click the Edit button to the right of the keyword that you want to edit in the Registered keyword list.

Edit synonyms:


Click the Edit button to open the Edit synonym screen.

Edit synonym screen:


(1) Use these radio buttons to select a synonym to be regarded as a keyword (standard form). If the identical character string has been registered as a synonym of a different keyword, that character string (candidate for a synonym) will appear on a different screen, and the radio buttons operate accordingly. The system operates this way in order to let users know that a word that is registered as a synonym of a different keyword can be separately registered as a keyword.

(2) Use these check boxes to select which words are to be used as synonyms. The check box will be automatically checked for the one with the radio button checked in the Keyword column.

(3) This area shows choices (candidates) for the keyword and synonyms.

(4) This area shows types of synonym candidates. The meaning of each type is as follows:
Type Meaning
Current keyword A keyword for which the Edit button is clicked in the Edit keyword screen.
Current synonym A synonym that is currently registered as a synonym of the current keyword.
Unused synonym candidate Among the synonym candidates for the current keyword and the current synonym, a word that is currently registered as a separate keyword or a keyword to which a word registered as a separate synonym belongs.
Aforementioned synonym A word that is registered in the candidate word file as a synonym candidate, or, a newly added synonym which is currently registered as a synonym of a different keyword.
Unregistered synonym A word that is registered in the candidate word file as a synonym candidate, or a newly added synonym which is not yet registered as a keyword or synonym.

(5) Click OK to apply the synonym settings to the Edit keyword screen. The keyword file is not yet saved when you click OK. To save the keyword file, you must save it in the Edit keyword screen.

Edit keyword screen after editing synonyms:


In the Edit keyword screen, synonyms are listed to the right of the equal sign.
4.10 Registering New Synonyms
To register new synonyms by entering character strings in the edit synonym screen, follow the procedures below.

Adding a new synonym:
  1. Click the New Synonym button to open the dialog box for entering a synonym.
  2. Type a synonym in the dialog box.
  3. Click OK.
After adding a new synonym:


The specified character string is added as a synonym candidate with the Synonym check box checked. Click the OK button at the bottom of the screen to add it as a new synonym. The keyword file is not saved at this point; therefore, it is necessary to save it in the Edit keyword screen.
4.11 Category Tree Display Mode
The structure of the Edit keyword screen in the category tree display mode is as follows.

Edit keyword screen in the category tree display mode:


The difference between this mode and 4.4 Candidate Words Display Mode is that in this mode, the category tree is displayed instead of the candidate word list.
4.12 Category Search
In the category tree display mode of the Edit keyword screen, you can specify a category to search registered keywords.

Category search:


(1) When the category name is clicked, the message "Selected" appears for that category.

(2) Keywords listed in the Registered keyword list are narrowed to the keywords registered in the specified category. This function can be used with other search or sorting functions.

(3) Click Reset Category Search" to cancel the search and restore the original list.
4.13 Registering Keywords in a Category
To register a keyword in a category, follow the procedures below.

Registering a keyword in a category:
  1. In the Registered keyword list, select the check box of a keyword that you want to register in a particular category. You can select multiple keywords simultaneously.
  2. Click the Add button to the right of the category name to register the selected keyword in that category.
After the keyword is registered, the category name appears in the Category area in the Registered keyword table.

After registering a keyword in a category:


Click the Remove button on the lower right side of the category name to cancel the category registration.
4.14 Saving and Exiting Edit Mode
After editing the keywords, you must run the termination processing regardless of whether or not changes such as adding or deleting keywords have been made, or whether or not changes must be saved. If the screen is closed while the edit operation continues, the keyword file (see 1.2 Dictionary Resource Files) is locked, and other users must interrupt when they want to edit the category tree or keywords.

Keyword file save/quit menu:


(1) Save the current changes and continue the operation. The file stays locked; therefore, the termination processing (2) or (3) is necessary.

(2) Save the changes and exit the keyword edit mode. The file will be unlocked.

(3) Exit the keyword edit mode without saving changes. The file will be unlocked.

Screen shown after saving and exiting the edit mode:


(1) The termination message appears.

(2) The file is unlocked, and Edit Category Tree and Edit Keywords become active again.

Terms of Use

Notices
This information was developed for products and services offered in the U.S.A.

IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any non-IBM product, program, or service.

IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not grant you any license to these patents. You can send license inquiries, in writing, to:

IBM Director of Licensing
IBM Corporation
North Castle Drive
Armonk, NY 10504-1785
U.S.A. 
For license inquiries regarding double-byte (DBCS) information, contact the IBM Intellectual Property Department in your country or send inquiries, in writing, to:

IBM World Trade Asia Corporation
Licensing
2-31 Roppongi 3-chome, Minato-ku
Tokyo 106-0032, Japan 
The following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you.

This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice.

Any references in this information to non-IBM Web sites are provided for convenience only and do not in any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the materials for this IBM product and use of those Web sites is at your own risk.

IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you.

Licensees of this program who wish to have information about it for the purpose of enabling: (i) the exchange of information between independently created programs and other programs (including this one) and (ii) the mutual use of the information which has been exchanged, should contact:

IBM Corporation
Silicon Valley Lab
Building 090/H-410
555 Bailey Avenue
San Jose, CA 95141-1003
U.S.A.
Such information may be available, subject to appropriate terms and conditions, including in some cases, payment of a fee.

The licensed program described in this document and all licensed material available for it are provided by IBM under terms of the IBM Customer Agreement, IBM International Program License Agreement or any equivalent agreement between us.

Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.

All statements regarding IBM's future direction or intent are subject to change or withdrawal without notice, and represent goals and objectives only.

This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental.

Copyright License
This information contains sample application programs in source language, which illustrate programming techniques on various operating platforms. You may copy, modify, and distribute these sample programs in any form without payment to IBM, for the purposes of developing, using, marketing or distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs.

Trademarks
This topic lists IBM trademarks and certain non-IBM trademarks.

See http://www.ibm.com/legal/copytrade.shtml for information about IBM trademarks.

The following terms are trademarks or registered trademarks of other companies:

Java and all Java-based trademarks and logos are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States, other countries, or both.

Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both.

Intel, Intel Inside (logos), MMX and Pentium are trademarks of Intel Corporation in the United States, other countries, or both.

UNIX is a registered trademark of The Open Group in the United States and other countries.

Linux is a trademark of Linus Torvalds in the United States, other countries, or both.

Other company, product or service names might be trademarks or service marks of others.