*
Metamerge logo
Search

Advanced Search
*
*
*
* HOME DOCUMENTS & RESOURCES DOWNLOADS EARLY TECH ACCESS SUPPORT FAQ KNOWN ISSUES OLD VERSIONS
*

Regular Expression Parser

Overview

The Regular Expression Parser validates and parses Connectors' input/output against some regular expression. It uses the free Regular Expressions for Java library "gnu.regxep" available at http://www.cacas.org/java/gnu/regexp/. Please, consult "gnu.regexp" documentation for the regular expression notation supported and for the library's specification.

The Regular Expression Parser is designed as a useful example that shows how to implement your own Parser in Java and integrate it in the Metamerge Integrator.

Functional Specification

Configuration

The Parser provides the following parameters:

Parameter

Description

class com.architech.parser.rspRegExpParser
regularExpression Specifies the regular expression the Parser will use.
Subexpressions are enclosed in parentheses (for example: "ab(c*)d(e*)f"). When the Parser is used in read mode, those subexpressions correspond to the Entry's Attributes (in the example above "c*" corresponds to the first Attribute and "e*" corresponds to the second Entry's Attribute).
attributeNames

Specifies the names of the Attributes delimited with semi-colons (for example: "Name;Value").
The interpretation of this parameter depends on the Parser mode:
read mode: The names are used for the Attributes corresponding to the subexpressions of the regular expression. Mapping is done in the order of appearance, i.e. the first subexpression will correspond to an Attribute named with the first name from the "attributeNames" parameter, etc.) o write mode: The names are used to define the output text. It is formed by concatenating the values of the Attributes enumerated in the "attributeNames" parameter.

Input

A single line from the input will correspond to a single Entry.
o If the line doesn't match the regularExpression then an Entry with no Attributes is returned.
o If the line matches the regularExpression then an Entry is populated with Attributes and returned. The number of Attributes assigned is equal to the number of subexpressions in the regularExpression and each Attribute's value is the substring of the input line that matches the corresponding subexpression.
If the number of the names in the attributeNames parameter is less than the number of the subexpressions in the regularExpression parameter then Attribute names are added - as many as needed to make those numbers equal. The Attribute names added consist of the prefix "ATTR_NAME_" and the number of the Attribute name added (starting from 0), e.i. ATTR_NAME_0, ATTR_NAME_1, ATTR_NAME_2, etc.

Output

All Attributes enumerated in the attributeNames parameter that exist in the Entry are concatenated to form a single string (in the order they appear in the attributeNames parameter).
o If this string matches the regularExpression, it is printed on a single line in the output.
o If this string does not match the regularExpression, nothing is printed in the output and the "no-match event" is logged.

 

Source Code

You can view the source code of the Regular Expression Parser here .

The Regular Expression Parser source file (with JavaDocs) is included here.

 

Installation

1. From the Regular Expressions for Java website download the package gnu.regexp-1.1.3a.tar.gz. (If this link has changed, please go to the library's page http://www.cacas.org/java/gnu/regexp/  and download library's latest version). Extract the archive's contents keeping path information. Copy the file "gnu-regexp-1.1.3.jar" (placed in the "lib" folder) to the "jars" subfolder of the MI root directory.
2. Download the Regular Expression Parser jar archive regExpParser.jar. Add the file "regExpParser.jar" to the "jars" subfolder of the MI root directory.
3. Edit "miadmin.lax" (placed in the MI root folder) "lax.class.path" property to contain the regular expression library path and the parser path (i.e. add ";jars/regExpParser.jar;jars/gnu-regexp-1.1.3.jar" at the end of the property value)
4. Edit "miserver.lax" (placed in the MI root) "lax.class.path" property to contain the regular expression library path and the parser jar path (i.e. add ";jars/regExpParser.jar;jars/gnu-regexp-1.1.3.jar" at the end of the property value)
5. Finally you have to open MI Admin and add the Regular Expression Parser as a Base Template in each MI configuration you intend to use the Parser in. To do this: (1) press the "Add" button in the "Parser Templates" tab of the "Base Templates" section in MI Admin; (2) in the "Define Parser" dialog box input how you would like to name the Parser in the "Name" edit-box and and place the text "com.architech.parser.rspRegExpParser" in the "Java Class" edit-box; (3) Open the newly added Parser Template and add the following parameters: "regularExpression" and "attributeNames".

 

Downloads

An example configuration that demonstrates the Regular Expression Parser is included here .

*
  Metamerge Integrator version 4.5 ©Copyright Metamerge AS 2000-2002 Last edited 2002-04-30 contact us