This topic contains sections marked as revised for this release

WebSphere Message Brokers
File: ac26040_
Writer: Bill Oppenheimer

Task topic

This build: July 31, 2007 21:19:49

Manipulating messages using the XMLNSC parser

The XMLNSC domain is an extension of the XMLNS domain, which is an extension of the XML domain.

The XMLNS domain adds namespace support. The new XMLNSC domain builds a more compact tree, so it uses less memory when handling large messages. Your existing applications can continue to use the XMLNS domain, but you can take advantage of performance and memory gains with the XMLNSC domain for new applications.

Message tree structure

The XMLNSC parser obtains its more compact tree by using a single name-value element to represent tagged text, rather than the separate name and value elements that are used by the XML and XMLNS parsers. Consider the following message:
    <Folder1>
        <Folder2 Attribute1='AttributeValue1'>
            <Field1>Value1</Field1>
            <Field2 Attribute2='AttributeValue2'>Value2</Field2>  
        </Folder2> 
    </Folder1>

In the XMLNSC domain, tagged text is represented by two name elements (Folder1 and Folder2) and four name-value elements, which are Attribute1, Field1, Field2, and Attribute2.

The XML and XMLNS domains differ in that the two fields (Field1 and Field2) are each represented by a name element with a child value element. This difference might seem small, but messages often have many such leaf fields. For example:
    <Folder1>
        <Folder2>
            <Field1>Value1</Field1>
            <Field2>Value2</Field2>
            ....
            <Field100>Value100</Field100>
        </Folder2> 
    </Folder1>

In this example, the XMLNSC parser represents the message by using two name elements and 100 name-value elements, whereas the XML and XMLNS parsers use 102 name elements and 100 value elements, plus a further 103 value elements to represent the white space implicit in formatted messages.

The XML and XMLNS domains create name-value elements for the white space formatting characters between the close and open of each folder or field. These white space elements have an empty name and a value for the space, tab, line feed, or other characters, that are used in the formatting of the XML document. These elements have no useful value and can therefore be discarded to improve the compaction. For the same reason, the default behavior is to discard any XML processing instructions and comments in the input stream, and to create no elements in the compact domain tree.

Attributes and tagged text

Attributes and tagged text are both represented by name-value elements, so they are distinguished by the use of the element types. If you do not specify a type, tagged text is assumed. Therefore, the first example message above might be produced by the following ESQL statements:
    SET Origin.Folder1.Folder2.(XMLNSC.Attribute)Attribute1 = 
       'AttributeValue1';
    SET Origin.Folder1.Folder2.Field1 = ‘Value1’;
    SET Origin.Folder1.Folder2.(XMLNSC.Attribute)Attribute2 = 
       'AttributeValue2';
    SET Origin.Folder1.Folder2.Field2 = ‘Value2’;

Although the preceding ESQL looks almost identical to that used with the XML parser, the type constants belong to the XMLNSC parser. The use of constants that belong to other parsers (for example, XML) leads to unexpected results because similarly named constants (for example, XML.Attribute) have different values.

The following constants are defined in the XMLNSC domain for creating attributes:
XMLNSC.Attribute
XMLNSC.SingleAttribute
XMLNSC.DoubleAttribute
Consider the following XML input message:
    <Folder1 Item='ValueA'>
        <Item>Value1</Item>
    </Folder1>
To remove the ambiguity of the name Item, which is used both as an attribute name and as a field name in Folder1, you can use the following ESQL:
    SET ItemAttributeValueHolder = InputRoot.XMLNSC.Folder1.(XMLNSC.Attribute)Item;
    SET ItemFieldValueHolder = InputRoot.XMLNSC.Folder1.(XMLNSC.Field)Item;

This method has an advantage over using an array index selection with Folder1 because it does not matter whether the attribute is present in the input stream.

Handling mixed text

By default, mixed text is discarded because it is typically just formatting white space and has no business meaning. However, a mode is provided in which, when parsing, any text that occurs other than between an opening tag and a closing tag (that is, open->open, close->close, and close->open) is represented by a single unnamed Value element. The value element types support PCDATA, CDATA, and hybrid, which is a mixture of the preceding two.

No special syntax element behavior exists for the getting and setting of values. You can access value elements from ESQL only by addressing them explicitly. The following extra constants are provided for this purpose:
   XMLNSC.Value
   XMLNSC.PCDataValue
   XMLNSC.CDataValue
   XMLNSC.HybridValue
The mode is controlled by the Retain Mixed Content property on the Parser Options tab in the Properties view on all message parsing nodes; for example, the MQInput node. For programmatic control using message options, the following constants are provided:
   XMLNSC.MixedContentRetainNone = 0x0000000000000000
   XMLNSC.MixedContentRetainAll  = 0x0001000000000000
These constants can be used in the Option clauses of both the ESQL CREATE statement (PARSE section) and the ASBITSTREAM function. For example:
   DECLARE X BLOB ASBITSTREAM(InputRoot.XMLNSC.Data OPTIONS 
   XMLNSC.MixedContentRetainAll);
   ...
   CREATE LASTCHILD OF OutputRoot PARSE(X OPTIONS 
   XMLNSC.MixedContentRetainNone);

Handling comments

By default, comments are discarded because they are auxiliary information with no business meaning. However, a mode is provided in which, when parsing, any comments that occur in the document (other than in the document description itself) are represented by a name-value element with the name Comment. The following extra constant is provided for this purpose:
   XMLNSC.Comment
The mode is controlled by setting the Retain Comments property on the Parser Options tab in the Properties view on all message parsing nodes; for example, the MQInput node. For programmatic control using message options, the following constants are provided:
   XMLNSC.CommentsRetainNone = 0x0000000000000000
   XMLNSC.CommentsRetainAll  = 0x0002000000000000
For example:
   DECLARE X BLOB ASBITSTREAM(InputRoot.XMLNSC.Data OPTIONS 
   XMLNSC.CommentsRetainAll);
   ...
   CREATE LASTCHILD OF OutputRoot PARSE(X OPTIONS XMLNSC.CommentsRetainNone);

Handling processing instructions

By default, processing instructions are discarded because they are auxiliary information with no business meaning. However, a mode is provided in which, when parsing, any processing instructions that occur in the document (other than in the document description itself) are represented by a name-value element with the appropriate name and value. The following extra constant is provided for this purpose:
    XMLNSC.ProcessingInstruction
The mode is controlled by the Retain Processing Instructions property on the Parser Options tab of the Properties view on all message parsing nodes; for example, the MQInput node. For programmatic control using message options, the following constants are provided:
    XMLNSC.ProcessingInstructionsRetainNone = 0x0000000000000000
    XMLNSC.ProcessingInstructionsRetainAll  = 0x0004000000000000
For example:
    DECLARE X BLOB ASBITSTREAM(InputRoot.XMLNSC.Data 
    OPTIONS XMLNSC.ProcessingInstructionsRetainAll);
    ...
    CREATE LASTCHILD OF OutputRoot PARSE(X OPTIONS 
    XMLNSC.ProcessingInstructionsRetainNone);

Migrating an existing flow to the XMLNSC domain

To use the XMLNSC domain and parser, re-code your ESQL to use XMLNSC in your paths. Consider the following ESQL statements:
SET OutputRoot.XML.Person.Salary    = 
               CAST(InputRoot.XML.Person.Salary AS INTEGER) * 3;
SET OutputRoot.XMLNS.Person.Salary  = 
               CAST(InputRoot.XMLNS.Person.Salary AS INTEGER) * 3;
SET OutputRoot.XMLNSC.Person.Salary = 
               CAST(InputRoot.XMLNSC.Person.Salary AS INTEGER) * 3;
In each case, the XML bit stream that is expected on the input queue and written to the output queue is of the following form:
    <Person><Salary>42</Salary></Person>

The three ESQL examples differ because they use different parsers to own these elements. The owning parser can be set either by the incoming message, with an MQRFH2 header with an <mcd> folder that specifies the message set domain, or by the message set domain that is defined in the Input Message Parsing properties of the flow input node. If both of these domain definitions are present, the value for the message set domain in the MQRFH2 header <mcd> folder takes precedence.

To migrate to the XMLNSC domain, when using MQRFH2 headers, add the new domain name to the <Msd> field of the <mcd> folder. The new domain name appears in the MQRFH2 header of the outgoing message. To protect external applications from these changes, specify the Use XMLNSC Compact Parser for XMLNS Domain property on the flow's input node, and on the Compute or Mapping node. With these properties set, the input and output messages are unchanged, allowing the <Msd> field value to remain as XMLNS. The flow now uses the compact parser and the ESQL paths are coded using XMLNSC.

If the incoming messages do not contain MQRFH2 headers and the input node's message domain property is used to specify the domain, you can migrate to the XMLNSC domain by setting the flow's input node domain property directly to XMLNSC. Alternatively, you can leave it as XMLNS and set the Use XMLNSC Compact Parser for XMLNS Domain property. The compact parser is used in the flow and you must code the ESQL paths using XMLNSC with either of these settings.

If outgoing messages do not contain MQRFH2 headers, the domain does not appear anywhere in the output messages and setting the Compute node's Use XMLNSC Compact Parser for XMLNS Domain property has no effect.

Constructing XML headers

The following ESQL is valid in the XML domain:
SET OutputRoot.XML.(XML.XmlDecl)*.(XML.Version)* = '1.0';
To migrate to XMLNS, change the root:
SET OutputRoot.XMLNS.(XML.XmlDecl)*.(XML.Version)* = '1.0';

Although the XMLNS parser is used, the element type constants are those that belong to the XML parser. This code works because the type values that are used by the XML and XMLNS parsers are the same. For the XMLNSC parser, however, the type values are different and, therefore, you must always use its own type constants.

In the XMLNSC domain no special type exists for the XML version; it is treated as an attribute of the XML declaration. The equivalent syntax for the above example is:
SET OutputRoot.XMLNSC.(XMLNSC.XmlDeclaration)*.(XMLNSC.Attribute)Version = '1.0';
In a similar way, in the XMLNSC domain, the XML encoding type and XML standalone mode are also processed as attributes of the XML declaration and can be set using the following ESQL:
SET OutputRoot.XMLNSC.(XMLNSC.XmlDeclaration)*.(XMLNSC.Attribute)Encoding = 'UTF-8';
SET OutputRoot.XMLNSC.(XMLNSC.XmlDeclaration)*.(XMLNSC.Attribute)StandAlone = 'Yes';
To provide the same output from a JavaCompute node, use the following example code:
//Create the XML domain root node
MBElement xmlRoot = root.createElementAsLastChild(MbXMLNSC.PARSER_NAME);

//Create the XML declaration parent node
MbElement xmlDecl = xmlRoot.createElementAsFirstChild(MbXMLNSC.XML_DECLARATION);

xmlDecl.setName("XmlDeclaration");

MbElement version = xmlDecl.CreateElementAsFirstChild(MbXMLNSC.ATTRIBUTE, "Version", "1.0");
MbElement encoding = xmlDecl.CreateElementAsFirstChild(MbXMLNSC.ATTRIBUTE, "Encoding", "utf-8");

//Create the message body
This code results in the following line appearing in the output message:
<?xml version="1.0" encoding="utf_8"?>

Copying message trees

When copying trees, the broker regards XML and XMLNSC as unlike parsers, which means that all attributes in the source tree get mapped to elements in the target tree. This situation arises only if you are using both parsers in the same flow - one for input and one for output; in this situation, use the compact parser for both flows.

If different parsers must be used for the input flow and output flow, you might need to explicitly specify the types of elements in the paths, or use the FIELDVALUE function to ensure a copy of scalar values rather than of sub-trees.

Follow the guidance that is provided for XML messages in Manipulating messages in the XML domain, in conjunction with the information in the topic Manipulating message body content.

Accessing syntax elements in the XMLNSC domain using correlation names

The following table provides the correlation names for each XML syntax element. When you work in the XMLNSC domain, use these names to refer to the elements in input messages, and to set elements, attributes, and values in output messages.
Syntax element Correlation name Constant value
Folder XMLNSC.Folder 0x01000000
Document type 1 XMLNSC.DocumentType 0x01000300
XML declaration 2 XMLNSC.XmlDeclaration 0x01000400
     
Field or Attr Value 3 XMLNSC.Value 0x02000000
PCData value XMLNSC.PCDataValue 0x02000000
CData value XMLNSC.CDataValue 0x02000001
Hybrid value XMLNSC.HybridValue 0x02000002
     
Entity Reference XMLNSC.EntityReference 0x02000100
     
Field 3 XMLNSC.Field 0x03000000
PCData XMLNSC.PCDataField 0x03000000
CData XMLNSC.CDataField 0x03000001
Hybrid XMLNSC.HybridField 0x03000002
     
Attribute 3 XMLNSC.Attribute 0x03000100
Single quote XMLNSC.SingleAttribute 0x03000101
Double quote XMLNSC.DoubleAttribute 0x03000100
     
Namespace declaration XMLNSC.NamespaceDecl 0x03000102
Single quote XMLNSC.SingleNamespaceDecl 0x03000103
Double quote XMLNSC.DoubleNamespaceDecl 0x03000102
     
Bitstream data XMLNSC.BitStream 0x03000200
     
Entity definition 1 3 XMLNSC.EntityDefinition 0x03000300
Single quote XMLNSC.SingleEntityDefinition 0x03000301
Double quote XMLNSC.DoubleEntityDefinition 0x03000300
     
Comment XMLNSC.Comment 0x03000400
     
Processing instruction XMLNSC.ProcessingInstruction 0x03000401
Notes:
  1. Document Type is used only for entity definitions; for example:
    SET OutputRoot.XMLNSC.(XMLNSC.DocumentType)BodyDocument
                  .(XMLNSC.EntityDefinition)TestDef =
     		          'Compact Tree Parser XML Test Module Version 1.0';
  2. The XML declaration is a special folder type that contains child elements for version, and so on; for example:
    -- Create the XML declaration 		
    SET OutputRoot.XMLNSC.(XMLNSC.XmlDeclaration)*.Version = 1.0;
    SET OutputRoot.XMLNSC.(XMLNSC.XmlDeclaration)*.Encoding = 'UTF8';
    SET OutputRoot.XMLNSC.(XMLNSC.XmlDeclaration)*.Standalone = 'yes';
  3. These correlation values represent multiple possible entity types and should not be used for specific checks in a FIELDTYPE(...) statement. For example, IF FIELDTYPE(...) = XMLNSC.Attribute THEN ... would never match anything because the element would be either an XMLNSC.SingleAttribute or an XMLNSC.DoubleAttribute.

Using EntityDefinition and EntityReference with the XMLNSC parser

Two examples are provided to demonstrate how to use EntityDefinition and EntityReference with the XMLNSC parser using ESQL. Both examples use the same input message:
<BookInfo dtn="BookInfo" edn="author" edv="A.N.Other">
<Identifier>ES39B103T6</Identifier>
</BookInfo>
The first example shows how to use EntityDefinition and EntityReference with the XMLNSC parser. The following output message is generated by the example:
<!DOCTYPE BookInfo [<!ENTITY author "A.N.Other">]>
<BookInfo><Identifier>ES39B103T7</Identifier><entref>&author;</entref></BookInfo>
In the following ESQL, XMLNSC.EntityDefinition is used to define the hard-coded entity author with a value of A.N.Other, which is derived from edv from the input message. XMLNSC.EntityReference is used to create a reference to the entity author in the XML message body.
SET OutputRoot.MQMD = InputRoot.MQMD;
DECLARE cursor REFERENCE TO InputRoot.XMLNSC.BookInfo;
SET OutputRoot.XMLNSC.BookInfo.Identifier = cursor.Identifier;
SET OutputRoot.XMLNSC.(XMLNSC.DocumentType)* NAME = cursor.dtn;
SET OutputRoot.XMLNSC.(XMLNSC.DocumentType)*.(XMLNSC.EntityDefinition)* NAME = 'author';
SET OutputRoot.XMLNSC.(XMLNSC.DocumentType)*.(XMLNSC.EntityDefinition)author VALUE = cursor.edv;
SET OutputRoot.XMLNSC.(XMLNSC.BookInfo).entref.(XMLNSC.EntityReference)* = 'author';
The variable cursor is used to point to the following variables:
  • The document type name dtn
  • The entity definition value edv
  • The value for Identifier
The values for these variables are derived from the input message.

This second example demonstrates how to create an output message that contains an entity definition and a reference to that entity, based on the content of the same input message.

The following output message is generated by the example, showing an entity definition called author and a reference to the entity in the XML message body:
<!DOCTYPE BookInfo [<!ENTITY author "Book 1">]>
<BookInfo Identifier="ES39B103T6">&author;</BookInfo>
The following ESQL uses EntityDefintion and EntityReference with the XMLNSC parser to generate the output message shown above:
SET OutputRoot.MQMD = InputRoot.MQMD;
DECLARE cursor REFERENCE TO InputRoot.XMLNSC.BookInfo;
CREATE FIELD OutputRoot.XMLNSC.BookInfo;
SET OutputRoot.XMLNSC.(XMLNSC.DocumentType)* NAME = cursor.dtn;
SET OutputRoot.XMLNSC.(XMLNSC.DocumentType)*.(XMLNSC.EntityDefinition)* NAME = cursor.edn;
SET OutputRoot.XMLNSC.(XMLNSC.DocumentType)*.(XMLNSC.EntityDefinition)* VALUE = 'Book 1';
SET OutputRoot.XMLNSC.(XMLNSC.Folder)*[<].(XMLNSC.EntityReference)* = cursor.edn;
SET OutputRoot.XMLNSC.Identifier.(XMLNSC.DoubleAttribute)Identifier = cursor.Identifier;
XMLNSC.EntityDefinition is used to define the entity author with a value of Book 1. A reference to the author entity is then created in the message using XMLNSC.EntityReference. The variable cursor is used to point to the following variables:
  • The document type name dtn
  • The entity definition value edv
  • The value for Identifier
These variables are all derived from the input message. The XMLNSC.DoubleAttribute code is used to add double quotes to the Identifier, and the XMLNSC.SingleAttribute code is used to add single quotes to the Identifier.

XMLNSC parser modes

By default, the XMLNSC parser discards document elements that typically carry no business meaning. However, parser modes are available to force retention of these elements. You can configure these modes on the properties of the node that specifies that the message is to be parsed in the XMLNSC domain.

The valid parser modes for the XMLNSC parser are:
XMLNSC.MixedContentRetainNone
XMLNSC.MixedContentRetainAll
XMLNSC.CommentsRetainNone
XMLNSC.CommentsRetainAll
XMLNSC.ProcessingInstructionsRetainNone
XMLNSC.ProcessingInstructionsRetainAll
The following example uses the XMLNSC.ProcessingInstructionsRetainAll and XMLNSC.ProcessingInstructionsRetainNone modes to retain document processing instructions while parsing:
DECLARE X BLOB ASBITSTREAM(InputRoot.XMLNSC.Data OPTIONS XMLNSC
                          .ProcessingInstructionsRetainAll);
...     
CREATE LASTCHILD OF outputRoot PARSE(X OPTIONS XMLNSC
                          .ProcessingInstructionsRetainNone);
Related concepts
Message flows overview
XML parsers and domains
ESQL overview
Related tasks
Designing a message flow
Defining message flow content
Manipulating messages in the XML domain
Related reference
ESQL reference
SET statement
FIELDVALUE function
ASBITSTREAM function
CREATE statement
Notices | Trademarks | Downloads | Library | Support | Feedback

Copyright IBM Corporation 1999, 2007Copyright IBM Corporation 1999, 2007. All Rights Reserved.
This build: July 31, 2007 21:19:49

ac26040_ This topic's URL is: