Importing from COBOL: supported features

The following table shows how COBOL definitions influence the XML schema settings in the message model.

Note that some xsd types have '-' after the type. This indicates that it is an anonymous simple type based on this type. For strings, the purpose of the anonymous type is to add a length restriction; for numeric types, the purpose is to add a minimum or a maximum value restriction.
COBOL Clause XML Schema datatype Notes
PIC A xsd:string -  
PIC G xsd:string -

Set the compile-time locale name to ja_JP in Windows - Preferences - Importer - COBOL to process this.

PIC N xsd:string -

Set the compile-time locale name to ja_JP in Windows - Preferences - Importer - COBOL to process this.

PIC X xsd:string -  
PIC 9(n) n = 1-4 xsd:short - DISPLAY, COMP, or COMP-3
PIC 9(n) n = 5-9 xsd:int - DISPLAY, COMP, or COMP-3
PIC 9(n) n = 10-18 xsd:long - DISPLAY, COMP, or COMP-3
PIC 9(n) n = 19-31 xsd:integer - DISPLAY, COMP, or COMP-3
PIC 9(n)V9(m) xsd:decimal - DISPLAY, COMP, or COMP-3 any virtual decimal point value
COMP-1 xsd:float -  
COMP-2 xsd:double -  
Any edited string xsd:string -  
Any edited number xsd:string - For example, PIC Z
VALUE All Non-88 Level VALUE clauses can be imported as schema default values (option on import wizard).

The following table shows how COBOL definitions influence the physical CWF characteristics of the elements that are generated in the message model.

COBOL Keyword CWF Physical Type CWF Length Characteristics Other CWF characteristics
PIC X(n)

PIC A(n)

Fixed Length String Length = n

Length Units = Bytes

Justification = Left Justify

Padding Character = SPACE

PIC G(n)

PIC N(n)

Fixed Length String Length = n

Length Units = Characters

Justification = Left Justify

Padding Character = SPACE

PIC 9(n) DISPLAY n=1-31 External Decimal Length = n

Length Units = Bytes

Justification = Right Justify

Padding Character = '0'

Signed = Unticked

Sign Orientation = Trailing

PIC 9(n) COMP, COMP-4, COMP-5 or BINARY Integer Length = 2, 4 or 8 based on n

Length Units = Bytes

Signed = Unticked

Sign Orientation = Blank

PIC 9(n) COMP-3 n=1-18 Packed Decimal Length = CEILING((n+1)/2)

Length Units = Bytes

Signed = Unticked

Sign Orientation = Blank

PIC S9(n) DISPLAY n=1-31 External Decimal Length = n

Length Units = Bytes

Signed = Ticked

Sign Orientation = Trailing

*See Note 1

PIC S9(n) COMP or COMP-3

n=1-18

Integer or Packed Decimal Length = See COMP and COMP-3 definitions above

Length Units = Bytes

Signed = Ticked

Sign Orientation = Blank

PIC 9(m)V9(n) DISPLAY n=1-31 External Decimal Length = n+m

Length Units = Bytes

Signed = Unticked

Sign Orientation = Trailing

Virtual Decimal Point = n

PIC 9(m)V9(n) COMP or COMP-3 Integer or Packed Decimal Length = CEILING((n+m+1)/2) for COMP-3

Length = 2, 4 or 8 for COMP

Length Units = Bytes

Signed = Unticked

Sign Orientation = Blank

Virtual Decimal Point = n

COMP-1 Float Length = 4

Length Units = Bytes

Signed = Ticked

Sign Orientation = Blank

COMP-2 Float Length = 8

Length Units = Bytes

Signed = Ticked

Sign Orientation = Blank

SYNC Float, Integer or Packed Decimal   Leading Skip Count as appropriate

Trailing Skip Count as appropriate

Byte alignment as appropriate

*See note 2

Notes:
  1. Sign Orientation can take one of the following values, based on the SEPARATE, LEADING, or TRAILING keywords in the COBOL definition:
    • Leading
    • Leading Separate
    • Trailing
    • Trailing Separate
  2. SYNC Keyword causes the field to be aligned on a 1, 2, 4, or 8-byte boundary. This might cause 'slack bytes' to be added either before or after a field. Leading Skip Count is the number of such bytes that are added before a field; Trailing Skip Count is the number of such bytes that are added after a field.

    Leading Skip Count and Trailing Skip Count are calculated for each of the imported elements by the importer, irrespective of SYNC clause. They have non-zero values when the SYNC clause is present.

    Where there is a repeating element, Leading Skip Count and Trailing Skip Count are used for the first occurrence of the repeating element; for subsequent occurrences, only the Trailing Skip Count is used.

    Refer to COBOL reference material for details of fields that require Byte Alignment.

  3. The COBOL importer requires all files that you are importing to be syntactically correct. Results are unpredictable if this is not the case.
  4. COBOL data types including POINTER, COMP-X, INDEX, and PROCEDURE-POINTER are not supported.
  5. COBOL containing the keyword NATIVE causes an error and will not import.
  6. COBOL level 66 and level 77 data items are not imported.
  7. Hexadecimal binary values cannot be attributed to non-numeric literals. They cannot reside in the LINKAGE SECTIONs that are imported by the COBOL importer. They can reside elsewhere in the COBOL file. Alternatively, you can convert the hexadecimal value to a char string for PIC X, or to a decimal number for PIC 9.
  8. Where element names clash with Java language keywords, they are modified by prefixing the element name with a single underscore character.
  9. Object-oriented extensions to COBOL 85 are not supported. For example, OBJECT-REFERENCE is not supported.
  10. COBOL OCCURS DEPENDING ON clause. The Byte Alignment, Leading Skip Count, and Trailing Skip Count CWF properties of elements within such a structure are not set up properly. You must correct these using the message editor.
  11. When the imported COBOL source file contains QUOTE or QUOTES in the value clause of a picture string, the default behavior is to fill in the data with double quotation marks unless you set the COBOL QUOTE compile option to SINGLE on the Import Options page of the COBOL importer wizard.

Signed external decimal numbers

The Custom Wire Format (CWF) component of the WebSphere Message Broker provides support for the modelling of numeric data using the External Decimal (also known as Zoned Decimal) data format. In this format, a number is stored internally as decimal character data. For example, in a system using the EBCDIC code, the number 1234 stored in a 4-byte external decimal field would be stored as the character string "1234" and its actual internal hexadecimal representation would be F1F2F3F4.

With signed external decimal numbers, the sign can be incorporated into the actual data by modifying the first half of the first or last byte (depending on whether you are using a sign-leading or sign-trailing representation). Typically, '0xC' is used to represent a positive number, '0xD' is used to represent a negative number and '0xF' is used to represent an unsigned number.

Note: In general, any of '0xA', '0xC', '0xE' or '0xF' can be used to indicate a positive value, and '0xB' or '0xD' can be used to indicate a negative value. The actual preferred representation is dependent upon the actual hardware architecture.

On ASCII machines there are a number of mechanisms for the internal representation of external decimal data. One representation ('Sign ASCII') employed by IBM's pSeries machines uses the normal ASCII codes ("0" [hex 30] to "9" [hex 39]) for the first/last digit of both unsigned and positive numbers, and the characters "p" [hex 70] to "y" [hex 79] for negative numbers.

An alternative method (Sign EBCDIC Custom) is used on some other ASCII based machines. This uses the same characters as an EBCDIC based machine, even though the actual internal hexadecimal representation for them are different. Using this technique, the character string for both EBCDIC and ASCII platforms is identical. You could potentially receive a message from an EBCDIC platform (created from a COBOL copy book that contains such entries as PIC XXX and PIC S999) and convert the whole message to ASCII or the other way around. The character string that represents the external decimal field in the message (after the ASCII/EBCDIC conversion) maps to the code point that represents the correct sign for the decimal. You should note that there is a limitation with this method. Curly brace characters are variant (i.e. they have different code points in different EBCDIC code pages). This mechanism works only for those EBCDIC code pages where the curly brace characters '{' and '}' (which are used to represent signed 0) have exactly the code points x'C0' and x'D0'. For example, it works for code page 500 but not for code page 871 where the curly braces have code points X'8E' and X'9C.

In an ASCII environment (determined by the CCSID property at runtime), the default for both input and output is the 'Sign ASCII' representation. It is possible to specify the applicable representation in the CWF physical layer for local attributes and local elements of types decimal, float, and integer.

Note: This is only appropriate for those elements or attributes that have an external decimal physical representation and that have an embedded ('Leading' or 'Trailing') sign (determined by the 'Sign Orientation' property).

The table below shows the internal representation (both character and actual hexadecimal value) of the first or last digit for external decimal numbers with an included (embedded) leading or trailing sign respectively. (Note: the table does not specify the representation for unsigned values which are 0x30-0x39 for ASCII and 0xF0-0xF9 for EBCDIC)

  Positively signed values   Negatively signed values
ASCII environment EBCDIC environment ASCII environment EBCDIC environment
Digit Sign ASCII Sign EBCDIC Custom   Sign ASCII Sign EBCDIC Custom  
0 0(30) {(7B) {(C0) p(70) }(7D) }(D0)
1 1(31) A(41) A(C1) q(71) J(4A) J(D1)
2 2(32) B(42) B(C2) r(72) K(4B) K(D2)
3 3(33) C(43) C(C3) s(73) L(4C) L(D3)
4 4(34) D(44) D(C4) t(74) M(4D) M(D4)
5 5(35) E(45) E(C5) u(75) N(4E) N(D5)
6 6(36) F(46) F(C6) v(76) O(4F) O(D6)
7 7(37) G(47) G(C7) w(77) P(50) P(D7)
8 8(38) H(48) H(C8) x(78) Q(51) Q(D8)
9 9(39) I(49) I(C9) y(79) R(52) R(D9)

This next table gives some examples for a range of simple numbers that would be representative of what could be transmitted or received using these approaches.

  Sign leading Sign trailing
  ASCII Environment EBCDIC Environment ASCII Environment EBCDIC Environment
Decimal value Sign ASCII Sign EBCDIC Custom   Sign ASCII Sign EBCDIC Custom  
1234

31 32 33 34
"1234"

31 32 33 34
"1234"

F1 F2 F3 F4
"1234"

31 32 33 34
"1234"

31 32 33 34
"1234"

F1 F2 F3 F4
"1234"

+1234

31 32 33 34
"1234"

41 32 33 34
"A234"

C1 F2 F3 F4
"A234"

31 32 33 34
"1234"

31 32 33 44
"123D"

F1 F2 F3 C4
"123D"

-1234

71 32 33 34
"q234"

4A 32 33 34
"J234"

D1 F2 F3 F4
"J234"

31 32 33 74
"123t"

31 32 33 4D
"123M"

F1 F2 F3 D4
"123M"

7890

37 38 39 30
"7890"

37 38 39 30
"7890"

F7 F8 F9 F0
"7890"

37 38 39 30
"7890"

37 38 39 30
"7890"

F7 F8 F9 F0
"7890"

+7890

37 38 39 30
"7890"

47 38 39 30
"G890"

C7 F8 F9 F0
"G890"

37 38 39 30
"7890"

37 38 39 7B
"789{"

F7 F8 F9 C0
"789{"

-7890

77 38 39 30
"w890"

50 38 39 30
"P890"

D7 F8 F9 F0
"P890"

37 38 39 70
"789p"

37 38 39 7D
"789}"

F7 F8 F9 D0
"789}"

Related concepts
Message modeling
The message model
Related tasks
Developing message models
Working with a message definition file
Working with message model objects
Related reference
Message model reference information
Message model object properties