Main Page   Class Hierarchy   Alphabetical List   Compound List   File List   Compound Members   File Members   Search  

ucnv_err.h File Reference

C UConverter predefined error callbacks. More...

#include "unicode/ucnv.h"
#include "unicode/utypes.h"

Go to the source code of this file.

Compounds

struct  UConverterFromUnicodeArgs
 The structure for the fromUnicode callback function parameter. More...

struct  UConverterToUnicodeArgs
 The structure for the toUnicode callback function parameter. More...


Defines

#define UCNV_SUB_STOP_ON_ILLEGAL   "i"
 FROM_U, TO_U options for sub callback. More...

#define UCNV_SKIP_STOP_ON_ILLEGAL   "i"
 FROM_U, TO_U options for skip callback. More...

#define UCNV_ESCAPE_ICU   NULL
 FROM_U_CALLBACK_ESCAPE option to escape the code unit according to ICU (UXXXX). More...

#define UCNV_ESCAPE_JAVA   "J"
 FROM_U_CALLBACK_ESCAPE option to escape the code unit according to JAVA (\uXXXX). More...

#define UCNV_ESCAPE_C   "C"
 FROM_U_CALLBACK_ESCAPE option to escape the code unit according to C (\uXXXX \UXXXXXXXX) TO_U_CALLBACK_ESCAPE option to escape the character value accoding to C (\xXXXX). More...

#define UCNV_ESCAPE_XML_DEC   "D"
 FROM_U_CALLBACK_ESCAPE option to escape the code unit according to XML Decimal escape (&DDDD) TO_U_CALLBACK_ESCAPE option to escape the character value accoding to XML Decimal escape (&DDDD). More...

#define UCNV_ESCAPE_XML_HEX   "X"
 FROM_U_CALLBACK_ESCAPE option to escape the code unit according to XML Hex escape (&xXXXX) TO_U_CALLBACK_ESCAPE option to escape the character value accoding to XML Hex escape (&xXXXX). More...

#define UCNV_ESCAPE_UNICODE   "U"
 FROM_U_CALLBACK_ESCAPE option to escape teh code unit according to Unicode (U+XXXXX). More...


Enumerations

enum  UConverterCallbackReason {
  UCNV_UNASSIGNED = 0, UCNV_ILLEGAL = 1, UCNV_IRREGULAR = 2, UCNV_RESET = 3,
  UCNV_CLOSE = 4, UCNV_CLONE = 5
}
 The process condition code to be used with the callbacks. More...


Functions

void UCNV_FROM_U_CALLBACK_STOP (const void *context, UConverterFromUnicodeArgs *fromUArgs, const UChar *codeUnits, int32_t length, UChar32 codePoint, UConverterCallbackReason reason, UErrorCode *err)
 DO NOT CALL THIS FUNCTION DIRECTLY! This From Unicode callback STOPS at the ILLEGAL_SEQUENCE, returning the error code back to the caller immediately. More...

void UCNV_TO_U_CALLBACK_STOP (const void *context, UConverterToUnicodeArgs *fromUArgs, const char *codeUnits, int32_t length, UConverterCallbackReason reason, UErrorCode *err)
 DO NOT CALL THIS FUNCTION DIRECTLY! This To Unicode callback STOPS at the ILLEGAL_SEQUENCE, returning the error code back to the caller immediately. More...

void UCNV_FROM_U_CALLBACK_SKIP (const void *context, UConverterFromUnicodeArgs *fromUArgs, const UChar *codeUnits, int32_t length, UChar32 codePoint, UConverterCallbackReason reason, UErrorCode *err)
 DO NOT CALL THIS FUNCTION DIRECTLY! This From Unicode callback skips any ILLEGAL_SEQUENCE, or skips only UNASSINGED_SEQUENCE depending on the context parameter simply ignoring those characters. More...

void UCNV_FROM_U_CALLBACK_SUBSTITUTE (const void *context, UConverterFromUnicodeArgs *fromUArgs, const UChar *codeUnits, int32_t length, UChar32 codePoint, UConverterCallbackReason reason, UErrorCode *err)
 DO NOT CALL THIS FUNCTION DIRECTLY! This From Unicode callback will Substitute the ILLEGAL SEQUENCE, or UNASSIGNED_SEQUENCE depending on context parameter, with the current substitution string for the converter. More...

void UCNV_FROM_U_CALLBACK_ESCAPE (const void *context, UConverterFromUnicodeArgs *fromUArgs, const UChar *codeUnits, int32_t length, UChar32 codePoint, UConverterCallbackReason reason, UErrorCode *err)
 DO NOT CALL THIS FUNCTION DIRECTLY! This From Unicode callback will Substitute the ILLEGAL SEQUENCE with the hexadecimal representation of the illegal codepoints. More...

void UCNV_TO_U_CALLBACK_SKIP (const void *context, UConverterToUnicodeArgs *fromUArgs, const char *codeUnits, int32_t length, UConverterCallbackReason reason, UErrorCode *err)
 DO NOT CALL THIS FUNCTION DIRECTLY! This To Unicode callback skips any ILLEGAL_SEQUENCE, or skips only UNASSINGED_SEQUENCE depending on the context parameter simply ignoring those characters. More...

void UCNV_TO_U_CALLBACK_SUBSTITUTE (const void *context, UConverterToUnicodeArgs *fromUArgs, const char *codeUnits, int32_t length, UConverterCallbackReason reason, UErrorCode *err)
 DO NOT CALL THIS FUNCTION DIRECTLY! This To Unicode callback will Substitute the ILLEGAL SEQUENCE,or UNASSIGNED_SEQUENCE depending on context parameter, with the Unicode substitution character, U+FFFD. More...

void UCNV_TO_U_CALLBACK_ESCAPE (const void *context, UConverterToUnicodeArgs *fromUArgs, const char *codeUnits, int32_t length, UConverterCallbackReason reason, UErrorCode *err)
 DO NOT CALL THIS FUNCTION DIRECTLY! This To Unicode callback will Substitute the ILLEGAL SEQUENCE with the hexadecimal representation of the illegal bytes (in the format XNN, e.g. More...


Detailed Description

C UConverter predefined error callbacks.

Error Behaviour Fnctions

Defines some error behaviour functions called by ucnv_{from,to}Unicode These are provided as part of ICU and many are stable, but they can also be considered only as an example of what can be done with callbacks. You may of course write your own.

These Functions, although public, should NEVER be called directly, they should be used as parameters to the ucnv_setFromUCallback and ucnv_setToUCallback functions, to set the behaviour of a converter when it encounters ILLEGAL/UNMAPPED/INVALID sequences.

usage example: 'STOP' doesn't need any context, but newContext could be set to something other than 'NULL' if needed.

    UErrorCode err = U_ZERO_ERROR;
    UConverter* myConverter = ucnv_open("ibm-949", &err);
  const void *newContext = NULL;
  const void *oldContext;
  UConverterFromUCallback oldAction;


    if (U_SUCCESS(err))
    {
  ucnv_setFromUCallBack(myConverter,
                       UCNV_FROM_U_CALLBACK_STOP,
                       newContext,
                       &oldAction,
                       &oldContext,
                      &status);
    }

The code above tells "myConverter" to stop when it encounters a ILLEGAL/TRUNCATED/INVALID sequences when it is used to convert from Unicode -> Codepage. The behavior from Codepage to Unicode is not changed.

Definition in file ucnv_err.h.


Define Documentation

#define UCNV_ESCAPE_C   "C"
 

FROM_U_CALLBACK_ESCAPE option to escape the code unit according to C (\uXXXX \UXXXXXXXX) TO_U_CALLBACK_ESCAPE option to escape the character value accoding to C (\xXXXX).

Stable:
ICU 2.0

Definition at line 91 of file ucnv_err.h.

#define UCNV_ESCAPE_ICU   NULL
 

FROM_U_CALLBACK_ESCAPE option to escape the code unit according to ICU (UXXXX).

Stable:
ICU 2.0

Definition at line 80 of file ucnv_err.h.

#define UCNV_ESCAPE_JAVA   "J"
 

FROM_U_CALLBACK_ESCAPE option to escape the code unit according to JAVA (\uXXXX).

Stable:
ICU 2.0

Definition at line 85 of file ucnv_err.h.

#define UCNV_ESCAPE_UNICODE   "U"
 

FROM_U_CALLBACK_ESCAPE option to escape teh code unit according to Unicode (U+XXXXX).

Stable:
ICU 2.0

Definition at line 108 of file ucnv_err.h.

#define UCNV_ESCAPE_XML_DEC   "D"
 

FROM_U_CALLBACK_ESCAPE option to escape the code unit according to XML Decimal escape (&DDDD) TO_U_CALLBACK_ESCAPE option to escape the character value accoding to XML Decimal escape (&DDDD).

Stable:
ICU 2.0

Definition at line 97 of file ucnv_err.h.

#define UCNV_ESCAPE_XML_HEX   "X"
 

FROM_U_CALLBACK_ESCAPE option to escape the code unit according to XML Hex escape (&xXXXX) TO_U_CALLBACK_ESCAPE option to escape the character value accoding to XML Hex escape (&xXXXX).

Stable:
ICU 2.0

Definition at line 103 of file ucnv_err.h.

#define UCNV_SKIP_STOP_ON_ILLEGAL   "i"
 

FROM_U, TO_U options for skip callback.

Stable:
ICU 2.0

Definition at line 74 of file ucnv_err.h.

#define UCNV_SUB_STOP_ON_ILLEGAL   "i"
 

FROM_U, TO_U options for sub callback.

Stable:
ICU 2.0

Definition at line 68 of file ucnv_err.h.


Enumeration Type Documentation

enum UConverterCallbackReason
 

The process condition code to be used with the callbacks.

Codes which are greater than UCNV_IRREGULAR should be passed on to any chained callbacks.

Stable:
ICU 2.0
Enumeration values:
UCNV_UNASSIGNED  The code point is unassigned.

The error code U_INVALID_CHAR_FOUND will be set.

UCNV_ILLEGAL  The code point is illegal.

For example, \x81\x2E is illegal in SJIS because \x2E is not a valid trail byte for the \x81 lead byte. Also, starting with Unicode 3.0.1, non-shortest byte sequences in UTF-8 (like \xC1\xA1 instead of \x61 for U+0061) are also illegal, not just irregular. The error code U_ILLEGAL_CHAR_FOUND will be set.

UCNV_IRREGULAR  The codepoint is not a regular sequence in the encoding.

For example, \xED\xA0\x80..\xED\xBF\xBF are irregular UTF-8 byte sequences for single surrogate code points. The error code U_INVALID_CHAR_FOUND will be set.

UCNV_RESET  The callback is called with this reason when a 'reset' has occured.

Callback should reset all state.

UCNV_CLOSE  Called when the converter is closed.

The callback should release any allocated memory.

UCNV_CLONE  Called when ucnv_safeClone() is called on the converter.

the pointer available as the 'context' is an alias to the original converters' context pointer. If the context must be owned by the new converter, the callback must clone the data and call ucnv_setFromUCallback (or setToUCallback) with the correct pointer.

Draft:
This API has been introduced in ICU 2.2. It is still in draft state and may be modified in a future release.

Definition at line 116 of file ucnv_err.h.


Function Documentation

void UCNV_FROM_U_CALLBACK_ESCAPE const void *    context,
UConverterFromUnicodeArgs   fromUArgs,
const UChar *    codeUnits,
int32_t    length,
UChar32    codePoint,
UConverterCallbackReason    reason,
UErrorCode   err
 

DO NOT CALL THIS FUNCTION DIRECTLY! This From Unicode callback will Substitute the ILLEGAL SEQUENCE with the hexadecimal representation of the illegal codepoints.

Parameters:
context:  the function currently recognizes the callback options:
UCNV_ESCAPE_ICU: Substitues the ILLEGAL SEQUENCE with the hexadecimal representation in the format UXXXX, e.g. "uFFFEu00ACuC8FE"). In the Event the converter doesn't support the characters {%,U}[A-F][0-9], it will substitute the illegal sequence with the substitution characters. Note that codeUnit(32bit int eg: unit of a surrogate pair) is represented as UD84DUDC56 UCNV_ESCAPE_JAVA: Substitues the ILLEGAL SEQUENCE with the hexadecimal representation in the format \uXXXX, e.g. "\uFFFE\u00AC\uC8FE"). In the Event the converter doesn't support the characters {\,u}[A-F][0-9], it will substitute the illegal sequence with the substitution characters. Note that codeUnit(32bit int eg: unit of a surrogate pair) is represented as \uD84D\uDC56 UCNV_ESCAPE_C: Substitues the ILLEGAL SEQUENCE with the hexadecimal representation in the format \uXXXX, e.g. "\uFFFE\u00AC\uC8FE"). In the Event the converter doesn't support the characters {\,u,U}[A-F][0-9], it will substitute the illegal sequence with the substitution characters. Note that codeUnit(32bit int eg: unit of a surrogate pair) is represented as \U00023456 UCNV_ESCAPE_XML_DEC: Substitues the ILLEGAL SEQUENCE with the decimal representation in the format &DDDDDDDD, e.g. "&#65534&#172&#51454"). In the Event the converter doesn't support the characters {&,#}[0-9], it will substitute the illegal sequence with the substitution characters. Note that codeUnit(32bit int eg: unit of a surrogate pair) is represented as &#144470 and Zero padding is ignored. UCNV_ESCAPE_XML_HEX:Substitues the ILLEGAL SEQUENCE with the decimal representation in the format &xXXXX, e.g. "&xFFFE&x00AC&xC8FE"). In the Event the converter doesn't support the characters {&,#,x}[0-9], it will substitute the illegal sequence with the substitution characters. Note that codeUnit(32bit int eg: unit of a surrogate pair) is represented as &x23456
Stable:
ICU 2.0

void UCNV_FROM_U_CALLBACK_SKIP const void *    context,
UConverterFromUnicodeArgs   fromUArgs,
const UChar *    codeUnits,
int32_t    length,
UChar32    codePoint,
UConverterCallbackReason    reason,
UErrorCode   err
 

DO NOT CALL THIS FUNCTION DIRECTLY! This From Unicode callback skips any ILLEGAL_SEQUENCE, or skips only UNASSINGED_SEQUENCE depending on the context parameter simply ignoring those characters.

Parameters:
context:  the function currently recognizes the callback options: UCNV_SKIP_STOP_ON_ILLEGAL: STOPS at the ILLEGAL_SEQUENCE, returning the error code back to the caller immediately. NULL: Skips any ILLEGAL_SEQUENCE
Stable:
ICU 2.0

void UCNV_FROM_U_CALLBACK_STOP const void *    context,
UConverterFromUnicodeArgs   fromUArgs,
const UChar *    codeUnits,
int32_t    length,
UChar32    codePoint,
UConverterCallbackReason    reason,
UErrorCode   err
 

DO NOT CALL THIS FUNCTION DIRECTLY! This From Unicode callback STOPS at the ILLEGAL_SEQUENCE, returning the error code back to the caller immediately.

Stable:
ICU 2.0

void UCNV_FROM_U_CALLBACK_SUBSTITUTE const void *    context,
UConverterFromUnicodeArgs   fromUArgs,
const UChar *    codeUnits,
int32_t    length,
UChar32    codePoint,
UConverterCallbackReason    reason,
UErrorCode   err
 

DO NOT CALL THIS FUNCTION DIRECTLY! This From Unicode callback will Substitute the ILLEGAL SEQUENCE, or UNASSIGNED_SEQUENCE depending on context parameter, with the current substitution string for the converter.

This is the default callback.

Parameters:
context:  the function currently recognizes the callback options: UCNV_SUB_STOP_ON_ILLEGAL: STOPS at the ILLEGAL_SEQUENCE, returning the error code back to the caller immediately. NULL: Substitutes any ILLEGAL_SEQUENCE
See also:
ucnv_setSubstChars
Stable:
ICU 2.0

void UCNV_TO_U_CALLBACK_ESCAPE const void *    context,
UConverterToUnicodeArgs   fromUArgs,
const char *    codeUnits,
int32_t    length,
UConverterCallbackReason    reason,
UErrorCode   err
 

DO NOT CALL THIS FUNCTION DIRECTLY! This To Unicode callback will Substitute the ILLEGAL SEQUENCE with the hexadecimal representation of the illegal bytes (in the format XNN, e.g.

"XFFX0AXC8X03").

Stable:
ICU 2.0

void UCNV_TO_U_CALLBACK_SKIP const void *    context,
UConverterToUnicodeArgs   fromUArgs,
const char *    codeUnits,
int32_t    length,
UConverterCallbackReason    reason,
UErrorCode   err
 

DO NOT CALL THIS FUNCTION DIRECTLY! This To Unicode callback skips any ILLEGAL_SEQUENCE, or skips only UNASSINGED_SEQUENCE depending on the context parameter simply ignoring those characters.

Parameters:
context:  the function currently recognizes the callback options: UCNV_SKIP_STOP_ON_ILLEGAL: STOPS at the ILLEGAL_SEQUENCE, returning the error code back to the caller immediately. NULL: Skips any ILLEGAL_SEQUENCE
Stable:
ICU 2.0

void UCNV_TO_U_CALLBACK_STOP const void *    context,
UConverterToUnicodeArgs   fromUArgs,
const char *    codeUnits,
int32_t    length,
UConverterCallbackReason    reason,
UErrorCode   err
 

DO NOT CALL THIS FUNCTION DIRECTLY! This To Unicode callback STOPS at the ILLEGAL_SEQUENCE, returning the error code back to the caller immediately.

Stable:
ICU 2.0

void UCNV_TO_U_CALLBACK_SUBSTITUTE const void *    context,
UConverterToUnicodeArgs   fromUArgs,
const char *    codeUnits,
int32_t    length,
UConverterCallbackReason    reason,
UErrorCode   err
 

DO NOT CALL THIS FUNCTION DIRECTLY! This To Unicode callback will Substitute the ILLEGAL SEQUENCE,or UNASSIGNED_SEQUENCE depending on context parameter, with the Unicode substitution character, U+FFFD.

Parameters:
context:  the function currently recognizes the callback options: UCNV_SUB_STOP_ON_ILLEGAL: STOPS at the ILLEGAL_SEQUENCE, returning the error code back to the caller immediately. NULL: Substitutes any ILLEGAL_SEQUENCE
Stable:
ICU 2.0


Generated on Wed Dec 18 16:50:28 2002 for ICU 2.4 by doxygen1.2.11.1 written by Dimitri van Heesch, © 1997-2001