Languages Around The World

LayoutEngine

Overview

The Latin script, which is the most commonly used script among software developers, is also the least complex script to display especially when it is used to write English. Using the Latin script, characters can be displayed from left to right in the order that they are stored in memory. Some scripts require rendering behavior that is more complicated than the Latin script. We refer to these scripts as "complex scripts" and to text written in these scripts as "complex text." Examples of complex scripts are the Indic scripts (for example, Devanagari, Tamil, Telugu, and Gujarati), Thai, and Arabic.

These complex scripts exhibit complications that are not found in the Latin script. The following lists the main complications in complex text:

The ICU LayoutEngine is designed to handle these complications through a simple, uniform client interface. Clients supply Unicode code points in reading or "logical" order, and the LayoutEngine provides a list of what to display, indicates the correct order, and supplies the positioning information.

Because the ICU LayoutEngine is platform independent and text rendering is inherently platform dependent, the LayoutEngine cannot directly display text. Instead, it uses an abstract base class to access font files. This base class models a TrueType font at a particular point size and device resolution. The TrueType fonts have the following characteristics:

Since many of the contextual forms, ligatures, and split characters needed to display complex text do not have Unicode code points, they can only be referred to by their glyph indices. Because of this, the LayoutEngine's output is a list of glyph indices. This means that the output must be displayed using an interface where the characters are specified by glyph indices rather than code points.

A concrete instance of this base class must be written for each target platform. For a simple example which uses the standard C library to access a TrueType font, look at the PortableFontInstance class in icu/source/test/letest .

The ICU LayoutEngine supports complex text in the following ways:

OpenType processing requires script-specific processing to be done before the tables are used. The ICU LayoutEngine performs this processing for Arabic, Devanagari, Bengali, Gurmukhi, Gujarati, Oriya, Tamil, Telegu, Kannada, and Malayalam text.

The AAT processing in the LayoutEngine is relatively basic as it only applies the default features in left-to-right text. This processing has been tested for Devanagari text. Since AAT processing is not script-specific, it might not work for other scripts.

Programming with the LayoutEngine

The ICU LayoutEngine is designed to process a run of text which is in a single font. It is written in a single direction (left-to-right or right-to-left), and is written in a single script. Clients can use ICU's Bidi processing to determine the direction of the text and use the ScriptRun class in icu/source/extra/scrptrun to find a run of text in the same script. Since the representation of font information is application specific, ICU cannot help clients find these runs of text.

Once the text has been broken into pieces that the LayoutEngine can handle, call the LayoutEngineFactory method to create an instance of the LayoutEngine class that is specific to the text. The following demonstrates a call to the LayoutEngineFactory:           

The following example shows how to use the LayoutEngine to process the text:             

This previous example computes three arrays: an array of glyph indices in display order, an array of x, y position pairs for each glyph, and an array that maps each output glyph back to the input text array. Use the following get methods to copy these arrays:

LEGlyphID *glyphs    = new LEGlyphID[glyphCount];
le_int32  *indices   = new le_int32[glyphCount];
float     *positions = new float[(glyphCount * 2) + 2];

engine->getGlyphs(glyphs, error);
engine->getCharIndices(indices, error);
engine->getGlyphPositions(positions, error);

Note
The positions array contains (glyphCount * 2) + 2 entries. This is because there is an x and a y position for each glyph. The extra two positions hold the x, y position of the end of the text run.

Once users have the glyph indices and positions, they can use the platform-specific code to draw the glyphs. For example, on Windows 2000, users can call ExtTextOut with the ETO_GLYPH_INDEX option to draw the glyphs and on Linux, users can call TT_Load_Glyph to get the bitmap for each glyph. However, users must draw the bitmaps themselves.

NoteThe ICU LayoutEngine was developed separately from the rest of ICU and uses different coding conventions and basic types. To use the LayoutEngine with ICU coding conventions, users can use the ICULayoutEngine class, which is a thin wrapper around the LayoutEngine class that incorporates ICU conventions and basic types.

For a more detailed example of how to call the LayoutEngine, look at icu/source/test/letest/letest.cpp . This is a simple test used to verify that the LayoutEngine is working properly. It does not do any complex text rendering.

For more information, see ICU , the OpenType Specification , and the TrueType Font File Specification .



Copyright (c) 2000 - 2005 IBM and Others - PDF Version - Feedback: http://icu.sourceforge.net/contacts.html

User Guide for ICU v3.4 Generated 2005-07-27.