Resources | Resources |



Character encoding

Character encoding is the mapping of each character in a character set to a numeric value. Brew MP supports various types of character encoding, including:

  • ISO/IEC 8859-1 ("LATIN1")
  • UTF-8

    Note: UTF is an abbreviation for UNICODE Transformation Format.

  • UTF-16/UCS-2
  • EUC-KR (mostly used for the KS X 1001 character set, formerly known as KSC5601)
  • EUC-JP/Shift-JIS (mostly used for the JIS X 0201 and JIS X 0208 character sets)
  • GB 18030/2312
  • Big5
  • AECHAR is a variable type provided by Brew MP for wide characters (unsigned 16-bit).

Brew MP encodings are defined in the AEEEncodingTypes.h header file in the \platform\system\inc directory in the Brew MP SDK.

Some Brew MP interfaces and classes are encoding-type dependent, which requires encoding conversions for applications supporting multiple languages with different character encodings.

The ICharsetConv interface can be used to convert strings from one character encoding to another. Brew MP provides the AEECLSID_SJISConv class, an implementation of ICharsetConv, which provides conversion functionality for the Shift_JIS character set. Manufacturers may provide additional implementations for other character sets.

To obtain the ClassID for a given encoding type, call ISHELL_GetHandler() with the interface ID AEEIID_ICharsetConv. For example, the following lines of code obtain the ClassID for the UTF-8 to UTF-16 conversion:

AEECLSID cls = 0;
ICharsetConv * piConv = NULL;

cls = ISHELL_GetHandler(ps, AEEIID_ICharsetConv, "UTF-8>UTF-16");

if (cls)
    nErr = ISHELL_CreateInstance(ps, cls, (void **) &piConv);

AEECLSID_SJISConv supports converting to and from the following encodings:

  • Shift_JIS
  • Shift_JIS-8: alias for Shift_JIS
  • Shift_JIS-16LE: Shift_JIS encoded in single 16-bit words
  • Shift_JIS-16BE
  • Shift_JIS-16HOST
  • UTF-8
  • UTF-16
  • UTF-16LE
  • UTF-16BE
  • UTF-16HOST
  • UCS-2
  • UCS-2LE
  • UCS-2BE
  • UTF-32
  • UTF-32LE
  • UTF-32BE
  • UTF-32HOST

If a required character set conversion is not provided by Brew MP, manufacturers may need to implement the conversion. Manufacturers can use the implementation of AEECLSID_SJISConv as an example. The source is available in the system\brewcore\src\aee\src\sjisconv directory in the Brew MP software.

When implementing a conversion:

  • All ICharsetConv implementations must support conversion to and from the Unicode charset.
  • All ICharsetConv implementations must support wide and narrow encodings of their charsets.

When selecting fonts in mixed language environments (such as Chinese and English), it is recommended that applications select a font that supports an encoding type that works for all required character sets, if possible. This practice minimizes toggling between fonts (by calling IDISPLAY_SetFont()), and helps maintain a uniform look and feel in the UI.

Note: For accurate simulation of multi-language applications in the SDK environment, make sure that the same IFont implementations (classes) are present on the device and the PC. If the target contains custom text controllers based on proprietary engines, such as a predictive Chinese text input engine, additional testing must be done on the target to account for the difference.

In addition to ICharsetConv, the Brew MP Standard Library (AEEStdLib.h) provides helper functions to handle the EUC-KR and Shift-JIS variable-width encoding type:

  • STREXPAND(): this function takes an EUC-KR or Shift-JIS encoded string and returns an AECHAR string.
  • WSTRCOMPRESS(): this function takes an AECHAR string obtained from STREXPAND() and reverts the EUC_KR or Shift-JIS encoded string back to its original form.