Unicode versions

Home | Previous Page | Next Page Unicode > Overview of Unicode >

Unicode versions

Although Unicode provides a consistent way of representing text across multiple languages, there are different versions which provide different data sizes for each character. The following table describes the versions that are supported within an IBM Informix ODBC application.

UCS-2: ISO encoding standard that maps Unicode characters to 2 bytes each. UCS-2 is the common encoding standard on Windows.
IBM Informix ODBC Driver for IBM AIX platforms supports UCS-2 encoding. IBM Informix ODBC Driver for Windows supports only UCS-2.
UCS-4: ISO encoding standard that maps Unicode characters into 4 bytes each.
The IBM Informix ODBC Driver supports UCS-4 on UNIX platforms.
UTF-8: Encoding standard that is based on a single (8 bit) byte. UTF-8 defines a mechanism to transform all Unicode characters into a variable length (1 to 4) encoding of bytes.
The IBM Informix ODBC Driver uses UTF-8 encoding for all UNIX applications that connect to the Data Direct (formerly Merant) driver manager.

7-bit ASCII characters have the same encoding under both ASCII and UTF-8. This has the advantage that UTF-8 can be used with much existing software without extensive revision.

Important:

In applications that use Unicode, the driver does the work of codeset conversion from Unicode to the database local and vice versa. Please note that UTF-8 is the only type of Unicode codeset that can be set as the client locale.