In a client/server environment, character data might need to be converted from one code set to another if the client or server computer uses different code sets to represent the same characters. The conversion of character data from one code set (the source code set) to another (the target code set) is called code-set conversion. Without code-set conversion, one computer cannot correctly process or display character data that originates on the other (when the two computers use different code sets).
IBM Informix products use GLS locales to perform code-set conversion. Both an IBM Informix client application and a database server might perform code-set conversion. For details, see Database Server Code-Set Conversion and Client Application Code-Set Conversion.
You specify a code set as part of the GLS locale. At runtime, IBM Informix products adhere to the following rules to determine which code sets to use:
Code-set conversion does not provide either of the following capabilities:
It does not convert between words in different languages. For example, it does not convert from the English word yes to the French word oui. It only ensures that each character retains its meaning when it is processed or written, regardless of how it is encoded.
For example, if the character â is passed to a target computer whose code set does not contain that character, the target computer cannot process or print the character exactly.
For each character in the source code set, a corresponding character in the target code set should exist. However, if the source code set contains characters that are not in the target code set, the conversion must then define how to map these mismatched characters to the target code set. (Absence of a mapping between a character in the source and target code sets is often called a lossy error.) If all characters in the source code set exist in the target code set, mismatch handling does not apply.
A code-set conversion uses one of the following four methods to handle mismatched characters:
This method maps each mismatched character to a unique character in the target code set so that the return mapping maps the original character back to itself. This method guarantees that a two-way conversion results in no loss of information; however, data that is converted just one way might prevent correct processing or printing on the target computer.
This method maps all mismatched characters to one character in the target code set that highlights mismatched characters. This method guarantees that a one-way conversion clearly shows the mismatched characters; however, a two-way conversion results in loss of information if mismatched characters are present.
This method maps each mismatched character to a character in the target code set that looks similar to the source character.
This method includes the mapping of one-character ligatures to their two-character equivalents and vice versa, to make printing of mismatched data more accurate on the target computer, but it most likely confuses the processing of this data on the target computer.
An application must use code-set conversion only if the two code sets (client and server-processing locale, or server-processing locale and server) are different. The following situations are possible causes of code sets that differ:
For example, the code for the character â (a-circumflex) in Windows Code Page 1252 is hexadecimal 0xE2. In IBM Coded Character Set Identifier (CCSID) 437 (a common IBM UNIX code set), the code is hexadecimal 0x83. If the code for â on the client is sent unchanged to the IBM UNIX computer, it prints as the Greek character g (gamma). This action occurs because the code for g is hexadecimal 0xE2 on the IBM UNIX computer.
For example, the code sets ccdc and big5 are both internal representations of a subset of the Chinese language. These subsets, however, include different numbers of Chinese characters.
If a code-set conversion is required for data transfer from computer A to computer B, then it is also required for data transfer from computer B to computer A. In the client/server environment, the following situations might require code-set conversion:
In Figure 4, the black dots indicate the two points in a client/server environment at which code-set conversion might occur.
In the example connection that Figure 4 shows, the ESQL/C client application performs code-set conversion on the data that it sends to and receives from the database server if the client and database code sets are convertible. The Informix database server also performs code-set conversion when it writes to a message-log file if the code sets of the server locale and server-processing locale are convertible.
Home | [ Top of Page | Previous Page | Next Page | Contents | Index ]