INFORMIX
Informix Guide to GLS Functionality
Glossary
Home Contents Index Master Index New Book

Glossary

7-bit character
A character that is composed of seven bits, such as all the characters of the ASCII code set.

8-bit character
The 8-bit characters are single-byte characters with code values between 128 and 255. Examples from the ISO8859-1 code set include the non-English é, ñ, and ö characters. They can be interpreted correctly only if the software that interprets them is 8-bit clean.

8-bit clean
This term describes whether a piece of software or a file system can process character data that contains 8-bit characters. Otherwise, only the characters of the ASCII code set can be represented and processed correctly in such a system.

16-bit code set
In a 16-bit code set, (such as JIS X0208), approximately 65,000 distinct characters can be encoded.

alpha class
The alpha class of a code set consists of all characters that are classified as alphabetic. For example, the alpha class of the ASCII code set is the letters a through z and A through Z.

ALS
An acronym for Asian Language System. ALS refers to a class of products that have been developed to operate with multibyte code sets. ALS products support various multibyte code sets whose characters are composed of 8, 16, 24, and 32 bits. ALS servers and tools are available for the 6.x and earlier family of products. These products might have been developed with the GLS Library or other software written specifically to handle Asian language processing.

ASCII
An acronym for the American Standards Committee for Information Interchange. This acronym is often used to describe an ordered set of printable and nonprintable characters used in computers and telecommunication. This code set is traditionally used in computer systems found in the United States. It contains 128 characters (each of which can be represented with 7 bits of information) and is the proper subset of every GLS character map and logical unit.

byte
The smallest computer memory unit (often called an octet).

character
A logical unit of storage for the value code in a code set. It can be represented by one or more bytes and can be a numeric, alphabetic, or nonprintable character (control character).

client
An application program that requests services from a server program, typically a file server or a database server.

client locale
The environment that defines the behavior of the client application by specifying a language, a code set, and the conventions used for a particular language. These conventions can include date, time, and monetary formats. For example, to read and write files, the client refers to the client locale, which is typically specified by the CLIENT_LOCALE environment variable.

code point
An entry in a code set. For example, in the ASCII code set, the 65th code point is A.

coded character set
See code set.

code set
A given language has a character set of one or more natural-language alphabets plus additional symbols for digits, punctuation, and diacritical marks. Each character has at least one code set, which maps its characters to unique bit patterns. ASCII, ISO8859-1, Microsoft 1252, and EBCDIC are examples of code sets for the English language.

code-set conversion
A translation of code points from one code set to another, such as ASCII to EBCDIC.

code-set order
The sequence of characters when sorted in terms of their numerical representation and code-point value in a code set. For example, in the ASCII code set, uppercase characters (A through Z) are ordered before lowercase characters (a through z).

collation
The process of ordering or comparing characters. The order or compare process is based on the code-set value of the character (code-set order) or on the localized order (collated order). Full internationalization of sorting and collating functions and support libraries is required in order to correctly and efficiently perform case conversions, comparisons, and regular expressions such as column1 = column2.

collation order
The sequence of values that specifies some logical ordering in which the character fields in a database are sorted and indexed. Collation order is also known as collating sequence.

DB_LOCALE
On the client, the value of DB_LOCALE specifies how databases are created on the database server (that is, which code set and collation order). In addition, DB_LOCALE specifies the code-set conversion (if any) between the client application and database server. On the database server, DB_LOCALE specifies how databases are created if this information is not specified by the client application.

encoding
The process of mapping a national character to its numeric representation in a coded-character set.

fixed-width character
Fixed-width character implementations (for example, UNICODE) assign a fixed number of bytes for each character encoded. Fixed-width character implementations are also referred to as wide-character encoding, which typically implies the use of specially defined data types.

flexible-width character
Flexible-width character implementations support encoding schemes that allow for single-byte characters and multibyte characters to be used simultaneously. The simultaneous use of single-byte and multibyte characters is achieved by establishing a protocol for recognizing the respective code sets supported in the encoded scheme. An example of flexible-width character implementation is Extended UNIX Code (EUC). EUC allows for support of up to four code sets simultaneously. In EUC, the 7-bit ASCII code set is supported with up to three other code sets for interoperability and compatibility. The other three code sets can be encoded as two-, three-, or four-byte code sets.

Global Language Support
Global Language Support (GLS) is an application environment that allows Informix application-programming interfaces (APIs) and database servers to handle different languages, cultural conventions, and code sets. Products labeled GLS use the GLS Library and GLS locales to facilitate locale-based processing. The GLS library supports text processing and multibyte characters.

GLS
See Global Language Support.

International-ization (I18n)
Internationalization (I18n) is the process of making Informix products easily adaptable to any culture and language. Among other features, internationalized software provides support for culturally specific sorting and for adaptable date, time, and money formats. An internationalized product does not required re-compilation of source code for each cultural environment. The user simply exchanges resource files and sets up the proper operating environment.

ISO8859-1
A code set that contains 256 single-byte characters. Characters 0 through 127 are the ASCII characters. Characters 128 through 255 are mostly characters from European languages, for example, é, ñ and ö.

language supplement
The result of the product localization process. A language supplement for a specific Western European language can be installed with an Informix product to allow the user to see error and warning messages in a language other than English. If installed with DB-Access, the menu names, options and on-line help for that product also appear in the specific language.

local character
A character in the native-language character set.

locale
The environment that defines the behavior of the program at runtime. The rules are usually based on the linguistic customs of the region or the territory. GLS library-based products use GLS locales that Informix defines and preprocesses for customers. Customers are not able to modify Informix locales. The locale can be set through an environment variable that dictates output formats for numbers, currency symbols, dates, and time as well as collation order for character strings and regular expressions.

localization (L10n)
Localization (abbreviated L10n) is the process adapting an internationalized product to a specific cultural environment. This process usually involves the creation of culturally specific resource files; the selection of message catalogs; the setting of date, time, and money formats; and the translation of the product user interface. It might also include the translation and production of end-user documentation, packaging, and collateral materials.

localization kit
The files and documentation that Informix supplies a localization center to enable translation to localize for the following products: C-ISAM, ESQL/C, ESQL/COBOL, OnLine, SE, and Universal Server.

multibyte character
If a language contains more than 256 characters, the code set must contain multibyte characters. A multibyte character might require from two to four bytes of storage. Many Asian languages contain 3,000 to 8,000 ideographic characters. Such languages have code sets made up of both single byte and multibyte characters (a multibyte code set). Some characters in the Japanese SJIS code set are multibyte characters of two or three bytes. Applications that handle data in a multibyte code set cannot assume that one character requires only one byte of storage.

native language
The computer user's spoken or written language (for example, English, Chinese, German, and so on).

NLS
An acronym for Native Language System (or Native Language Support). Informix limits its use of this term to the support of English and European languages for Informix servers 6.x and higher.

NLS Open, Implicit,
Explicit
The Informix 6.0 server and connectivity products introduced the NLS feature. The NLS feature was introduced in such a way that all types of client applications could access NLS data from the database server regardless of whether the client itself was able to process the NLS data according to the rules of a locale or whether the client supported the new NCHAR and NVARCHAR data types introduced by the database server.

To process these new data types, the Informix 6.0 server and connectivity products created three modes by which a client could connect to a database: Open NLS, Implicit NLS and Explicit NLS. These modes are determined by the values of two environment variables (DBNLS and COLLCHAR) sent to the server.

OPEN
NLS
This mode is for tools that have not been modified to process NLS data according to the rules of a locale and have not been modified to support the NCHAR and NVARCHAR data types.

In this mode, the database server allows the client to connect to any database, regardless of the locale of the database. When NCHAR and NVARCHAR data is sent from the database server to the client application, the database server converts it to CHAR and VARCHAR data, respectively. When CHAR and VARCHAR data is sent from the client to the database server, the database server converts it to NCHAR and NVARCHAR data, respectively. This mode is enabled when the client sets DBNLS=2 and COLLCHAR=1.

IMPLICIT
NLS
This mode is for tools that have been modified to process NLS data according to the rules of a locale but have not been modified to support the NCHAR and NVARCHAR data types.

In this mode, the locale used by the client must equal the locale used by the database to which the client is trying to connect. When NCHAR and NVARCHAR data is sent from the database server to the client application, the server converts it to CHAR and VARCHAR data, respectively. When CHAR and VARCHAR data is sent from the client to the database server, the database server converts it to NCHAR and NVARCHAR data, respectively. This mode is enabled when the client sets DBNLS=1 and COLLCHAR=1.

EXPLICIT
NLS
This mode is for tools that have been modified to process NLS data according to the rules of a locale and have been modified to support the NCHAR and NVARCHAR data types.

In this mode, the locale that the client uses must equal the locale used by the database to which the client is trying to connect. This mode is enabled when the client sets DBNLS=1 and COLLCHAR=0.

The following environment variable settings are correct, but they do not allow access to any NLS features: DBNLS=0 and COLLCHAR=0. If you do not set these variables, the default is 0.

The following environment variable settings are not valid: DBNLS=0 and COLLCHAR=1 and DBNLS=2 and COLLCHAR=0.

partial character
A multibyte character that has lost one or more bytes so that the intended meaning of the character is lost. GLS software provides context-specific solutions that prevent partial characters from being generated during string-processing operations.

server locale
The locale with which the server performs input and output. For example, files that are written and read with respect to the locale that the SERVER_LOCALE environment variable specifies.

server processing locale
The environment that the database server dynamically determines based on the client locales and information that is stored in the database being accessed.

single-byte character
The number of unique characters in the language determines the amount of storage that each code-set character requires. Because a single byte can store values in the range of 0 to 255, it can uniquely identify 256 characters. Most Western languages have fewer than 256 characters, so their code sets consist of single-byte characters. When an application handles data such in these code sets, it can assume that one character is always stored in one byte.

UNICODE
A consortium of computer vendors defined UNICODE to create a unified worldwide code set. UNICODE was integrated as part of the ISO international standard 10646-1 and is emerging as the code-set choice for Microsoft operating systems. Informix supports UNICODE and continues to support the diverse prevailing code sets for compatibility reasons and for the advanced processing requirements for three-byte and four-byte encoding schemes.

white space
White space is a series of one or more space characters. The GLS locale defines the characters that are considered to be space characters. For example, both the TAB and blank might be defined as space characters in one locale, but certain combinations of the CTRL key and another character might be defined as space characters in a different locale file.

wide character
A wide-character form of a code set involves normalizing the size of each multibyte character so that each character is the same size. This size must be equal to or greater than the largest character that an operating system can support, and it must match the size of an integer data type that the C compiler can scale. Some examples of an integer data type that the C compiler can scale are short integer (short int), integer (int), or long integer (long int).




Informix Guide to GLS Functionality, version 9.1
Copyright © 1998, Informix Software, Inc. All rights reserved.