informix
Informix Guide to GLS Functionality
Database Server Features

Locale Support For C User-Defined Routines

Dynamic Server allows you to create user-defined routines (UDRs) that are written in the C programming language. These C UDRs use the DataBlade API to communicate with the database server. For a complete description of the DataBlade API, see the DataBlade API Programmer's Manual. This section describes how to internationalize a C UDR.

Internationalization is the process of creating a user-defined routine (UDR) that can support different languages, territories, and code sets without changing or recompiling its code. For a complete discussion of internationalization, see the Informix GLS Programmer's Manual. An internationalized C UDR must handle the following GLS considerations:

Current Processing Locale for UDRs

To access a database, a client application first requests a connection to the database server. The database server must verify that it can access the specified database and establish the connection between the client and this database. In the process, the database server establishes the server-processing locale to use the duration of the connection. When the client application executes a UDR, this UDR executes on the server computer in the context of the server-processing locale. This locale is often called the current processing locale.

Many user-defined routines handle non-ASCII data correctly even if they were originally written for ASCII data. However, some routines might perform abnormally. To internationalize your C UDR, you must ensure that your UDR handles the server-processing locale in any GLS-related operations. If the UDR does not properly support the server-processing locale, the routine might return an error message.

Non-ASCII Characters in Source Code

Non-ASCII characters might appear in the following places within a C-language UDR source file:

In C-Language Statements

The C compiler must recognize the code set that you use in your C-language statements. The capabilities of your C compiler might limit your ability to use non-ASCII characters within the C-language statements in a UDR source file. For example, some C-language compilers support multibyte characters in literals or comments only.

If the C compiler does not fully support non-ASCII characters, it might not successfully compile a UDR that contains these characters. In particular, the following situations might affect compilation of your UDR:

In SQL Statements

In C UDRs, SQL statements occur as literal strings to the mi_exec() and mi_prepare() functions. The C compiler does not parse these literal strings. Therefore, it does not need to recognize the code set of the characters in these SQL statements.

Within a C source file, you can use non-ASCII characters in SQL statements for the following objects:

Copying Character Data

When you copy data, you must ensure that the buffers are an adequate size to hold the data. If the destination buffer is not large enough for the multibyte data in the source buffer, the data might be truncated during the copy. For example, the following C code fragment copies the multibyte data A1A2A3B1B2B3 from buf1 to buf2:

Because buf2 is not large enough to hold the multibyte string, the copy truncates the string to A1A2A3B1B2. To prevent this situation, ensure that the multibyte string fits into a buffer before the DataBlade API module performs the copy.

The Informix GLS Library

The Informix GLS library is an application programming interface (API) that lets developers of user-defined routines and DataBlade modules create internationalized applications.

Character Processing with Informix GLS

The macros and functions of Informix GLS provide access within a DataBlade API module to GLS locales, which contain culture-specific information. The Informix GLS library contains functions that provide the following capabilities:

For more information on the Informix GLS library and how to use it in a DataBlade API module, see the Informix GLS Programmer's Manual.

Compatibility of Wide-Character Data Types

Wide character data types are an alternative form for the processing of multibyte characters. A wide-character form of a code set involves the normalization of the size of each multibyte character so that each character is the same size. A legacy DataBlade API module might use any of the following data types to hold wide characters.

Wide-Character
Data Type
Description Drawback
mi_wchar A legacy DataBlade API data type currently defined as unsigned short on all systems The DataBlade API does not provide wide-character functions that operate on mi_wchar values.
wchar_t An operating-system data type that is platform-specific The operating-system provides wide-character functions that operate on wchar_t values. Use of these functions is platform specific.

The Informix GLS library provides the gl_wchar_t data type for support of wide characters. Informix GLS also provides its own set of wide-character functions that operate on gl_wchar_t. Use of the Informix GLS wide-character functions removes platform dependency from your application and provides access within your DataBlade API module to Informix GLS locales.

The Informix GLS library does not provide any functions for conversion between gl_wchar_t and mi_wchar or gl_wchar_t an wchar_t. If a DataBlade API module continues to use either mi_wchar or wchar_t and also needs to use the Informix GLS wide-character processing, you must write code to perform any necessary conversions.

Code-Set Conversion and the DataBlade API

Within a UDR, the DataBlade API does not perform any code-set conversion automatically. Your C UDR might need to perform code-set conversion in the following situations:

Character Strings in UDRs

When your C UDR contains character strings that are sent to the database server, it must perform any required code-set conversion on these strings. This code-set conversion must handle any differences between the code set of this character string and the code set of the server-processing locale in which the UDR executes.

For example, the DataBlade API does not perform code-set conversion on the multibyte table name, A1A2A3B1B2, in following SELECT statement:

If your UDR might execute in a server-processing locale that does not include a code set that supports characters in your SQL statements, the UDR can explicitly perform code-set conversion between the code sets of the server-processing locale and a specified locale. The DataBlade API provides the following functions to assist in this code-set conversion.

Code-Set Conversion on a String DataBlade API Function
Perform code-set conversion on a specified string from a specified locale to the server-processing locale mi_convert_from_codeset()
Perform code-set conversion on a specified string from the server-processing locale to a specified locale mi_convert_to_codeset()

For more information on the syntax of these DataBlade API functions, see the function reference of the DataBlade API Programmer's Manual.

Character Strings in Opaque-Type Support Functions

The client application performs code-set conversion of non-opaque-type data that is transferred to and from the client. However, the database server does not know about the internal format of an opaque data type. Therefore, for opaque data types, the support functions are responsible for explicitly converting any string that is not in the code set of the server-processing locale.

You might need to perform code-set conversion in the following opaque-type support functions:

The DataBlade API provides the following functions for code-set conversion in the support functions of an opaque data type.

Code-Set Conversion on an Opaque Type DataBlade API Function
Perform code-set conversion on a string argument from the code set of the server-processing locale to that of the client locale mi_put_string()
Perform code-set conversion on a string from the code set of the client locale to that of the server-processing locale mi_get_string()

For more information on the syntax of these DataBlade API functions, see the function reference in the DataBlade API Programmer's Manual.

Locale-Specific Data Formatting

When a C UDR handles strings that contain end-user formats for date, time, numeric, or monetary data, you must write the UDR so that it handles any locale-specific formats of these end-user formats. The DataBlade API provides functions that convert between the internal representation of several data types and its end-user format.

The following DataBlade API functions convert an internal database value to a string that uses the locale-specific end-user format.

DataBlade API Function Description
mi_date_to_string() Uses the locale-specific end-user date format to convert an internal DATE value to its string equivalent.
mi_money_to_string() Uses the locale-specific end-user monetary format to convert an internal MONEY value to its string equivalent.
mi_decimal_to_string() Uses the locale-specific end-user numeric format to convert an internal DECIMAL value to its string equivalent.

Important: The mi_datetime_to_string() and mi_interval_to_string() functions do not format the string in the date and time formats of the current processing locale. Instead, they create a date/time or interval string in a fixed ANSI SQL format.

The following DataBlade API functions interpret a string in its locale-specific end-user format and convert it to its internal database value.

DataBlade API Function Description
mi_string_to_date() Converts a string in its locale-specific date end-user format to its internal DATE format.
mi_string_to_money() Converts a string in its locale-specific currency end-user format to its internal MONEY format.
mi_string_to_decimal() Converts a string in its locale-specific numeric end-user format to its internal DECIMAL format.

Important: The mi_string_to_datetime() and mi_string_to_interval() functions do not interpret the string in the date and time formats of the current processing locale. Instead, they interpret the date/time or interval string in a fixed ANSI SQL format.

Internationalized Exception Messages

The DataBlade API function mi_db_error_raise() sends an exception message to an exception callback. This message can be either of the following:

For general information on how to specify a literal message in mi_db_error_raise() and how to specify a custom message for mi_db_error_raise(), see the chapter on how to handle exceptions and events in the DataBlade API Programmer's Manual.

This section discusses the following tasks about how to raise locale-specific exception messages:

Inserting Custom Exception Messages

You can store custom status codes and their associated messages in the syserrors system catalog table. To create a custom exception message, insert a row directly in the syserrors table. The syserrors table provides the following columns for an internationalized exception message.

Column Name Description
sqlstate The SQLSTATE value that is associated with the exception You can use the following query to determine the current list of SQLSTATE message strings in syserrors:
For more information on how to determine SQLSTATE values, see the DataBlade API Programmer's Manual.
message The text of the exception message, with characters in the code set of the target locale By convention, do not include any newline characters in the message.
locale The locale with which the exception message is to be used The locale column identifies the language and code set used for the internationalization of error and warning messages. This name is the name of the target locale of the message text.

Tip: For more information on the columns of the syserrors system catalog table, see the chapter on the system catalog tables in the "Informix Guide to SQL: Reference."

Do not allow any code-set conversion to take place when you insert the message text in syserrors. If the code sets of the client and database locales differ, temporarily set both the CLIENT_LOCALE and DB_LOCALE environment variables in the client environment to the name of the database locale. This workaround prevents the client application from performing code-set conversion.

If you specify any parameters in the message text, include only ASCII characters in the parameters names. Following this convention means that the parameter name can be the same for all locales. Most code sets include the ASCII characters.

For example, the following INSERT statements insert new messages in syserrors whose SQLSTATE value is "03I01":

The '03I01' SQLSTATE value now has two locale-specific messages. The database server chooses the appropriate message based on the server-processing locale of the UDR when it executes. For more information on how mi_db_error_raise() locates an exception message, see Searching for Custom Messages.

For a complete description of how to add custom messages to the syserrors system catalog table, see the DataBlade API Programmer's Manual.

Searching for Custom Messages

When the mi_db_error_raise() function initiates a search of the syserrors system catalog table, it requests the message in which all components of the locale (language, territory, code set, and optional modifier) are the same in the current processing locale and the locale column of syserrors.

For C UDRs that use the default locale, the current processing locale is U.S. English (en_us). When the current processing locale is U.S. English, mi_db_error_raise() looks only for messages that use the U.S. English locale. However, for C UDRs that use nondefault locales, the current processing locale is the server-processing locale.

For a description of how mi_db_error_raise() searches for messages in the syserrors system catalog table, see the chapter on exceptions in the DataBlade API Programmer's Manual.

Specifying Parameter Markers

The custom message in the syserrors system catalog table can contain parameter markers. These parameter markers are sequences of characters enclosed by a single percent sign on each end (for example, %TOKEN%). A parameter marker is treated as a variable for which the mi_db_error_raise() function can supply a value. The mi_db_error_raise() function assumes that any message text or message parameter strings that you supply are in the server-processing locale.

For a complete description of how to specify parameter markers for a custom message, see the DataBlade API Programmer's Manual.

Internationalized Tracing Messages

The DataBlade API supports trace messages that correspond to a particular locale. The current database locale determines which code set the trace message uses. Based on the current database locale, a given tracepoint can produce an internationalized trace message. Internationalized tracing enables you to develop and test the same code in many different locales.

To provide internationalized tracing support, the DataBlade API provides the following capabilities:

Inserting Messages in the systracemsgs System Catalog Table

The systracemsgs system catalog table stores internationalized trace messages that you can use to debug your C UDRs. To create an internationalized trace message, insert a row directly into the systracemsgs table. The systracemsgs table provides the following information about an internationalized trace message.

Column Name Description
name The name of the trace message
locale The locale with which the trace message is to be used
message The text of the trace message

The combination of message name and locale must be unique within the table. Once you insert a new trace class into systracemsgs, the database server assigns it a unique identifier, called a trace-message identifier. It stores the trace-class identifier in the msgid column of systracemsgs. Once a trace message exists in the systracemsgs table, you can specify the message either by name or by trace-message identifier to DataBlade API tracing functions.

The trace-message text can be a string of text in the appropriate language and code set for the locale, and it can contain tokens to indicate where to substitute a piece of text. Token names are set off by a single percent (%) symbol on each end.

The following INSERT statement puts a new message called qp1_exit in the systracemsgs table:

This message text is in English and therefore the systracemsgs row specifies the default locale of U.S. English.

This second message is the French version of the qp1_exit message and therefore the systracemsgs row specifies the French locale on a UNIX system (fr_fr.8859-1):

Enter message text in the language of the server locale, with any characters available in the server code set. To insert a variable, enclose the variable name with a a single percent sign on each end (for example, %a%). When the database server prepares the trace message for output, it replaces each variable with its actual value.

Putting Internationalized Trace Messages into Code

The DataBlade API provides the following tracing functions to insert internationalized tracepoints into UDR code:

Syntax elements for both GL_DPRINTF and gl_tprintf() have the following values:

trace_class is either a trace-class name or the trace-class identifier integer value expressed as a character string.
threshold is a nonnegative integer that sets the tracepoint threshold for execution.
message_name is the identifier for an internationalized message stored in the systracemsgs system catalog table of the database.
toktype is a string made up of a token name followed by a single percent (%) symbol followed by a single character output specifier as used in printf formats.
val is a value expression to be output that must match the type of the output specifier in the preceding token.
MI_LIST_END is a macro constant that ends the variable-length list.

Important: The MI_LIST_END constant marks the end of the variable-length list. If you do not include MI_LIST_END, the user-defined routine might fail.

The following example shows an internationalized trace statement that uses the GL_DPRINTF macro:

If the current locale is the default locale of U.S. English and the current trace level of the funcEntry class is greater than or equal to 20, this tracepoint generates the following trace message:

The following example shows an internationalized trace block that uses the gl_tprinf() function:

If the current locale is French and the current trace level of the funcEntry class is greater than or equal to 25, this tracepoint generates the following trace message:

The database server writes the trace messages in the trace-output file in the code set of the locale associated with the message. If the trace message originated from the systracemsgs system catalog table, its characters are in the code set of the locale specified in the locale column of its systracemsgs entry. The database server might have performed code-set conversion on these trace messages if the code set in the UDR source is different from (but compatible with) the code set of the server-processing locale.

Searching for Trace Messages

To write an internationalized trace message to your trace-output file, the database server must locate a row in the systracemsgs system catalog table whose locale column matches (or is compatible with) the server-processing locale for your UDR. Therefore, to see a particular trace message in the trace-output file, your locale environment variables (CLIENT_LOCALE, DB_LOCALE, and SERVER_LOCALE) must be set so that the database server generates a server-processing locale that matches an entry in the systracemsgs table.

The database server searches the systracemsgs table for an entry with the same name as the tracepoint and a locale in which all components of the locale (language, territory, and code set) are the same in the current processing locale and the locale column of systracemsgs. If only the language and territory match, the database server converts the code set. If no message has matching language and territory, it uses the first available message with the correct language. If there is no message in the appropriate language, it uses the message for the default language, en_us.

Locale-Sensitive Data in an Opaque Data Type

When you create an opaque data type, you must write the support functions and SQL functions of the opaque type so that they handle locale-sensitive data. An opaque data type is fully encapsulated; its internal structure is not known to the database server. Therefore, the database server cannot automatically perform the locale-specific tasks such as code-set conversion on character data or locale-specific formatting of date, numeric, or monetary data.

When you create an opaque data type, you must write the support functions of the opaque type so that they handle any locale-sensitive data. In particular, consider how to handle any locale-sensitive data when you write the following support functions:

The DataBlade API and Informix GLS provide GLS support for opaque-type support functions written in C. The following sections summarize GLS considerations for these support functions. For general information on the support functions of an opaque data type, see Extending Informix Dynamic Server 2000.

Internationalized Input and Output Support Functions

The internal representation of an opaque data type is the C structure that stores the opaque-type information. Each opaque type also has a character-based format, known as its external representation. This external representation is received by the database server as an LVARCHAR value. The LVARCHAR data type can hold single-byte (ASCII and non-ASCII) and multibyte character data, depending on the locale of the client application.

Client applications perform code-set conversion on LVARCHAR data. However, the ability to transfer the data between a client application and database server is not sufficient to support locale-sensitive data in opaque data types. It does not ensure that the data is correctly manipulated at its destination. The input and output support functions convert the opaque data type from its internal to an external representation, and vice versa, as follows:

When you write these opaque-type support functions as C UDRs, you must ensure that these functions correctly handle any locale-sensitive data, including the following tasks.

Locale-Sensitive Task For More Information
Any code-set conversion on character data Code-Set Conversion and the DataBlade API
Any handling of multibyte or wide characters in character data The Informix GLS Library
Any formatting of locale-specific date, numeric, or monetary data Locale-Specific Data Formatting

Internationalized Send and Receive Support Functions

The send and receive functions support binary transfer of opaque data types.That is, they convert the opaque data type from its internal representation on the client computer to its internal representation on the server computer (where it is stored), as follows:

If the internal representation of an opaque type contains character data, the client application cannot perform any locale-specific translations, including the following ones.

Locale-Sensitive Task For More Information
Any code-set conversion on character data Character Strings in Opaque-Type Support Functions
Any handling of multibyte or wide characters in character data The Informix GLS Library

Therefore, when you write the receive and send support functions as C UDRs, you must ensure that these functions handle these locale-sensitive tasks correctly.


Informix Guide to GLS Functionality, Version 9.2
Copyright © 1999, Informix Software, Inc. All rights reserved