Home | Previous Page | Next Page   SQL Features > Using Character Data Types >

Localized Collation of Character Data

The choice of locale can affect the collating order of NCHAR and NVARCHAR character data types, as the following sections describe.

The NCHAR Data Type

The NCHAR data type stores character data in a fixed-length field as a string of single-byte or multibyte letters, numbers, and other characters that are supported by the code set of your database locale. The syntax of the NCHAR data type is as follows.

Read syntax diagramSkip visual syntax diagram                         (1)          .-(--1--)----.
>>-| Data Type segment |-------NCHAR--+-(--size--)-+-----------><
 

Notes:
  1. See IBM Informix: Guide to SQL Syntax.
Element
Purpose
size
Specifies the number of bytes in the column. The total length of an NCHAR column cannot exceed 32,767 bytes. If you do not specify size, the default is NCHAR(1).

Because the length of this column is fixed, when the database server retrieves or sends an NCHAR value, it transfers exactly size bytes of data. If the length of a character string is shorter than size, the database server extends the string with spaces to make up the size bytes. If the string is longer than size bytes, the database server truncates the string.

Collating NCHAR Data

NCHAR is a locale-sensitive data type. The only difference between NCHAR and CHAR data types is the collation order. The database server sorts data in NCHAR columns in localized order, if the locale defines a localized order. (For most operations, the database server collates data in CHAR columns in code-set order, even if the locale defines a localized collation.)

Tip:
The default locale (U.S. English) does not specify a localized order. Therefore, the database server sorts NCHAR data in code-set order for this locale. When you use the default locale, there is no difference between CHAR and NCHAR data.
Handling NCHAR Data

A client application manipulate NCHAR data using the CLIENT_LOCALE setting of the client system. The client application performs code-set conversion of NCHAR data automatically if CLIENT_LOCALE differs from DB_LOCALE, and DBNLS is set to 1.

Multibyte Characters with NCHAR

To store multibyte character data in an NCHAR column, your database locale must support a code set that includes the same multibyte characters. When you store multibyte characters, make sure to calculate the number of bytes that are needed. The size parameter of the NCHAR data type refers to the number of bytes of storage that is reserved for the data, rather than to the number of logical characters.

Because one multibyte character requires several bytes for storage, the value of size bytes does not indicate the number of characters that the column can hold. The total number of multibyte characters that you can store in the column is less than the total number of bytes that you can store in the column. Make sure to declare the size value of the NCHAR column in such a way that it can hold enough characters for your purposes.

Treating NCHAR Values as Numeric Values

If you plan to perform calculations on numbers that are stored in a column, assign a numeric data type (such as INTEGER or FLOAT) to that column. The description of the CHAR data type in the IBM Informix: Guide to SQL Reference provides detailed reasons why you should not store certain numeric values in CHAR values. The same reasons apply for certain numeric values as NCHAR values. Treat only numbers that have leading zeros (such as postal codes) as NCHAR data types. Use NCHAR only if you need to sort the numeric values in localized order.

Nonprintable Characters with NCHAR

An NCHAR value can include tabs, spaces, and other whitespace and nonprintable characters. Nonprintable NCHAR and CHAR values are entered, displayed, and treated similarly.

The NVARCHAR Data Type

The NVARCHAR data type stores character data in a variable-length field. Data can be a string of single-byte or multibyte letters, numbers, and other characters that are supported by the code set of your database locale.

The syntax of the NVARCHAR data type is as follows:

Read syntax diagramSkip visual syntax diagram                         (1)
>>-| Data Type segment |---------------------------------------->
 
                     .-,--0-------.
>--NVARCHAR--(--max--+-,--reserve-+--)-------------------------><
 

Notes:
  1. See IBM Informix: Guide to SQL Syntax.
Element
Purpose
max
Specifies the maximum number of bytes that can be stored in the column.
reserve
Specifies the minimum number of bytes that can be stored in the column.

You must specify max of the NVARCHAR column. The size of this parameter cannot exceed 255 bytes.

When you place an index on an NVARCHAR column, the maximum size is 254 bytes. You can store shorter, but not longer, character strings than the value that you specify.

Specify the reserve parameter when you initially intend to insert rows with data values having few or no characters in this column but later expect the data to be updated with longer values. This value can range from 0 to 255 bytes but must be less than the max size of the NVARCHAR column. If you do not specify a minimum space value, the default value of reserve is 0.

Although use of NVARCHAR economizes on space that is used in a table, it has no effect on the size of an index. In an index that is based on an NVARCHAR column, each index key has a length equal to max bytes, the maximum size of the column.

The database server does not strip an NVARCHAR object of any user-entered trailing whitespace, nor does it pad the NVARCHAR object to the full length of the column. However, if you specify a minimum reserved space (reserve), and some of the data values are shorter than that amount, some of the space that is reserved for rows goes unused.

Collating NVARCHAR Data

The NVARCHAR data type is a locale-sensitive data type. The only difference between NVARCHAR and VARCHAR data types is the collation order. The database server collates data in NVARCHAR columns in localized order, if the locale defines a localized order. For most operations, the database server collates data in CHAR columns in code-set order.

Tip:
The default locale (U.S. English) does not specify a localized order. Therefore, the database server sorts NVARCHAR data in code-set order. When you use the default locale, there is no difference between VARCHAR and NVARCHAR data.
Handling NVARCHAR Data

Within a client application, always manipulate NVARCHAR data in the CLIENT_LOCALE of the client application. The client application performs code-set conversion of NVARCHAR data automatically if CLIENT_LOCALE differs from DB_LOCALE. (For information about code-set conversion, see Performing Code-Set Conversion.)

Multibyte Characters with NVARCHAR

To store multibyte character data in an NVARCHAR column, your database locale must support a code set with these same multibyte characters. When you store multibyte characters, make sure to calculate the number of bytes that are needed. The max parameter of the NVARCHAR data type refers to the maximum number of bytes that the column can store.

Because one multibyte character uses several bytes for storage, the value of max bytes does not indicate the number of logical characters that the column can hold. The total number of multibyte characters that you can store in the column is less than the total number of bytes that the column can store. Make sure to declare the max value of the NVARCHAR column so that it can hold enough multibyte characters for your purposes.

Nonprintable Characters with NVARCHAR

An NVARCHAR value can include tabs, spaces, and nonprintable characters. Nonprintable NVARCHAR characters are entered, displayed, and treated in the same way as nonprintable VARCHAR characters.

Tip:
The database server interprets the null character (ASCII 0) as a C null terminator. In NVARCHAR data, the null terminator acts as a string-terminator character.
Storing Numeric Values in an NVARCHAR Column

The database server does not pad a numeric value in a NVARCHAR column with trailing blanks up to the maximum length of the column. The number of digits in a numeric NVARCHAR value is the number of characters that you need to store that value. For example, the database server stores a value of 1 in the mytab table when it executes the following SQL statements:

CREATE TABLE mytab (col1 NVARCHAR(10));
INSERT INTO mytab VALUES (1);

Performance Considerations

The NCHAR data type is similar to the CHAR data type, and NVARCHAR is similar to the VARCHAR data type. These data types differ in two ways:

Localized collation depends on the sorting rules that the locale defines, not simply on the computer representation of the character (the code points). This difference means that the database server might perform complex processing to compare and collate NCHAR and NVARCHAR data. Therefore, access to NCHAR data might be slower with respect to comparison and collation than to access CHAR data. Similarly, access to data in an NVARCHAR column might be slower with respect to comparison and collation than access to the same data in a VARCHAR column.

Assess whether your character data needs to take advantage of localized order for collation and comparison. If code-set order is adequate, use the CHAR, LVARCHAR, and VARCHAR data types.

Home | [ Top of Page | Previous Page | Next Page | Contents | Index ]