Home | Previous Page | Next Page   SQL Features > Handling Character Data >

Using SQL Length Functions

You can use SQL length functions in the SELECT statement and other data manipulation statements. Length functions return the length of a column, string, or variable in bytes or characters.

The choice of locale affects the following three SQL length functions:

For the syntax of these functions, see the Expression segment in the IBM Informix: Guide to SQL Syntax.

The LENGTH Function

The LENGTH function returns the number of bytes of data in character data. However, the behavior of the LENGTH function varies with the type of argument that the user specifies. The argument can be a quoted string, a character-type column other than the TEXT data type, a TEXT column, a host variable, or an SPL routine variable.

The following table shows how the LENGTH function operates on each of these argument types. The Example column in this table uses the symbol s to represent a single-byte trailing whitespace character.

This table assumes that all arguments consist of single-byte characters.

LENGTH Argument Behavior Example
Quoted string Returns number of bytes in string, minus any trailing whitespace (as defined in the locale). If the string is 'Ludwig', the result is 6. If the string is 'Ludwigssss', the result is still 6.
CHAR, VARCHAR, LVARCHAR, NCHAR, or NVARCHAR column Returns number of bytes in a column, minus any trailing white- space characters, regardless of defined length of the column. If the fname column of the customer table is a CHAR(15) column, and this column contains the string 'Ludwig', the result is 6. If the fname column contains the string 'Ludwigssss', the result is still 6.
TEXT column Returns number of bytes in a column, including trailing white-space characters. If the cat_descr column in the catalog table is a TEXT column, and this column contains the string 'Ludwig', the result is 6. If the cat_descr column contains the string 'Ludwigssss', the result is 10.
Host or procedure variable Returns number of bytes that the variable contains, minus any trailing white pace, regardless of defined length of the variable. If the procedure variable f_name is defined as CHAR(15), and this variable contains the string 'Ludwig', the result is 6. If the f_name variable contains the string 'Ludwigssss', the result is still 6.

When you use the default locale or any locale with a single-byte code set, the LENGTH function seems to return the number of characters in the column. In the following example, the stores_demo database, which contains the customer table, uses the default code set for the U.S. English locale. Suppose a user enters a SELECT statement with the LENGTH function to display the last name, length of the last name, and customer number for rows where the customer number is less than 106.

SELECT lname AS cust_name, 
   length (fname) AS length, customer_num AS cust_num
   FROM customer WHERE customer_num < 106

The following example of output shows the result of the query. For each row that is retrieved, the length column seems to show the number of characters in the lname (cust_name) column. However, the length column actually displays the number of bytes in the lname column.

In the default code set, one byte stores one character. For more information about the default code set, see The Default Locale.

cust_name length cust_num
Ludwig 6 101
Carole 6 102
Philip 6 103
Anthony 7 104
Raymond 7 105

When you use the LENGTH function in a locale that supports a multibyte code set, such as the Japanese SJIS code set, the distinction between characters and bytes is meaningful. LENGTH returns the number of bytes in its argument. This result might be different from the number of characters.

The next example assumes that the database that contains the customer_multi table has locale with a multibyte code set. Suppose that the user enters a SELECT statement with the LENGTH function to display lname, its length, and customer_num for the customer whose number is 199.

SELECT lname AS cust_name, 
   length (fname) AS length, customer_num AS cust_num
   FROM customer_multi WHERE customer_num = 199

Suppose that lname for customer 199 consists of four characters:

aA1A2bB1B2

In this representation, the first character (the symbol a) is a single-byte character. The second character (the symbol A1A2) is a 2-byte character. The third character (the symbol b) is a single-byte character. The fourth character (the symbol B1B2) is a 2-byte character.

The following example of output shows the result of the query. Although the customer first name consists of 4 characters, the length column shows that the total number of bytes in this name is 6.

cust_name length cust_num
aA1A2bB1B2 6 199

The OCTET_LENGTH Function

The OCTET_LENGTH function returns the number of bytes and generally includes trailing whitespace characters in the byte count. This SQL length function uses the definition of whitespace that the locale defines. OCTET_LENGTH returns the number of bytes in a character column, quoted string, host variable, or SPL variable. The actual behavior of OCTET_LENGTH varies with the type of argument that the user specifies.

The following table shows how the OCTET_LENGTH function operates on each of the argument types. The Example column in this table uses the symbol s to represent a single-byte trailing whitespace character. For simplicity, the Example column also assumes that the example strings consist of single-byte characters.

OCTET_LENGTH Argument Behavior Example
Quoted string Returns number of bytes in string, including any trailing white- space characters. If the string is 'Ludwig', the result is 6. If the string is 'Ludwigssss', the result is 10.
CHAR or
NCHAR column
Returns number of bytes in string, including trailing whitespace characters. This value is the defined length, in bytes, of the column. If the fname column of the customer table is a CHAR(15) column, and this column contains the string 'Ludwig', the result is 15. If the fname column contains the string 'Ludwigsss', the result is still 15.
VARCHAR or
NVARCHAR column
Returns number of bytes in string, including trailing whitespace. Value is the actual length, in bytes, of the character string, not the declared maximum column size. If the cat_advert column of the catalog table is a VARCHAR(255, 65) column, and this column contains the string "Ludwig", the result is 6. If the column contains the string 'Ludwigssss', the result is 10.
TEXT column Returns number of bytes in column, including trailing white- space characters. If the cat_descr column in the catalog table is a TEXT column, and this column contains the string 'Ludwig', the result is 6. If the cat_descr column contains the string 'Ludwigssss', the result is 10.
Host or procedure variable Returns number of bytes that the variable contains, including any trailing whitespace, regardless of defined length of variable. If the procedure variable f_name is defined as CHAR(15), and this variable contains the string 'Ludwig', the result is 6. If the f_name variable contains the string 'Ludwigssss', the result is 10.

The difference between the LENGTH and OCTET_LENGTH functions is that OCTET_LENGTH generally includes trailing whitespace in the byte count, whereas LENGTH generally excludes trailing whitespace from the byte count.

The advantage of the OCTET_LENGTH function over the LENGTH function is that the OCTET_LENGTH function provides the actual column size whereas the LENGTH function trims the column values and returns the length of the trimmed string. This advantage of the OCTET_LENGTH function applies both to single-byte code sets such as ISO8859-1 and multibyte code sets such as the Japanese SJIS code set.

The following table shows some results that the OCTET_LENGTH function might generate.

OCTET_LENGTH Input String Description Result
'abc ' A quoted string with four single-byte characters (the characters abc and one trailing space) 4
'A1A2B1B2' A quoted string with two multibyte characters 4
'aA1A2bB1B2' A quoted string with two single-byte and two multibyte characters 6

The CHAR_LENGTH Function

The CHAR_LENGTH function (also known as the CHARACTER_LENGTH function) returns the number of characters in a quoted string, column with a character data type, host variable, or procedure variable. However, the actual behavior of this function varies with the type of argument that the user specifies.

The following table shows how the CHAR_LENGTH function operates on each of the argument types. The Example column in this table uses the symbol s to represent a single-byte trailing whitespace. For simplicity, the Example column assumes that the strings consist of single-byte characters.

CHAR_LENGTH Argument Behavior Example
Quoted string Returns number of characters in string, including any trailing white- space (as defined in the locale). If the string is 'Ludwig', the result is 6. If the string is 'Ludwigssss', the result is 10.
CHAR or
NCHAR column
Returns number of characters in string, including trailing white space characters. This value is the defined length, in bytes, of the column. If the fname column of the customer table is a CHAR(15) column, and this column contains the string 'Ludwig', the result is 15. If the fname column contains the string 'Ludwigssss', the result is 15.
VARCHAR or
NVARCHAR column
Returns number of characters in string, including whitespace characters. Value is the actual length, in bytes, of the string, not the declared maximum column size. If the cat_advert column of the catalog table is a VARCHAR(255, 65), and this column contains the string "Ludwig", the result is 6. If the column contains the string 'Ludwigssss', the result is 10.
TEXT column Returns number of characters in column, including trailing white space characters. If the cat_descr column in the catalog table is a TEXT column, and this column contains the string 'Ludwig', the result is 6. If the cat_descr column contains the string 'Ludwigssss', the result is 10.
Host or procedure variable Returns number of characters that the variable contains, including any trailing white space, regardless of declared length of the variable. If the procedure variable f_name is defined as CHAR(15), and this variable contains the string 'Ludwig', the result is 6. If the f_name variable contains the string 'Ludwigssss', the result is 10.

The CHAR_LENGTH function is especially useful with multibyte code sets. If a quoted string of characters contains any multibyte characters, the number of characters in the string differs from the number of bytes in the string. You can use the CHAR_LENGTH function to determine the number of characters in the quoted string.

However, the CHAR_LENGTH function can also be useful in single-byte code sets. In these code sets, the number of bytes in a column is equal to the number of characters in the column. If you use the LENGTH function to determine the number of bytes in a column (which is equal to the number of characters in this case), LENGTH trims the column values and returns the length of the trimmed string. In contrast, CHAR_LENGTH does not trim the column values but returns the declared size of the column.

The following table shows some results that the CHAR_LENGTH function might generate for quoted strings.

CHAR_LENGTH Input String Description Result
'abc ' A quoted string with 4 single-byte characters (the characters abc and 1 trailing space) 4
'A1A2B1B2' A quoted string with 2 multibyte characters 2
'aA1A2B1B2' A quoted string with 2 single-byte and 2 multibyte characters 4
Home | [ Top of Page | Previous Page | Next Page | Contents | Index ]