Loading and Unloading Data

The LOAD and UNLOAD statements allow you transfer data to and from your database with operating-system text files. The following sections describe the GLS aspects of the LOAD and UNLOAD statements. For a complete description of the use and syntax of these statements, see the IBM Informix: Guide to SQL Syntax.

Loading Data into a Database

The LOAD statement inserts data from an operating-system file into an existing table or view. This operating-system file is called a LOAD FROM file. The data in this file can contain any character that the client code set defines. If the client locale supports a multibyte code set, the data can contain multibyte characters. If the database locale supports a code set that is different from but convertible to the client code set, the client performs code-set conversion on the data before sending the data to the database server. For more information, see Performing Code-Set Conversion.

The locale also defines the formats for date, time, numeric, and monetary data. You can apply any format that the client locale supports to column values in the LOAD FROM file. For example, a French locale might define monetary values that have a blank space as the thousands separator and a comma as the decimal separator. When you use this locale, the following literal value for a MONEY column is valid in a LOAD FROM file:

You can specify alternative formats for date and monetary data. If you set appropriate environment variables, the LOAD FROM files can use the alternative end-user formats for DATE, DATETIME, and MONEY column values. For more information, see Customizing Date and Time End-User Formats and Customizing Monetary Values.

Unloading Data from a Database

The UNLOAD statement writes the rows that a SELECT statement retrieves to an operating-system file. This operating-system file is called an UNLOAD TO file. The data values in this file contains characters that the client code set defines. If the client locale supports a multibyte code set, the data can include multibyte characters from the code set.

If the database locale supports a code set that is different from but convertible to the client code set, the client performs code-set conversion on the data before it writes the data to the UNLOAD TO file. (For more information, see Performing Code-Set Conversion.)

The client locale and certain environment variables determine the output format of certain data types in the UNLOAD TO file. These data types include DATE values, MONEY values, values of numeric data types, and DATETIME values. For further information, see End-User Formats and Customizing End-User Formats.

You can use an UNLOAD TO file, which the UNLOAD statement generates, as the input file (the LOAD FROM file) to a LOAD statement that loads another table or database. When you use an UNLOAD TO file in this manner, make sure that all environment variables and the client locale have the same values when you perform the LOAD as they did when you performed the UNLOAD.

Loading with External Tables (XPS)

High-performance parallel loading and unloading for Extended Parallel Server uses external tables. It uses a series of enhanced SQL statements that you can issue with DB–Access or embed in ESQL/C.

The high-performance loader (HPL) provides extensive support for loading tables from many different sources, and performs a variety of data-format conversions. It also supports non-ASCII characters in field and record delimiters. High-performance loading performs operations like the following that might involve support for non-ASCII characters:

In nondefault locales, enhanced SQL statements for the loader, such as CREATE EXTERNAL TABLE...USING, INSERT INTO...SELECT, and SELECT...INTO EXTERNAL table-name USING, support identifiers that can include non-ASCII characters. For information about identifiers for Extended Parallel Server, see Non-ASCII Characters in Identifiers.

Loading Simple Large Objects with External Tables (XPS)

Extended Parallel Server supports external tables to load and unload simple large objects. Simple large objects (TEXT or BYTE data type columns) are supported only by delimited and INFORMIX format external tables. In delimited format, a simple-large-object column can be represented in either text or hex encoding. In text encoding, a simple large object is written to data file as is. Backslashes and delimiters are escaped. In hex encoding, each data byte in a simple large object is represented by two hexadecimal digits (0 through 9, and A through F). Nonprintable characters in simple large objects are included unchanged in data files.

For information about how to define simple-large-object columns in an external table, see the CREATE EXTERNAL TABLE statement in the IBM Informix: Guide to SQL Syntax. For information on file formats and performance considerations, as well as a step-by-step procedure for loading with external tables, see the IBM Informix: Administrator's Reference.

Specifying an Escape Character

You can specify an escape character to direct the database server to recognize incomplete or invalid multibyte character data in the simple large object. If you do not specify an escape character, the database server does not check the character fields in text-based data files for embedded special characters during loading.

When you specify an escape character, the backslash (\) character precedes any single character to indicate the occurrence of the actual character, regardless of whether it would otherwise have a special significance to the loading and unloading process. For example, '\|' is interpreted as the literal '|' character instead of as a column separator.

During unloading, the database server escapes delimiters and backslash (\) symbols. During loading, any character that immediately follows a backslash is taken literally. Nonprintable characters are directly embedded in the data file for TEXT column values.

Defining a Delimiter

Simple-large-object data values are inserted directly into the record at the point where the TEXT or Byte column is defined, between field delimiters.

User-defined delimiters are limited to one byte each. Therefore, in multibyte locales, only characters with a length of exactly one byte can be defined as delimiters. In both single byte and multibyte locales, a simple large object is always traversed byte by byte. If a byte matches one of the delimiters or a backslash, it is escaped during unloading. During loading, only the byte immediately following a backslash is escaped, not the (possibly multibyte) character following the backslash.

Transversal of delimited simple-large-object data is performed byte by byte in all locales. A simple large object is not traversed character by (possibly multibyte) character because it does not always contain valid text, and might contain incomplete or invalid multibyte characters. Unlike character columns, blank filling or truncating for simple large objects is not an option for invalid multibyte characters. You cannot have random access to the data in simple large objects, and you cannot alter simple large objects in any way.

The database server does not detect incomplete or invalid multibyte characters in simple-large-object data in the loading or unloading process. You must ensure that multibyte data is consistent and accurate before you load it into a character column.