informix
Informix Guide to SQL: Syntax
SQL Statements

CREATE EXTERNAL TABLE

Use the CREATE EXTERNAL TABLE statement to define an external source that is not part of your database so you can use that external source to load and unload data for your database.

Syntax

Element Purpose Restrictions Syntax
table Name of the external table that describes the external data The name must be different from any existing table, view, or synonym name in the current database. Database Object Name, p. 4-50

Usage

After you create a table with the CREATE EXTERNAL TABLE statement, you can move data to and from the external source with an INSERT INTO...SELECT statement.

Column Definition

Element Purpose Restrictions Syntax
column One column name for each column of the external table For each column, you must specify an Informix data type. Identifier, p. 4-205
n Number of 8-bit bytes to represent the integer For FIXED format binary integers; big-endian byte order. n=2 for 16-bit integers; n=4 for 32-bit integers
p Precision (total number of digits) For FIXED-format files only. Literal Number,
p.
4-237
s Scale (number of digits after the decimal point) For FIXED-format files only. Literal Number,
p.
4-237
null_string Value that is to be interpreted as a null For restriction information, refer to "Defining Null Values." Quoted String,
p.
4-260
template Name of a table with the exact column names and definitions for your external table You cannot skip columns or put data of a different type into any column. Database Object Name, p. 4-50

Using the SAME AS Clause

When you create a table with the SAMEAS keyword, the column names from the original table are used in the new table. You cannot use indexes in the external table definition.

You cannot use the SAMEAS keyword for FIXED-format files.

Using the EXTERNAL Keyword

Use the EXTERNAL keyword to specify a data type for each column of your external table that has a data type different from the internal table. For example, you might have a VARCHAR column in the internal table that you want to map to a CHAR column in the external table.

You must specify an external type for every column that is in fixed format. You cannot specify an external type for delimited format columns except for BYTE and TEXT columns where your specification is optional. For more information, see TEXT and HEX External Types.

Integer Data Types

Besides valid Informix integer data types, you can specify packed decimal, zoned decimal, and IBM-format binary representation of integers.

For packed or zoned decimal, you specify precision (total number of digits in the number) and scale (number of digits that are to the right of the decimal point). Packed decimal representation can store two digits, or a digit and a sign, in each byte. Zoned decimal requires (p + 1) bytes to store p digits and the sign.

Big-Endian Format

The database server also supports two IBM-format binary representations of integers: BINARY(2) for 16-bit integer storage and BINARY(4) for 32-bit integer storage. The most significant byte of each number has the lowest address; that is, binary-format integers are stored big-end first (big-endian format) in the manner of IBM and Motorola processors. Intel processors and some others store binary-format integers little-end first, a storage method that the database server does not support for external data.

Defining Null Values

The packed decimal, zoned decimal, and binary data types do not have a natural null value, so you must define a value that can be interpreted as a null when the database server loads or unloads data from an external file. You can define the null_string as a number that will not be used in the set of numbers stored in the data file (for example, -9999.99). You can also define a bit pattern in the field as a hexadecimal pattern, such as 0xffff, that is to be interpreted as a null.

The database server uses the null representation for a fixed-format external table to both interpret values as the data is loaded into the database and to format null values into the appropriate data type when data is unloaded to an external table.

The following examples are of column definitions with null values for a fixed-format external table:

If the packed decimal or zoned decimal is stored with all bits cleared to represent a null value, the null_string can be defined as "0x0". The following rules apply to the value assigned to a null_string:

TEXT and HEX External Types

An Informix BYTE or TEXT column can be encoded in either the TEXT or HEX external type. You can use only delimited BYTE and TEXT formats with these external types. Fixed formats are not allowed. In addition, you cannot use these external types with any other type of delimited-format columns (such as character columns).

You do not need to specify these external types. If you do not define an external column specifically, Informix TEXT columns default to TEXT and Informix BYTE columns default to HEX.

The database server interprets two adjacent field delimiters as a null value.

During unloading, the database server escapes delimiters and backslashes (\). During loading, any character that follows a backslash is taken literally. Nonprintable characters are directly embedded in the data file if you choose TEXT format.

User-defined delimiters are limited to one byte each. For information about delimiters if you are using a multibyte locale, see the Informix Guide to GLS Functionality.

For more information on BYTE and TEXT data, see your Administrator's Guide.

Manipulating Data in Fixed Format Files

For files in FIXED format, you must declare the column name and the EXTERNAL item for each column to set the name and number of characters. For FIXED-format files, the only data type allowed is CHAR. You can use the keyword NULL to specify what string to interpret as a null value.

Column-Level Constraints

Use column-level constraints to limit the type of data that is allowed in a column. Constraints at the column level are limited to a single column.

Using the Not-Null Constraint

If you do not indicate a default value for a column, the default is null unless you place a not-null constraint on the column. In that case, no default value exists for the column.

If you place a not-null constraint on a column (and no default value is specified), the data in the external table must have a value set for the column when loading through the external table. When no reject file exists and no value is encountered, the database server returns an error and the loading stops. When a reject file exists and no value is encountered, the error is reported in the reject file and the load continues.

Using the CHECK Constraint

Check constraints allow you to designate conditions that must be met before data can be assigned to a column during an INSERT or UPDATE statement. When a reject file does not exist and a row evaluates to false for any check constraint defined on a table during an insert or update, the database server returns an error. When there is a reject file and a row evaluates to false for a check constraint defined on the table, the error is reported in the reject file and the statement continues to execute.

Check constraints are defined with search conditions. The search condition cannot contain subqueries, aggregates, host variables, or SPL routines. In addition, it cannot include the following built-in functions: CURRENT, USER, SITENAME, DBSERVERNAME, or TODAY.

When you define a check constraint at the column level, the only column that the check constraint can check against is the column itself. In other words, the check constraint cannot depend upon values in other columns of the table.

DATAFILES Clause

The DATAFILES clause names the external files that are opened when you use external tables.

Element Purpose Restrictions Syntax
coserver_group Name of the coserver group that contains the external data The coserver group must exist. Identifier, p. 4-205
coserver_num Numeric ID of the coserver that contains the external data The coserver must exist. Literal Number,
p.
4-237
fixed_path Pathname for describing the input or output files in the external table definition The specified path must exist. The pathname must conform to the conventions of your operating system.
formatted_path Formatted pathname that uses pattern-matching characters The specified path must exist. The pathname must conform to the conventions of your operating system.

You can use cogroup names and coserver numbers when you describe the input or output files for the external table definition. You can identify the DATAFILES either by coserver number or by cogroup name. A coserver number contains only digits. A cogroup name is a valid identifier that begins with a letter but otherwise contains any combination of letters, digits, and underscores.

If you use only some of the available coservers for reading or writing files, you can designate these coservers as a cogroup using onutil and then use the cogroup name rather than explicitly naming each coserver and file separately. Whenever you use all coservers to manage external files, you can use the predefined coserver_group.

For examples of the DATAFILES clause, see the section, Examples.

Using Formatting Characters

You can use a formatted pathname to designate a filename. If you use a formatted pathname, you can take advantage of the substitution characters %c, %n, and %r(first...last).

Formatting String Purpose
%c Replaced with the number of the coserver that manages the file
%n Replaced with the name of the node on which the coserver that manages the file resides
%r(first...last) Names multiple files on a single coserver

Important: The formatted pathname option does not support the %o formatting string.

Table Options

The optional table parameters include additional characteristics that define the table.

Element Purpose Restrictions Syntax
filename Full directory path and filename that you want all coservers to use as a destination when they write conversion error messages If you do not specify REJECTFILE, no reject files are created, and if errors occur, the load task will fail. Filename must conform to the conventions of your operating system.
field_delimiter Character to separate fields The default value is the pipe (|) character. If you use a non-printing character as a delimiter, you must encode it as the octal representation of the ASCII character. For example, '\006' can represent CTRL-F. Quoted String,
p.
4-260
num_errors Number of errors allowed per coserver before the database server stops the load If you do not set the MAXERRORS environment variable, the database server processes all data regardless of the number of errors. This parameter is ignored during an unload task. Literal Number,
p.
4-237
num_rows Approximate number of rows contained in the external table None. Literal Number,
p.
4-237
quoted_string ASCII character that represents the escape Only one character is valid. Quoted String,
p.
4-260
record_delimiter Character to separate records If you do not set the RECORDEND environment variable, the default value is the newline character (\n). If you use a non-printing character as a delimiter, you must encode it as the octal representation of the ASCII character. For example, '\006' can represent CTRL-F. Quoted String,
p.
4-260

Use the table options keywords as the following table describes. You can use each keyword whenever you plan to load or unload data unless only one of the two modes is specified.

Keyword Purpose
CODESET Specifies the type of code set of the data
DEFAULT (load only) Specifies that the database server should replace missing values in delimited input files with column defaults (if they are defined) instead of inserting nulls This option allows input files to be sparsely populated. The input files do not need to have an entry for every column in the file where a default is the value to be loaded.
DELIMITER Specifies the character that separates fields in a delimited text file
DELUXE (load only) Sets a flag that causes the database server to load data in deluxe mode Deluxe mode is required for loading into STANDARD tables.
ESCAPE Directs the database server to recognize ASCII special characters embedded in ASCII-text-based data files If you do not specify ESCAPE when you load data, the database server does not check the character fields in text data files for embedded special characters. If you do not specify ESCAPE when you unload data, the database server does not create embedded hexadecimal characters in text fields.
EXPRESS Sets a flag that causes the database server to attempt to load data in express mode If you request express mode but indexes or unique constraints exist on the table or the table contains BYTE or TEXT data, or the target table is not RAW or OPERATIONAL, the load stops with an error message reporting the problem.
FORMAT Specifies the format of the data in the data files
MAXERRORS Sets the number of errors that are allowed per coserver before the database server stops the load
RECORDEND Specifies the character that separates records in a delimited text file
REJECTFILE Sets the full pathname for all coservers to the area where reject files are written for data-conversion errors If conversion errors occur and you have not specified REJECTFILE or the reject files cannot be opened, the load job ends abnormally. For information on reject-file naming and use of formatting characters, see Reject Files.
SIZE Specifies the approximate number of rows that are contained in the external table This option can improve performance if you use the external table in a join query.

Important: Check constraints on external tables are designed to be evaluated only when loading data. The database server cannot enforce check constraints on external tables because the data can be freely altered outside the control of the server. If you want to restrict rows that are written to an external table during unload, use a WHERE clause to filter the rows.

Reject Files

Rows that have conversion errors during a load or rows that violate check constraints defined on the external table are written to a reject file on the coserver that performs the conversion. Each coserver manages its own reject file. The REJECTFILE keyword determines the name given to the reject file on each coserver.

You can use the formatting characters %c and %n (but not %r) in the filename format. Use the %c formatting characters to make the filenames unique. For more information on formatting characters, see the section Using Formatting Characters.

If you perform another load to the same table during the same session, any earlier reject file of the same name is overwritten.

Reject file entries have the following format:

The reject file writes the coserver-number, filename, record, field-name and reason-code in ASCII. The bad line information varies with the type of input file. For delimited files or fixed-position character files, up to 80 characters of the bad line are copied directly into the reject file. For Informix internal data files, the bad line is not placed in the reject file, because you cannot edit the binary representation in a file. However, coserver-number, filename, record, reason-code, and field-name are still reported in the reject file so you can isolate the problem.

The types of errors that cause a row to be rejected are as follows.

Error Text Explanation
CONSTRAINT constraint name This constraint was violated.
CONVERT_ERR Any field encounters a conversion error.
MISSING_DELIMITER No delimiter was found.
MISSING_RECORDEND No recordend was found.
NOT NULL A null was found in field-name.
ROW_TOO_LONG The input record is longer than 32 kilobytes.

Examples

The examples in this section show how you can name files to use in the DATAFILES field.

Assume that the database server is running on four nodes, and one file is to be read from each node. All files have the same name. The DATAFILES item can then be as follows:

Now, consider a system with 16 coservers where only three coservers have tape drives attached (for example, coservers 2, 5, and 9). If you define a cogroup for these coservers before you run load and unload commands, you can use the cogroup name rather than a list of individual coservers when you execute the commands. To set up the cogroup, run onutil.

Then define the file locations for named pipes:

The filenames expand as follows:

If, instead, you want to process three files on each of two coservers, define the files as follows:

The expanded list is as follows:

Related Information

Related statements: INSERT, SELECT, and SET PLOAD FILE

For more information on external tables, refer to your Administrator's Reference.


Informix Guide to SQL: Syntax, Version 9.2
Copyright © 1999, Informix Software, Inc. All rights reserved