|
Use the CREATE EXTERNAL TABLE statement to define an external source that is not part of your database so you can use that external source to load and unload data for your database.
After you create a table with the CREATE EXTERNAL TABLE statement, you can move data to and from the external source with an INSERT INTO...SELECT statement.
When you create a table with the SAMEAS keyword, the column names from the original table are used in the new table. You cannot use indexes in the external table definition.
You cannot use the SAMEAS keyword for FIXED-format files.
Use the EXTERNAL keyword to specify a data type for each column of your external table that has a data type different from the internal table. For example, you might have a VARCHAR column in the internal table that you want to map to a CHAR column in the external table.
You must specify an external type for every column that is in fixed format. You cannot specify an external type for delimited format columns except for BYTE and TEXT columns where your specification is optional. For more information, see TEXT and HEX External Types.
Integer Data TypesBesides valid Informix integer data types, you can specify packed decimal, zoned decimal, and IBM-format binary representation of integers.
For packed or zoned decimal, you specify precision (total number of digits in the number) and scale (number of digits that are to the right of the decimal point). Packed decimal representation can store two digits, or a digit and a sign, in each byte. Zoned decimal requires (p + 1) bytes to store p digits and the sign.
Big-Endian FormatThe database server also supports two IBM-format binary representations of integers: BINARY(2) for 16-bit integer storage and BINARY(4) for 32-bit integer storage. The most significant byte of each number has the lowest address; that is, binary-format integers are stored big-end first (big-endian format) in the manner of IBM and Motorola processors. Intel processors and some others store binary-format integers little-end first, a storage method that the database server does not support for external data.
Defining Null ValuesThe packed decimal, zoned decimal, and binary data types do not have a natural null value, so you must define a value that can be interpreted as a null when the database server loads or unloads data from an external file. You can define the null_string as a number that will not be used in the set of numbers stored in the data file (for example, -9999.99). You can also define a bit pattern in the field as a hexadecimal pattern, such as 0xffff, that is to be interpreted as a null.
The database server uses the null representation for a fixed-format external table to both interpret values as the data is loaded into the database and to format null values into the appropriate data type when data is unloaded to an external table.
The following examples are of column definitions with null values for a fixed-format external table:
If the packed decimal or zoned decimal is stored with all bits cleared to represent a null value, the null_string can be defined as "0x0". The following rules apply to the value assigned to a null_string:
Warning: If a row that contains a null value is unloaded into an external table and the column that receives the null value has no null value defined, the database server inserts a zero into the column.
An Informix BYTE or TEXT column can be encoded in either the TEXT or HEX external type. You can use only delimited BYTE and TEXT formats with these external types. Fixed formats are not allowed. In addition, you cannot use these external types with any other type of delimited-format columns (such as character columns).
You do not need to specify these external types. If you do not define an external column specifically, Informix TEXT columns default to TEXT and Informix BYTE columns default to HEX.
The database server interprets two adjacent field delimiters as a null value.
During unloading, the database server escapes delimiters and backslashes (\). During loading, any character that follows a backslash is taken literally. Nonprintable characters are directly embedded in the data file if you choose TEXT format.
User-defined delimiters are limited to one byte each. For information about delimiters if you are using a multibyte locale, see the Informix Guide to GLS Functionality.
For more information on BYTE and TEXT data, see your Administrator's Guide.
For files in FIXED format, you must declare the column name and the EXTERNAL item for each column to set the name and number of characters. For FIXED-format files, the only data type allowed is CHAR. You can use the keyword NULL to specify what string to interpret as a null value.
Use column-level constraints to limit the type of data that is allowed in a column. Constraints at the column level are limited to a single column.
If you do not indicate a default value for a column, the default is null unless you place a not-null constraint on the column. In that case, no default value exists for the column.
If you place a not-null constraint on a column (and no default value is specified), the data in the external table must have a value set for the column when loading through the external table. When no reject file exists and no value is encountered, the database server returns an error and the loading stops. When a reject file exists and no value is encountered, the error is reported in the reject file and the load continues.
Check constraints allow you to designate conditions that must be met before data can be assigned to a column during an INSERT or UPDATE statement. When a reject file does not exist and a row evaluates to false for any check constraint defined on a table during an insert or update, the database server returns an error. When there is a reject file and a row evaluates to false for a check constraint defined on the table, the error is reported in the reject file and the statement continues to execute.
Check constraints are defined with search conditions. The search condition cannot contain subqueries, aggregates, host variables, or SPL routines. In addition, it cannot include the following built-in functions: CURRENT, USER, SITENAME, DBSERVERNAME, or TODAY.
When you define a check constraint at the column level, the only column that the check constraint can check against is the column itself. In other words, the check constraint cannot depend upon values in other columns of the table.
The DATAFILES clause names the external files that are opened when you use external tables.
You can use cogroup names and coserver numbers when you describe the input or output files for the external table definition. You can identify the DATAFILES either by coserver number or by cogroup name. A coserver number contains only digits. A cogroup name is a valid identifier that begins with a letter but otherwise contains any combination of letters, digits, and underscores.
If you use only some of the available coservers for reading or writing files, you can designate these coservers as a cogroup using onutil and then use the cogroup name rather than explicitly naming each coserver and file separately. Whenever you use all coservers to manage external files, you can use the predefined coserver_group.
For examples of the DATAFILES clause, see the section, Examples.
You can use a formatted pathname to designate a filename. If you use a formatted pathname, you can take advantage of the substitution characters %c, %n, and %r(first...last).
Important: The formatted pathname option does not support the %o formatting string.
Table Options
The optional table parameters include additional characteristics that define the table.
Use the table options keywords as the following table describes. You can use each keyword whenever you plan to load or unload data unless only one of the two modes is specified.
Important: Check constraints on external tables are designed to be evaluated only when loading data. The database server cannot enforce check constraints on external tables because the data can be freely altered outside the control of the server. If you want to restrict rows that are written to an external table during unload, use a WHERE clause to filter the rows.
Reject Files
Rows that have conversion errors during a load or rows that violate check constraints defined on the external table are written to a reject file on the coserver that performs the conversion. Each coserver manages its own reject file. The REJECTFILE keyword determines the name given to the reject file on each coserver.
You can use the formatting characters %c and %n (but not %r) in the filename format. Use the %c formatting characters to make the filenames unique. For more information on formatting characters, see the section Using Formatting Characters.
If you perform another load to the same table during the same session, any earlier reject file of the same name is overwritten.
Reject file entries have the following format:
The reject file writes the coserver-number, filename, record, field-name and reason-code in ASCII. The bad line information varies with the type of input file. For delimited files or fixed-position character files, up to 80 characters of the bad line are copied directly into the reject file. For Informix internal data files, the bad line is not placed in the reject file, because you cannot edit the binary representation in a file. However, coserver-number, filename, record, reason-code, and field-name are still reported in the reject file so you can isolate the problem.
The types of errors that cause a row to be rejected are as follows.
The examples in this section show how you can name files to use in the DATAFILES field.
Assume that the database server is running on four nodes, and one file is to be read from each node. All files have the same name. The DATAFILES item can then be as follows:
Now, consider a system with 16 coservers where only three coservers have tape drives attached (for example, coservers 2, 5, and 9). If you define a cogroup for these coservers before you run load and unload commands, you can use the cogroup name rather than a list of individual coservers when you execute the commands. To set up the cogroup, run onutil.
Then define the file locations for named pipes:
The filenames expand as follows:
If, instead, you want to process three files on each of two coservers, define the files as follows:
The expanded list is as follows:
Related statements: INSERT, SELECT, and SET PLOAD FILE
For more information on external tables, refer to your Administrator's Reference.