![]() |
|
The INSERT, UPDATE, and DELETE statements modify data in an existing database. Whenever you modify existing data, the integrity of the data can be affected. For example, an order for a nonexistent product could be entered into the orders table, a customer with outstanding orders could be deleted from the customer table, or the order number could be updated in the orders table and not in the items table. In each of these cases, the integrity of the stored data is lost.
Data integrity is actually made up of the following parts:
Well-designed databases incorporate these principles so that when you modify data, the database itself prevents you from doing anything that might harm the integrity of the data.
An entity is any person, place, or thing to be recorded in a database. Each table represents an entity, and each row of a table represents an instance of that entity. For example, if order is an entity, the orders table represents the idea of an order and each row in the table represents a specific order.
To identify each row in a table, the table must have a primary key. The primary key is a unique value that identifies each row. This requirement is called the entity integrity constraint.
For example, the orders table primary key is order_num. The order_num column holds a unique system-generated order number for each row in the table. To access a row of data in the orders table, use the following SELECT statement:
Using the order number in the WHERE clause of this statement enables you to access a row easily because the order number uniquely identifies that row. If the table allowed duplicate order numbers, it would be almost impossible to access one single row because all other columns of this table allow duplicate values.
For more information on primary keys and entity integrity, see the Informix Guide to Database Design and Implementation.
Semantic integrity ensures that data entered into a row reflects an allowable value for that row. The value must be within the domain, or allowable set of values, for that column. For example, the quantity column of the items table permits only numbers. If a value outside the domain can be entered into a column, the semantic integrity of the data is violated.
The following constraints enforce semantic integrity:
Referential integrity refers to the relationship between tables. Because each table in a database must have a primary key, this primary key can appear in other tables because of its relationship to data within those tables. When a primary key from one table appears in another table, it is called a foreign key.
Foreign keys join tables and establish dependencies between tables. Tables can form a hierarchy of dependencies in such a way that if you change or delete a row in one table, you destroy the meaning of rows in other tables. For example, Figure 6-4 shows that the customer_num column of the customer table is a primary key for that table and a foreign key in the orders and cust_call tables. Customer number 106, George Watson, is referenced in both the orders and cust_calls tables. If customer 106 is deleted from the customer table, the link between the three tables and this particular customer is destroyed.
When you delete a row that contains a primary key or update it with a different primary key, you destroy the meaning of any rows that contain that value as a foreign key. Referential integrity is the logical dependency of a foreign key on a primary key. The integrity of a row that contains a foreign key depends on the integrity of the row that it references-the row that contains the matching primary key.
By default, the database server does not allow you to violate referential integrity and gives you an error message if you attempt to delete rows from the parent table before you delete rows from the child table. You can, however, use the ON DELETE CASCADE option to cause deletes from a parent table to trip deletes on child tables. See Using the ON DELETE CASCADE Option.
To define primary and foreign keys, and the relationship between them, use the CREATE TABLE and ALTER TABLE statements. For more information on these statements, see the Informix Guide to SQL: Syntax. For information about how to build a data model with primary and foreign keys, see the Informix Guide to Database Design and Implementation.
To maintain referential integrity when you delete rows from a primary key for a table, use the ON DELETE CASCADE option in the REFERENCES clause of the CREATE TABLE and ALTER TABLE statements. This option allows you to delete a row from a parent table and its corresponding rows in matching child tables with a single delete command.
Locking During Cascading DeletesDuring deletes, locks are held on all qualifying rows of the parent and child tables. When you specify a delete, the delete that is requested from the parent table occurs before any referential actions are performed.
What Happens to Multiple Children TablesIf you have a parent table with two child constraints, one child with cascading deletes specified and one child without cascading deletes, and you attempt to delete a row from the parent table that applies to both child tables, the DELETE statement fails, and no rows are deleted from either the parent or child tables.
Logging Must Be Turned OnYou must turn on logging in your current database for cascading deletes to work. Logging and cascading deletes are discussed in Transaction Logging on page 6-53.
Suppose you have two tables with referential integrity rules applied, a parent table, accounts, and a child table, sub_accounts. The following CREATE TABLE statements define the referential constraints:
The primary key of the accounts table, the acc_num column, uses a SERIAL data type, and the foreign key of the sub_accounts table, the ref_num column, uses an INTEGER data type. Combining the SERIAL data type on the primary key and the INTEGER data type on the foreign key is allowed. Only in this condition can you mix and match data types. The SERIAL data type is an INTEGER, and the database automatically generates the values for the column. All other primary and foreign key combinations must match explicitly. For example, a primary key that is defined as CHAR must match a foreign key that is defined as CHAR.
The definition of the foreign key of the sub_accounts table, the ref_num column, includes the ON DELETE CASCADE option. This option specifies that a delete of any row in the parent table accounts will automatically cause the corresponding rows of the child table sub_accounts to be deleted.
To delete a row from the accounts table that will cascade a delete to the sub_accounts table, you must turn on logging. After logging is turned on, you can delete the account number 2 from both tables, as the following example shows:
You can use cascading deletes for most deletes, including deletes on self-referencing and cyclic queries. The only exception is correlated subqueries. In correlated subqueries, the subquery (or inner SELECT) is correlated when the value it produces depends on a value produced by the outer SELECT statement that contains it. If you have implemented cascading deletes, you cannot write deletes that use a child table in the correlated subquery. You receive an error when you attempt to delete from a correlated subquery.
Important: You cannot define a DELETE trigger event on a table if the table defines a referential constraint with ON DELETE CASCADE.
Object Modes and Violation Detection
The object modes and violation detection features of the database can help you monitor data integrity. These features are particularly powerful when they are combined during schema changes or when insert, delete, and update operations are performed on large volumes of data over short periods.
Database objects, within the context of a discussion of the object modes feature, are constraints, indexes, and triggers, and each of them have different modes. Do not confuse database objects that are relevant to the object modes feature with generic database objects. Generic database objects are things like tables and synonyms.
You can set disabled, enabled, or filtering modes for a constraint or a unique index. You can set disabled or enabled modes for a trigger or a duplicate index. You can use database object modes to control the effects of INSERT, DELETE, and UPDATE statements.
Constraints, indexes, and triggers are enabled by default.
When a database object is enabled, the database server recognizes the existence of the database object and takes the database object into consideration while it executes an INSERT, DELETE, or UPDATE statement. Thus, an enabled constraint is enforced, an enabled index updated, and an enabled trigger is executed when the trigger event takes place.
When you enable constraints and unique indexes, if a violating row exists, the data manipulation statement fails (that is no rows change) and the database server returns an error message.
You can identify the reason for the failure when you analyze the information in the violations and diagnostic tables. You can then take corrective action or roll back the operation.
When a database object is disabled, the database server does not take it into consideration during the execution of an INSERT, DELETE, or UPDATE statement. A disabled constraint is not enforced, a disabled index is not updated, and a disabled trigger is not executed when the trigger event takes place. When you disable constraints and unique indexes, any data manipulation statement that violates the restriction of the constraint or unique index succeeds, (that is the target row is changed) and the database server does not return an error message.
When a constraint or unique index is in filtering mode, the statement succeeds and the database server enforces the constraint or the unique index requirement during an INSERT, DELETE, or UPDATE statement by writing the failed rows to the violations table associated with the target table. Diagnostic information about the constraint violation is written to the diagnostics table associated with the target table.
An example with the INSERT statement can illustrate the differences between the enabled, disabled, and filtering modes. Consider an INSERT statement in which a user tries to add a row that does not satisfy an integrity constraint on a table. For example, assume that user joe created a table named cust_subset, and this table consists of the following columns: ssn (customer's social security number), fname (customer's first name), lname (customer's last name), and city (city in which the customer lives). The ssn column has the INT data type. The other three columns have the CHAR data type.
Assume that user joe defined the lname column as not null but has not assigned a name to the not null constraint, so the database server has implicitly assigned the name n104_7 to this constraint. Finally, assume that user joe created a unique index named unq_ssn on the ssn column.
Now user linda who has the Insert privilege on the cust_subset table enters the following INSERT statement on this table:
To better understand the distinctions among enabled, disabled, and filtering modes, you can view the results of the preceding INSERT statement in the following three sections.
Results of the Insert Operation When the Constraint Is EnabledIf the not null constraint on the cust_subset table is enabled, the INSERT statement fails to insert the new row in this table. Instead user linda receives the following error message when she enters the INSERT statement:
If the not null constraint on the cust_subset table is disabled, the INSERT statement that user linda issues successfully inserts the new row in this table. The new row of the cust_subset table has the following column values.
Results of the Insert When Constraint Is in Filtering Mode
If the not null constraint on the cust_subset table is set to the filtering mode, the INSERT statement that user linda issues fails to insert the new row in this table. Instead the new row is inserted into the violations table, and a diagnostic row that describes the integrity violation is added to the diagnostics table.
Assume that user joe has started a violations and diagnostics table for the cust_subset table. The violations table is named cust_subset_vio, and the diagnostics table is named cust_subset_dia. The new row added to the cust_subset_vio violations table when user linda issues the INSERT statement on the cust_subset target table has the following column values.
ssn | fname | lname | city | informix_tupleid | informix_optype | informix_recowner |
---|---|---|---|---|---|---|
973824499 | jane | NULL | los altos | 1 | I | linda |
This new row in the cust_subset_vio violations table has the following characteristics:
The INSERT statement that user linda issued on the cust_subset target table also causes a diagnostic row to be added to the cust_subset_dia diagnostics table. The new diagnostic row added to the diagnostics table has the following column values.
This new diagnostic row in the cust_subset_dia diagnostics table has the following characteristics:
By joining the violations and diagnostics tables, user joe (who owns the cust_subset target table and its associated special tables) or the DBA can find out that the row in the violations table whose informix_tupleid value is 1 was created after an INSERT statement and that this row is violating a constraint. The table owner or DBA can query the sysconstraints system catalog table to determine that this constraint is a not null constraint. Now that the reason for the failure of the INSERT statement is known, user joe or the DBA can take corrective action.
Multiple Diagnostic Rows for One Violations RowIn the preceding example, only one row in the diagnostics table corresponds to the new row in the violations table. However, more than one diagnostic row can be added to the diagnostics table when a single new row is added to the violations table. For example, if the ssn value (973824499) that user linda entered in the INSERT statement had been the same as an existing value in the ssn column of the cust_subset target table, only one new row would appear in the violations table, but the following two diagnostic rows would be present in the cust_subset_dia diagnostics table.
Both rows in the diagnostics table correspond to the same row of the violations table because both of these rows have the value 1 in the informix_tupleid column. However, the first diagnostic row identifies the constraint violation caused by the INSERT statement that user linda issued, while the second diagnostic row identifies the unique-index violation caused by the same INSERT statement. In this second diagnostic row, the value I in the objtype column stands for a unique-index violation, and the value unq_ssn in the objname column gives the name of the index for which the integrity violation was detected.
For more information about how to set database object modes, see the SET Database object mode statement in the Informix Guide to SQL: Syntax.
After you create a violations table for a target table, you cannot alter the columns or the fragmentation of the base table or the violations table. If you alter the constraints on a target table after you have started the violations table, nonconforming rows will be filtered to the violations table.
When you create a violations table for a target table on Enterprise Decision Server, all constraints are in filtering mode. The violations table contains fields that record the diagnostic information, thus no separate diagnostics table exists.
For information about how to start and stop the violations tables, see the START VIOLATIONS TABLE and STOP VIOLATIONS TABLE statements in the Informix Guide to SQL: Syntax.
Relationship of Violations Tables and Database Object ModesIf you set the constraints or unique indexes defined on a table to the filtering mode, but you do not create the violations and diagnostics tables for this target table, any rows that violate a constraint or unique-index requirement during an insert, update, or delete operation are not filtered to a violations table. Instead, you receive an error message that indicates that you must start a violations table for the target table.
Similarly, if you set a disabled constraint or disabled unique index to the enabled or filtering mode and you want the ability to identify existing rows that do not satisfy the constraint or unique-index requirement, you must create the violations tables before you issue the SET DATABASE OBJECT MODE statement.Examples of START VIOLATIONS TABLE Statements
The following examples show different ways to execute the START VIOLATIONS TABLE statement.
Starting Violations and Diagnostics Tables Without Specifying Their NamesTo start a violations and diagnostics table for the target table named customer in the demonstration database, enter the following statement:
Because your START VIOLATIONS TABLE statement does not include a USING clause, the violations table is named customer_vio by default, and the diagnostics table is named customer_dia by default. The customer_vio table includes the following columns:
The customer_vio table has the same table definition as the customer table except that the customer_vio table has three additional columns that contain information about the operation that caused the bad row.
The customer_dia table includes the following columns:
This list of columns shows an important difference between the diagnostics table and violations table for a target table. Whereas the violations table has a matching column for every column in the target table, the columns of the diagnostics table do not match any columns in the target table. The diagnostics table created by any START VIOLATIONS TABLE statement always has the same columns with the same column names and data types.
Starting Violations and Diagnostics Tables and Specifying Their NamesThe following statement starts a violations and diagnostics table for the target table named items. The USING clause assigns explicit names to the violations and diagnostics tables. The violations table is to be named exceptions, and the diagnostics table is to be named reasons.
Specifying the Maximum Number of Rows in the Diagnostics TableThe following statement starts violations and diagnostics tables for the target table named orders. The MAX ROWS clause specifies the maximum number of rows that can be inserted into the diagnostics table when a single statement, such as an INSERT or SET DATABASE OBJECT MODE statement, is executed on the target table.
If you do not specify a value for MAX ROWS when you create a violations table, no maximum (other than disk space) will be imposed.
Example of Privileges on the Violations TableThe following example illustrates how the initial set of privileges on a violations table is derived from the current set of privileges on the target table.
For example, assume that we created a table named cust_subset and that this table consists of the following columns: ssn (customer's social security number), fname (customer's first name), lname (customer's last name), and city (city in which the customer lives).
The following set of privileges exists on the cust_subset table:
Now user alvin starts a violations table named cust_subset_viols and a diagnostics table named cust_subset_diags for the cust_subset table, as follows:
The database server grants the following set of initial privileges on the cust_subset_viols violations table:
The following example illustrates how the initial set of privileges on a diagnostics table is derived from the current set of privileges on the target table.
For example, assume that a table called cust_subset consists of the following columns: ssn (customer's social security number), fname (customer's first name), lname (customer's last name), and city (city in which the customer lives).
The following set of privileges exists on the cust_subset table:
Now user alvin starts a violations table named cust_subset_viols and a diagnostics table named cust_subset_diags for the cust_subset table, as follows:
The database server grants the following set of initial privileges on the cust_subset_diags diagnostics table: