INFORMIX
Informix Guide to SQL: Tutorial
Chapter 4: Modifying Data
Home Contents Index Master Index New Book

Data Integrity

The INSERT, UPDATE, and DELETE statements modify data in an existing database. Whenever you modify existing data, the integrity of the data can be affected. For example, an order for a nonexistent product could be entered into the orders table, a customer with outstanding orders could be deleted from the customer table, or the order number could be updated in the orders table and not in the items table. In each of these cases, the integrity of the stored data is lost.

Data integrity is actually made up of the following parts:

Well-designed databases incorporate these principles so that when you modify data, the database itself prevents you from doing anything that might harm the data integrity.

Entity Integrity

An entity is any person, place, or thing to be recorded in a database. Each entity represents a table, and each row of a table represents an instance of that entity. For example, if order is an entity, the orders table represents the idea of order and each row in the table represents a specific order.

To identify each row in a table, the table must have a primary key. The primary key is a unique value that identifies each row. This requirement is called the entity integrity constraint.

For example, the orders table primary key is order_num. The order_num column holds a unique system-generated order number for each row in the table. To access a row of data in the orders table, you can use the following SELECT statement:

Using the order number in the WHERE clause of this statement enables you to access a row easily because the order number uniquely identifies that row. If the table allowed duplicate order numbers, it would be almost impossible to access one single row, because all other columns of this table allow duplicate values.

For more information on primary keys and entity integrity, refer to Chapter 8, "Building Your Data Model."

Semantic Integrity

Semantic integrity ensures that data entered into a row reflects an allowable value for that row. The value must be within the column-specific properties, or allowable set of values, for that column. For example, the quantity column of the items table permits only numbers. If a value outside the column-specific properties can be entered into a column, the semantic integrity of the data is violated.

Semantic integrity is enforced using the following constraints:

For more information on using semantic integrity constraints in database design, refer to "Defining Column-Specific Properties".

Referential Integrity

Referential integrity refers to the relationship between tables. Because each table in a database must have a primary key, this primary key can appear in other tables because of its relationship to data within those tables. When a primary key from one table appears in another table, it is called a foreign key.

Foreign keys join tables and establish dependencies between tables. Tables can form a hierarchy of dependencies in such a way that if you change or delete a row in one table, you destroy the meaning of rows in other tables. For example, Figure 4-1 shows that the customer_num column of the customer table is a primary key for that table and a foreign key in the orders and cust_call tables. Customer number 106, George Watson, is referenced in both the orders and cust_calls tables. If customer 106 is deleted from the customer table, the link between the three tables and this particular customer is destroyed.

When you delete a row that contains a primary key or update it with a different primary key, you destroy the meaning of any rows that contain that value as a foreign key. Referential integrity is the logical dependency of a foreign key on a primary key. The integrity of a row that contains a foreign key depends on the integrity of the row that it references-the row that contains the matching primary key.

By default, the database server does not allow you to violate referential integrity and gives you an error message if you attempt to delete rows from the parent table before you delete rows from the child table. You can, however, use the ON DELETE CASCADE option to cause deletes from a parent table to trip deletes on child tables. See "Using the ON DELETE CASCADE Option".

Figure 4-1
Referential Integrity in the stores7 Database

To define primary and foreign keys, and the relationship between them, use the CREATE TABLE and ALTER TABLE statements. For more information on these statements, see Chapter 1 of the Informix Guide to SQL: Syntax. For information on building data models using primary and foreign keys, refer to Chapter 8, "Building Your Data Model."

Using the ON DELETE CASCADE Option

To maintain referential integrity when you delete rows from a primary key for a table, use the ON DELETE CASCADE option in the REFERENCES clause of the CREATE TABLE and ALTER TABLE statements. This option allows you to delete a row from a parent table and its corresponding rows in matching child tables with a single delete command.

Locking During Cascading Deletes
During deletes, locks are held on all qualifying rows of the parent and child tables. When you specify a delete, the delete that is requested from the parent table occurs before any referential actions are performed.

What Happens to Multiple Children Tables
If you have a parent table with two child constraints, one child with cascading deletes specified and one child without cascading deletes, and you attempt to delete a row from the parent table that applies to both child tables, the DELETE statement fails, and no rows are deleted from either the parent or child tables.

Logging Must Be Turned On
You must turn logging on in your current database for cascading deletes to work. Logging and cascading deletes are discussed in "Transaction Logging" on page 4-29.

Example
Suppose you have two tables with referential integrity rules applied, a parent table, accounts, and a child table, sub_accounts. The following CREATE TABLE statements define the referential constraints:

The primary key of the accounts table, the acc_num column, uses a SERIAL data type, and the foreign key of the sub_accounts table, the ref_num column, uses an INTEGER data type. Combining the SERIAL data type on the primary key and the INTEGER data type on the foreign key is allowed. Only in this condition can you mix and match data types. The SERIAL data type is an INTEGER, and the database automatically generates the values for the column. All other primary and foreign key combinations must match explicitly. For example, a primary key that is defined as CHAR must match a foreign key that is defined as CHAR.

To delete a row from the accounts table that will cascade a delete to the sub_accounts table, you must turn on logging. After logging is turned on, you can delete the account number 2 from both tables, as the following example shows:

Restrictions on Cascading Deletes
You can use cascading deletes for most deletes, including deletes on self- referencing and cyclic queries. The only exception is correlated subqueries. In correlated subqueries, the subquery (or inner SELECT) is correlated when the value it produces depends on a value produced by the outer SELECT statement that contains it. If you have implemented cascading deletes, you cannot write deletes that use a child table in the correlated subquery. You receive an error when you attempt to delete from a correlated subquery.

Object Modes and Violation Detection

The object modes and violation detection features of the database can help you monitor data integrity. These features are particularly powerful when they are combined during schema changes or when insert, delete, and update operations are performed on large volumes of data over short periods.

You can use the object modes feature to change the modes of database objects. Database objects, within the context of a discussion of the object modes feature, are constraints, indexes, and triggers. Do not confuse database objects that are relevant to the object modes feature with generic database objects. Generic database objects are things like tables and synonyms. The database objects that relate specifically to object modes are constraints, indexes, and triggers, and all of them have different modes.

Constraints can be enabled, disabled, or filtering. The database manager does not enforce disabled constraints even though their definitions are still in the system catalogs. Only constraints in the enabled and filtering mode are enforced. However, when a constraint is in filter mode, the database manager ensures the integrity of the base table for that particular constraint. The difference between enabled mode and filtering mode is apparent in the way the database manager handles a query that poses a violation of the constraint. The database manager uses the violation-detection feature when it deals with a constraint violation.

Consider an insert statement that violates a constraint. Depending on the mode of the constraint, the database manager handles the insert statement as follows:

    An insert operation that violates an enabled constraint is not inserted into the target table. A constraint violation error is returned to the user, and effects of the statement are rolled back.

    An insert operation that violates a disabled constraint is inserted in the target table, and no error is returned to the user.

    An insert operation that violates a filtering constraint is not inserted into the target table; instead it is inserted into the violations table. The information about the integrity violation is created and stored in a third table called the diagnostics table. The effects of the insert operation are not rolled back. When you switch the mode of the constraint to filtering, you can determine whether or not an error is returned after a constraint is violated.

A unique index also has enabled, disabled, and filter modes. A unique index in filter mode operates the same way as a constraint in filter mode. An index that does not avoid duplicate entries, however, only has enabled and disabled modes. When an index is disabled, its contents are not updated following insert, delete, or update modifications to the base table of the index. The optimizer cannot use a disabled index during a query because the index contents are not current.

Unlike constraints and unique indexes, triggers have two modes. Formerly, a trigger either existed and was fired at the appropriate time by the database manager, or nothing happened because the trigger did not exist. Now you can use object modes to disable an existing trigger. The database manager ignores a trigger in disabled mode even though the catalog information of the disabled trigger is kept up to date. The database manager does not ignore a trigger in enabled mode. Triggers do not have a filtering mode since they do not impose any kind of integrity specification on the database.

SQL Statements and Examples

For more detailed information, see the SET, START VIOLATIONS TABLE, and STOP VIOLATIONS TABLE statements in the Informix Guide to SQL: Syntax.




Informix Guide to SQL: Tutorial, version 9.1
Copyright © 1998, Informix Software, Inc. All rights reserved.