INFORMIX
Informix Guide to SQL: Tutorial
Chapter 3: Composing Advanced SELECT Statements
Home Contents Index Master Index New Book

Creating Advanced Joins

Chapter 2, "Composing Simple SELECT Statements," shows how to include a WHERE clause in a SELECT statement to join two or more tables on one or more columns. It illustrates natural joins and equi-joins.

This chapter discusses the uses of two more complex kinds of joins, self-joins and outer joins. As described for simple joins, you can define aliases for tables and assign display labels to expressions to shorten your multiple-table queries. You can also issue a SELECT statement with an ORDER BY clause that sorts data into a temporary table.

Self-Joins

A join does not always have to involve two different tables. You can join a table to itself, creating a self-join. Joining a table to itself can be useful when you want to compare values in a column to other values in the same column.

To create a self-join, list a table twice in the FROM clause and assign it a different alias each time. Use the aliases to refer to the table in the SELECT and WHERE clauses as if it were two separate tables. (Aliases in SELECT statements are shown in Chapter 2, "Composing Simple SELECT Statements," in this manual and discussed in Chapter 1 of the Informix Guide to SQL: Syntax.)

Just as in joins between tables, you can use arithmetic expressions in self-joins. You can test for null values, and you can use an ORDER BY clause to sort the values in a specified column in ascending or descending order.

Query 3-9 finds pairs of orders where the ship_weight differs by a factor of five or more and the ship_date is not null. The query then orders the data by ship_date.

Query 3-9

Query Result 3-9

If you want to store the results of a self-join into a temporary table, append an INTO TEMP clause to the SELECT statement and rename at least one set of columns by assigning them display labels. Otherwise, the duplicate column names cause an error and the temporary table is not created.

Query 3-10, which is similar to Query 3-9, labels all columns selected from the orders table and puts them in a temporary table called shipping.

Query 3-10

If you query with SELECT * from that table, you see the rows that Query Result 3-10 shows.

Query Result 3-10

You can join a table to itself more than once. The maximum number of self-joins depends on the resources available to you.

The self-join in Query 3-11 creates a list of those items in the stock table that are supplied by three manufacturers. By including the last two conditions in the WHERE clause, it eliminates duplicate manufacturer codes in rows retrieved.

Query 3-11

Query Result 3-11

If you want to select rows from a payroll table to determine which employees earn more than their manager, you can construct the self-join that Query 3-12a shows.

Query 3-12a

Query 3-12b uses a correlated subquery to retrieve and list the 10 highest-priced items ordered.

Query 3-12b

Query 3-12b returns the 10 rows that Query Result 3-12 shows.

Query Result 3-12

You can create a similar query to find and list the 10 employees in the company who have the most seniority.

Correlated and uncorrelated subqueries are described in "Subqueries in SELECT Statements".

Using Rowid Values
Universal Server assigns a unique rowid to rows in nonfragmented tables. Rows in fragmented tables do not contain the rowid column. Informix recommends that you use primary keys as a method of access in your applications rather than rowids. Because primary keys are defined in the ANSI specification of SQL, using them to access data makes your applications more portable. In addition, Universal Server requires less time to access data in a fragmented table using a primary key than it requires to access the same data using rowid. For information about rowids and tables, see "Accessing Data Stored in Fragmented Tables".

You can use the hidden rowid column in a self-join to locate duplicate values in a table. In the following example, the condition x.rowid != y.rowid is equivalent to saying "row x is not the same row as row y."

Query 3-13 selects data twice from the cust_calls table, assigning it the table aliases x and y.

Query 3-13

Query 3-13 searches for duplicate values in the customer_num column, and for their rowids, finding the pair Query Result 3-13 shows.

Query Result 3-13

You can write the last condition as Query 3-13 shows.

Another way to locate duplicate values is with a correlated subquery, as Query 3-14 shows.

Query 3-14

Query 3-14 locates the same two duplicate customer_num values as Query 3-13 and returns the rows Query Result 3-14 shows.

Query Result 3-14

You can use the rowid, shown earlier in a self-join, to locate the internal record number that is associated with a row in a database table. The rowid is, in effect, a hidden column in every table. The sequential values of rowid have no special significance and can vary depending on the location of the physical data in the chunk. Your rowid might vary from the example shown. The use of rowid is discussed in detail in the INFORMIX-Universal Server Administrator's Guide.

Query 3-15 uses the rowid and the wildcard asterisk symbol (*) in the SELECT clause to retrieve every row in the manufact table and their corresponding rowids.

Query 3-15

You also can use the rowid when you select a specific column, as Query 3-16 shows.

Query 3-16

Query Result 3-16

You can use the rowid in the WHERE clause to retrieve rows based on their internal record number. This method is handy when no other unique column exists in a table. Query 3-17 uses a rowid from Query 3-16.

Query 3-17

Query 3-17 returns the row that Query Result 3-17 shows.

Query Result 3-17

Using the USER Function
To obtain additional information about a table, you can combine the rowid with the USER function.

Query 3-18 assigns the label username to the USER expression column and returns this information about the cust_calls table.

Query 3-18

For example, if the user zenda used Query 3-18, the output appears as in Query Result 3-18.

Query Result 3-18

You can also use the USER function in a WHERE clause when you select the rowid.

Query 3-19 returns the rowid for only those rows that are inserted or updated by the user who performs the query.

Query 3-19

For example, if the user richc used Query 3-19, the output appears as in Query Result 3-19..

Query Result 3-19

Using the DBSERVERNAME Function
With Universal Server, you can add the DBSERVERNAME keyword (or its synonym, SITENAME) to a query to find out where the current database resides.

Query 3-20 finds the database server name and the user name as well as the rowid and the tabid, which is the serial-interval table identifier for system catalog tables.

Query 3-20

Query 3-20 assigns display labels to the DBSERVERNAME and USER expressions and returns the 10 rows from the systables system catalog table. For example, if user zenda is connected to a server called manatee, the output appears as in Query Result 3-20.

Query Result 3-20

Never store a rowid in a permanent table or attempt to use it as a foreign key because the rowid can change. For example, if a table is dropped and then reloaded from external data, all the rowids are different.

USER and DBSERVERNAME are discussed in Chapter 2, "Composing Simple SELECT Statements."

Outer Joins

Chapter 2, "Composing Simple SELECT Statements," shows how to create and use some simple joins. Whereas a simple join treats two or more joined tables equally, an outer join treats two or more joined tables unsymmetrically. A simple join makes one of the tables dominant (also called preserved) over the other subservient tables.

Outer joins occur in four basic types:

This section discusses these types of outer joins. For full information on their syntax, use, and logic, see the discussion of outer joins in Chapter 1 of the Informix Guide to SQL: Syntax.

In a simple join, the result contains only the combinations of rows from the tables that satisfy the join conditions. Rows that do not satisfy the join conditions are discarded.

In an outer join, the result contains the combinations of rows from the tables that satisfy the join conditions. Rows from the dominant table that would otherwise be discarded are preserved, even though no matching row was found in the subservient table. The dominant-table rows that do not have a matching subservient-table row receive a row of nulls before the selected columns are projected.

An outer join applies conditions to the subservient table while it sequentially applies the join conditions to the rows of the dominant table. The conditions are expressed in a WHERE clause.

An outer join must have a SELECT clause, a FROM clause, and a WHERE clause. To transform a simple join into an outer join, insert the keyword OUTER directly before the name of the subservient tables in the FROM clause. As shown later in this section, you can include the OUTER keyword more than once in your query.

Before you use outer joins heavily, determine whether one or more simple joins can work. You often can get by with a simple join when you do not need supplemental information from other tables.

The examples in this section use table aliases for brevity. Table aliases are discussed in Chapter 2, "Composing Simple SELECT Statements."

Simple Join

Query 3-21 is an example of the type of simple join on the customer and cust_calls tables that is shown in Chapter 2, "Composing Simple SELECT Statements."

Query 3-21

Query 3-21 returns only those rows in which the customer has made a call to customer service, as Query Result 3-21 shows.

Query Result 3-21

Simple Outer Join on Two Tables

Query 3-22 uses the same select list, tables, and comparison condition as the preceding example, but this time it creates a simple outer join.

Query 3-22

The addition of the keyword OUTER in front of the cust_calls table makes it the subservient table. An outer join causes the query to return information on all customers, whether or not they have made calls to customer service. All rows from the dominant customer table are retrieved, and null values are assigned to corresponding rows from the subservient cust_calls table, as Query Result 3-22 shows.

Query Result 3-22

Outer Join for a Simple Join to a Third Table

Query 3-23 shows an outer join that is the result of a simple join to a third table. This second type of outer join is known as a nested simple join.

Query 3-23

Query 3-23 first performs a simple join on the orders and items tables, retrieving information on all orders for items with a manu_code of KAR or SHM. It then performs an outer join to combine this information with data from the dominant customer table. An optional ORDER BY clause reorganizes the data into the form Query Result 3-23 shows.

Query Result 3-23

Outer Join for an Outer Join to a Third Table

Query 3-24 creates an outer join that is the result of an outer join to a third table. This third type is known as a nested outer join.

Query 3-24

Query 3-24 first performs an outer join on the orders and items tables, retrieving information on all orders for items with a manu_code of KAR or SHM. It then performs an outer join, which combines this information with data from the dominant customer table. Query 3-24 preserves order numbers that the previous example eliminated, returning rows for orders that do not contain items with either manufacturer code. An optional ORDER BY clause reorganizes the data, as Query Result 3-24 shows.

Query Result 3-24

You can state the join conditions in two ways when you apply an outer join to the result of an outer join to a third table. The two subservient tables are joined, but you can join the dominant table to either subservient table without affecting the results if the dominant table and the subservient table share a common column.

Outer Join of Two Tables to a Third Table

Query 3-25 shows an outer join that is the result of an outer join of each of two tables to a third table. In this fourth type of outer join, join relationships are possible only between the dominant table and the subservient tables.

Query 3-25

Query 3-25 individually joins the subservient tables orders and cust_calls to the dominant customer table; it does not join the two subservient tables. An INTO TEMP clause selects the results into a temporary table for further manipulation or queries, as Query Result 3-25 shows.

Query Result 3-25

If Query 3-25 had tried to create a join condition between the two subservient tables o and x, as Query 3-26 shows, an error message would have indicated the creation of a two-sided outer join.

Query 3-26




Informix Guide to SQL: Tutorial, version 9.1
Copyright © 1998, Informix Software, Inc. All rights reserved.