Planning the Routine

Extending Informix Dynamic Server 2000
Developing a User-Defined Routine

Planning the Routine

When you write a user-defined routine, consider the following:

Defining a return value (user-defined functions only)

The routine name and routine parameters make up the routine signature for the routine. The routine signature uniquely identifies the user-defined routine in the database. For more information, see The Routine Signature.

Consider the following questions about routine naming and design:

Are any of my routines modal? That is, does the behavior of the routine depend on one of its arguments?

Can I describe what each type and routine does in two sentences?

Do any of my routines take more than three arguments?

Have I used polymorphism effectively?

Naming the Routine

Choose sensible names for your routines. Make the routine name easy to remember and have it succinctly describe what the routine does. The database server supports polymorphism, which allows multiple routines to have the same name. This ability to assign one name to multiple routines is called routine overloading. For more information on routine overloading, refer to Overloading Routines.

Routine overloading is contrary to programming practice in some high-level languages. For example, a C programmer might be tempted to create functions with the following names that return the larger of their arguments:

	bigger_int(integer, integer)
	bigger_real(real, real)

In SQL, these routines are better defined in the following way:

	bigger(integer, integer)
	bigger(real, real)

The naming scheme in the second example allows users to ignore the types of the arguments when they call the routine. They simply remember what the routine does and let the database server choose which routine to call based on the argument types. This feature makes the user-defined routine simpler to use.

Defining Routine Parameters

When you invoke a user-defined routine, you can pass it optional argument values. Each argument value corresponds to a parameter of the routine.

Number of Arguments

Limit the number of arguments in your user-defined routines and make sure that these arguments do not make the routine modal. A modal routine uses a special argument as a sort of flag to determine which of several behaviors it should take. For example, the following statement shows a routine call to compute containment of spatial values:

Containment(polygon, polygon, integer);

This routine determines whether the first polygon contains the second polygon or whether the second contains the first. The caller supplies an integer argument (for example, 1 or 0) to identify which value to compute. This is modal behavior; the mode of the routine changes depending on the value of the third argument.

In the following example, the routine names clearly explain what computation is performed:

Contains(polygon, polygon) 
ContainedBy(polygon, polygon)

Always construct your routines to be nonmodal, as in the second example.

Declaring Routine Parameters

You define routine parameters in a parameter list when you declare the routine. In the parameter list, each parameter provides the name and data type of a value that the routine expects to handle. Routine parameters are optional; you can write a UDR that has no input parameters.

When you invoke the routine, the argument value must have a data type that is compatible with the parameter data type. If the data types are not the same, the database server tries to resolve the differences. For more information, see The Routine-Resolution Process.

The way that you declare a user-defined routine depends on the language in which you write that routine.

The parameters in an SPL routine must be declared with SQL data types. For more information, see Executing an SPL Routine.

For routines written in an external language (C or Java), you use the syntax of the external language to declare the routine. The routine parameters indicate the argument data types that the routine expects to handle.

You declare the routine parameters with data types that the external language supports. However, when you register the routine with CREATE FUNCTION or CREATE PROCEDURE, you use SQL data types for the parameters. (For more information, see Registering Parameters and a Return Value.) Therefore, you must ensure that these external data types are compatible with the SQL data types that the routine registration specifies.

For UDRs written in C, the DataBlade API provides special data types for use with SQL data types. For most of these special data types, you must use the pass-by-reference mechanism. However, for a few data types, you can use the pass-by-value mechanism. For more information, see the chapter in DataBlade API data types in the DataBlade API Programmer's Manual.

Every UDR written in Java maps to an external Java static method whose class resides in a JAR file that has been installed in a database. The SQL-to-Java data type mapping is done according to the JDBC specification. For more information, refer to Creating UDRs in Java.

Returning a Value

A common use of a user-defined routine is to return a value to the calling SQL statement. A user-defined routine that returns a value is called a user-defined function.

For information on how to specify the data type of the return value of a user-defined function, see Registering a User-Defined Routine.

Returning a Variant or Nonvariant Value

By default, a user-defined function is a variant function. A variant function has any of the following characteristics:

It returns different results when it is invoked with the same arguments.

For example, a function whose return value is computed based on the current date or time is a variant function.

It has variant side effects, such as:

modifying some database table, variable state, or external file.

failing to locate an external file, or a table or row in a database, and returning an error.

You can explicitly specify a variant function with the VARIANT keyword. However, because a function is variant by default, this keyword is not required.

A nonvariant function always returns the same value when it receives the same argument, and it has none of the preceding variant side effects. Therefore, nonvariant functions cannot contain SQL statements or access external files. For nonvariant functions, the database server might decide at execution time to cache the return values for expensive functions or to parallelize the query. You can create a functional index only on a nonvariant function. You specify a nonvariant function with the NOT VARIANT keywords.

Within the CREATE FUNCTION statement, you can specify VARIANT or NOT VARIANT in either of these places:

As a routine modifier in the WITH clause

For external routines, as part of the EXTERNAL NAME clause

If you specify these keywords in both of these locations, the two specifications must be the same.

Returning Multiple Values with an OUT Parameter

By definition, an external function returns only one value. However, the database server provides the following methods for external routines to return multiple values:

An OUT parameter and a statement-local variable (SLV) to return multiple values with a single invocation of the function

An ITERATOR routine modifier to return a value multiple times through the automatic repeated execution of the function by the database server

OUT Parameters and Statement-Local Variables

To create an external function that returns more than one value, perform the following actions:

Define the UDR to use an OUT parameter.

Access the returned value within an SQL statement with a statement-local variable (SLV).

Tip: Although this section discusses the use of SLVs in the context of an OUT parameter in an external function, you can use SLVs and OUT parameters in SPL functions.

Defining User-Defined Functions That Have OUT Parameters

An OUT parameter is a function parameter whose argument value the database server passes by reference to the routine. The external function can change the value of this parameter and return a new value through the parameter. Each external function can have at most one OUT parameter.

You can define OUT parameters for the following external routines:

User-defined routines written in C

For information about writing database server functions with OUT parameters, see the chapter on writing database server routines in the DataBlade API Programmer's Manual.

User-defined routines written in Java

For information about writing database server functions in Java, refer to Creating UDRs in Java.

To be called in a WHERE clause, a routine must return a BOOLEAN value. Therefore, only user-defined functions can be called in a WHERE clause.

The OUT parameter does not pass any data into the user-defined routine. At the time the function is called, the SLV does not reference any actual value. The database server passes only a pointer to some space where the function can store a value. The external function cannot retrieve values from this reserved space. By contrast, when a pure C-language program passes a pointer to a function, this pointer can point to a valid value, and the function can examine the current value before it modifies and returns the pointer.

Important: In the user-defined function, do not try to read from the address of the OUT parameter pointer because the value is meaningless.

To return a value from an external routine with an OUT parameter

Write an external routine that includes the OUT parameter in its parameter list.

The OUT parameter must be passed by reference. That is, it must be a pointer to the argument value, not the value itself.
For example, the following declaration of a C-language function allows you to return extra information through the y parameter: mi_integer my_func(int x, int *y);

Register the user-defined routine with the OUT keyword before the appropriate parameter to indicate that the routine handles an OUT parameter.

In the CREATE FUNCTION or CREATE PROCEDURE statement, the OUT keyword indicates that the last parameter is passed as a pointer. The OUT parameter must be the last parameter in the parameter list.
For example, the following statement shows how you might register the my_func() function, which uses the y parameter of the function argument to return extra values:

CREATE FUNCTION my_func(x INT, OUT y INT)
RETURNING INT
EXTERNAL NAME "/usr/lib/local_site.so"
LANGUAGE C

Referencing OUT Parameters of User-Defined Functions

A statement-local variable (SLV) transmits an OUT parameter from a user-defined function to other parts of an SQL statement. An SLV is local to the SQL statement; that is, it is only valid for the life of the SQL statement. It provides a temporary name by which to access the OUT parameter value.

In the SQL statement that calls the user-defined function, declare the SLV with the following syntax:

SLV_name # SLV_type
 



SLV_name

is the name of the SLV variable.


SLV_type

is the data type of the SLV variable

SLV_name	is the name of the `SLV` variable.
SLV_type	is the data type of the `SLV` variable

To obtain OUT parameters from an external routine

Declare an SLV to obtain the OUT parameter value when you invoke the external routine in a WHERE clause of an SQL expression.

Use the SLV in other parts of the SQL statement to access the OUT parameter value.

For example, the following SELECT statement declares an SLV y that is typed as an INTEGER in its WHERE clause and then accesses the y SLV in the WHERE clause:

SELECT ...WHERE my_func(x, y # INT) < 100 AND (y = 3)

For more information on the syntax and use of an SLV, see the description of function expressions within the Expression segment in the Informix Guide to SQL: Syntax.

Returning Values with an Iterator Function

By default, a user-defined function returns only one value; that is, it calculates its return value and returns only once to its calling SQL statement. Although the section OUT Parameters and Statement-Local Variables describes how to return more than one value from a user-defined function, the function still returns all these values at the same time to the calling SQL statement. User-defined functions that return their result in a single return to the calling SQL statement are called noncursor functions because they do not require a database cursor to be executed. For information on how to invoke noncursor functions, see Invoking a User-Defined Routine.

However, you can write a user-defined function that returns to its calling SQL statement several times, each time returning a value. Such a user-defined function is called an iterator function. An iterator function is a cursor function because it must be associated with a cursor when it is executed. The cursor holds the values that the cursor function repeatedly returns to the SQL statement. The calling program can then access the cursor to obtain each returned value, one at a time. The contents of the cursor are called an active set. Each time the iterator function returns a value to the calling SQL statement, it adds one item to the active set.

Tip: An iterator function is similar to an SPL function that contains the RETURN WITH RESUME statement. Creating an Iterator Function

You can write iterator functions in C or in Java:

A C-language iterator function uses DataBlade API functions, such as mi_fp_setisdone() and mi_fp_request(), to handle each return item of the active set.

A Java iterator function uses the UDREnv interface to provide access to the user state.

Registering an Iterator Function

By default, an external function is not an iterator. To define an iterator function, you must register the function with the ITERATOR routine modifier. The following sample CREATE FUNCTION statement shows how to register the function TopK() as an iterator function in C:

CREATE FUNCTION TopK(INTEGER, INTEGER)
	RETURNS INTEGER
	WITH (ITERATOR)
	EXTERNAL NAME
		'/usr/lib/extend/misc/topkterms.so(topk_integers)'
	LANGUAGE C NOT VARIANT

Invoking an Iterator Function

You can invoke an iterator function with the EXECUTE FUNCTION statement in one of the following methods:

Directly with the EXECUTE FUNCTION statement:

From DB-Access

In a prepared cursor in an external routine

In an external routine

In an SPL FOREACH loop

With an EXECUTE FUNCTION statement as part of an INSERT statement:

From DB-Access

In a prepared cursor in ESQL/C or an external routine

In a DataBlade API database server routine

In an SPL FOREACH loop

Adhering to Coding Standards

The SQL/PSM standard is available for UDR development. In addition, Informix publishes a collection of standards for DataBlade module development. These standards are available from the DataBlade Developers Program. The most important rules govern the naming of data types and routines. DataBlade modules share these name spaces, so you must follow the naming guidelines to guarantee that no problems occur when you register multiple DataBlade modules in a single database.

Tip: Informix recommends that you use the DataBlade Developers Kit (DBDK), Version 4.0 or later, to help write UDRs. Although it is possible to create a DataBlade without DBDK, it is not desirable. DBDK enforces standards on DataBlade modules that facilitate migration between different versions of the database server.

In addition, the standards for 64-bit clean implementation, safe function-calling practices, thread-safe development, and platform portability are important. Adherence to these standards ensures that UDR modules are portable across platforms.

Ask yourself the following questions when coding your user-defined routine:

Do I obey all naming standards?

Is my design 64-bit clean and portable across platforms?

Is my design thread-safe?

Extending Informix Dynamic Server 2000Developing a User-Defined Routine