Coding the User-Defined Routine

Informix DataBlade API Programmer's Manual
Writing a User-Defined Routine

Coding the User-Defined Routine

When you write your C UDR, you write a C function. This section provides information about how to:

define a return value (for a user-defined function).

Defining Routine Parameters

When the routine manager invokes a C UDR, it passes to it any argument values that the SQL statement provided. When you write a C UDR, you can define routine parameters that indicate the data types of the arguments that the UDR expects to handle.

This section provides information about how to define routine parameters for the following UDR arguments:

Parameters for MI_DATUM arguments

Parameter for the MI_FPARAM argument, which the routine manager passes in to every UDR

Tip: Routine arguments are optional. However, if your UDR does not require arguments, you must still declare an MI_FPARAM parameter in the C-function declaration. For more information, see For the MI_FPARAM Argument.

The routine manager uses the parameter data types in routine resolution. Therefore, you can have multiple UDRs with the same name as long as their parameter lists uniquely identify them. For more information, see Routine Resolution.

For MI_DATUM Arguments

When the SQL statement invokes a UDR, it can specify column values or expressions to pass to the UDR. The routine manager passes these argument values to a user-defined routine as MI_DATUM values. The data type of the argument determines the passing mechanism that the routine manager uses for the argument value, as follows:

Most data types cannot fit into an MI_DATUM and are passed by reference.

A few data types can fit into an MI_DATUM and are by value.

Tip: For more information about how the routine manager passes argument values to a C UDR, see Pushing Arguments Onto the Stack.

The passing mechanism that the routine manager uses for a particular argument determines how you must declare the corresponding parameter of the UDR.

Pass-By-Reference Arguments

When an argument has a data type that cannot fit into an MI_DATUM, the routine manager passes the argument by reference. For these pass-by-reference arguments, you declare a parameter that is a pointer to the parameter data type in the C-function declaration.

Figure 12-1 shows the bigger_double() user-defined function, which compares two mi_double_precision values. Because the routine manager passes mi_double_precision values by reference, bigger_double() declares the two parameters as pointers to the mi_double_precision data type.


	Figure 12-1 Passing Arguments By Reference

Important: Memory that the routine manager allocates to pass an argument by reference has a PER_ROUTINE memory duration. Therefore, it is guaranteed to be valid only for the duration of the UDR. The database server automatically frees this memory when the UDR completes.

Any C-language code that calls bigger_double() must pass the mi_double_precision values by reference, as in the following sample call:

mi_double_precision double1, double2, *result;

result = mi_alloc(size_of(mi_double_precision));
double1 = 13497.931669;
double2 = 235521832.00484;
result = bigger_double(&double1, &double2);

Values passed into a user-defined routine are often also used in other places in the SQL statement. If your UDR modifies a pass-by-reference value, this modified value is what any successive routines in the SQL statement that operate on this value use. Therefore, it might make a difference when your UDR is run within the context of an SQL statement because any routine that runs before it can see (and possibly modify) any pass-by-reference values.

Tip: Informix recommends avoiding the modification of a pass-by-reference argument within a C UDR. For more information, see Modifying Argument Values. Pass-By-Value Parameters

When an argument has a data type that can fit into an MI_DATUM, the routine manager passes the argument by value. For these pass-by-value arguments, you declare a parameter as the actual parameter data type in the C-function declaration.

Figure 12-2 shows the bigger_int() UDR, which compares two mi_integer values. Because the routine manager passes mi_integer values by value, the UDR declares the two parameters with the mi_integer data type, not as pointers to mi_integer.


	Figure 12-2 Passing Arguments By Value

Any C-language code that calls bigger_int() must also pass the mi_integer values by value, as in the following sample call:

mi_integer int1, int2, result;
...
int1 = 6;
in2 = 8;
result = bigger_int(int1, int2);

For the MI_FPARAM Argument

The routine manager passes an MI_FPARAM structure into every UDR that it executes. This structure contains routine-state information about the UDR, such as information about arguments and return values. Because the routine manager automatically passes an MI_FPARAM structure to a UDR, you do not need to explicitly declare this structure in most C-function declarations.

However, you must include an MI_FPARAM declaration in the C-function declaration in the following cases:

You need to access routine-information within the UDR.

When you declare an MI_FPARAM parameter, this declaration must be the last parameter in the C declaration of your UDR. For more information about the DataBlade API functions that access the routine information from MI_FPARAM see Accessing the Routine State with MI_FPARAM.

You declare a UDR that does not take any arguments.

A C UDR always takes at least one argument: a pointer to the MI_FPARAM structure. When the parameter list of your SQL UDR is empty, you must still include a declaration for the MI_FPARAM structure, even if the UDR does not access routine-state information.

Tip: When you register the user-defined routine, omit the MI_FPARAM parameter from the parameter list in the CREATE FUNCTION or CREATE PROCEDURE statement.

For example, the bigger_double() user-defined function in Figure 12-1 on page 12-5 does not need to include a declaration for the MI_FPARAM structure because it does not need to access routine-state information and it has other parameters. However, suppose you register the func_noargs() user-defined function, which does not require arguments:

CREATE FUNCTION func_noargs() RETURNS INTEGER
	EXTERNAL NAME '/usr/lib/udrs/udrs.so' LANGUAGE C;

In the C UDR, you must declare the func_noargs() function with a single parameter, a pointer to the MI_FPARAM structure:

mi_integer func_noargs(MI_FPARAM *fparam)
{
...
}

The declaration of the MI_FPARAM structure allows the routine manager to pass this structure into the UDR.

Obtaining Argument Values

To obtain the argument value with a C UDR, access the parameter that you have specified in the C declaration of the function. The parameter declaration indicates the appropriate passing mechanism for the UDR parameters. You can access the argument values through these declarations, as you would any other C-function parameter, as follows:

For pass-by-reference parameters, access the argument value through its pointer. Do not modify this pointer within the body of the UDR.

Most data types are passed by reference. The sample UDR bigger_double(), in Figure 12-1 on page 12-5, shows how to access pass-by-reference arguments within a UDR.

For pass-by-value parameters, you can access the argument value directly through its parameter variable.

For a list of data types that can be returned by value, see Figure 2-18 on page 2-48. The sample UDR bigger_int(), in Figure 12-2 on page 12-6, shows how to access pass-by-value arguments within a UDR.

Tip: You can obtain information about an argument, such as its type and length, from the MI_FPARAM structure. For more information, see Checking Routine Arguments.

This section provides the following information about how to obtain routine arguments:

How to handle character arguments

How to handle arguments that contain the SQL NULL value

When you can modify routine arguments

Handling Character Arguments

When the routine manager receives text data for a C UDR, it puts this text data into an mi_lvarchar varying-length structure. It then passes a pointer to this mi_lvarchar structure as the MI_DATUM for the UDR argument. Therefore, a C UDR must have its text parameter declared as a pointer to an mi_lvarchar when the parameter accepts data from the following SQL character data types:

CHAR

VARCHAR

LVARCHAR

TEXT

You must also use mi_lvarchar if your user-defined routine handles locale-sensitive character data types (NCHAR or NVARCHAR). For more information on the NCHAR and NVARCHAR data types, see the Informix Guide to GLS Functionality.

Important: These SQL data types can be represented as null-terminated strings even though a C UDR never receives a null-terminated string as an argument. Do not code a C UDR to receive null-terminated strings as arguments. For more information on how to access mi_lvarchar, see Varying-Length Data Structures.

For example, suppose you want to define a user-defined function called initial_cap() that accepts a VARCHAR string and ensures that the string begins with an uppercase letter and is followed by lowercase letters. This UDR would be useful in the following query to check the capitalization of a customer last name:

SELECT customer_num
FROM customer
WHERE initial_cap(lname) = "Sadler"

In the preceding query, use of the initial_cap() function means that you do not have to ensure that the customer last names (in the lname column) were entered with an initial uppercase letter. The preceding query would locate customer numbers for customers whose last names where either Sadler or sadler.

The following CREATE FUNCTION statement registers the initial_cap() function in the database:

CREATE FUNCTION initial_cap(str VARCHAR(50))
RETURNS VARCHAR(50)
EXTERNAL NAME '/usr/udrs/text/checkcaps.so'
LANGUAGE C

The following declaration of initial_cap() specifies an mi_lvarchar pointer as the parameter data type even though the function is registered to accept a VARCHAR-column value:

/* Valid C UDR declaration for string parameter */
mi_lvarchar *initial_cap(str)
	mi_lvarchar *str;

The following declaration of initial_cap() is invalid because it specifies an mi_string pointer as the parameter data type:

/* INVALID declaration for string parameter */
mi_string *initial_caps(strng)
	mi_string *strng;

The initial_caps() function in the preceding declaration would not execute correctly because it interprets its argument as an mi_string when the routine manager actually sends this argument as an mi_lvarchar.

Figure 12-3 shows the implementation of the initial_cap() function.


	Figure 12-3 Handling Character Data in a User-Defined Routine

Tip: A C UDR that returns data for one of the SQL character data types must return a pointer to an mi_lvarchar. The initial_cap() function returns a varying-length structure to hold the initial-capital string. For more information, see Returning Character Values.

Handling NULL Arguments

By default, a C UDR does not handle SQL NULL values. When you call a UDR with an SQL NULL as the argument, the routine manager does not invoke the UDR. It returns a value of SQL NULL for the UDR. To have the UDR invoked when it is called with SQL NULL arguments, register the UDR with the HANDLESNULLS routine modifier and code the UDR to take special steps when it receives a NULL argument.

To determine whether an argument is SQL NULL, pass the MI_FPARAM structure as the last argument in the UDR and use the mi_fp_argisnull() function to check for NULL argument values. Do not just compare the argument with a NULL-valued pointer. For more information, see Handling NULL Arguments With MI_FPARAM.

Modifying Argument Values

Do not modify a UDR argument unless it is an OUT parameter. The routine manager does not make routine-specific copies of the arguments that it passes to UDRs because it is more efficient not to do so. Keep in mind that values passed into a UDR are often used on other places in the SQL statement. If you modify a pass-by-reference value within the UDR, you also modify it for all other parts of the SQL statement (including other UDRs) that operate on the value after the UDR executes. When you modify a pass-by-reference argument within the UDR, you might create an order-dependent result of the SQL statement. That is, it now might make a difference when, within the SQL statement, your UDR is run.

Defining a Return Value

When you declare a C user-defined routine, you specify the routine return value, as follows:

For a user-defined function, the C declaration specifies the data type that the UDR returns.

For a user-defined procedure, the C declaration specifies the void data type as a return value.

Important: A C user-defined function can only return one value.

This section provides the following information about how to handle a return value for a C user-defined function:

How to return a single value from a user-defined function

How to return multiple values from a user-defined function

Returning a Value

When a user-defined function completes, the routine manager returns its value as an MI_DATUM value. The data type of the return value determines the passing mechanism that the routine manager uses for the value, as follows:

Most data types cannot fit into an MI_DATUM and are passed by reference.

A few data types can fit into an MI_DATUM and are by value.

The passing mechanism that the routine manager uses for a particular return value determines how you must declare it in the user-defined function, as follows.


Return-Value Data Type	Tasks to Return the Value
For data types that cannot fit into the `MI_DATUM` structure	Return the value by reference: Declare a local variable that is a pointer to the actual return value. Allocate the memory for the return value with the `PER_ROUTINE` memory duration. Use a DataBlade API memory-management function. For more information, see Managing User Memory. Assign the address of this memory to a local variable. Store the return value in this memory. Return the pointer to this memory as the return value.
For data types that can fit into the `MI_DATUM` structure	Can return the value by value: Declare a local variable to hold the actual return value. Store the return value in this local variable. Return the local variable as the return value.

A user-defined function can return a pass-by-reference argument that was not initialized by the caller of the user-defined function. However, it must be clear to the caller of user-defined function that this function expects to receive an uninitialized pointer as an argument. The user-defined function can receive the uninitialized pointer as a pass-by-reference argument; initialize this pointer to memory that it allocates, and return the pointer by reference.

Important: A user-defined function cannot return an automatic or local variable if its data type cannot be returned by value. That is, any automatic or local variables with data types that cannot fit into MI_DATUM cannot be returned by value from the UDR.

To set the return value, access the automatic or local variable that you declared in the user-defined function. You can access the return value through this local variable, as you would any other C-function variable, as follows:

For pass-by-reference return values, access the return value through its pointer.

Most data types are passed by reference. The sample UDR bigger_double(), in Figure 12-1 on page 12-5, shows how to return an mi_double_precision value by reference. It allocates PER_ROUTINE memory for the return value, which the database server deallocates when the user-defined function completes.

For pass-by-value return values, you can access the return value directly through its variable.

For a list of data types that can be returned by value, see Figure 2-18 on page 2-48. The sample UDR bigger_int(), in Figure 12-2 on page 12-6, shows how to return an mi_integer value by value.

Tip: You can obtain information about a return value, such as its type and length, from the MI_FPARAM structure. For more information, see Accessing Return-Value Information.

Returning a NULL Value

To return an SQL NULL value from a user-defined function, pass the MI_FPARAM structure as the last argument in the UDR and use the mi_fp_setreturnisnull() function to set the NULL value in this MI_FPARAM structure. You must call the mi_fp_setreturnisnull() function with MI_TRUE before your UDR completes. If you do not, you might receive an incorrect result from the UDR. Do not just return a NULL-valued pointer. For more information, see Returning a NULL Value.

Returning Character Values

The routine manager handles all text return values from a C UDR as mi_lvarchar values. Therefore, a C UDR must declare its return value as a pointer to an mi_lvarchar when it returns data for any of the following SQL character data types:

CHAR

VARCHAR

LVARCHAR

TEXT

Important: Even though these SQL data types can be represented as null-terminated strings, do not code a C UDR to return a null-terminated string. For more information on how to access mi_lvarchar, see Varying-Length Data Structures.

For example, the initial_cap() function in Figure 12-3 on page 12-11 can ensure that names are entered with an initial uppercase letter followed by lowercase letters. This UDR would be useful in the following query to ensure consistent capitalization of the customer last name:

INSERT INTO customer(customer_num, lname, fname)
VALUES (0, initial_cap("ANDERSON"), initial_cap("TASHI"))

The calls to initial_cap() in this INSERT statement convert the last and first names of this customer as Anderson and Tashi, respectively.

Figure 12-3 on page 12-11 shows the following declaration for initial_cap():

/* Valid C UDR declaration for string return value */
mi_lvarchar *initial_cap(str)
	mi_lvarchar *str;

This declaration correctly specifies an mi_lvarchar pointer as the return type so that the function can return the VARCHAR value. The following declaration of initial_cap() is invalid because it specifies an mi_string pointer as the return type:

/* INVALID declaration for string return value */
mi_string *initial_cap(strng)
	mi_lvarchar *strng;

The initial_cap() function in the preceding declaration would not return the expected value because the routine manager interprets the mi_string that the UDR returns as an mi_lvarchar.

Tip: A C UDR that accepts data for one of these SQL text data types must also declare its parameters as mi_lvarchar pointers. For more information, see Handling Character Arguments.

Returning Multiple Values

Unlike an SPL routine, a C user-defined function can directly return at most one value. However, a user-defined function can return multiple values when you use the following features together:

An OUT parameter in the user-defined function

A statement local variable (SLV) in the SQL statement that calls the user-defined function

OUT parameters and SLVs allow a user-defined function to return a second value to the calling SQL statement.

Tip: Although this section discusses the use of SLVs and OUT parameters in the context of a C user-defined function, you can use SLVs and OUT parameters in SPL functions. A user-defined procedure (external or SPL) cannot use an OUT parameter because a UDR with an OUT parameter must be called in the WHERE clause of an SQL statement. To be called in a WHERE clause, a UDR must return a BOOLEAN value. Therefore, only user-defined functions can be called in a WHERE clause. For general information on how to use an OUT parameter, see the discussion of how to return multiple values from external functions in "Extending Informix Dynamic Server 2000."

An iterator function is a special-purpose user-defined function that can return multiple values, one value per iteration of the function. For more information, see Writing an Iterator Function.

Using an OUT Parameter

An OUT parameter is a routine argument that is passed by reference to the C user-defined function. A C user-defined function can use an OUT parameter to return a value indirectly. The database server allocates storage for the OUT parameter and passes a pointer to that storage to the UDR.

For a C user-defined function to receive an OUT parameter, it must:

Declare the OUT parameter as the last parameter in the parameter list (before any MI_FPARAM structure).

Declare the OUT parameter as a pointer to the appropriate data type.

The size of the OUT parameter must be the size of an MI_DATUM. The passing mechanism for the parameter must be pass by reference regardless of the data type that you pass back.

Set the OUT parameter in the return-value array of the MI_FPARAM structure to indicate a non-NULL value.

DataBlade API modules often use OUT parameters for Boolean functions to return rank or scoring information (which can indicate how closely the return result matched the query criteria). For example, Figure 12-4 shows a C UDR, called out_test(), that does not actually search a particular title for a string, but returns a 100 percent relative weight of match success as an OUT parameter.


	Figure 12-4 The out_test() User-Defined Function

In Figure 12-4, the call to the mi_fp_setargisnull() function sets the third argument, which is the OUT parameter, to MI_FALSE, which indicates that the argument does not contain an SQL NULL value. The MI_FPARAM structure stores routine arguments in zero-based arrays. Therefore, the mi_fp_setargisnull() function specifies a position of 2 to access the third argument.

Tip: For more information on how to access the MI_FPARAM structure, see Accessing the Routine State with MI_FPARAM.

When you register the user-defined function, precede the OUT parameter with the OUT keyword. For example, the following CREATE FUNCTION statement registers the out_test() function (which Figure 12-4 defines):

CREATE FUNCTION out_test(doc LVARCHAR, 
						query VARCHAR(120), 
						OUT weight INTEGER)
	RETURNING BOOLEAN
	EXTERNAL NAME '/usr/udrs/udrs.so'
	LANGUAGE C

Using the Statement-Local Variable

When you call a user-defined function that has an OUT parameter, you must declare a statement-local variable (SLV) in the WHERE clause of the SQL statement. The SLV holds the value that the OUT parameter returns. Other parts of the SQL statement can then access the OUT parameter value through the SLV.

For example, the following SELECT statement calls the out_test() function (which Figure 12-4 on page 12-18 defines) and saves the result of the OUT parameter in a statement-local variable called weight:

SELECT title, weight FROM mytab

	WHERE out_test(title, 'aaa', weight # INTEGER)

The SELECT statement specifies the statement-local variable in its select list so it can return the value of the OUT parameter.

For more information on the syntax and use of SLVs, see the description of how to return multiple values from a function in Extending Informix Dynamic Server 2000.