Creating Routines

Home | Previous Page | Next Page Creating DataBlade Objects Using BladeSmith > Creating DataBlade Module Objects >

Creating Routines

You can define public or private user-defined routines that support your DataBlade module. You can specify if the routine is called by SQL or is an internal routine.

Routines can be functions, which return values, or procedures, which do not return values. Routines can be written in the C or Java programming languages or the Informix Stored Procedure Language (SPL).

Use the New Routine wizard to:

Overload existing routines for extended data types.
Existing routines can be built-in or user-defined. Built-in routines include operator and other arithmetic functions, and support routines. For a list of built-in routines you can overload, see IBM Informix: User-Defined Routines and Data Types Developer's Guide.
Overload operators for built-in aggregates for extended data types.
Built-in aggregates include AVG, DISTINCT, MAX, MIN, RANGE, SUM, STDEV, and VARIANCE. For a list of the operators you must overload for each built-in aggregate, see IBM Informix: User-Defined Routines and Data Types Developer's Guide.
Create new user-defined routines for built-in or extended data types.

The following table lists the properties you specify when you create a routine.

Property	Default Value	Description
Routine name	prefixRoutine	The name of the routine. If you are overloading a routine, the name can be an existing routine name; otherwise, it must be a unique name. See Routine Name for more information.
Return type	No return type	The data type that is returned by the routine. Functions return a value, but procedures do not.
Statement local variable? (Available for routines that return values only)	No	Whether the last argument passed to the function is an OUT parameter for a statement local variable, allowing the function to return two values. See Statement Local Variables for more information.
Arguments	None	The name, data type, and default value of each argument passed to the routine. See Routine Arguments for more information.
Language	C	The programming language in which to write the routine: C, Java, or Stored Procedure Language (SPL). You must set server compatibility to 9.2 or later to generate code for Java projects. You need the J/Foundation upgrade to IBM Informix Dynamic Server to enable Java services.
SQL routine body (SPL routines)	None	The SPL statements that define the routine. See the IBM Informix: Guide to SQL Syntax for more information on SPL.
Does not accept null values? (C and Java routines)	Yes	Whether the routine accepts null values. If a routine that does not accept nulls is passed a null value, the database server returns a null value without calling the routine.
Is variant? (C and Java routines)	Yes	Variant routines can return different values with the same input arguments. The database server never caches results from variant routines. See Variant Functions for more information.
Is parallelizable? (C and Java routines)	No	Parallelizable routines can be split into subqueries and processed in parallel. See Parallelizable Routines for more information.
Is a DBA routine? (C and Java routines)	No	The routine can be created or executed only by a user with DBA permissions.
Never called from SQL? (C routines)	No	If a routine cannot be called from SQL, it is an internal routine that can only be called directly by the database server: for example, primary access method routines.
An iterator? (C and Java routines)	No	Iterator routines return a set of values, one value at a time. See the IBM Informix: DataBlade API Programmer's Guide for more information.
C routine name (C routines)	prefixRoutine	The name of the routine in the shared object file. Must be unique. See C Routine Name for more information.
Shared object path (C and Java routines)	$INFORMIXDIR/ extend/%SYSBLDDIR%/ project.bld (C routines) %JAVAPATH% (Java routines)	The relative or absolute path and filename of the shared object. The default path and filename is recommended.
Well behaved or poorly behaved? (C and Java routines)	Well behaved	Well-behaved routines can run in the CPU virtual processors; poorly behaved routines should run in a user-defined virtual processor. See Routine Behavior for more information.
User-defined virtual processor class (C and Java routines)	default_class	The name of the user-defined virtual processor class in which a poorly behaved routine runs. See User-Defined Virtual Processor Class Name for more information.
Special stack size requirements? (C routines)	No	Whether the routine needs an unusually large amount of virtual shared memory to execute. See Stack Size for more information.
Cost of routine (C routines)	0	The relative cost of the routine, for query optimization. See Cost of Routine for more information.
Negator routine?	No	A routine that returns the opposite Boolean result with the same arguments. Used for query optimization. See Related Routines for more information.
Commutator routine? (C routines)	No	A routine that returns the same Boolean result with the arguments in reverse order. Used for query optimization. See Related Routines for more information.
Selectivity routine? (C routines)	No	A routine that estimates the percentage of rows returned by the routine. Used for query optimization. See Related Routines for more information.

The SQL that BladeSmith generates for routines uses the ALTER FUNCTION statement to specify all but the following properties:

Routine name
Return type
Statement local variable
Arguments
Language
DBA routine

Using the ALTER FUNCTION statement allows BladeManager to re-register the routine without dropping and re-creating it.

The following sections describe properties of routines.

Routine Name

Specify the name of an existing routine to overload it for a new data type, or specify a unique routine name to create a new routine.

If you are creating a selectivity routine for a user-defined routine, name the selectivity routine RoutineSelectivity.

You can overload built-in operator and other arithmetic routine for collection, row, and distinct data types. (You can overload most arithmetic routines for opaque data types with the New Opaque Type wizard.) How arithmetic routines operate on collection and row data types is determined by the code you write for them. For example, if you overload the Plus() function for a row data type, it might either:

Add the values of the fields and return a row data type with the same number of fields as the originals.
Return a row data type with twice as many fields as the originals.

See IBM Informix: User-Defined Routines and Data Types Developer's Guide for a list of built-in routines you can overload.

Although it is not necessary, you can create new support routines for collection, row, and distinct data types.

Statement Local Variables

If you want your function to return two values, check the statement local variable check box. The last argument for your function is then defined as an OUT parameter. The OUT parameter corresponds to a value the function returns indirectly, through a pointer, to a statement local variable (SLV). The value the function returns through the pointer is an extra value, in addition to the value it returns explicitly.

The SLV provides a temporary name that a single statement can manipulate. An SQL statement uses each SLV to transmit the output from a single function to other parts of the SQL statement.

See the IBM Informix: DataBlade API Programmer's Guide for more information.

Routine Arguments

A routine can accept 0 to 20 arguments.

Arguments passed to a routine have the following properties:

Name. The name must follow SPL, C, or Java language naming conventions.
Data type. For SPL and C, any existing data type that appears on the list; for Java, any existing data type except row or collection data types. If you want to use a data type that is not on the list, you must first create it in BladeSmith. For more information on SQL data types, see IBM Informix: Guide to SQL Reference.
Default value. Optional. The value of the argument if a value for the argument is not specified when the routine is called.

Variant Functions

By default, user-defined functions are variant. Variant functions can return different values or have varying side effects, given the same arguments. For example, a function that returns the current date or time is a variant function. However, a function that appears nonvariant can also have varying side effects, such as updating a table or external file.

The cost of defining a nonvariant function as variant is low: you might experience slightly diminished performance. However, the cost of defining a function that exhibits variant behavior as nonvariant can be high, because a query might return incorrect results.

Most functions are not variant; marking them as nonvariant improves performance. If the function is nonvariant, the database server might cache the return values of expensive functions or run parallel queries. Functional indexes are only allowed on nonvariant functions.

See the IBM Informix: User-Defined Routines and Data Types Developer's Guide manual for more information.

Parallelizable Routines

Mark a routine as parallelizable if it can be executed within a parallel database query (PDQ) statement. PDQ statements allow the Informix database server to distribute the executions of one query among several processors by dividing the query into subqueries. The database server then allocates subqueries to separate threads for parallel processing and thus improves performance. See IBM Informix: User-Defined Routines and Data Types Developer's Guide for more information about using the parallelizable option.

Use routine parallelization if your routine is used as an expression in qualification clauses, in GROUP BY lists, or as an overloaded comparison operator.

A routine cannot be parallelizable if it accepts row or collection data types as arguments.

C and Java routines are parallelizable if they call only the DataBlade API routines listed in the following categories from the IBM Informix: DataBlade API Programmer's Guide:

Data handling, except for collection manipulation functions
Session, thread, and transaction management
Function execution
Memory management
Exception handling
Callbacks
Miscellaneous

See the IBM Informix: DataBlade API Programmer's Guide for more information about the routines under each category.

C Routine Name

SQL allows overloading of routine names; however, the C language does not. Therefore, if you overload a routine, you must give it a unique C name.

Routine Behavior

A routine is well-behaved within the context of Informix database server architecture if it:

Yields the virtual processor on a regular basis to other threads.
Does not use blocking operating-system calls.

If your routine violates one of these conditions, mark it as poorly behaved and type the name of a user-defined virtual processor in the user-defined virtual processor class field.

User-Defined Virtual Processor Class Name

The name of the grouping class for the user-defined virtual processor must be 128 alphanumeric characters or fewer, and it must be unique. The class name is case insensitive. It is recommended that you begin the name of your virtual processor class with your DataBlade module new object prefix.

Tip:

You can create a routine that references a virtual processor class before that class exists. However, you must create the virtual processor class and create virtual processors in it before you register your DataBlade module in the database.

Stack Size

You can specify stack size only for a user-defined routine written in C.

Stack space is allocated from a common region in shared memory that can be overrun if a routine consumes more stack space than is allocated for it. To avoid stack overrun:

allocate sufficient stack space for all the local variables in the routine. Monitor stack usage with the onstat utility. See the IBM Informix: Administrator's Guide for more information on the onstat utility.
execute recursively called C routines with the mi_call() function. Pass any arguments other than mi_integer data types by reference. See the IBM Informix: DataBlade API Programmer's Guide for more information on mi_call().

When you specify a stack size for a user-defined routine, the database server allocates the specified amount of memory for every execution iteration of the routine.

See the IBM Informix: Dynamic Server Administrator's Guide for more information on stacks.

Cost of Routine

You can specify cost only for user-defined routines written in C.

The relative cost of the routine is used by the query optimizer to determine the order in which to process WHERE clauses in a SELECT statement. Expensive routines are called after inexpensive routines. A cost of 0 indicates that the routine costs about the same as the routines in the reference list that have a cost of 0. The reference list shows all user-defined routines created in the project. The standard formula for computing routine cost is:

lines_of_code + (I/O_operations x 100)

Because the optimizer compares routine costs, the actual cost is irrelevant; only the relative cost matters. However, follow the general formula to ensure that your routines interact with other DataBlade module routines in a predictable way.

Related Routines

If your user-defined function compares or acts as a filter for two instances of the same data type and returns a Boolean result, you can specify related functions to optimize the execution of the function when it is called in the WHERE clause of a SELECT statement.

Important:

Related functions must exist before you can choose them. You can create them before you create the function they are related to, or you can update the original routine to add related routines after you create them.

Commutator and Negator Functions

You can specify a commutator function only for user-defined routines written in C.

The database server calls a commutator or a negator function instead of the original function if the query optimizer determines that it is faster. A commutator function returns the same Boolean result as the original function with the same arguments but with the arguments in reverse order. A negator returns the opposite Boolean result as the original function with the same arguments in the same order.

Selectivity Functions

You can specify a selectivity function only for user-defined routines written in C.

A selectivity function estimates the percentage of rows that might be returned by your function, given a set of arguments. Define a selectivity function if you want to determine the cost of your function so that the query optimizer can determine when it is most efficient to call your function. Selectivity functions determine the cost of a function with statistics gathered about the values of the data type on which the function operates. See User-Defined Statistics for more information on selectivity functions and how they process user-defined statistics.

You can create a selectivity function for your user-defined function if your function compares or acts as a filter for two values of the following kind of data types:

An opaque data type for which you have created user-defined statistics support routines (see Statistics Support)
A built-in data type

The B-tree functions Equal() and NotEqual() that are overloaded for an opaque data type are good candidates for selectivity functions. Because the Equal() and NotEqual() functions are created with the Opaque Type wizard, you must add selectivity support by assigning selectivity routines on their properties pages after you create them.

Built-in data types have built-in statistics support routines, and all qualifying built-in functions (such as B-tree functions) have built-in selectivity functions. You can only create selectivity functions for functions that take built-in data types if those functions are user-defined.

A selectivity function must have the following properties:

A name in the form of FunctionSelectivity(), where Function is the name of the function to which it is assigned
A double-precision return type
Two arguments of type POINTER
All other properties as their default values

For a description of user-defined statistics and selectivity, see User-Defined Statistics.