Creating a User-Defined Aggregate

Home | Previous Page | Next Page Creating User-Defined Routines > Creating Special-Purpose UDRs > Writing an Aggregate Function >

Creating a User-Defined Aggregate

The built-in aggregates provide basic aggregations. However, if your data requires some special aggregation, you can create a custom aggregate function, called a user-defined aggregate. To implement your custom aggregation, you design an aggregate algorithm, which consists of the following parts:

The aggregate state, which contains the information that needs to be passed between iterations of the aggregate
The aggregation tasks, which are implemented as special-purpose user-defined functions called aggregate support functions

To implement a user-defined aggregate function with C user-defined functions

Determine the content and data type of the aggregate state.
Write the C UDRs that implement the required aggregate support functions for the data type on which you want to implement the user-defined aggregate.
Define the user-defined aggregate in the database with the CREATE AGGREGATE and CREATE FUNCTION statements.

After you complete these steps, you can use the aggregate in an SQL statement.

The following sections describe each of these development steps in more detail and use the SQSUM1 user-defined aggregate (which Table 103 describes) as an example.

Table 103. A Sample User-Defined Aggregate
User-Defined Aggregate	Description	Definition
SQSUM1	Sums all values and calculates the square of this sum	(x₁+ x₂ + x₃+ ... )²

Determining the Aggregate State

An aggregate is a series of iterations. Each iteration processes one aggregate argument (which contains one column value) and performs the necessary computations to merge it into a partial result. The partial result is a snapshot of the aggregate arguments that the aggregate has merged so far. Once the aggregate has received all column values, it returns a value to the calling statement, based on the final partial result.

Each iteration of the aggregate is a separate invocation of a user-defined function. If user-allocated memory has the default PER_ROUTINE memory duration, the database server automatically deallocates it after only one iteration of the iteration function. (For more information, see Choosing the Memory Duration.) Therefore, while an iteration executes, it can access only the following information:

Its own local variables, which are deallocated at the end of each iteration and therefore unavailable to other iterations
The aggregate argument, which contains a new column value for each iteration
The aggregate state, which contains any nonlocal information that the iteration needs to perform its merge, including the partial result and any other external information (such as an operating-system file) that an iteration might need to access

During its invocation, each aggregate iteration merges the aggregate argument into the aggregate state and returns the updated state. The database server preserves the updated aggregate state and passes it into the next iteration of the aggregate. When you create a user-defined aggregate, you must determine what nonlocal information each iteration needs and then define an aggregate state to contain this information. Otherwise, the aggregate iterations cannot obtain the information they need to perform their computations.

Important:

Design the aggregate state so that each aggregate support function can obtain all the state information that it needs.

As a starting point, try using the data type of the aggregate argument for the aggregate state. Such a state is called a simple state. For more information, see Aggregate Support Functions for the Aggregate State.

For example, to determine the state for the SQSUM1 user-defined aggregate (which Table 103 describes), assess what tasks need to be performed in each iteration of SQSUM1. For each aggregate argument, SQSUM1 needs to add together the following values:

The aggregate argument, which is passed to each iteration of the aggregate
The partial sum of previous argument values, which must exist in the aggregate state

The data type that you choose for the aggregate state affects how the state must be managed. When the SQSUM1 aggregate receives INTEGER values as its aggregate arguments, the sum of these values is also an INTEGER value. Therefore, the SQSUM1 aggregate has an integer state, which holds the partial sum.

Writing the Aggregate Support Functions

An aggregate support function is a special-purpose user-defined function that implements some task in the aggregate algorithm. You write aggregate support functions to initialize, calculate, and return the aggregate result to the calling code. An aggregate algorithm can include the following kinds of aggregate support functions.

Aggregate Support Function	Algorithm Step
INIT	Possible initialization tasks that must be performed before the iterations can begin
ITER	An iteration step, which is performed on each aggregate argument and merges this argument into a partial result
COMBINE	Merging one partial result with another partial result, thus allowing parallel execution of the user-defined aggregate
FINAL	Post-iteration tasks that must be performed after all aggregate arguments were merged into a partial result

The following sections summarize each of the aggregate support functions.

INIT Function

The INIT aggregate support function performs the initialization for the user-defined aggregate. Table 104 summarizes possible initialization tasks.

Table 104. Initialization Tasks for the INIT Aggregate Support Function
Initialization Task	More Information
Set-up for any additional resources outside the state that the aggregation might need	Aggregate Support Functions That the Algorithm Requires
Initial calculations that the user-defined aggregate might need	Aggregate Support Functions That the Algorithm Requires
Allocation and initialization of the aggregate state that the rest of the aggregation computation might need	Aggregate Support Functions for the Aggregate State
Handling of an optional set-up argument	Implementing a Set-Up Argument

The INIT support function is an optional aggregate support function. If your aggregate algorithm does not require any of the tasks in Table 104, you do not need to define an INIT function. When you omit the INIT function from your user-defined aggregate, the database server performs the state management for the aggregate state. For more information, see Handling a Simple State.

To declare an INIT support function as a C function, use the following syntax:

agg_state init_func(dummy_arg, set_up_arg)
   agg_arg_type dummy_arg;
   set_up_type set_up_arg; /* optional */

agg_state: is the data type of the aggregate state.
dummy_arg: is a dummy parameter that has the same data type as the aggregate argument that this aggregate support function is to handle for the user-defined aggregate.
init_func: is the name of the INIT aggregate support function.
set_up_arg: is an optional parameter for the set-up argument. For more information, see Implementing a Set-Up Argument.

In the execution of a UDA, the database server calls the INIT function before it begins the actual aggregation computation. It passes in any optional set-up argument (set_up_arg) from the user-defined aggregate and copies any initialized aggregate state that INIT returns into the state buffer. For more information on the state buffer, see Aggregate Support Functions for the Aggregate State.

The first argument of the INIT function serves only to identify the data type of the aggregate argument that the user-defined aggregate handles. The routine manager uses this argument in routine resolution to determine the correct version of the overloaded INIT function. At the time of invocation, the routine manager just passes in a NULL value for the first argument of the INIT function, as the following syntax shows:

agg_state init_func(NULL, optional set-up argument)

Tip:

For more information on how the routine manager resolves the overloaded aggregate support functions, see the chapter on aggregates in the IBM Informix: User-Defined Routines and Data Types Developer's Guide.

Figure 96 shows the INIT aggregate support function that handles an INTEGER argument for the SQSUM1 user-defined aggregate (which Table 103 describes).

Figure 96. INIT Aggregate Support Function for SQSUM1 on INTEGER

/* SQSUM1 INIT support function on INTEGER */
mi_integer init_sqsum1(dummy_arg)
   mi_integer dummy_arg;
{
   return (0);
}

For other aggregate support functions of SQSUM1, see Figure 97, Figure 98, and Figure 99.

ITER Function

The ITER aggregate support function performs the sequential aggregation or iteration for the user-defined aggregation. It merges a single aggregate argument into the partial result, which the aggregate state contains. The ITER function is a required aggregate support function; that is, you must define an ITER function for every user-defined aggregate. If you do not define an ITER function for your user-defined aggregate, the database server generates an error.

Tip:

If the UDA does not have an INIT support function, the ITER support function can initialize the aggregate state. For more information, see Handling a Simple State.

If the UDA was registered with the HANDLESNULLS modifier in the CREATE AGGREGATE statement, the database server calls the ITER support function once for each aggregate argument that passed to the user-defined aggregate. Each aggregate argument is one column value. If you omit the HANDLESNULLS modifier from CREATE AGGREGATE, the database server does not call ITER for any NULL-valued aggregate arguments. Therefore, NULL-valued aggregate arguments do not contribute to the aggregate result.

For example, suppose you execute the SQSUM1 user-defined aggregate (which Table 103 describes) in the following query:

SELECT SQSUM1(col3) FROM tab1;

The tab1 table (which Figure 95 defines) contains 6 rows. Therefore, the preceding query (which contains no WHERE clause) causes 6 invocations of the SQSUM1 ITER function. Each invocation of this ITER function processes one value from the col3 column.

To declare an ITER function as a C function, use the following syntax:

agg_state iter_func(current_state, agg_argument)
   agg_state current_state;
   agg_arg_type agg_argument;

agg_state: is the data type of the current and updated aggregate state. After the ITER function merges the agg_argument into the current_state state, it returns a pointer to the updated state.
agg_arg_type: is the data type of the agg_argument, which is the data type that the aggregate support function handles for the user-defined aggregate.
agg_argument: is a single aggregate argument, usually a column value, which the ITER function merges into the partial result in the current_state aggregate state.
current_state: is the current aggregate state, which previous calls to the ITER function and to the INIT function have generated.
iter_func: is the name of the ITER aggregate support function.
Important:

Make sure that the ITER support function obtains all state information that it needs from its "current_state" argument. The INIT function cannot maintain additional state information as user data in its MI_FPARAM structure because MI_FPARAM is not shared among the other aggregate support functions. However, the ITER function can store user data in MI_FPARAM that is not part of the aggregate result.

Figure 97 shows the ITER aggregate support function that handles an INTEGER argument for the SQSUM1 user-defined aggregate (which Table 103 describes).

Figure 97. ITER Aggregate Support Function for SQSUM1 on INTEGER

/* SQSUM1 ITER support function on INTEGER */
mi_integer iter_sqsum1(state, value)
   mi_integer state;
   mi_integer value;
{
   /* add 'state' and 'value' together */
      return (state + value);
}

For other aggregate support functions of SQSUM1, see Figure 96, Figure 98, and Figure 99.

COMBINE Function

The COMBINE aggregate support function allows your user-defined aggregate to execute in a parallel query. When a query that contains a user-defined aggregate is processed in parallel, each parallel thread operates on a one subset of selected rows. The COMBINE function merges the partial results from two such subsets. This aggregate support function ensures that the result of aggregating over a group of rows sequentially is the same as aggregating over two subsets of the rows in parallel and then combining the results.

The COMBINE function is required for parallel execution. When a query includes a user-defined aggregate, the database server uses parallel execution when the query includes only aggregates. However, the COMBINE function might be used even when a query is not parallelized. For example, when a query contains both distinct and non-distinct aggregates, the database server can decompose the computation of the non-distinct aggregate into sub-aggregates based on the distinct column values. Therefore, you must provide a COMBINE function for every user-defined aggregate.

If you do not define an COMBINE function for your user-defined aggregate, the database server generates an error. However, if your user-defined aggregate uses a simple state, the COMBINE function can be the same as the ITER function. For more information, see Handling a Simple State.

To declare a COMBINE function as a C function, use the following syntax:

agg_state combine_func(agg_state1, agg_state2)
   agg_state agg_state1, agg_state2;

agg_state: is the data type of the two partial aggregate states (agg_state1 and agg_state2) as well as the updated aggregate state, which the COMBINE function returns.
agg_state1: is the aggregate state from one parallel thread.
agg_state2: is the aggregate state from the second parallel thread.
combine_func: is the name of the COMBINE aggregate support function.

In the execution of a UDA, the database server calls the COMBINE once for each pair of threads (agg_state1 and agg_state2) that execute a parallel query that contains the user-defined aggregate. When the COMBINE function combines two partial results, it might also need to release resources associated with one of the partial results.

Figure 98 shows the COMBINE aggregate support function that handles an INTEGER argument for the SQSUM user-defined aggregate (which Table 103 describes).

Figure 98. COMBINE Aggregate Support Function for SQSUM1 on INTEGER

/* SQSUM1 COMBINE support function on INTEGER */
mi_integer combine_sqsum1(state1, state2)
   mi_integer state1, state2;
{
   /* Return the new partial sum from two parallel partial
    * sums
    */
   state1 += state2;
   return (state1);
}

For other aggregate support functions of SQSUM1, see Figure 96, Figure 97, and Figure 99. For more information on parallel execution of a UDA, see Executing a User-Defined Aggregate in Parallel Queries.

FINAL Function

The FINAL aggregate support function performs the post-iteration tasks for the user-defined aggregate. Table 105 summarizes possible post-iteration tasks.

Table 105. Post-Iteration Tasks for the FINAL Aggregate Support Function
Post-Iteration Task	More Information
Type conversion of the final state into the return type of the user-defined aggregate	Returning an Aggregate Result Different from the Aggregate State
Post-iteration calculations that the user-defined aggregate might need	Aggregate Support Functions That the Algorithm Requires
Deallocation of memory that the INIT aggregate support function has allocated	Managing a Pointer-Valued State

The FINAL function is an optional aggregate support function. If your user-defined aggregate does not require one of the tasks in Table 105, you do not need to define a FINAL function. When you omit the FINAL function from your user-defined aggregate, the database server returns the final aggregate state as the return value of the user-defined aggregate. If this state does not match the expected data type of the aggregate return value, the database server generates a data type mismatch error.

Important:

In general, the FINAL support function must not deallocate the aggregate state. Only for a pointer-valued state (in which the aggregate support functions must handle all state management) does the FINAL support function need to deallocate the state. For more information, see Managing a Pointer-Valued State.

To declare a FINAL function as a C function, use the following syntax:

agg_type final_func(final_state)
   agg_state final_state;

agg_type: is the data type of the aggregate result, which is what the user-defined aggregate returns to the SQL statement in which it was invoked.
final_state: is the final aggregate state, as previous calls to the INIT and ITER functions have generated.
final_func: is the name of the FINAL aggregate support function.

In the execution of a UDA, the database server calls the FINAL function after all iterations of the ITER function are complete.

Figure 99 shows the aggregate support functions that handle an INTEGER argument for the SQSUM user-defined aggregate (which Table 103 describes).

Figure 99. FINAL Aggregate Support Function for SQSUM1 on INTEGER

/* SQSUM1 FINAL support function on INTEGER */
mi_integer final_sqsum1(state)
   mi_integer state;
{
   /* Calculate square of sum */
   state *= state;

   return (state);
}

For other aggregate support functions of SQSUM1, see Figure 96, Figure 97, and Figure 98.

Defining the User-Defined Aggregate

You can define the user-defined aggregate before you create the C implementation of the aggregate support functions. However, you must ensure that the names of the C functions match the names in the CREATE FUNCTION statements that register them.

To define a user-defined aggregate in a database with SQL

The development steps for a UDA list the definition of the UDA after the aggregate support functions are written. However, the CREATE AGGREGATE statement does not verify that the aggregate support functions it lists were registered nor does the CREATE FUNCTION statement verify that the executable C code exists.

Figure 100 shows the CREATE AGGREGATE statement that defines the SQSUM1 user-defined aggregate (which Table 103 describes) in the database.

Figure 100. Registering the SQSUM1 User-Defined Aggregate

CREATE AGGREGATE sqsum1
   WITH (INIT = init_sqsum1,
       ITER = iter_sqsum1,
       COMBINE = combine_sqsum1,
       FINAL = final_sqsum1);

Suppose that the INIT, ITER, COMBINE, and FINAL aggregate support functions for the SQSUM1 aggregate are compiled and linked into a shared-object module named sqsum.

UNIX/Linux Only

On UNIX or Linux, the executable code for the SQSUM1 aggregate support functions would be in a shared library named sqsum.so.

End of UNIX/Linux Only

Figure 101 shows the CREATE FUNCTION statements that register the aggregate support functions for the SQSUM1 user-defined aggregate to handle the INTEGER data type.

Figure 101. Registering the Aggregate Support Functions for SQSUM1 to Handle INTEGER

CREATE FUNCTION init_sqsum1(dummy_arg INTEGER)
   RETURNS INTEGER
   WITH (HANDLESNULLS)
   EXTERNAL NAME '/usr/udrs/aggs/sums/sqsum.so'
   LANGUAGE C;

CREATE FUNCTION iter_sqsum1(state INTEGER, 
      one_value INTEGER)
   RETURNS INTEGER
   EXTERNAL NAME '/usr/udrs/aggs/sums/sqsum.so'
   LANGUAGE C;

CREATE FUNCTION combine_sqsum1(state1 INTEGER, 
      state2 INTEGER)
   RETURNS INTEGER
   EXTERNAL NAME '/usr/udrs/aggs/sums/sqsum.so'
   LANGUAGE C;

CREATE FUNCTION final_sqsum1(state INTEGER)
   RETURNS INTEGER
   EXTERNAL NAME '/usr/udrs/aggs/sums/sqsum.so'
   LANGUAGE C;

The registered names of the aggregate support functions must match the names that the CREATE AGGREGATE statement lists. For example, in Figure 101, a CREATE FUNCTION statement registers the INIT support function as init_sqsum1, which is the same name that the INIT option lists in the CREATE AGGREGATE statement (see Figure 100).

In addition, the CREATE FUNCTION for the INIT support function must include the HANDLESNULLS routine modifier so that the database server can pass the INIT function the dummy NULL-valued argument.

For more information on the use of the CREATE AGGREGATE and CREATE FUNCTION statements to define user-defined aggregates, see the chapter on aggregates in the IBM Informix: User-Defined Routines and Data Types Developer's Guide. For the syntax of these statements, see the IBM Informix: Guide to SQL Syntax.

Using the User-Defined Aggregate

After you complete the aggregate-development steps, the end user can use the user-defined aggregate on the defined data type in SQL statements. However, use of the user-defined aggregate does assume the appropriate privileges.

For the tab1 table, which Figure 95 defines, the following query uses the new SQSUM1 aggregate function on the INTEGER column, col3:

SELECT SQSUM1(col3) FROM tab1;

With the rows that Figure 95 has inserted, the preceding query yields an INTEGER value of 10201.

To be able to use SQSUM1 on other data types, you need to ensure that the appropriate aggregate support functions exist for this data type. For example, SQSUM2 User-Defined Aggregate shows the definition of a version of the SQSUM aggregate on both an INTEGER and a named row type.

Determining Required Aggregate Support Functions

Figure 102 shows the execution sequence of aggregate support functions for a user-defined aggregate that is not executed in a parallel query.

Figure 102. Execution of a Nonparallel User-Defined Aggregate

begin figure description - This figure is described in the surrounding text. - end figure description

Tip:

For information about how to execute a user-defined aggregate in parallel queries, see Executing a User-Defined Aggregate in Parallel Queries.

These aggregate support functions use the aggregate state to pass shared information between themselves.

As you design your aggregate algorithm, you must determine which of the support functions the algorithm requires. As a minimum, the user-defined aggregate must have an ITER function. It is the ITER function that performs a single iteration of the aggregate on one aggregate argument. Although Figure 102 shows the execution of both the INIT and FINAL support functions, these functions are optional for a user-defined aggregate. In addition, the COMBINE function, though required, often does not require separate code; it can simply call the ITER function.

Writing aggregate support functions that your user-defined aggregate does not require means unnecessary coding and execution time. Therefore, it is important to assess your aggregate for required functions. The following table shows the design decisions in the determination of required aggregate support functions.

Design Decision	Aggregate Support Functions Involved	More Information
Does the algorithm require initialization or clean-up tasks?	INIT and FINAL	Aggregate Support Functions That the Algorithm Requires
Does the aggregate have a simple aggregate state?	INIT, COMBINE, and FINAL	Yes: Handling a Simple State No: Handling a Nonsimple State
Does the aggregate have a set-up argument?	INIT	Implementing a Set-Up Argument
Does the aggregate return a value whose data type is different from the aggregate state?	FINAL	Returning an Aggregate Result Different from the Aggregate State
Does the aggregate have special needs to run in a parallel query?	COMBINE	Executing a User-Defined Aggregate in Parallel Queries

You can overload the aggregate support functions to provide support for different data types. Any overloaded version of the UDA, however, cannot omit any of the aggregate support functions that the CREATE AGGREGATE statement has listed or use any support function that CREATE AGGREGATE has not specified.

When the database server executes a UDA (regardless of the data type of the aggregate argument), the database expects to find all the aggregate support functions that the CREATE AGGREGATE statement has registered. Therefore, if you omit a support function for one of the reasons in the preceding table, all versions of the aggregate for all data types must be able to execute using only the aggregate support functions that CREATE AGGREGATE specifies.

Aggregate Support Functions That the Algorithm Requires

To implement a user-defined aggregate, you must develop an algorithm that calculates an aggregate return value based on the aggregate-argument values. You need to break this algorithm into the following steps.

Algorithm Step	Aggregate Support Function
Calculations or initializations that must be done before the iterations can begin	INIT
Calculations that must be done on each aggregate argument to merge this argument into a partial result	ITER
Post-iteration tasks must be performed after all aggregate arguments were merged into a partial result	FINAL

All aggregate algorithms must include an ITER function to handle each aggregate argument. However, the INIT and FINAL support functions are optional. To determine whether your algorithm requires an INIT or FINAL function, make the following design assessments:

Are there calculations or initializations that must be done before the iterations can begin?
If the algorithm requires additional resources to perform its task (such as operating-system files or smart large objects), use the INIT function to set up these resources. The INIT function can also initialize the partial result.
Are there post-iteration tasks that must be performed after all aggregate arguments were merged into a partial result?
If the INIT function has set up resources to perform the aggregation, the FINAL function can deallocate or close resources so that they are free for other users. In addition, if the aggregation requires calculations that must be performed on the final partial result, use the FINAL function to perform these calculations.

For example, the following algorithm defines the SQSUM1 aggregate (which Table 103 describes):

(x₁+ x₂ + ... )²

where each x_i is one column value; that is, one aggregate argument. The ITER function for SQSUM1 takes a single aggregate argument and adds it to a partial sum (see Figure 97). The algorithm does not require initialization of additional resources. Therefore, no INIT function is required for this task. However, the INIT function can initialize the partial result (see Figure 96).

The SQSUM1 user-defined aggregate does require post-iteration calculations. When the last iteration is reached, the partial sum needs to be squared to obtain the aggregate return value. This final calculation is performed in a FINAL function and returned as the return value for the SQSUM1 aggregate (see Figure 99).

The SUMSQ user-defined aggregate (described on page SUMSQ User-Defined Aggregate) is an example of a user-defined aggregate that requires neither initialization nor post-iteration tasks. Therefore, it does not require the INIT and FINAL support functions.

Aggregate Support Functions for the Aggregate State

The aggregate support functions pass information about the aggregate among themselves in the aggregate state.

Tip:

For an explanation of the aggregate state, see Determining the Aggregate State.

The database server invokes all aggregate support functions as regular UDRs. Therefore, each support function has a default memory duration of PER_ROUTINE, which means that the database server frees any memory that the support function allocates when that function terminates. However, the aggregate state must be valid across all invocations of all aggregate support functions, including possible multiple iterations of the ITER function. Therefore, a special state buffer (with a PER_COMMAND memory duration) must exist so that the aggregate state is available to all aggregate support functions. The database server then passes this state buffer to each invocation of an aggregate support function.

The purpose of the INIT support function is to return a pointer to the initialized aggregate state. To determine whether your aggregate state requires an INIT support function for state management, assess the data type of this state. The aggregate-state data type determines whether the INIT function must handle state management, as follows:

If the data type of the aggregate state is the same as the aggregate argument, the aggregate has a simple state.
If your user-defined aggregate uses a simple state, the database server can perform the state management. It can automatically allocate the state buffer for the aggregate state. Therefore, the INIT support function does not need to handle this allocation.
If the data type of the aggregate state is different from the aggregate argument, the aggregate has a nonsimple state.
There are several types of non-simple states possible. The database server cannot perform state management for these non-simple states. Instead, the INIT support functions must perform the state-management tasks to handle non-simple states.

Once the user-defined aggregate has an aggregate state, the database server passes a pointer to this state buffer to each invocation of the ITER function and to the FINAL function.

Handling a Simple State

A simple state is an aggregate state whose data type is the same as the aggregate argument. At any point in the iteration, a simple state contains only the partial result of the aggregation. For example, the SUM built-in aggregate uses a simple state because its state contains only the partial result: the running total of the aggregate arguments. When the SUM aggregate operates on INTEGER aggregate arguments, it creates an integer partial sum for these arguments. Therefore, the data type of its aggregate argument and aggregate state is the same.

However, the AVG built-in aggregate does not use a simple state. Because it must divide the total by the number of values processed, its state requires two values: the running total and the number of arguments processed. When the AVG aggregate operates on INTEGER aggregate arguments, it creates an integer partial sum and an integer count for these arguments. Therefore, the data type of its aggregate argument (INTEGER) cannot be not the same as its aggregate state (two INTEGER values).

When a user-defined aggregate has a simple state, the following items apply:

The INIT aggregate support function does not need to allocate the aggregate state.
In this case, the database server automatically performs the state management. If a UDA with a simple state does not include any other tasks that require an INIT support function (see Table 104), you can omit the INIT function from the definition of the UDA. The only possible state-management task you might want to perform in the INIT function is state initialization. For more information, see When to Allocate and Deallocate a State.
The COMBINE aggregate support function can just call the ITER support function.
In this case, you do not have to create special code in the COMBINE function for the handling of parallel execution. Instead, the ITER function can perform the merge of two partial results. For more information, see Executing a User-Defined Aggregate in Parallel Queries.
The FINAL function is not required if the data type of the simple state is the same as the aggregate result.
In this case, the aggregate argument, aggregate state, and aggregate result have the same data type. Such a user-defined aggregate is called a simple binary operator. If a UDA is a simple binary operator and does not include any other tasks that require a FINAL support function (see Table 105), you can omit the FINAL function from the definition of the UDA. For more information, see Returning an Aggregate Result Different from the Aggregate State.

When a UDA does not include an INIT function, the database server takes the following state-management steps:

Allocates the PER_COMMAND state buffer to hold the aggregate state
The database server can determine the size of the state buffer from the data type of the aggregate argument, which is passed into the user-defined aggregate.
Initializes this aggregate state to an SQL NULL value whose data type is the same as the aggregate argument
Passes the NULL-valued aggregate state to the first invocation of the ITER support function
The ITER support function can just use the system-allocated state buffer to hold the state information. If you have some minor initialization tasks that you need to perform, the ITER function can check for a NULL-valued aggregate state on its first iteration. If the state is NULL, ITER can initialize the state to its appropriate value. In this way, you can perform minor state initialization without the overhead of a separate invocation of the INIT function.

When a UDA does not include a FINAL function, the database server passes the final state as the aggregate result of the user-defined aggregate.

The implementation on the SQSUM1 aggregate includes an INIT support function that initializes the aggregate state (see Figure 96). However, because SQSUM1 has a simple state, this INIT function is not required. Instead, an ITER function can check for a NULL-valued state and perform the state initialization. The ITER support function of the SQSUM2 aggregate (see Figure 106) shows this type of implementation.

The SUMSQ user-defined aggregate (described on page SUMSQ User-Defined Aggregate) also has a simple state and therefore does not require an INIT support function for state management.

Handling a Nonsimple State

When the data type of the aggregate argument is not adequate for the state information, you must use a nonsimple state for your UDA. A nonsimple state is an aggregate state whose data type is not the same as the aggregate argument. Possible uses for a nonsimple state include an aggregate state that contains:

An aggregate state that contains more information than the aggregate-argument data type can hold
An aggregate state that contains information of a data type different than that of the aggregate argument

When the aggregate-argument data type is not adequate for the state information, you must determine the appropriate data structure to hold the state information. This data structure depends upon the size of the aggregate state that you need to maintain, as the following table shows.

Nonsimple State	Description	More Information
Single-valued state	Consists of information that can be stored in an Informix built-in data type	Managing a Single-Valued State
Opaque-type state	Consists of several values, but these values do not exceed the maximum size of an opaque data type	Managing an Opaque-Type State.
Pointer-valued state	Consists of values whose size does exceed the maximum size of an opaque data type	Managing a Pointer-Valued State.

If your user-defined aggregate uses a single-valued or opaque-type state, the database server can still perform the state management. However, it cannot provide all necessary state management for a pointer-valued state.

Managing a Single-Valued State

A single-valued state uses a built-in SQL data type to hold the aggregate state. Use a single-valued state for an aggregate state that can fit into a built-in data type but whose data type does not match that of the aggregate argument.

Tip:

Built-in SQL data types are provided data types. For more information, see the chapter on data types in the IBM Informix: Guide to SQL Reference.

To use a single-valued state for a UDA

Write the appropriate aggregate support functions so that they handle a single-valued state.
Declare the state parameters and return values of the aggregate support functions to use the DataBlade API data type that corresponds to the built-in SQL data type that your state requires. For a list of these data type correspondences, see Table 4. Information on how to handle state management for a single-valued state is provided below.
Register the aggregate support functions with the CREATE FUNCTION statement.
Specify the built-in SQL data type as the data type of the state parameters and return values in the function signatures of the aggregate support functions.

The database server can perform memory management for a single-valued state because it can determine the size of a built-in data type. Therefore, you do not need to allocate memory for a single-valued state in the INIT support function.

In the ITER function, you can initialize or update a single-valued state in either of the following ways:

In-place state update
To modify the state, change the value (or values) of the DataBlade API variable that the database server has passed into the ITER function as the state argument.
Allocate a new state
To modify the state, declare a local variable or allocate PER_ROUTINE memory for a new variable and put the new values into this variable.

For more information, see When to Allocate and Deallocate a State.

For a single-valued state, the FINAL support function does not need to perform any state-management tasks. However, it must convert to the data type of the final aggregate state to that of the aggregate result. For more information, see Returning an Aggregate Result Different from the Aggregate State.

Managing an Opaque-Type State

An opaque-type state uses an opaque data type to hold the aggregate state. A possible use for an opaque-type state is to include an aggregate state that contains more information than the aggregate-argument data type or a built-in data type can hold. The size of an aggregate state that can be implemented as an opaque-type state is limited by the maximum size of an opaque type.

Important:

The maximum size of an opaque type is system dependent. On many systems, this limit is 32 kilobytes. Consult your machine notes for the limit on your system. If your aggregate state might contain more data than the opaque-type limit, you must use a pointer-valued state instead. For more information, see Managing a Pointer-Valued State.

To use an opaque-type state for a UDA, write the appropriate aggregate support functions so that they handle an opaque-type state.

Declare the state parameters and return values of the aggregate support functions to use the internal format of the opaque type. This internal format is usually a C struct structure. For more information, see Determining Internal Representation.

Handling State Management for an Opaque-Type State

After you register the opaque data type, the database server can obtain information about the data type of the state value when the CREATE FUNCTION statement registers the function signatures of the aggregate support functions.

You need to write the opaque-type support functions only if you need such functionality in the aggregate support functions. For example, the input and output support functions might be useful when you debug your UDA. If you do write opaque-type support functions, you must compile and link them into a shared-object module, as Compiling a C UDR describes.

Specify the registered opaque type as the data type of the state parameters and return values in the function signatures of the aggregate support functions.

The database server can perform memory management for an opaque-type state because it can determine the size of the opaque data type. The CREATE OPAQUE TYPE statement registers the opaque type, including its size, in the system catalog tables of the database. From the system catalog tables, the database server can determine the size of the aggregate state to allocate. Therefore, you do not need to allocate the opaque-type state in the INIT support function.

In the ITER function, you can initialize or update an opaque-type state in either of the following ways:

In-place state update
To modify the state, change the values in the internal opaque-type structure that the database server has passed into the ITER function as an argument.
Allocate a new state
To modify the state, allocate PER_ROUTINE memory for a new internal opaque-type structure and put the new values into this structure.

If you need to initialize the internal structure of the opaque type, the INIT or ITER function can allocate PER_ROUTINE memory for the structure perform the appropriate initializations. When the support function exits, the database server copies the contents of this PER_ROUTINE structure into the PER_COMMAND system-allocated state buffer.

For more information, see When to Allocate and Deallocate a State. For an example of a user-defined aggregate that uses an opaque-type state, see the description of the PERCENT_GTR aggregate on page PERCENT_GTR User-Defined Aggregate.

Managing a Pointer-Valued State

A pointer-valued state uses the POINTER data type as the aggregate state. The mi_pointer data type is the DataBlade API type that represents the SQL data type, POINTER. (For more information, see Pointer Data Types (Server).) Use a pointer-valued state when an aggregate state might contain more information than can fit into the maximum opaque-type size.

Important:

The maximum size of an opaque type is system dependent. On many systems, this limit is 32 kilobytes. Consult your machine notes for the limit on your system. If your aggregate state contains less data than the opaque-type limit, use an opaque-type state instead. For more information, see Managing an Opaque-Type State.

To use a pointer-valued state for a UDA

Write the appropriate aggregate support functions so that they handle a pointer-valued state.
Declare the state parameters and return values of the aggregate support functions to use the mi_pointer data type. Information on how to handle state management of a pointer-valued state follows.
Register the aggregate support functions with the CREATE FUNCTION statement.
Specify the POINTER data type for the state parameters and return values in the function signatures of the aggregate support functions.

The database server cannot perform state management for a pointer-valued state because it cannot determine the size of the state. The DataBlade API data type mi_pointer is a typedef for the following C data type:

void *

Because this data type is only a pointer, the database server cannot determine how large the aggregate state is. Therefore, it cannot allocate the PER_COMMAND system-allocated state buffer. In this case, the INIT and FINAL aggregate support functions are not optional. They must perform state management of the nonsimple aggregate state, as follows:

The INIT function can allocate and initialize the aggregate state.
The INIT function must also allocate any related resources that the aggregate state might need. Keep in mind that the database server does not interpret the contents of the pointer-valued state. It cannot manage any objects that the state type might reference. Therefore, use states with embedded pointers with caution.
The ITER function must perform an in-place update to initialize or modify a pointer-valued state.
Once you allocate the pointer-valued state, the database server passes a pointer to this state to the other aggregate support functions. Initialize or update the pointer-valued state only with an in-place update. For more information, see When to Allocate and Deallocate a State.
The FINAL function can handle deallocation of resources that the INIT function has set up.
For a pointer-valued state, the FINAL function must always deallocate the aggregate state. If your INIT support function has allocated related resources that the aggregate state uses, make sure that the FINAL function deallocates these resources.

Important:

Make sure that you use a memory duration that extends for the life of the user-defined aggregate. A PER_ROUTINE memory duration (the default) expires after one invocation of the ITER function completes. Therefore, you must use a memory duration of at least PER_COMMAND for memory associated with the state.

When to Allocate and Deallocate a State

To each invocation of the ITER support function, the database server automatically passes a pointer to the state buffer. When you need to initialize or update the state information, the ITER function can handle the modification in either of two ways, as the following table describes.

Changing the State	State Memory Duration	Results
Merge the aggregate argument into the existing state in-place and return the existing state.	The existing state has a PER_COMMAND memory duration: For single-valued and opaque-type states, this state is the system-allocated state buffer. For a pointer-valued state, this state is a user-allocated state buffer.	The new state value is at the address that the database server has passed into the ITER function. The ITER function then returns this address as the updated state. Because the state memory has a PER_COMMAND memory duration, the database server can re-use the same state for subsequent invocations of ITER.
Allocate fresh memory for a new state, merge the existing state with the new aggregate argument into this state, and return this new state.	The new state has a PER_ROUTINE memory duration: For a single-valued state, this state can be either a declared local variable or user-allocated PER_ROUTINE memory. For an opaque-type state, the new state must be user-allocated PER_ROUTINE memory. For a pointer-valued state, this state is user-allocated memory with either a PER_ROUTINE or PER_COMMAND memory duration. However, for PER_COMMAND memory, you must also handle deallocation of the old state. For more information, see Managing a Pointer-Valued State.	The new state value is at the address of the new state. The ITER function then returns the address of the new state as the updated state. Because this memory has a PER_ROUTINE memory duration, the database server must copy the returned state back into the PER_COMMAND buffer.

The new state method can be slower than the in-place method. Design your ITER support function to use the in-place method whenever possible. When the database server can skip the copy operation, you can improve performance of your UDA.

To determine which of these methods was used in the ITER support function, the database server compares the state value that ITER returns and the state value that was passed into ITER. If these two pointers identify the same memory location, the ITER function has modified the state in-place. Therefore, the database server does not need to perform the copy operation. If these two pointers identify different memory locations, the database server proceeds with the copy operation.

Aggregate support functions have the following restrictions on the deallocation of an aggregate state:

For any state other than a pointer-valued state, no aggregate support function must deallocate the state memory.
No aggregate support function can return a NULL-valued pointer as the state.

Implementing a Set-Up Argument

You can define a UDA so that the end user can supply a set-up argument to the aggregate. The set-up argument can customize the aggregate for a particular invocation. For example, the PERCENT_GTR user-defined aggregate (see page PERCENT_GTR User-Defined Aggregate) determines the percentage of numbers greater than a particular value. The UDA could have been implemented so that the value to compare against is hardcoded into the UDA. However, this would mean a separate user-defined aggregate that checks for values greater than 10, another that checks for values greater than 15, and so on.

Instead, the PERCENT_GTR aggregate accepts the value to compare against as a set-up argument. In this way, the end user can determine what values are needed, as follows:

SELECT PERCENT_GTR(col1, 10) FROM tab1; -- values > 10;
SELECT PERCENT_GTR(col1, 15) FROM tab1; -- values > 15;

The database server passes in the set-up argument as the second argument to the INIT function. Therefore, the INIT support function must handle the set-up argument. Usually, this handling involves performing any initial processing required for the value and then saving this value in the aggregate state. It might also check for a possible SQL NULL value as a set-up argument.

This set-up argument is optional, in the sense that you can define a UDA with one or without one. However, if you define your UDA to include a set-up argument, the end user must provide a value for this argument. When the UDA is invoked with two arguments (aggregate argument and set-up argument), the database server looks for an INIT function with two arguments. If you omit the set-up argument when you invoke the UDA, the database server looks for an INIT function with just one argument.

To indicate no set-up argument, the end user can provide the SQL NULL value as a set-up value. However, if you really want to make the set-up argument truly optional for the end user, you must create and register two INIT functions:

One that takes two arguments
One that takes only one argument
In this case, you could assign the set-up argument some known default value.

As the writer of the UDA, you need to decide whether this feature is useful.

Returning an Aggregate Result Different from the Aggregate State

The aggregate result is the value that the user-defined aggregate returns to the calling SQL statement. If the user-defined aggregate does not include a FINAL support function, the database server returns the final aggregate state; that is, it returns the value of the aggregate state after the last aggregate iteration. However, if your UDA needs to return a value whose data type is different from the aggregate state, use a FINAL support function to convert the final aggregate state to the data type that you want to return from the aggregate.

For example, the PERCENT_GTR user-defined aggregate (see page PERCENT_GTR User-Defined Aggregate) returns the percentage of values greater than some value as a percentage; that is, as a fixed-point number in the range 0.00 to 100.00. To handle integer values, the user-defined aggregate would require an aggregate state that holds the following values:

The total number of aggregate arguments greater than 10
The total number of aggregate arguments
The value to compare against

However, the aggregate result of PERCENT_GTR is a fixed-point number. Therefore, you would not want the aggregate to return the final state to the calling SQL statement. Instead, the FINAL support function needs to perform the following steps:

Divide the total number of arguments by the total that are greater than 10.
Return the fixed-point quotient, which is the percentage of values greater than 10.

For the complete example of the PERCENT_GTR user-defined aggregate, see page PERCENT_GTR User-Defined Aggregate.

Executing a User-Defined Aggregate in Parallel Queries

The database server can break up the aggregation computation into several pieces and compute them in parallel. Each piece is computed sequentially as follows:

The INIT support function initializes execution in the parallel thread.
For each aggregate argument in the subset, the ITER support function merges the aggregate argument into a partial result.

The database server then calls the COMBINE support function to merge the partial states, two at a time, into a final state. For example, for the AVG built-in aggregate, the COMBINE function would add the two partial sums and adds the two partial counts. Finally, the database server calls the FINAL support function on the final state to generate the aggregate result.

Figure 103 shows the execution sequence of aggregate support functions for a user-defined aggregate that is executed in two parallel threads.

Figure 103. Parallel Execution of a UDA

Figure 103 shows how the COMBINE function is used to execute a user-defined aggregate with two parallel threads. For more than two parallel threads, the database server calls COMBINE on two thread states to obtain one, combines this state with another thread state, and so on until it has processed all parallel threads. The database server makes the decision whether to parallelize execution of a user-defined aggregate and the degree of such parallelism. However, these decisions are invisible to the end user.

Parallel aggregation must give the same results as an aggregate that is not computed in parallel. Therefore, you must write the COMBINE function so that the result of aggregating over the entire group of selected rows is the same as aggregating over two partitions of the group separately and then combining the results.

For an example of COMBINE functions in user-defined aggregates, see Sample User-Defined Aggregates.

Sample User-Defined Aggregates

This section provides the sample user-defined aggregates that the following table describes.

User-Defined Aggregate	Description
SUMSQ	Squares each value and calculates the sum of these squared values
SQSUM2	Sums all values and calculates the square of this sum
PERCENT_GTR	Determines the percentage of values greater than a user-specified value
X_PERCENTILE	Determines the value in a group of values that is the x-percentile, where x is a percent that the end user specifies.

Each description includes the aggregate support functions written in C and the SQL statements to define the user-defined aggregate in the database.

SUMSQ User-Defined Aggregate

The SUMSQ user-defined aggregate squares each value and calculates the sum of these squared values. It has the following algorithm:

x₁²+ x₂² + ...

where each x_i is one column value; that is, one aggregate argument.

To determine the aggregate state for SUMSQ, examine what information needs to be available for each iteration of the aggregate. To perform one iteration of SUMSQ, the ITER function must:

Square the aggregate argument.
The ITER function has access to the aggregate argument because the database server passes it in. Therefore, ITER does not require additional information to perform this step.
Add the squared argument to the partial sum of previous squared values.
To add in the squared argument, the aggregate must keep a partial sum of the previous squared values. For the ITER function to have access to the partial sum from the previous iterations, the aggregate state must contain it.

The SUMSQ has a simple state because the data type of the partial sum is the same as that of the aggregate argument. For example, when the SUMSQ aggregate receives INTEGER values, this partial sum is also an INTEGER value. Therefore, SUMSQ can allow the database server to manage this state, which has the following effect on the design of the aggregate support functions:

The INIT support function does not need to perform state management.
An aggregate with a simple state does not need to explicitly handle the allocation and deallocation of the aggregate state. Instead, the database server automatically allocates the aggregate state and initializes it to NULL. Therefore, the INIT function does not require other INIT-function tasks (see Table 104). Therefore, this support function can safely be omitted from the aggregate definition.
The COMBINE support function can be the same as its ITER function.
No special processing is required to merge two partial states. The ITER function can adequately perform this merge.

Before the iterations begin, the partial sum needs to be initialized to zero (0). However, because the INIT function is not required for state management, this aggregate initializes the state in the first invocation of its ITER function. The ITER function then calculates the square of a single aggregate argument, and adds this value to a partial sum. When the last iteration is reached, the final partial sum is the value that the SUMSQ aggregate returns. Therefore, the SUMSQ algorithm does not require a FINAL function for post-iteration tasks.

Figure 104 shows the required aggregate support functions that handle an INTEGER argument for the SUMSQ user-defined aggregate.

Figure 104. Aggregate Support Functions for SUMSQ on INTEGER

/* SUMSQ ITER support function on INTEGER */
mi_integer iter_sumsq(state, value, fparam)
   mi_integer state;
   mi_integer value;
   MI_FPARAM *fparam;
{
   /* If 'state' is NULL, this is the first invocation. 
    * Just return square of 'value'.
    */
   if ( mi_fp_argisnull(fparam, 0) )
      return (value * value);
   else /* add 'state' and square of 'value' together */
      return (state + (value * value));
}

/* SUMSQ COMBINE support function on INTEGER */
mi_integer combine_sumsq(state1, state2)
   mi_integer state1, state2;
{
   /* Return the new partial sum from two parallel partial
    * sums
    */
   return (iter_sumsq(state1, state2));
}

The following SQL statement registers the SUMSQ user-defined aggregate in the database:

CREATE AGGREGATE sumsq
   WITH (ITER = iter_sumsq,
       COMBINE = combine_sumsq);

This CREATE AGGREGATE statement lists only the aggregate support functions that are required to implement SUMSQ: ITER and COMBINE.

Suppose that the ITER and COMBINE aggregate support functions for the SUMSQ aggregate are compiled and linked into a shared-object module named sumsq.

UNIX/Linux Only

On UNIX or Linux, the executable code for the SUMSQ aggregate support functions would be in a shared library named sumsq.so.

End of UNIX/Linux Only

Figure 105 shows the CREATE FUNCTION statements that register the aggregate support functions for SUMSQ to handle INTEGER aggregate arguments.

Figure 105. Registering the SUMSQ Aggregate Support Functions for INTEGER

CREATE FUNCTION iter_sumsq(state INTEGER, one_value INTEGER)
   RETURNS INTEGER
   WITH (HANDLESNULLS)
   EXTERNAL NAME '/usr/udrs/aggs/sums/sumsq.so'
   LANGUAGE C;

CREATE FUNCTION combine_sumsq(state1 INTEGER, state2 INTEGER)
   RETURNS INTEGER
   EXTERNAL NAME '/usr/udrs/aggs/sums/sumsq.so'
   LANGUAGE C;

For the tab1 table, which Figure 95 defines, the following query uses the new SUMSQ aggregate function on the INTEGER column, col3:

SELECT SUMSQ(col3) FROM tab1;

With the rows that Figure 95 has inserted, the preceding query yields an INTEGER value of 2173. To be able to use SUMSQ on other data types, you need to ensure that the appropriate aggregate support functions exist for this data type.

SQSUM2 User-Defined Aggregate

The SQSUM2 user-defined aggregate is another version of the SQSUM1 aggregate, which Table 103 describes. Its algorithm is the same as SQSUM1:

(x₁+ x₂ + ... )²

where each x_i is one column value; that is, one aggregate argument.

However, the SQSUM2 aggregate takes advantage of the fact that this aggregate has a simple state. Because the database server automatically handles state management, the SQSUM2 aggregate can safely omit the INIT function.

Figure 106 shows the aggregate support functions that handle an INTEGER argument for the SQSUM2 user-defined aggregate.

Figure 106. Aggregate Support Functions for SQSUM2 on INTEGER

/* SQSUM2 ITER support function on INTEGER */
mi_integer iter_sqsum2(state, value, fparam)
   mi_integer state;
   mi_integer value;
   MI_FPARAM *fparam;
{
   /* If 'state' is NULL, this is the first invocation. 
    * Just return 'value'.
    */
   if ( mi_fp_argisnull(fparam, 0) )
      return (value);
   else /* add 'state' and 'value' together */
      return (state + value);
}

/* SQSUM2 COMBINE support function on INTEGER */
mi_integer combine_sqsum2(state1, state2)
   mi_integer state1, state2;
{
   /* Return the new partial sum from two parallel partial
    * sums
    */
   return (iter_sqsum2(state1, state2));
}

/* SQSUM2 FINAL support function on INTEGER */
mi_integer final_sqsum2(state)
   mi_integer state;
{
   /* Calculate square of sum */
   state *= state;

   return (state);
}

In its first invocation, the ITER function performs the state initialization. It then takes a single aggregate argument and adds it to a partial sum. For aggregates with a simple state, the COMBINE function can be the same as the ITER function. Therefore, this COMBINE function just calls iter_sumsq2( ) to perform the merge of two partial states.

Tip:

The ITER function in Figure 106 could use the binary operator plus( ) to perform the addition. This operator is already defined on the INTEGER data type and therefore would not need to be written or registered. To use plus( ) in ITER, you would need to ensure that it is defined for the data type on which the SQSUM2 aggregate is defined.

The data type of the aggregate result is also the same as the aggregate state. Therefore, SQSUM2 is a simple binary operator and the FINAL support function is not needed to convert the data type of the final state. However, the SQSUM2 aggregate still does require a FINAL support function. The SQSUM2 algorithm involves a post-iteration calculation: it must square the final sum to obtain the aggregate return value. The FINAL function performs this final calculation and returns it as the aggregate result for the SQSUM2 aggregate.

Suppose that the ITER, COMBINE, and FINAL aggregate support functions for the SQSUM2 aggregate are compiled and linked into a shared-object module named sqsum.

UNIX/Linux Only

On UNIX or Linux, the executable code for the SQSUM2 aggregate support functions would be in a shared library named sqsum.so.

End of UNIX/Linux Only

Once you have successfully compiled and linked the aggregate support functions, you can define the SQSUM2 aggregate in the database. Figure 107 shows the CREATE AGGREGATE statement that registers the SQSUM2 user-defined aggregate. This statement specifies the registered SQL names of the required aggregate support functions.

Figure 107. Registering the SQSUM2 User-Defined Aggregate

CREATE AGGREGATE sqsum2
   WITH (ITER = iter_sqsum2,
       COMBINE = combine_sqsum2,
       FINAL = final_sqsum2);

Figure 108 shows the CREATE FUNCTION statements that register the SQSUM2 aggregate support functions for the aggregate argument of the INTEGER data type.

Figure 108. Registering the SQSUM2 Aggregate Support Functions for INTEGER

CREATE FUNCTION iter_sqsum2(state INTEGER, one_value INTEGER)
   RETURNS INTEGER
   WITH (HANDLESNULLS)
   EXTERNAL NAME '/usr/udrs/aggs/sums/sqsum.so'
   LANGUAGE C;

CREATE FUNCTION combine_sqsum2(state1 INTEGER, state2 INTEGER)
   RETURNS INTEGER
   EXTERNAL NAME '/usr/udrs/aggs/sums/sqsum.so'
   LANGUAGE C;

CREATE FUNCTION final_sqsum2(state INTEGER)
   RETURNS INTEGER
   EXTERNAL NAME '/usr/udrs/aggs/sums/sqsum.so'
   LANGUAGE C;

In Figure 108, the CREATE FUNCTION statement that registers the ITER support function requires the HANDLESNULLS routine modifier because the aggregate does not have an INIT support function.

For the tab1 table, which Figure 95 defines, the following query uses the new SQSUM2 aggregate function on the INTEGER column, col3:

SELECT SQSUM2(col3) FROM tab1;

With the rows that Figure 95 has inserted, the preceding query yields an INTEGER value of 10201, which is the same value that the SQSUM1 aggregate returned for these same rows.

Now, suppose that you want to define the SQSUM2 user-defined aggregate on the complexnum_t named row type, which Figure 93 defines. This version of SQSUM2 must have the same aggregate support functions as the version that handles INTEGER (see Figure 107).

Aggregate Support Function	SQL Function Name	C Function Name
ITER	iter_sqsum2( )	iter_sqsum2_complexnum( )
COMBINE	combine_sqsum2( )	combine_sqsum2_complexnum( )
FINAL	final_sqsum2( )	final_sqsum2_complexnum( )

The following code shows the aggregate support functions that handle a complexnum_t named row type as an argument for the SQSUM2 user-defined aggregate:

/* SQSUM2 ITER support function for complexnum_t */
MI_ROW *iter_sqsum2_complexnum(state, value, fparam)
   MI_ROW *state;
   MI_ROW *value;
   MI_FPARAM *fparam;
{
   /* Compute the new partial sum using the complex_plus( ) 
    * function. Put the sum in a new MI_ROW, which 
    * complex_plus( ) allocates (and returns a pointer to)
    */
   return (complex_plus(state, value, fparam));
}

/* SQSUM2 COMBINE support function for complexnum_t */
MI_ROW *combine_sqsum2_complexnum(state1, state2, fparam)
   MI_ROW *state1, *state2;
   MI_FPARAM *fparam;
{
   MI_ROW *ret_state;

   ret_state = 
      iter2_sqsum2_complexnum(state1, state2, fparam);

   mi_free(state1);
   mi_free(state2);

   return (ret_state);
}

/* SQSUM2 FINAL support function for complexnum_t */
MI_ROW *final_sqsum2_complexnum(state)
   MI_ROW *state;
{
   MI_CONNECTION *conn;
   MI_TYPEID *type_id;
   MI_ROW_DESC *row_desc;

   MI_ROW *ret_row;
   MI_DATUM values[2];
   mi_boolean nulls[2] = {MI_FALSE, MI_FALSE};

   mi_real *real_value, *imag_value;
   mi_integer real_len, imag_len;
   mi_real sqsum_real, sqsum_imag;

   /* Extract complex values from state row structure */
   mi_value_by_name(state, "real_part", 
      (MI_DATUM *)&real_value, &real_len);
   mi_value_by_name(state, "imaginary_part", 
      (MI_DATUM *)&imag_value, &imag_len);

   /* Calculate square of sum */
   sqsum_real = (*real_value) * (*real_value);
   sqsum_imag = (*imag_value) * (*imag_value);

   /* Put final result into 'values' array */
   values[0] = (MI_DATUM)&sqsum_real;
   values[1] = (MI_DATUM)&sqsum_imag;

   /* Generate return row type */
   conn = mi_open(NULL, NULL, NULL);
   type_id = mi_typestring_to_id(conn, "complexnum_t");
   row_desc = mi_row_desc_create(type_id);
   ret_row = mi_row_create(conn, row_desc, values, nulls);

   return (ret_row);
}

Figure 109 shows the CREATE FUNCTION statements that register the SQSUM2 aggregate support functions for an aggregate argument of the complexnum_t data type.

Figure 109. Registering the SQSUM2 Aggregate Support Functions for the complexnum_t Named Row Type

CREATE FUNCTION iter_sqsum2(state complexnum_t, 
      one_value complexnum_t)
   RETURNS complexnum_t
   WITH (HANDLESNULLS)
   EXTERNAL NAME
'/usr/udrs/aggs/sums/sqsum.so(iter_sqsum2_complexnum)'
   LANGUAGE C;

CREATE FUNCTION combine_sqsum2(state1 complexnum_t, 
      state2 complexnum_t)
   RETURNS complexnum_t
   EXTERNAL NAME
'/usr/udrs/aggs/sums/sqsum.so(combine_sqsum2_complexnum)'
   LANGUAGE C;

CREATE FUNCTION final_sqsum2(state complexnum_t)
   RETURNS complexnum_t
   EXTERNAL NAME
'/usr/udrs/aggs/sums/sqsum.so(final_sqsum2_complexnum)'
   LANGUAGE C;

The following query uses the SQSUM2 aggregate function on the complexnum_t column, col2:

SELECT SQSUM2(col2) FROM tab1;

With the rows that Figure 95 has inserted, the preceding query yields a complexnum_t value of:

ROW(817.96, 1204.09)

PERCENT_GTR User-Defined Aggregate

Suppose you want to create a user-defined aggregate that determines the percentage of values greater than some user-specified value and returns this percentage as a fixed-point number in the range 0 to 100. The implementation of this UDA uses the following aggregate features:

Uses a set-up argument to allow the end user to specify the value to compare against
Uses an opaque-type state to hold the state information and initialize the state in the INIT support function
Uses a COMBINE function that must do more than just call the ITER support function
Returns an aggregate result whose data type is different from that of the aggregate argument
Handles NULL values as aggregate arguments

The PERCENT_GTR user-defined aggregate needs the following state information:

The user-specified set-up argument
The current number of values greater than the set-up argument
The current number of values processed

Therefore, it uses the following C structure, named percent_state_t, to hold the aggregate state:

typedef struct percent_state
   {
   mi_integer gtr_than;
   mi_integer total_gtr;
   mi_integer total;
   } percent_state_t;

Because the size of the percent_state_t structure never exceeds the maximum opaque-type size, PERCENT_GTR can use an opaque-type state to hold its aggregate state. The following code shows the INIT aggregate support function that handles an INTEGER argument for the PERCENT_GTR aggregate:

/* PERCENT_GTR INIT support function for INTEGER */
percent_state_t *init_percentgtr(dummy_arg, gtr_than, fparam)
   mi_integer dummy_arg;
   mi_integer gtr_than;
   MI_FPARAM *fparam;
{
   percent_state_t *state;

   /* Allocate PER_ROUTINE memory for state and initialize it */
   state = mi_alloc(sizeof(percent_state_t));

   /* Check for a NULL-valued set-up argument */
   if ( mi_fp_argisnull(fparam, 1) )
      state->gtr_than = 0;
   else
      state->gtr_than = gtr_than;
   state->total_gtr = 0;
   state->total = 0;

   return (state);
}

This INIT function performs the following tasks:

Handles a set-up argument
This set-up argument is the value that the end user specifies so that the aggregate knows which value to compare the aggregate arguments against. If the end user provides a NULL value for the set-up argument, PERCENT_GTR checks for values greater than zero (0).
Allocates PER_ROUTINE memory for the opaque-type state
The INIT function does not need to allocate memory for an opaque-type state because the database server can perform the state management. However, because PERCENT_GTR already requires an INIT function to handle the set-up argument, INIT allocates a PER_ROUTINE percent_state_t structure so that it can initialize the opaque-type state.

The following code implements the ITER aggregate support function that handles an INTEGER argument for the PERCENT_GTR aggregate:

/* PERCENT_GTR ITER support function for INTEGER */
percent_state_t *iter_percentgtr(curr_state, agg_arg, fparam)
   percent_state_t *curr_state;
   mi_integer agg_arg;
   MI_FPARAM *fparam;
{
   if ( mi_fp_argisnull(fparam, 1) == MI_TRUE )
      agg_arg = 0;

   if ( agg_arg > curr_state->gtr_than )
      curr_state->total_gtr += 1;

   curr_state->total += 1;

   return (curr_state);
}

The PERCENT_GTR aggregate is defined to handle NULL values (see Figure 110). This ITER function must check for a possible NULL aggregate argument. The function converts any NULL value to a zero (0) so that the numeric comparison can occur.

The following COMBINE aggregate support function handles an INTEGER argument for the PERCENT_GTR aggregate:

/* PERCENT_GTR COMBINE support function for INTEGER */
percent_state_t *combine_percentgtr(state1, state2)
   percent_state_t *state1;
   percent_state_t *state2;
{
   state1->total += state2->total;
   state1->total_gtr += state2->total_gtr;

   mi_free(state2);

   return(state1);
}

Because PERCENT_GTR does not have a simple state, its COMBINE function must explicitly perform the merging of two parallel threads, as follows:

It adds the two partial sums (total and total_gtr).
It deallocates the PER_COMMAND memory for the second parallel thread (merging of the two states was done "in-place" in state1).

The following code shows the FINAL aggregate support function that handles an INTEGER argument for the PERCENT_GTR aggregate:

/* PERCENT_GTR FINAL support function for INTEGER */
mi_decimal *final_percentgtr(final_state)
   percent_state_t *final_state;
{
   mi_double_precision quotient;
   mi_decimal return_val;
   mi_integer ret;

   quotient = 
      ((mi_double_precision)(final_state->total_gtr)) / 
      ((mi_double_precision)(final_state->total)) * 100;

   if ( (ret = deccvdbl(quotient, &return_val)) < 0 )
      ret = deccvasc("0.00", 4, &return_val);
      
   return (&return_val);
}

The PERCENT_GTR aggregate returns a data type different from the aggregate state. The FINAL function must convert the final state from the aggregate-state data type (percent_state_t) to the aggregate-result data type (DECIMAL).

Once you have successfully compiled and linked the aggregate support functions, you can define the PERCENT_GTR aggregate in the database. For a user-defined aggregate that uses an opaque-type state, this definition includes the following steps:

Use CREATE OPAQUE TYPE to register the opaque type that holds the opaque-type state.
Use CREATE AGGREGATE to register the aggregate.
Use CREATE FUNCTION to register the aggregate support functions.

The PERCENT_GTR aggregate uses a fixed-length opaque type, percent_state_t, to hold its opaque-type state. The following CREATE OPAQUE TYPE statement registers this opaque type:

CREATE OPAQUE TYPE percent_state_t (INTERNALLENGTH = 12);

The INTERNALLENGTH modifier specifies the size of the fixed-length C data structure, percent_state_t, that holds the opaque-type state.

Figure 110 shows the CREATE AGGREGATE statement that defines the PERCENT_GTR user-defined aggregate. This statement specifies the registered SQL names of the required aggregate support functions. It also includes the HANDLESNULLS modifier to indicate that the PERCENT_GTR aggregate does process NULL values as aggregate arguments. By default, the database server does not pass a NULL value to an aggregate.

Figure 110. Registering the PERCENT_GTR User-Defined Aggregate

CREATE AGGREGATE percent_gtr
   WITH (INIT = init_percent_gtr,
      ITER = iter_percent_gtr,
      COMBINE = combine_percent_gtr,
      FINAL = final_percent_gtr,
      HANDLESNULLS);

Suppose that the INIT, ITER, COMBINE, and FINAL aggregate support functions for the PERCENT_GTR aggregate are compiled and linked into a shared-object module named percent.

UNIX/Linux Only

On UNIX or Linux, the executable code for the PERCENT_GTR aggregate support functions would be in a shared library named percent.so.

End of UNIX/Linux Only

The following CREATE FUNCTION statements register the PERCENT_GTR aggregate support functions for an aggregate argument of the INTEGER data type:

CREATE FUNCTION init_percent_gtr(dummy INTEGER, gtr_val INTEGER)
RETURNING percent_state_t
WITH (HANDLESNULLS)
EXTERNAL NAME '/usr/udrs/aggs/percent/percent.so(init_percentgtr)'
LANGUAGE C;

CREATE FUNCTION iter_percent_gtr(state percent_state_t, one_value INTEGER)
RETURNS percent_state_t
WITH (HANDLESNULLS)
EXTERNAL NAME '/usr/udrs/aggs/percent/percent.so(iter_percentgtr)'
LANGUAGE C;

CREATE FUNCTION combine_percent_gtr(state1 percent_state_t, 
   state2 percent_state_t)
RETURNS percent_state_t
WITH (HANDLESNULLS)
EXTERNAL NAME '/usr/udrs/aggs/percent/percent.so(combine_percentgtr)'
LANGUAGE C;

CREATE FUNCTION final_percent_gtr (state percent_state_t)
RETURNS DECIMAL(5,2)
WITH (HANDLESNULLS)
EXTERNAL NAME '/usr/udrs/aggs/percent/percent.so(final_percentgtr)'
LANGUAGE C;

These CREATE FUNCTION statements register an SQL name for each of the aggregate support functions that you have written in C. They must all include the HANDLESNULLS routine modifier because the PERCENT_GTR aggregate handles NULL values.

The following query uses the PERCENT_GTR aggregate function on the INTEGER column, col3, to determine the percentage of values greater than 25:

SELECT PERCENT_GTR(col3, 20) FROM tab1;

With the rows that Figure 95 has inserted, the preceding query yields a DECIMAL(5,2) value of 33.33 percent: 2 of the 6 values are greater than 20 (24 and 31).

X_PERCENTILE User-Defined Aggregate

Suppose you want to create a user-defined aggregate that calculates the x-percentile for a group of values. The x-percentile is the number within the group of values that separates x percent of the values below and (100-x) percent above. The median is a special case of the x-percentile. It represents the 50th-percentile:

X_PERCENTILE(y, 50)

That is, the above aggregate returns the value within a sample of y values that has an equal number of values (50 percent) above and below it in the sample.

The implementation of this UDA uses the following aggregate features:

Uses a set-up argument to enable the end user to specify the x-percentile to obtain
Uses a pointer-valued state to hold the state information, and allocates and initializes the state in the INIT support function
Uses a COMBINE function that must do more than just call the ITER support function
Handles NULL values as aggregate arguments, including returning an SQL NULL value if the aggregate argument to return was NULL

The X_PERCENTILe user-defined aggregate needs the following state information:

The user-specified set-up argument
The current number of values processed
The current list of values processed
The current list of whether the values processed are NULL.

Therefore, X_PERCENTILE uses a C structure named percentile_state_t to hold the aggregate state:

#define MAX_N 1000

typedef struct percentile_state
   {
   mi_integer percentile;
   mi_integer count;
   mi_integer value_array[MAX_N];
   mi_integer value_is_null[MAX_N];
   } percentile_state_t;

Important:

The percentile_state_t structure stores the values processes in an in-memory array within the state. You could also choose to store these values elsewhere, such as in an operating-system file or in a separate location in memory. Each of these locations has advantages and disadvantages. Choose the structure that best fits your application needs.

The size of the percentile_state_t structure depends on the number of aggregate arguments stored in the value_array array; that is, values less then or equal to the MAX_N constant. On a system with four-byte mi_integer values, the size of this structure is:

8 + 4(MAX_N)

If X_PERCENTILE used an opaque-type state, this structure must be less than the maximum opaque-type size. For systems that have a 32 kilobyte maximum opaque-type size, the X_PERCENTILE aggregate could use an opaque-type state as long as it is called in a query that finds 8190 or fewer rows. If the query finds more than 8190 rows, the state would not fit into an opaque type. To avoid this restriction, X_PERCENTILE implements the aggregate state as a pointer-valued state.

The following code shows the INIT aggregate support function that handles an INTEGER argument for the X_PERCENTILE aggregate:

/* X_PERCENTILE INIT support function on INTEGER */
mi_pointer init_xprcnt(dummy, prcntile, fparam)
   mi_integer dummy;
   mi_integer prcntile;
   MI_FPARAM *fparam;
{
   percentile_state_t *state;

   /* Allocate memory for the state from the PER_COMMAND
    * pool
    */
   state = (percentile_state_t *)
      mi_dalloc(sizeof(percentile_state_t), PER_COMMAND);

   /* Initialize the aggregate state */
   if ( mi_fp_argisnull(fparam, 1) )
      state->percentile = 50; /* median */
   else
      state->percentile = prcntile;
   state->count = 0;

   return ((mi_pointer)state);
}

This INIT support function performs the following tasks:

Handles a set-up argument
This set-up argument is the value that the end user specifies so that the aggregate can determine the value that has x percent values below and (100-x) percent above. If the end user provides an SQL NULL for the set-up argument, X_PERCENTILE assumes a value of 50 and therefore calculates the median.
Allocates PER_COMMAND memory for the pointer-valued state
The database server does not perform state management for pointer-valued states. Therefore, the INIT function must allocate the memory for the state. It also assigns the appropriate values to the percentile_state_t structure to initialize the state.

The following code implements the ITER aggregate support function that handles an INTEGER argument for the X_PERCENTILE aggregate:

/* X_PERCENTILE ITER support function on INTEGER */
mi_pointer iter_xprcnt(state_ptr, value, fparam)
   mi_pointer state_ptr;
   mi_integer value;
   MI_FPARAM *fparam;
{   
   mi_integer i, j;
   mi_integer is_null = 0;
   percentile_state_t *state = 
      (percentile_state_t *)state_ptr;

   /* Check for NULL-valued 'value' */
   if ( mi_fp_argisnull(fparam, 1) )
      {
      value = 0;
      is_null = 1;
      }

   /* Find position of 'value' in ordered 'value_array' */
   for ( i=0; i < state->count; i++ )
      {
      if ( state->value_array[i] > value )
         break;
      }

   /* Increment number of values (count) */
   ++state->count;

   /* Put value into ordered list of existing values */
   for (j=state->count - 1; j > i; j--)
      {
      state->value_array[j] = state->value_array[j-1];
      state->value_is_null[j] = state->value_is_null[j-1];
      }
   state->value_array[i] = value;
   state->value_is_null[i] = is_null;

return ((mi_pointer)state);
}

The ITER support function updates the aggregate state in-place with the following information:

Increments the number of aggregate arguments processed (count)
Stores the new aggregate argument in increasing sorted order in the value_array array
Stores the is-NULL flag that corresponds to each aggregate argument in its corresponding position in the value_is_null array

The ITER function also handles a possible NULL-valued aggregate argument. Because the X_PERCENTILE aggregate is defined to handle NULL values (see Figure 111), the database server calls ITER for NULL-valued aggregate arguments.

The following COMBINE aggregate support function handles an INTEGER argument for the X_PERCENTILE aggregate:

/* X_PERCENTILE COMBINE support function on INTEGER */
mi_pointer combine_xprcnt(state1_ptr, state2_ptr)
   mi_pointer state1_ptr, state2_ptr;
{
   mi_integer i;
   percentile_state_t *state1 = 
      (percentile_state_t *)state1_ptr;
   percentile_state_t *state2 = 
      (percentile_state_t *)state2_ptr;

   /* Merge the two ordered value arrays */
   for ( i=0; i < state2->count; i++ )
      (void) iter_xprcnt(state1_ptr, 
         state2->value_array[i]);

   /* Free the PER_COMMAND memory allocated to the state of
    * the 2nd parallel thread (state2). The memory
    * allocated to the state of the 1st parallel thread
    * (state1) holds the updated state. It is in the FINAL
    * support function.
    */
   mi_free(state2);

   return (state1_ptr);
}

This COMBINE support function merges two aggregate states, as follows:

Two ordered lists are merged into a single ordered list.
Two counts are added together.
Memory for one of the partial states is freed.
A pointer to the merged aggregate state is returned.

The following FINAL aggregate support function handles an INTEGER argument for the X_PERCENTILE aggregate:

/* X_PERCENTILE FINAL support function on INTEGER */
mi_integer final_xprcnt(state_ptr, fparam)
   mi_pointer state_ptr;
   MI_FPARAM *fparam;
{
   mi_integer index, trunc_int;
   mi_integer x_prcntile;
   percentile_state_t *state = 
      (percentile_state_t *)state_ptr;

/* Obtain index position of x-percentile value */
   trunc_int = (state->count) * (state->percentile);
   index = trunc_int/100;
   if ( (trunc_int % 100) >= 50 )
      index++;

/* Obtain x-percentile value from sorted 'value_array' */ 
   x_prcntile = state->value_array[index];

/* Check for NULL value so it can be returned as such */
   if ( state->value_is_null[index] )
      mi_fp_setreturnisnull(fparam, 0, MI_TRUE);

/* Free the PER_COMMAND memory allocated to the state */
   mi_free(state);

/* Return retrieved x-percentile value */
   return (x_prcntile);
}

This FINAL support function performs the following tasks:

Calculates the x-percentile for the values in the sorted array
The FINAL function must obtain the index position for the value that represents the x-percentile, where x is the percentage that the end user has passed in as a set-up argument.
Deallocates PER_COMMAND memory for the pointer-valued state
The database server does not perform any state management for pointer-valued states. Therefore, the FINAL function must deallocate the PER_COMMAND state memory that the INIT function has allocated.

After you successfully compile and link the aggregate support functions, you can define the PERCENT_GTR aggregate in the database. Figure 111 shows the CREATE AGGREGATE statement that defines the X_PERCENTILE user-defined aggregate. This statement specifies the registered SQL names of the required aggregate support functions. It also includes the HANDLESNULLS modifier to indicate that the PERCENT_GTR aggregate does process NULL values as aggregate arguments. By default, the database server does not pass a NULL value to an aggregate.

Figure 111. Registering the X_PERCENTILE User-Defined Aggregate

CREATE AGGREGATE x_percentile
   WITH (INIT = init_x_prcntile,
        ITER = iter_x_prcntile,
        COMBINE = combine_x_prcntile,
        FINAL = final_x_prcntile,
        HANDLESNULLS);

Suppose that the INIT, ITER, COMBINE, and FINAL aggregate support functions for the X_PERCENTILE aggregate are compiled and linked into a shared-object module named percent.

UNIX/Linux Only

On UNIX or Linux, the executable code for the X_PERCENTILE aggregate support functions would be in a shared library named percent.so.

End of UNIX/Linux Only

The following CREATE FUNCTION statements register the X_PERCENTILE aggregate support functions for an aggregate argument of the INTEGER data type:

CREATE FUNCTION init_x_prcntile(dummy INTEGER, x_percent INTEGER)
   RETURNING POINTER
   WITH (HANDLESNULLS)
   EXTERNAL NAME 
      '/usr/udrs/aggs/percent/percent.so(init_xprcnt)'
   LANGUAGE C;

CREATE FUNCTION iter_x_prcntile(agg_state POINTER, 
      one_value INTEGER)
   RETURNS POINTER
   WITH (HANDLESNULLS)
   EXTERNAL NAME 
      '/usr/udrs/aggs/percent/percent.so(iter_xprcnt)'
   LANGUAGE C;

CREATE FUNCTION combine_x_prcntile(agg_state1 POINTER, 
      agg_state2 POINTER)
   RETURNS POINTER
   WITH (HANDLESNULLS)
   EXTERNAL NAME 
   '/usr/udrs/aggs/percent/percent.so(combine_xprcnt)'
   LANGUAGE C;

CREATE FUNCTION final_x_prcntile(agg_state POINTER)
   RETURNS INTEGER
   WITH (HANDLESNULLS)
   EXTERNAL NAME 
      '/usr/udrs/aggs/percent/percent.so(final_xprcnt)'
   LANGUAGE C;

These CREATE FUNCTION statements use the SQL data type, POINTER, to indicate that the aggregate support functions accept a generic C pointer and perform their own memory management. They must all include the HANDLESNULLS routine modifier because the X_PERCENTILE aggregate handles NULL values.

The following query uses the X_PERCENTILE aggregate function on the INTEGER column, col3, to determine the quartile (the 25th percentile) for the values in col3:

SELECT X_PERCENTILE(col3, 25) FROM tab1;

For the tab1 rows that Figure 95 has inserted, X_PERCENTILE creates the following sorted list for the col3 values:

5, 9, 13, 19, 24, 31

Because 25 percent of 6 values is 1.5, X_PERCENTILE obtains the list item that has 2 values (1.5 rounded up to the nearest integer) below it. The preceding query returns 13 as the quartile for col3.

Suppose you add the following row to the tab1 table:

INSERT INTO tab1 (7, NULL:complexnum_t, NULL);

This INSERT statement adds a NULL value to the col3 column. Because X_PERCENTILE handles NULLs, the database server calls the X_PERCENTILE aggregate on this new row as well. After this seventh row is inserted, X_PERCENTILE would generate the following sorted list for col3:

(NULL), 5, 9, 13, 19, 24, 31

Twenty-five percent of 7 values is 1.75, so X_PERCENTILE obtains the list item that has 2 (1.75 truncated to the nearest integer) values below it. Now the quartile for col3 would be 9. If X_PERCENTILE was not registered with the HANDLESNULLS modifier, however, the database server would not call X_PERCENTILE for this newest row and the quartile for col3 would have been 13 (the quartile for 6 rows, even though col3 actually has 7 rows).

If you called the X_PERCENTILE aggregate with an x-percentile that would return the first value in the list (the NULL value), the FINAL support function uses the DataBlade API function mi_fp_setreturnisnull( ) to set the aggregate result to NULL. For example, suppose you execute the following query on the col3:

SELECT X_PERCENTILE(col3, 5) FROM tab1;

This query asks for the 5th percentile for the seven values in col3. Because 5 percent of 7 values is 0.35, X_PERCENTILE obtains the list item that has zero values (0.35 truncated to the nearest integer) below it. The preceding query returns NULL as the quartile for col3. The ITER function has stored NULL values as zeros in the sorted value_array. For the FINAL support function to determine when a value of zero indicates a NULL and when it indicates zero, it checks the value_is_null array. If the zero indicates a NULL value, FINAL calls the DataBlade API function mi_fp_setreturnisnull( ) to set the aggregate result to NULL.