The query optimizer uses the selectivity and cost of a query to help select the best query plan. To help the optimizer select the best query plan, you can provide the query optimizer with information about the selectivity and cost of your UDR.
This information is extremely useful for an expensive UDR, a UDR that requires a lot of execution time or resources to execute. When the query optimizer can obtain the selectivity and cost for an expensive UDR, it can better determine when to execute the UDR in the query plan.
When you write an expensive UDR, you can indicate the following performance-related characteristics of the UDR to assist the query optimizer in developing a query plan:
This section describes how to create selectivity and cost functions for an expensive UDR. For a general description of how the query optimizer uses cost and selectivity for UDRs, see the IBM Informix: User-Defined Routines and Data Types Developer's Guide.
If your UDR is a Boolean function, it can be used as a filter in a query. (For a list of Boolean functions that are useful as query filters, see Table 106.) The query optimizer uses the selectivity of a query to estimate the number of rows that the query will return. The selectivity is a floating-point value between zero (0) and one (1) that represents the percentage of rows for which the query (and each filter in the query) is expected to return a true value.
For a Boolean function likely to be used as a query filter, you can use the following routine modifiers to specify a selectivity for the function.
When the query optimizer needs to determine the selectivity of the Boolean function, it either uses the constant selectivity value or calls the selectivity function, depending whether the Boolean function was registered with the SELCONST or SELFUNC routine modifier.
If you need to calculate the selectivity for a Boolean function at runtime, create a selectivity function.
The selectivity function has the following coding requirements:
The SQL selectivity function has the following registration requirements:
If your UDR requires a lot of system resources (such as a large number of disk accesses or network accesses), you can define a cost-of-execution for the UDR. The query optimizer uses the cost value to determine the total cost of executing a query.
When you register a UDR, you can specify its cost with one of the following routine modifiers.
When the query optimizer needs to determine the cost of the UDR, it either uses the constant cost value or calls the cost function, depending whether the UDR was registered with the PERCALL_COST or COSTFUNC routine modifier.
If you need to calculate the cost for a UDR at runtime, create a cost function.
The cost function has the following coding requirements:
For more information, see MI_FUNCARG Data Type.
The SQL cost function has the following registration requirements:
The MI_FUNCARG data type is an Informix-defined opaque type that contains information about the companion UDR of a selectivity or cost function. Selectivity and cost functions both have the same number of arguments as their companion UDRs. To calculate selectivity or cost effectively, however, your user-defined function might need to know additional information about the context in which the UDR was called. The DataBlade API provides this contextual information in the MI_FUNCARG structure.
Each argument of a cost or selectivity function is of type MI_FUNCARG. The DataBlade API provides accessor functions for the MI_FUNCARG structure. You can use any of these functions to extract information about the companion-UDR arguments from the selectivity or cost function. Table 107 lists the DataBlade API accessor functions that obtain information from the MI_FUNCARG structure.
Table 108 lists the DataBlade API accessor functions that obtain general information about a companion UDR from the MI_FUNCARG structure.
The MI_FUNCARG structure categorizes each argument of the companion UDR arguments. The MI_FUNCARG data type identifies the following kinds of arguments in the companion UDR.
Companion-UDR Argument Type | Argument-Type Constant |
---|---|
Argument is a constant value | MI_FUNCARG_CONSTANT |
Argument is a column value | MI_FUNCARG_COLUMN |
Argument is a parameter | MI_FUNCARG_PARAM |
In addition to the general companion-UDR information that the functions in Table 108 obtain, you can also obtain information about the arguments themselves. The information that you can obtain depends on the particular category of the companion-UDR argument. Table 109 lists the DataBlade API accessor functions that obtain argument information from the MI_FUNCARG structure.
For example, you can write the following query:
SELECT * FROM tab1 WHERE meets_cost(tab1.int_col, 20) ...;
Suppose you register the meets_cost( ) function with a selectivity function named meets_cost_selfunc( ), as follows:
CREATE FUNCTION meets_cost(col INTEGER, value INTEGER) RETURNS BOOLEAN WITH (....SELFUNC=meets_cost_selfunc....) EXTERNAL NAME '......' LANGUAGE C;
Because the meets_cost( ) function returns a BOOLEAN value, you can write a selectivity function for the function. You write meets_cost_selfunc( ) so that it expects two arguments of the data type MI_FUNCARG. The following table shows what different MI_FUNCARG accessor functions return when you invoke them for each of the arguments of the meets_cost( ) function.
DataBlade API Function | Argument 1 | Argument 2 |
---|---|---|
mi_funcarg_get_argtype( ) | MI_FUNCARG_COLUMN | MI_FUNCARG_CONSTANT |
mi_funcarg_get_datatype( ) | Type identifier for data type of tab1.int_col | Type identifier for INTEGER data type |
mi_funcarg_get_datalen( ) | Length of tab1.int_col | Length of INTEGER |
mi_funcarg_get_tabid( ) | Table identifier of tab1 | Undefined |
mi_funcarg_get_colno( ) | Column number of int_col | Undefined |
mi_funcarg_isnull( ) | FALSE | FALSE |
mi_funcarg_get_constant( ) | Undefined | An MI_DATUM structure that holds the value of 20 |
When the companion UDR receives an argument that is a constant, you can obtain the following information about this constant from within the cost or selectivity function.
When the companion UDR receives an argument that is a column, you can obtain the following information about this column from the associated MI_FUNCARG argument of the cost or selectivity function.
The column number and table identifier are useful in a selectivity or cost function to obtain additional information about the column argument from the syscolumns or systables system catalog tables. The data distribution is useful if the determination of selectivity or cost depends on how the column values are distributed; that is, how many values in each range of values. Data distributions only make sense for data types that can be ordered.
The mi_funcarg_get_distrib( ) function obtains the contents of the encdat column of the sysdistrib system catalog table. The encdat column stores the data distribution for the column associated with the companion-UDR argument, as follows:
The mi_funcarg_get_distrib( ) function returns the data distribution in an mi_bitvarying structure as an mi_statret structure. The mi_statret structure can store the data distribution either directly in the structure (in the statdata.buffer field) or in a smart large object (in the statdata.mr field).
For more information about user-defined statistics, see Providing Statistics Data for a Column.
Home | [ Top of Page | Previous Page | Next Page | Contents | Index ]