The Parallel Database Query (PDQ) feature allows the database server to run a single SQL statement in parallel. When you send a query to the database server, it breaks your request into a set of discrete subqueries, each of which can be assigned to a different CPU virtual processor. A parallelizable query is a query that can be executed in parallel. PDQ is especially effective when your tables are fragmented and your server computer has more than one CPU.
A parallelizable UDR is a C UDR that can be executed in parallel when it is invoked within a parallelizable query. If you write your C UDR to be parallelizable, it can be executed in parallel when the query that invokes it is executed in parallel. That is, the C UDR can execute on subsets of table data just as the query itself can. A query that invokes a nonparallelizable UDR can still run in parallel. However, the calls to the UDR do not run in parallel. Similarly, prepared queries that invoke a parallelizable query do not run the UDR in parallel.
The following subsections describe these steps in detail.
To write a parallelizable C UDR, you must ensure that the UDR does not include any calls to the non-PDQ-threadsafe DataBlade API functions that Table 110 lists.
Category of Non-PDQ-Threadsafe Function | DataBlade API Function | |
---|---|---|
Statement processing: | ||
Statement execution
A parallelizable UDR cannot parse an SQL statement. |
mi_exec( ), mi_prepare( ) | |
Current-statement
processing
No current statement exists in a parallelizable UDR. Therefore, these functions are not useful. |
mi_binary_query( ), mi_command_is_finished( ), mi_get_result( ), mi_get_row_desc_without_row( ), mi_next_row( ), mi_query_finish( ), mi_query_interrupt( ), mi_result_command_name( ), mi_result_row_count( ), mi_value( ), mi_value_by_name( ) | |
Prepared
statements
No prepared statement exists because you cannot prepare one in a parallelizable UDR. Therefore, these functions are not useful. |
mi_close_statement( ), mi_drop_prepared_statement( ), mi_exec_prepared_statement( ), mi_fetch_statement( ), mi_get_statement_row_desc( ), mi_open_prepared_statement( ), mi_statement_command_name( )
All
input-parameter accessor functions: mi_parameter_*
|
|
Transfer of data
Even though these type-transfer functions are PDQ-threadsafe, they are usually called within the send and receive support functions of an opaque type and are likely to be called during statement processing. |
All
type-transfer functions: mi_get_*, mi_put_*
(see Table 114) |
|
Other | mi_current_command_name( ) | |
Save-set handling | All
save-set functions: mi_save_set_*
(see Table 60) |
|
Complex-type (collections and row types) handling: | ||
Collection processing | All
collection functions: mi_collection_*
(see Collections) |
|
Row-type processing | mi_get_row_desc( ), mi_get_row_desc_from_type_desc( ), mi_get_row_desc_without_row( ), mi_get_statement_row_desc( )
mi_row_create( ), mi_row_free( ), mi_row_desc_create( ), mi_row_desc_free( ) |
|
Complex-type processing | Type-descriptor
accessor functions if they access a complex type: mi_type_*
(see Table 9) Column
functions if they access a complex type: mi_column_*
|
|
Operating-system file access | All
file-access functions: mi_file_*
(see Table 91) |
|
Tracing: | ||
Even though the files listed here are not PDQ-threadsafe, you can include most statements that generate trace output in a parallelizable UDR. | mi_tracefile_set( ), mi_tracelevel_set( )GL_DPRINTF | |
Miscellaneous | mi_get_connection_option( ), mi_get_database_info( ), mi_get_session_connection( ), mi_get_type_source_type( ) |
A parallelizable C UDR cannot call (either explicitly or implicitly) any of the DataBlade API functions in Table 110. If you attempt to run a UDR that contains a non-PDQ-threadsafe function in parallel, the database server generates an error. If your UDR must call one of the functions in Table 110, it cannot be parallelizable.
Keep in mind the following considerations when you write a UDR to be parallelizable:
When you register a UDR with the PARALLELIZABLE routine modifier, you tell the database server that the UDR was written according to the guidelines in Writing the Parallelizable UDR. That is, the UDR does not call any DataBlade API functions that are non-PDQ-threadsafe. However, registering the UDR with the PARALLELIZABLE modifier does not guarantee that every invocation of the UDR executes in parallel. The decision whether to parallelize a query and any accompanying UDRs is made when the query is parsed and optimized.
When a query with a parallelizable UDR executes in parallel, each routine instance might have more than one routine sequence. For a parallelized UDR, the routine manager creates a routine sequence for each scan thread of the query.
For example, suppose you have the following query:
SELECT a_func(x) FROM table1 WHERE a_func(y) > 7;
Suppose also that the table1 table in the preceding query is fragmented into four fragments and the a_func( ) user-defined function was registered with the PARALLELIZABLE routine modifier. When this query executes in serial, it contains two routine instances (one in the select list and one in the WHERE clause) and two routine sequences. However, when this query executes in parallel over table1, it still contains two routine instances but it now has six routine sequences:
The MI_FPARAM structure holds the information of the routine sequence. Therefore, the routine manager allocates an MI_FPARAM structure for each scan thread of the parallelized query. All invocations of the UDR that execute in the scan thread can share the information in an MI_FPARAM structure. However, UDRs in different scan threads cannot share MI_FPARAM information across scan threads.
For more information about how the routine manager creates a routine sequence, see Creating the Routine Sequence.
You can use the SQL statement SET EXPLAIN to determine whether a parallelizable query is actually being executed in parallel. The SET EXPLAIN statement executes when the database server optimizes a statement. It creates a file that contains:
For more information on SET EXPLAIN, see its description in the IBM Informix: Guide to SQL Syntax and your IBM Informix: Performance Guide.
The following onstat options are useful to track execution of parallel activities:
For more information on the onstat utility, see the IBM Informix: Administrator's Reference.
Home | [ Top of Page | Previous Page | Next Page | Contents | Index ]