Creating a Well-Behaved Routine

Home | Previous Page | Next Page Creating User-Defined Routines > Writing a User-Defined Routine > Using Virtual Processors >

Creating a Well-Behaved Routine

Because the CPU VP is used to execute all client requests, it is important that the code it executes be well-behaved; that is, all code should have the following attributes:

Preserve availability of the CPU VP
The CPU VP performs system services and related tasks and executes code for UDRs. If a UDR issues a standard blocking I/O call in a CPU VP, then the VP must wait for the I/O to complete and cannot attend to other threads and administrative tasks. The time spent waiting adversely affects the overall performance of the system. DataBlade API I/O functions enable the CPU VP to process the I/O asynchronously and do not block the CPU VP.

The benefit of releasing the CPU VP so that it can execute other threads outweighs the overhead involved in saving the current thread state and switching to another thread. Each thread should explicitly yield the CPU VP in a timely manner (at least every 1/10 of a second).
Be process safe
Well-behaved code must be able to migrate among processes without loss of essential information or changing the global VP state. C UDR code is process safe when all state information is entirely encapsulated within the arguments to each C function and within the scope of the function itself. UDRs should not use global variables or system calls that change the process state.

Code that is provided to execute within SQL statements (such as built-in SQL functions) is well-behaved. However, IBM does not have control over the code you write in your C UDR. A C UDR must be well-behaved to execute in the CPU VP. As a UDR developer, you must ensure that your C UDR adheres to the safe-code requirements in Table 85.

Table 85. Safe-Code Requirements for a Well-Behaved UDR
Safe-Code Requirement	Coding Rule	Possible Workarounds
Preserve availability.	Yield the CPU VP in a timely manner (at least every 1/10 of a second).	To execute in the CPU VP, use mi_yield( ) to explicitly yield the CPU VP during resource-intensive processing. Otherwise, execute in a user-defined VP class.
	Do not use blocking I/O calls.	Execute in a yielding user-defined VP class.
	Never change the working directory.	None
Be process safe.	No heap-memory allocation	To execute in the CPU VP, use the DataBlade API memory-management functions.
	No modification of global or static data	To execute in the CPU VP, use the MI_FPARAM structure if you need to preserve state information. If necessary, global or static data can be read, as long as it is not updated. Otherwise, execute in a nonyielding user-defined VP class or a single-instance user-defined VP.
	No modification of the global state of the virtual processor	A C UDR that modifies the global VP state cannot execute safely in any VP. If modification of this data is essential to the application, execute the C UDR in a nonyielding user-defined VP class or a user-defined VP class that has only one VP defined.
Avoid unsafe operating-system calls.	Do not use any system calls that might impair availability or allocate local resources.	If use of such system calls is essential to the application, execute the C UDR in a nonyielding user-defined VP class and a single-instance VP and then change back.

If a UDR does not follow the safe-code requirements in Table 85, it is called an ill-behaved routine. An ill-behaved routine cannot safely execute in the CPU VP.

Warning:

Execution of an ill-behaved routine in the CPU VP can cause serious interference with the operation of the database server. In addition, the UDR itself might not produce correct results.

If your C UDR has one of the ill-behaved traits in Table 85, follow the suggestions in the Possible Workarounds column. The following sections describe more fully the safe-code requirements for a well-behaved C UDR.

Preserving Availability of the CPU VP

A well-behaved C UDR must preserve the availability of the CPU virtual processor (CPU VP). The CPU virtual processor appears to execute multiple threads simultaneously because it switches between threads. The database server tries to keep a thread running on the same CPU VP that begins the thread execution. However, if the current thread is waiting for some other type of resource to be accessed or some other task to be performed, the CPU virtual processor is needlessly held up. To avoid this situation, the database server can migrate the current thread to another VP.

For example, a query request starts as a session thread in the CPU VP. Suppose this query contains a C UDR that accesses a smart large object. While the thread waits for the smart-large-object data to be fetched from disk, the database server migrates the thread to an AIO VP, releasing control of the CPU VP so that other threads can execute.

At a given time, a VP can run only one thread. To maintain availability for session threads, the CPU VP swaps out one thread to allow another to execute. This process of swapping threads is sometimes called thread yielding. This continual thread yielding keeps the CPU VP available to process many threads. The speed at which CPU-VP processing occurs produces the appearance that the database server processes multiple tasks simultaneously.

Unlike an operating system, which assigns time slices to processes for their CPU access, the database server does not preempt a running thread when a fixed amount of time expires. Instead, it runs a thread until the thread yields the CPU VP. Thread yielding can occur at either of the following events:

When the thread explicitly calls mi_yield( )
When the thread requires some external resource to continue execution (such as file or data I/O)

When a thread yields, the VP switches to the next thread that is ready to run. The VP continues execution and migration of threads until it eventually returns to the original thread.

For a C UDR to preserve availability of the CPU VP, the UDR must ensure that it does not monopolize the CPU VP. When a C UDR keeps exclusive control of the CPU VP, the UDR blocks other threads from accessing this VP. A C UDR can impair concurrency of client requests if it behaves in either of the following ways:

It does not regularly yield the CPU.
You must ensure that the C UDR yields the CPU VP at appropriate intervals.
It calls a blocking-I/O function.
You must ensure that the C UDR does not call any blocking I/O functions because they can monopolize the CPU VP and possibly hang the database server.

Denying other threads access to the CPU VP can affect every user on the system, not just the users whose queries contain the same C UDR. If you cannot code a C UDR to explicitly yield during resource-intensive processing and to avoid blocking-I/O functions, the UDR is an ill-behaved routine and must execute in a user-defined VP class.

Yielding the CPU VP

To preserve the availability of the CPU VP, a well-behaved C UDR must ensure that it regularly yields the CPU VP to other threads. A C UDR might yield when it calls a DataBlade API function because DataBlade API functions automatically yield the VP when appropriate. For example, the UDR thread might migrate to the AIO VP to perform any of the following kinds of I/O:

Smart-large-object I/O with a DataBlade API function such as mi_lo_open( ), mi_lo_read( ), or mi_lo_write( )
External-file I/O with a DataBlade API file-access function such as mi_file_open( ), mi_file_read( ), or mi_file_write( )

Therefore, you can assume that thread migration might occur during execution of any DataBlade API function.

However, if your C UDR performs any of the following types of resource-intensive tasks (which do not involve calls to DataBlade API functions), your UDR does not automatically yield the VP:

A task that is CPU- or I/O-bound
A task that causes other threads to wait for an undue length of time (usually longer than 0.1 seconds)

For such a C UDR to be well-behaved, it must explicitly yield the CPU VP with the DataBlade API function mi_yield( ). The mi_yield( ) function causes the thread that is executing the UDR to voluntarily yield the CPU VP so that other threads get a chance to execute in the VP. When the original thread is ready to continue execution, execution resumes at the point immediately after the call to the mi_yield( ) function.

Write your C UDR so that it yields the VP at strategic points in its processing. Possible points include the beginning or end of lengthy loops and before and/or after expensive computations. Judicious use of mi_yield( ) generally leads to an improved response time overall.

If you cannot code the C UDR to explicitly yield during resource-intensive passages of code, the UDR is considered an ill-behaved routine and must not execute in the CPU VP. To isolate a resource-intensive UDR from the CPU VP, you can assign the routine to a user-defined VP class. To determine which kind of user-defined VP to define, you must also consider whether you need to preserve availability of the user-defined VP. Keep in mind that all VPs of a class share a thread queue. If there are multiple users of your UDR, multiple threads can accumulate in the same thread queue. If your UDR does not yield, it blocks other UDRs that execute in the same VP class. Therefore, the VP might not effectively share between users. One user might have to wait while the UDR in the query of some other user completes.

You can use a user-defined VP to execute a resource-intensive routine:

To preserve availability of a user-defined VP, execute the routine in a yielding user-defined VP.
Within your UDR, you can use the mi_yield( ) function to yield the user-defined VP to other threads that execute in the same VP class. To increase availability, you can define multiple instances of the yielding user-defined VP.
If you cannot rewrite the routine to yield, add more user-defined VPs.
A nonyielding user-defined VP is used for code that must maintain ownership of the process until it completes. A nonyielding VP might modify a global variable or use a command resource that cannot be shared.

Avoiding Blocking I/O Calls

To preserve concurrency, a well-behaved C UDR must avoid system calls that perform blocking input and output operations (I/O). Some of these operating-system calls follow:

accept( )
bind( )
fopen( )
getmsg( )

msgget( )
open( )
pause( )
poll( )

putmsg( )
read( )
select( )

semop( )
wait( )
write( )

When a C UDR executes any of these system calls, the CPU VP must wait for the I/O to complete. In the meantime, the CPU VP cannot process any other requests. The database server can appear to stall because the concurrency of the CPU VP is impaired.

If your C UDR needs to perform file I/O, do not use operating-system calls to perform this task. Instead, use the DataBlade API file-access functions. These file-access functions allow the CPU VP to process the I/O asynchronously. Therefore, they do not block the CPU VP. For more information, see Access to Operating-System Files.

If your UDR must issue blocking I/O calls, assign the routine to execute in a user-defined VP class. When a UDR blocks a user-defined VP, only those UDRs that are assigned to that VP are affected. You might need to use a single instance of a user-defined VP, which would affect client response. Your UDR must also handle any problems that could occur if the thread yielded; for example, operating-system file descriptors do not migrate with a thread if it moves to a different VP.

Writing Threadsafe Code

A well-behaved C UDR must be threadsafe. During execution, an SQL request might travel around the different VP classes. For example, a query starts in the CPU VP, but it might migrate to a user-defined VP to execute a UDR that was registered for that VP class. In turn, the UDR might fetch a smart large object, which would cause the thread to migrate to the AIO VP.

Migrating a thread to a different VP means that the database server must preserve the state of the thread before it migrates the thread. When a client application connects to the database server, the database server creates a thread-control block (TCB) to store thread-state information needed when a thread switches VPs. The TCB includes the following thread-state information:

Contents of the VP system registers
Program counter, which contains the address of the next instruction to execute.
Stack pointer, which points to private memory, called a thread stack
For more information on use of the thread stack by a UDR, see Managing Stack Space.

Tip:

For more information on the structure and use of the thread-control block, see your IBM Informix: Administrator's Guide.

When a thread migrates from one VP to another, it releases its original VP so this VP can execute other threads. The benefit of releasing the CPU VP outweighs the overhead involved in saving the thread state. Therefore, a C UDR must be able to continue execution without loss of information when it migrates to a different VP.

For a C UDR to successfully migrate among VPs, its code must be threadsafe; that is, it must have the following attributes:

Does not perform any dynamic memory allocation with operating-system calls
Does not modify global or static data
Does not modify other global process-state information

Tip:

A parallelizable UDR has additional coding restrictions. For more information, see Creating Parallelizable UDRs.

Restricting Memory Allocation

To be threadsafe, a well-behaved C UDR must not use system memory-management routines to allocate memory dynamically including the following operating-system calls:

calloc( )
free( )
malloc( )

mmap( )
realloc( )

shmat( )
valloc( )

Many other system calls allocate memory as well.

These operating-system calls allocate memory from the program heap space. The location of this heap space on only one VP creates the following problems:

Heap memory available to one VP is not visible after a thread migrates to another VP.
Once the thread migrates, the UDR can no longer access any data that was stored in heap memory. Even if the UDR allocates heap memory at the beginning of execution and frees this memory before it completes, the thread might still migrate to a different VP during execution of the UDR.
Other VPs are not prevented from using the same address space for the shared-memory pool.
When a VP needs to extend the virtual memory pool, it negotiates the addition of new shared-memory segments to the existing pool. The VP then updates the resident portion of shared memory and sends a signal to other VPs so that they can become aware of changes to shared memory.

A VP that extends the memory pool is not aware of any portion of memory that malloc( ) (or any other system memory-management routine) is using. Therefore, the VP might try to use the same address space that a system memory-management call has reserved.
Heap memory that system memory-management calls allocate is not automatically freed.
If a C UDR does not explicitly free this heap memory, memory leaks can occur.

For a C UDR to be well-behaved, it must handle dynamic memory allocation with the DataBlade API memory-management functions. These DataBlade API functions provide the following benefits:

They allocate user memory from the database server shared memory.
All VPs can access database server shared memory. Figure 84 shows the areas of memory from which DataBlade API and operating-system memory-management functions allocate. For more information, see Managing User Memory.
They allocate user memory with a specified lifetime called a memory duration.
If a C UDR does not explicitly free memory that these DataBlade API functions allocate, the database server automatically deallocates it when its memory duration has expired. This automatic reclamation reduces memory leaks. For more information, see Choosing the Memory Duration.

Important:

Do not call operating-system memory-management functions from within a C UDR. Use these DataBlade API memory-management functions instead. The DataBlade API memory-management functions are safer in a C UDR than their operating-system equivalents.

If you are porting legacy code to a C UDR, you might want to write simple C programs to implement system memory-management calls and link these functions into your code before you make the UDR shared-object module. The following code fragment shows a simple implementation of malloc( ) and free( ) functions:

/* mallocfix.c: This file contains "fixed" versions of the
 *              malloc( ) and free( ) system memory-management
 *              calls for use in legacy code that currently
 *              uses malloc( ) and free( ).
 * Use mi_alloc( ) and mi_free( ) in new code.
 */
#include <mi.h>
void *malloc(size_t size)
{
   return (mi_alloc((mi_integer)size));
}

void free(void *ptr)
{
   mi_free(ptr);
}

This code fragment uses mi_alloc( ), which allocates user memory in the current memory duration. Therefore, the fragment allocates the memory with the default memory duration of PER_ROUTINE. For more information, see Managing the Memory Duration.

If you cannot avoid using system memory-management functions, your C UDR is ill-behaved. You can use system memory-management functions in your UDR only if you can guarantee that the thread will not migrate. A thread could migrate during any DataBlade API call. To guarantee that the thread never migrates, you can either allocate and free the memory inside a code block that does not execute any DataBlade API functions or use a single-instance VP.

This restriction means that if you must use a system memory-management function, you must segment the UDR into sections that use DataBlade API functions and sections that are not safe in the CPU VP. All files must be closed and memory deallocated before you leave the sections that are not safe in the CPU VP. For more information, see External-Library Routines.

Avoiding Modification of Global and Static Variables

To be threadsafe, a well-behaved C UDR must avoid use of global and static variables. Global and static variables are stored in the address space of a virtual processor, in the data segment of a shared-object file. These variables belong to the address space of the VP, not of the thread itself. Modification of or taking pointers to global or static variables is not safe across VP migration boundaries.

When an SQL statement contains a C UDR, the routine manager loads the shared-object file that contains the UDR object code into each VP. Therefore, each VP receives its own copy of the data and text segments of a shared-object file and all VPs have the same initial data in their shared-object data segments. Figure 73 shows a schematic representation of a virtual processor and indicates the location of global and static variables.

Figure 73. Location of Global and Static Variables in a VP

begin figure description - This figure is described in the surrounding text. - end figure description

As Figure 73 shows, global and static variables are not stored in database server shared memory, but in the data and text segments of a VP. These segments in one VP are not visible after a thread migrates to another VP. Therefore, if a C UDR modifies global or static data in the data segment of one VP, the same data is not available if the thread migrates.

Figure 74 shows an implementation of a C UDR named bad_rowcount( ) that creates an incremented row count for the results of a query.

Figure 74. Incorrect Use of Static Variable in a C UDR

/* bad_rowcount( )
 *     Increments a counter for each row in a query result.
 *     This is the WRONG WAY to implement the function
 *     because it updates a static variable.
 */
mi_integer
bad_rowcount(Gen_fparam)
   MI_FPARAM *Gen_fparam;
{
   static mi_integer bad_count = 0;
   bad_count++;
   return bad_count;
}

Suppose the following SELECT statement executes:

SELECT bad_rowcount( ), customer_id FROM customer;

The CPU VP that is processing this query (for example, CPU-VP 1) executes the bad_rowcount( ) function. The bad_rowcount( ) function is not well-behaved because it uses a static variable to hold the row count. Use of this static bad_count variable creates the following problems:

The updated bad_count value is not visible when the thread migrates to another VP.
When bad_rowcount( ) increments the bad_count variable to 1, it updates the static variable in the shared-object data segment of CPU-VP 1. If the thread now migrates to a different CPU VP (for example, CPU-VP 2), this incremented value of bad_count is not available to the bad_rowcount( ) function. This next invocation of bad_rowcount( ) gets an initialized value of zero (0), instead of 1.
Concurrent activity of the bad_rowcount( ) function is not interleaved.
For example, suppose CPU-VP 1 and CPU-VP 2 are processing session threads for three client applications, each of which execute the bad_rowcount( ) function. Now two copies of the bad_count static variable are being incremented among the three client applications.

A well-behaved C UDR can avoid use of global and static data with the following workarounds.

Workaround	Description
Use only local (stack) variables and user memory (which the DataBlade API memory-management functions allocate).	Both of these types of memory remain accessible when a thread migrates to another VP: Because the stack is maintained as part of the thread, reads and writes of local variables are maintained when the thread migrates among VPs. Write reentrant code that keeps variables on the stack. User memory resides in database server shared memory and therefore is accessible by all VPs. For more information, see Managing User Memory.
Use a function-parameter structure, named MI_FPARAM, to track private state information for a C UDR.	The MI_FPARAM structure is available to all invocations of a UDR within a routine sequence. Figure 48 shows the implementation of the rowcount( ) function, which uses the MI_FPARAM structure to correctly implement the row counter that bad_rowcount( ) attempts to implement. For more information, see Saving a User State.
If necessary, you can use read-only static or global variables because the values of these variables remain the same in each CPU VP.	Keep in mind, however, that addresses of global and static variables as well as addresses of functions are not stable when the UDR migrates across VPs.

Workaround

Description

Use only local (stack) variables and user memory (which the DataBlade API memory-management functions allocate).

Both of these types of memory remain accessible when a thread migrates to another VP:

Because the stack is maintained as part of the thread, reads and writes of local variables are maintained when the thread migrates among VPs. Write reentrant code that keeps variables on the stack.
User memory resides in database server shared memory and therefore is accessible by all VPs.

For more information, see Managing User Memory.

Use a function-parameter structure, named MI_FPARAM, to track private state information for a C UDR.

The MI_FPARAM structure is available to all invocations of a UDR within a routine sequence. Figure 48 shows the implementation of the rowcount( ) function, which uses the MI_FPARAM structure to correctly implement the row counter that bad_rowcount( ) attempts to implement. For more information, see Saving a User State.

If necessary, you can use read-only static or global variables because the values of these variables remain the same in each CPU VP.

Keep in mind, however, that addresses of global and static variables as well as addresses of functions are not stable when the UDR migrates across VPs.

If your C UDR cannot avoid using global or static variables, it is an ill-behaved routine. You can execute the ill-behaved routine in a nonyielding user-defined VP class but not in the CPU VP. A nonyielding user-defined VP prevents the UDR from yielding and thus from migrating to another VP. Because the nonyielding VP executes the UDR to completion, any global (or static) value is valid for the duration of a single invocation of the UDR. The nonyielding VP prevents other invocations of the same UDR from migrating into the VP and updating the global or static variables. However, it does not guarantee that the UDR will return to the same VP for the next invocation.

For the global (or static) value to be valid across a single UDR instance (all invocations of the UDR), define a single-instance user-defined VP. This VP class contains one nonyielding VP. It ensures that all instances of the same UDR execute on the same VP and update the same global variables. A single-instance user-defined VP is useful if your UDR must access a global or static variable by its address.

For more information, see Choosing the User-Defined VP Class.

Modifying the Global Process State

To be VP safe, a well-behaved C UDR must avoid modification of the global process state. All virtual processors that belong to the same VP class share access to both data and processing queues in memory. However, the global process state is not shared. The database server assumes that the global process state of each VP is the same. This consistency ensures that VPs can exchange work on threads.

For a C UDR to be well-behaved, it must avoid any programming tasks that modify the global process state of the virtual processor. Update of global and static data (Avoiding Modification of Global and Static Variables) involves modification of the global process. A well-behaved UDR must not use operating-system calls that can alter the process state, such as chdir( ), fork( ), signal( ), or unmask( ). Such operating-system calls can interfere with thread migration because the global process state does not migrate with the thread. In addition, you need to be careful with tasks such as opening file descriptors and using operating-system threads.

Avoiding Restricted System Calls

A well-behaved C UDR must avoid the use of restricted system calls, which can have the following adverse effects:

They might block I/O, which causes the operating system to suspend the process that calls them.
This suspension slows down both the C UDR that contains the calls and any other threads that share the same CPU virtual processor.
Many system calls allocate resources local to the process and are not re-entrant.

IBM cannot provide a definitive list of unsafe system calls because system calls that are unsafe vary among versions of operating systems and different types of operating systems. Additionally, the implementation of the VPs is different between UNIX or Linux and Windows:

On UNIX or Linux, the VPs are implemented as separate processes.
On Windows, each VP is a thread of a common process.

The difference in VP implementation means that some system calls are acceptable when the C UDR runs on Windows but not when this same UDR runs on UNIX or Linux. There are also differences in how UNIX or Linux handles shared libraries and how Windows handles dynamic link libraries (DLLs) that can affect the platform on which operating-system calls are valid. Therefore, UDRs might not be portable from one operating system to another.

Unsafe Operating-System Calls

An unsafe system call is one that blocks, causing the virtual processor to stall the CPU until the call returns, or one that allocates resources local to the virtual processor instead of in shared memory. A system call within a transaction is not terminated by a rollback, so a suspended transaction can wait indefinitely for the call to return. For instructions on recovery from a deadlock during a long transaction rollback, see the IBM Informix: Dynamic Server Administrator's Guide.

A well-behaved C UDR must not include any of the categories of system calls in Table 86. The system calls listed in the Sample Operating-System Calls column are listed only as possible examples. The operating-system calls that are unsafe in your C UDR can depend on your operating system. Consult your operating-system documentation for information on system calls that perform the categories of tasks in Table 86.

Table 86. Unsafe Operating-System Calls
Type of Operating-System Call	Sample Operating-System Calls
Calls that manipulate signals to processes	signal( ), alarm( ), sleep( )
Calls that modify the system security	setuid( ), seteuid( ), setruid( ), setgid( ), setegid( ), setrgid( )
Calls that initiate or halt system processes	fork( ), exec( ), exit( ), system( ), popen( )
Calls that modify the shared-memory segments	shmat( )
Calls that modify the runtime environment of the dynamic linker	dlopen( ), dlsym( ), dlerror( ), dlclose( )Windows: LoadLibrary( )

Warning:

The database server reserves all operating-system signals for its own use. The virtual processors use signals to communicate with one another. If a UDR were to use signals, these signals would conflict with those that the virtual processors use. Therefore, do not raise, handle, or mask signals within a C UDR.

You can use system utilities to check if undesired system calls were included in your shared-object file:

On UNIX or Linux, you can use the nm and ldd commands to obtain this information. The ldd command lists the dynamic dependencies from a shared object.
On Windows, you can use the DUMPBIN command with its /IMPORTS option to obtain this information.

Tip:

Given a DataBlade build (.bld) file, check for unresolved references in the file and all its dependencies. You can compare this list for system calls that violate the rules of the VP you have chosen to execute your C UDR.

For a list of operating-system calls that are generally safe in a C UDR, see Safe Operating-System Calls.

External-Library Routines

It is recommended that a C UDR avoid the use of routines from existing external libraries. Some of these external routines might contain system calls that are restricted in your VP. If your C UDR must use an external routine, it might be ill behaved. Avoid calling the following kinds of external library routines, which are not safe in the CPU VP:

Routines that do blocking I/O, such as routines that open files
Routines that dynamically allocate memory, such as malloc( )
Routines that allocate static memory

To execute one of these routines safely in a UDR, the following steps are possible:

Divide the UDR into critical-code sections and DataBlade-API-code sections.
Execute the UDR in a user-defined VP.

The following text explains these steps.

Important:

Any external-library routine that uses signals cannot be used in a C UDR. Do not use this suggested workaround for any external library call that uses signals.

For an external routine to execute safely, the thread that executes the UDR must not migrate out of the VP as long as the UDR uses the unsafe resources (open files, memory allocated with malloc( ), or static-memory data). However, DataBlade API functions might automatically yield the VP when they execute. This yielding causes the thread to migrate to another VP.

Therefore, you cannot interleave DataBlade API calls and external routines in your UDR. Instead, you must segment your C UDR into the following distinct sections:

Critical-code sections
These sections contain only the external-library calls that are not safe in the CPU VP. Before execution leaves the critical-code section, any unsafe resources must be released: open files must be closed and memory allocated with malloc( ) must be deallocated.
DataBlade-API code sections
These sections contain only DataBlade API functions. No external-library functions that are not safe in the CPU VP exist in these sections because any DataBlade API function might cause the thread to migrate.

Safe Operating-System Calls

The following table lists operating-system calls that are considered safe within a well-behaved C UDR on all supported platforms. Be sure to use threadsafe (_r) versions where applicable.

Category	System Calls	Notes
Character classification	isalnum( ), isalpha( ), isascii( ), isastream( ), isatty( ), iscntrl( ), isdigit( ), isgraph( ), islower( ), isspace( ), isprint( ), ispunct( ), isupper( ), isxdigit( )	None
String manipulation	tolower( ), toupper( ), toascii( )	None
String parsing	getopt( ), getsubopt( )	None
Multibyte strings	mbtowc( ), wctomb( ), mblen( ), mbstowcs( ), wcstombs( )	None.
String processing	strcasecmp( ), strcat( ), strchr( ), strcmp( ), strcoll( ), strcpy( ), strcspn( ), strdup( ), strerror( ), strlen( ), strncasecmp( ), strncat( ), strncmp( ), strncpy( ), strpbrk( ), strrchr( ), strsignal( ), strspn( ), strstr( ), strtod( ), strtok( ), strtok_r( ), strtol( ), strtoll( ), strtoul( ), strtoull( ), strxfrm( )	None.
String formatting	sprintf( ), sscanf( )	None
Numeric processing	a641( ), l64a( ), abs( ), labs( ), llabs( ), atof( ), atoi( ), atol( ), atoll( ), div( ), ldiv( ), lldiv( ), lltostr( ), strtoll( )	None
Random-number generation	srand( ), rand( ), srandom( ), random( ), srand48( ), drand48( ), erand48( ), lrand48( ), nrand48( ), mrand48( )	The random-number generator must be reseeded whenever a thread switch might have occurred.
Numeric conversion	econvert( ), fconvert( ), gconvert( ), seconverty( ), sfconvert( ), sgconvert( ), qeconvert( ), qfconvert( ), ecvt( ), fcvt( ), gcvt( )	ifx_dececvt( ), ifx_decfcvt( )
Time functions	ascftime( ), strftime( ), cftime( ), ctime( ), ctime_r( ), asctime( ), asctime_r( ), gmtime( ), gmtime_r( ), difftime( ), localtime( ), localtime_r( )clock( ), gettimeofday( ), mktime( )	No time-zone changes are permitted.
Date functions	getdate( )	None
Sorting and searching	bsearch( ), qsort( ), lfind( ), lsearch( )	None
Encryption	crypt( ), setkey( ), encrypt( )	None
Memory management	memccpy( ), memchr( ), memcmp( ), memcpy( ), memmove( ), memset( )	Use memmove( ) and memset( ) only for memory that was allocated with mi_alloc( ).
Environment information	getenv( )	None
Bit manipulation	ffs( )	None
Byte manipulation	swab( )	None
Structure-member manipulation	offsetof( )	None
Trigonometric functions	acos( ), acosh( ), asin( ), asinh( ), atan( ), atan2( ), atanh( )cos( ), cosh( ), sin( ), sinh( ), tan( ), tanh( )	None
Bessel functions	j0( ), j1( ), jn( ), y0( ), y1( ), yn( )	None
Root extraction	cbrt( ), sqrt( )	None
Rounding	ceil( ), floor( ), rint( )	None
IEEE functions	copysign( ), isnan( ), fabs( ), fmod( ), nextafter( ), remainder( )	None
Error functions	erf( ), erfc( )	None
Exponentials and logarithms	exp( ), expm1( ), log( ), log10( ), log1p( ), pow( )	None
Gamma functions	lgamma( ), lgamma_r( )	The contents of signgam are unreliable after a thread switch.
Euclidean distance	hypot( )	None

Tip:

The system calls in the preceding table follow the Portable Operating System Interface for Computing Environments (POSIX) specification.

For a list of categories of operating-system calls that are generally unsafe in a UDR, see Unsafe Operating-System Calls.

Windows Only

The following actions are valid only in C UDRs that run on Windows and only if they do not interfere with the shared-memory model that the database server uses:

C UDRs can create additional threads or processes.
C UDRs can use shared memory for interprocess communication.

End of Windows Only

Important:

Use of user-defined VPs can result in slightly lower performance because the thread must migrate from the CPU VP to the user-defined VP on which the C UDR executes. Use a user-defined VP only when necessary.

Choosing the User-Defined VP Class

When you run your C UDR in a user-defined VP, you can relax some, but not all, of the CPU VP safe-code requirements (Table 85). You must choose a user-defined VP that is appropriate for the ill-behaved traits of your UDR. The following types of user-defined VPs allow a C UDR to contain the ill-behaved traits.

Type of User-Defined VP	Purpose
Yielding user-defined VP	Prevents a UDR from blocking the CPU VP because it blocks a user-defined VP thread
Nonyielding user-defined VP	Preserves global state of the VP across one UDR invocation
Single-instance user-defined VP	Preserves global state of the VP across all UDR invocations and instances

Warning:

The user-defined VP class frees the CPU VPs from effects of some ill-behaved traits of a UDR. However, this VP class provides little protection from process failures. Even when the UDR runs in a user-defined VP class, programming errors that cause process failures can severely affect the database server.

The Yielding User-Defined VP

By default, a user-defined virtual processor is a yielding VP. That is, it expects the thread to yield execution whenever the thread waits for other resources. Once a thread yields a user-defined VP, the VP can run other threads that execute UDRs assigned to this VP class. The most common use of a yielding user-defined VP class is for execution of code that cannot be rewritten to use the DataBlade API file-access functions to perform file-system activity.

The following table summarizes the programming requirements for C UDRs that apply to execution in a yielding user-defined VP.

CPU VP Safe-Code Requirement Rule	Required for Yielding User-Defined VP?
Yields the VP on a regular basis	Recommended
Does not use blocking operating-system calls	Not required
Does not allocate local resources, including heap memory	Yes
Does not modify global or static data	Yes
Does not modify other global process-state information	Yes
Does not use restricted operating-system calls	Yes

The main advantages of a yielding user-defined VP class are as follows:

You can use the mi_yield( ) function in your UDR to explicitly yield the user-defined VP.
Failure to use mi_yield( ) in a UDR creates the same loss of concurrency that it would in a CPU VP. However, loss of concurrency is not as critical in user-defined VPs because these VPs do not handle all query processing, as the CPU VPs do. For more information, see Yielding the CPU VP.
You are no longer restricted from use of blocking I/O calls in the UDR.
The C UDR can issue direct file-system calls that block further VP processing until the I/O is complete. Because user-defined VPs are not in the same VP class as CPU VPs, this blocking does not affect concurrency of the CPU VP or threads on other VPs. The most common use of a yielding user-defined VP is to run a UDR in which it is not practical to rewrite file-system activity with the DataBlade API file-access functions. For more information, see Avoiding Blocking I/O Calls.

Important:

A yielding user-defined VP relaxes the restriction on use of blocking I/O calls. However, they do not remove the restrictions on other types of unsafe system calls. For more information, see Avoiding Restricted System Calls.

The main disadvantage of a yielding user-defined VP is that it can reduce performance of UDR execution. Execution in the CPU VP maximizes performance of a well-behaved UDR.

For more information, see Defining a Yielding User-Defined VP Class.

The Nonyielding User-Defined VP

A nonyielding user-defined virtual-processor class runs a C UDR in a way that gives the routine exclusive use of the VP. It executes the UDR serially. That is, each UDR runs to completion before the next UDR begins. The C UDR does not yield. The most common use of a nonyielding user-defined VP class is for porting of legacy code that is not designed to handle concurrency issues (non-reentrant code) or that uses global memory.

The following table summarizes the programming rules that apply to execution in a nonyielding user-defined VP.

CPU VP Safe-Code Requirement	Required for Nonyielding User-Defined VP?
Yields the CPU on a regular basis	Not required
Does not use blocking operating-system calls	Not required
Does not allocate local resources, including heap memory	Yes
Does not modify global or static data	Not required (for global changes accessed by a single invocation of the UDR)
Does not modify other global data	Not required (for global changes accessed by a single invocation of the UDR)
Does not use unsafe operating-system calls	Yes

The main advantages of a nonyielding user-defined VP class is that a single invocation of the UDR is guaranteed to run on the same VP. This restriction creates the following benefits for an ill-behaved routine.

Feature of a Nonyielding User-Defined VP	Benefit to an Ill-Behaved UDR
Provides the same support for blocking I/O as a yielding user-defined VP	A UDR can perform blocking I/O functions. For a list of some sample blocking I/O functions, see Avoiding Blocking I/O Calls.
Can execute a C UDR that was not designed or coded to handle the concurrency issues of multiprocessing	A UDR executes to completion. A nonyielding user-defined VP ignores requests for a yield within DataBlade API functions as well as explicit calls to mi_yield( ).
Allows your UDR to modify global information	A UDR can modify global information (such as global or static variables, or global process information) as long as the changes to this global information are only needed within a single invocation of the UDR. For more information, see Avoiding Modification of Global and Static Variables and Modifying the Global Process State.

However, a nonyielding user-defined VP has the following disadvantages:

It reduces concurrency of the UDR execution.
If you have multiple VPs in the nonyielding VP class, multiple instances of the UDR can run concurrently, one per VP. However, each UDR invocation runs to completion. No migration occurs while one UDR invocation executes (or if the UDR performs blocking I/O).
It does not guarantee that the state remains across multiple instances of the UDR.
Two invocations of the UDR might not overlap on the same VP. Therefore, the global VP state remains stable. However, another instance of the UDR might migrate into the VP and change the global VP state.

Important:

If your UDR needs to make changes to global information that is available across the UDR instance, you must use a single-instance user-defined VP to execute the UDR.

For more information, see Defining a Nonyielding User-Defined VP Class.

The Single-Instance User-Defined VP

A single-instance user-defined VP class is a VP class that has only one VP. Therefore, it runs a C UDR in a way that gives the routine exclusive use of the entire VP class. As with a nonyielding user-defined VP, a single-instance VP executes a C UDR serially. Therefore, the UDR does not need to yield. Because a single-instance VP class has only one VP, the thread that executes the UDR does not migrate to another VP.

Depending on your requirements for yielding, a single-instance user-defined VP can be regular or nonyielding. A regular single-instance user-defined VP can handle the use of malloc( ) and other local memory access. If it is nonyielding, the VP can deal with problems like modification of global variables.

CPU VP Safe-Code Requirement	Required for Single-Instance User-Defined VP?
Yields the CPU on a regular basis	Not required
Does not use blocking operating-system calls	Not required
Does not allocate local resources, including heap memory	Yes
Does not modify global or static data	Not required (for global changes accessed by a single instance of the UDR)
Does not modify other global process-state information	Not required (for global changes accessed by a single instance of the UDR)
Does not use restricted operating-system calls	Required for some calls

The main advantage of a single-instance user-defined VP class is that all instances of the UDR are guaranteed to run on the same VP (that is, on the same system process). Therefore, changes the UDR makes to the global information (global or static variables, or the global process state) are accessible across all instances of the UDR. A UDR might execute many times for a query, once for each row processed. With multiple VPs in a class, you cannot guarantee that all instances of a UDR execute on the same VP. Though execution for the first invocation might be on one VP, the execution for the next invocation might be on some other VP.

The only way to guarantee that all instances execute on one VP is to define a single-instance user-defined VP class. Therefore, a single-instance user-defined VP class is useful for a UDR that shares special information across multiple instances. Examples might be a special iterator function or a user-defined aggregate.

Tip:

The DataBlade API supports the mi_udr_lock( ) function to explicitly lock a UDR to a VP. For more information, see Locking a Routine Instance to a VP.

For example, suppose you have a UDR that contains the following code fragment:

{
   static stat_var;
   static file_desc;
   mi_integer num_bytes_read;
   ...
   file_desc = mi_file_open(....);
   num_bytes_read = mi_file_read(file_desc ....);
   ...
}

If this UDR ran on a yielding user-defined VP, the thread might yield at the mi_file_read( ) call. Another thread might then execute this same code and change the value of file_desc. When the original thread returned, it would no longer be reading from the file it had opened. Instead, if you can assign this UDR to a nonyielding user-defined VP, the thread never yields and the value of file_desc cannot be changed by other threads.

The main disadvantage of a single-instance user-defined VP is that it removes concurrency of UDR execution. This loss of concurrency brings the following restrictions:

A single-instance user-defined VP is probably not a scalable solution.
All instances of the UDR that execute on a single-instance VP must compete for the same VP. You cannot increase the number of VPs in the single-instance class to improve performance.
A single-instance user-defined VP does not support execution of parallel UDRs.

Important:

If your UDR needs to make changes to global information that is available across only a single invocation of the UDR, use a nonyielding user-defined VP to execute the UDR. For more information, see The Nonyielding User-Defined VP.

You must weigh these advantages and disadvantages carefully when choosing whether to use a single-instance user-defined VP class to execute your ill-behaved UDR. For more information, see Defining a Single-Instance User-Defined VP Class.

Defining a User-Defined VP

You define a new virtual-processor class in the ONCONFIG file with the VPCLASS configuration parameter. The num option specifies the number of virtual processors in a user-defined VP class that the database server starts during its initialization. The class name is not case sensitive, but it must have fewer than 128 characters. If your DataBlade uses a prefix, such as USR, begin the names of any user-defined VPs with this prefix.

Dynamic Server supports the following types of user-defined VP classes for execution of an ill-behaved C UDR.

Type of User-Defined VP Class	VPCLASS Option
Yielding user-defined VP	None (default type of user-defined VP class)
Nonyielding user-defined VP	noyield
Single-instance user-defined VP (yielding or nonyielding)	num=1

Important:

When you edit the ONCONFIG file to create a new virtual-processor class, you must add a VPCLASS parameter and remove the SINGLE_CPU_VP parameter. For more information on the ONCONFIG file, see the IBM Informix: Administrator's Reference.

After you add or modify the VPCLASS configuration parameter, restart the database server with the oninit utility (or its equivalent). For more information about how to restart the database server, see your IBM Informix: Administrator's Guide. You can add or drop user-defined virtual processors while the database server is online. For more information, see Adding and Dropping VPs.

When you use a class of user-defined virtual processors to run a C UDR, you must ensure that the name of the VP is the same in both of the following locations:

In the VPCLASS parameter in the ONCONFIG file, which defines the VP class
In the CLASS routine modifier of the CREATE FUNCTION or CREATE PROCEDURE statement, which registers the C UDR in the database

For more information, see Assigning a C UDR to a User-Defined VP Class.

Defining a Yielding User-Defined VP Class

The VPCLASS configuration parameter creates a yielding user-defined VP by default. You can also use the num option to specify the number of VPs in the yielding user-defined VP class.

Figure 75 defines a yielding user-defined VP class named newvp with three virtual processors.

Figure 75. Defining a Yielding User-Defined VP Class

VPCLASS newvp,num=3               # Yielding VP class with 3 instances

The C user-defined function, GreaterThanEqual( ), in Figure 78, executes in the newvp VP class.

Defining a Nonyielding User-Defined VP Class

To create a nonyielding user-defined VP, include the noyield option of the VPCLASS configuration parameter. You can also use the num option to specify the number of VPs in the nonyielding user-defined VP class.

Tip:

The noyield option is ignored for predefined virtual-processor classes such as CPU and AIO. For more information on the VPCLASS configuration parameter, see the IBM Informix: Administrator's Reference.

Figure 76 defines the nonyielding user-defined VP class named nonyield_vp with two VPs in the class.

Figure 76. Defining a Nonyielding User-Defined VP Class

VPCLASS nonyield_vp, num=2, noyield               # Nonyielding VP class

At runtime you can determine whether the VP on which a UDR is running is part of a nonyielding user-defined VP class with the mi_vpinfo_isnoyield( ) function. For more information, see Obtaining VP-Environment Information.

Defining a Single-Instance User-Defined VP Class

To define a single-instance user-defined VP, specify a value of one (1) for the num option of the VPCLASS configuration parameter. Figure 77 creates a yielding single-instance user-defined VP class, single_vp.

Figure 77. Defining a Single-Instance User-Defined VP Class

VPCLASS single_vp, num=1                # Single-instance VP class

At runtime you can determine whether the VP on which a UDR is running is part of a single-instance user-defined VP class with the mi_vpinfo_vpid( ) and mi_class_numvp( ) functions. For more information, see Obtaining VP-Environment Information.

Assigning a C UDR to a User-Defined VP Class

When you register an ill-behaved C UDR, you assign it to a class of user-defined virtual processors with the CLASS routine modifier of the CREATE FUNCTION or CREATE PROCEDURE statement.

Tip:

By default, all C UDRs execute in any VP. To have your C UDR run only in the CPU VP, you can specify the string "cpu vp" with the CLASS modifier. If your C UDR can run anywhere, you should omit the CLASS modifier.

For example, Figure 78 shows a CREATE FUNCTION statement that registers the C user-defined function, GreaterThanEqual( ) and specifies that the user-defined VP class named newvp executes this function.

Figure 78. Specifying a User-Defined VP Class for a C UDR

CREATE FUNCTION GreaterThanEqual(ScottishName, ScottishName)
   RETURNS BOOLEAN 
   WITH (CLASS = 'newvp')
   EXTERNAL NAME '/usr/lib/objects/udrs.so(grtrthan_equal'
   LANGUAGE C;

Figure 75 shows the definition of the newvp user-defined VP class. All UDRs that specify the newvp VP class with the CLASS routine modifier share the three VPs in the newvp VP class.

When you register user-defined functions or user-defined procedures with the CREATE FUNCTION or CREATE PROCEDURE statement, you can reference any user-defined VP class that you like. The CREATE FUNCTION and CREATE PROCEDURE statements do not verify that the VP class you specify exists when they register the UDR.

Important:

When you try to run a UDR that was registered to execute in a user-defined VP class, that VP class must exist and it must have virtual processors assigned to it. If the class does not have any virtual processors, you receive an SQL error. For information on how to define a user-defined VP, see Defining a User-Defined VP.

For more information on the syntax of CREATE FUNCTION or CREATE PROCEDURE to assign a C UDR to a VP class, see the description of the CLASS routine modifier in the Routine Modifier segment of the IBM Informix: Guide to SQL Syntax.