The basic rule is SHM connections to CPU VP's, network connections to NET VP's.
If a poll threads runs on a CPU VP then there are NO busy waits on the
VP (a busy wait is when the process spins attempting to avoid
semop). Instead of spinning the CPU VP will run the poll
thread. So rather than the CPU VP spinning needlessly wasting time, it
is polling waiting on I/O and if still does not find work to do, then
it will semop. When the NET VP notifies the CPU VP that a request has
arrived the CPU VP will force a context switch. If both are running on
the same CPU VP this is just a thread switch.
By default the poll thread runs every 10 thread switch. So if you are doing 1000 threads switch a second then you will be doing 100 poll calls a second.
The following paragraphs are a copy of an article posted by Art Kagel to the comp.databases.informix newsgroup
There have been several posts lately that question how best to allocate shared memory and TCP listeners and the docs do not help as the authors' took a narrow view that just does not work in the real-world. Here's why I recommend shared memory in EVERY CPU VP and TCP in a few NET VPs, as posted to the IDS Forum this AM:
I've researched and tested this one to death long ago and it still holds. To explain. We originally went with the recommendations in the admin guide and had only one CPU listener for shared memory connections and a couple for TCP connections. Since most connections here are shared memory (all user accesses are through middleware servers running local to the engine) response was OK most of the time, though we did notice that CPU VP #1 was using several times the CPU time as the other 23 CPU VPs and some peak periods simple requests seemed to hang.
However, the SA's complained that once the Informix engine was started up the system call overhead on the machines went from about 2000/sec on similarly loaded machine not running IDS to 20-25000/sec on the IDS server machines. This was on DG/UX on early Motorolla 88000 based Aviion platforms which could not easily handle system call overhead beyond about 20000/sec without dogging out so we had a problem. My normal approach to problem solving is to figure out how the problem could be caused and prove or disprove the various proposals.
I spoke to senior Informix techs about how and when the engine might be using excessive system calls and eventually the discussion got around to polling/listening. It turns out that quite naturally if the TCP litener threads are running in NET VPs which have nothing else to do they set up a select() system call with all of the connection listening ports and all of the ports assigned to existing connections and block on that call until a request for a new connection or for service of an existing one comes in. One system call, all is cool. On the other hand if the thread is running in the CPU VP it cannot block because the CPU VP has other work to do and remember IDS threading is done in the IDS library not the OS scheduler so the system call block will block the entire VP not just the one thread. So instead of sitting on the select call the thread will poll all active ports in a short timer loop when it is not otherwise busy and at certain break points during query processing. The polling involves a non-blocking read() system call to each active port. MASSIVE system call overload results. Plus if the CPU VP is very busy it will be polling relatively seldom from a new requests point of view causing response delays when the load goes up.
Now conversely, for ipcshm connections there is no blocking call possible. There is only reading the shared memory locations for each allocated connection (active or not) in a loop. Each read requires a system call to latch the location and another to reset the mutex. If poll thread is assigned to a NET VP which has nothing else to do the poll loop will run almost continuously until the VPs timeslice expires. Again massive system call overload and a CPU that is burning cycles for nothing. (However, as the admin guide suggests shared memory responsiveness is great!) Not place that ipcshm poll thread into a CPU VP and the poll loop can only run when the VP is quiet or at those designated breakpoints and then only for a few iterations. Which is why my shared memory responsiveness was down at peak. It also explained why the 1st CPU VP was getting to do most of the work. Remember that the point in time when the CPU VP has the most time to poll is when it is not processing requests or is waiting for IOs. So when it picks up a new request it is likely to assign the new user thread to itself instead of putting into the queue for another CPU VP to pick it up. This causes the one listening VP to be even busier and less responsive.
So based on this research and thought-experiment I began to test the obvious and less obvious solutions. The best course was to run shared memory poll threads in EVERY CPU VP to spread the load and improve responsiveness and to run TCP listeners in NET VPs so they could block when possible. The results? Our CPU VPs now run a smooth curve with the first 3rd taking on about 50% of the work the second 3rd of the VPs taking about 30% and the remaining 20% on the remaining CPU VPs and shared memory response is instantaneous under all load conditions since some CPU VP is always polling or about to poll. Meanwhile system call overhead on the server host was cut to under 10000/sec. As a bonus CPU usage on the CPU to which the first CPU VP was affined drop from 80% to 40%.
After that I began to advocate these rules of thumb for connections:
TCP connections ONLY in NET VPs only and as few as needed to maintain responsiveness. Shared memory connections ONLY in CPU VPs and running in EVERY CPU VP always.
To discuss how Oninit ® can assist please call on +1-913-674-0360 or alternatively just send an email specifying your requirements.