Home | Previous Page | Next Page   Setting Up and Managing Enterprise Replication > Monitoring and Troubleshooting Enterprise Replication >

Enterprise Replication Event Alarms

Starting with Version 10.0 of Informix Dynamic Server, you can use event alarms specific to Enterprise Replication to automate many administrative tasks. You can set your ALARMPROGRAM script to capture Enterprise Replication Class IDs and messages and initiate corrective action or notification for each event. For example, you can add a new chunk to the queue data sbspace or dbspace if you detect (using Class ID 31) that the storage space is full.

For information on setting ALARMPROGRAM scripts to capture events, see the appendix on event alarms in the IBM Informix Administrator's Reference.

If you were already using an ALARMPROGRAM script prior to Version 10.0 to manage Enterprise Replication administrative work, you need to modify the script to detect and take action on the Enterprise Replication events documented in this section.

The following table lists the Class IDs and Class Messages for the alarms that are raised by Enterprise Replication.

Class ID Class Message
30 DDR subsystem [failure | notification]
31 ER stable storage [queue sbspace | queue dbspace | pager sbspace] is full
32 ER: error detected in grouper sub component
33 ER: error detected in data sync sub component
34 ER: error detected in queue management sub component
436 ER: network interface sub component notification
35 ER: error detected in global catalog sub component
37 ER: error detected while recovering Enterprise Replication
38 ER: resource allocation problem detected
39 Please contact IBM Informix Technical Support

The following tables show for each Class ID the error strings that can be returned, their severity, and the situations that trigger them. In the Situation column, snoopy refers to ddr_snoopy, an internal component of Enterprise Replication that reads the log buffers and passes information to the grouper.

Table 19. Events for Class ID 30
Error String Severity Situation
Log corruption detected or read error occurred while snooping logs. ALRM_
EMERGENCY
Snoopy receives a bad buffer during a log read.
WARNING: The replay position was overrun, data may not be replicated. ALRM_
EMERGENCY
Snoopy detects that the replay position has been overwritten (page Preventing DDRBLOCK Mode)
CDR: Unexpected log record type record_type for subsystem subsystem passed to DDR. ALRM_
EMERGENCY
A log record of unexpected type was passed to snoopy.
DDR Log Snooping - Catchup phase started, userthreads blocked ALRM_
ATTENTION
Snoopy sets DDRBLOCK (Preventing DDRBLOCK Mode)
DDR Log Snooping - Catchup phase completed, userthreads unblocked ALRM_
ATTENTION
Snoopy unsets DDRBLOCK (Preventing DDRBLOCK Mode)
Table 20. Events for Class ID 31
Error String Severity Situation
CDR QUEUER: Send Queue space is FULL - waiting for space in sbspace_name. ALRM_
EMERGENCY
An RQM queue runs out of room to spool (Recovering when Storage Spaces Fill)
CDR Pager: Paging File full: Waiting for additional space in sbspace_name ALRM_
EMERGENCY
Grouper paging sbspace has run out of space (Increasing the Sizes of Storage Spaces)
Table 21. Events for Class ID 32
Error String Severity Situation
CDR Grouper Fanout/Evaluator thread is aborting. ALRM_
EMERGENCY
Grouper fanout or evaluator is aborting.
CDR: Could not copy transaction at log id log_unique_id position log_position. Skipped. ALRM_
EMERGENCY
Grouper is unable to copy the transaction into send queue.
CDR: Paging error detected. ALRM_
EMERGENCY
Grouper detected paging error.
CDR Grouper: Local participant (%s) stopped for the replicate %s (or exclusive replicate set), table (%s:%s.%s). Data may be out of sync. If replicated column definition was modified then please perform the alter operation at all the replicate participants, remaster the replicate definition then restart the replicate (or exclusive replicate set) definition for the local participant with the data sync option (-S). ALRM_
EMERGENCY
If the grouper sub-component is not able to convert the replicated row data from the local dictionary format to the master dictionary format, the grouper stops the local participant from the corresponding replicate (or exclusive replicate set) definition and invokes the alarm event handler.
CDR CDR_subcomponent_name: Could not apply undo properly. SKIPPING TRANSACTION. TX Begin Time: datetime TX Restart Log Id: log_id TX Restart Log Position: log_position TX Commit Time: datetime TX End Log Id: log_id TX End Log Position: log_position ALRM_
ATTENTION
Grouper was unable to apply an undo (rollback to savepoint) to a transaction.
Table 22. Events for Class ID 33
Error String Severity Situation
CDR DS thread_name thread is aborting. ALRM_
EMERGENCY
Data sync is aborting.
Received aborted transaction, no data to spool. ALRM_
INFO
Datasync received transaction that was aborted in first buffer, so there is nothing to spool to ATS/RIS.
Table 23. Events for Class ID 34
Error String Severity Situation
CDR CDR_subcomponent_name: bad replicate ID replicate_id ALRM_
ATTENTION
RQM cannot find the replicate in the global catalog for which it has a transaction.
Table 24. Events for Class ID 35
Error String Severity Situation
CDR: Could not drop delete table. SQL code sql_error_code, ISAM code isam_error_code. Table 'database_name:table_name'. Please drop the table manually. ALRM_
ATTENTION
Could not drop delete table while deleting the replicate from the local participant.
CDR GC peer request failed: command: command_string, error error_code, CDR server CDR_server_ID ALRM_
ATTENTION
Execution of the control command requested by the peer server failed at the local server.
CDR GC peer processing failed: command: command_string, error error_code, CDR server CDR_server_ID ALRM_
ATTENTION
Control command execution at the peer server failed.
4Table 25. Events for Class ID 36
4Error String 4Severity 4Situation
4CDR NIF connection terminated to servergroupname; connection request received from an unknown server. 4ALRM_ 4
4ATTENTION
4Enterprise Replication received a re-connect connection 4request from an unknown server.
Table 26. Events for Class ID 37
Error String Severity Situation
CDR CDR_subcomponent_name: bad replicate ID replicate_id ALRM_
ATTENTION
Table 27. Events for Class ID 38
Error String Severity Situation
CDR CDR_subcomponent_name memory allocation failed (reason). ALRM_INFO The specified Enterprise Replication component could not allocate memory.
Table 28. Events for Class ID 39
Error String Severity Situation
(blank) ALRM_
EMERGENCY
An internal error has occurred that requires assistance from Technical Support.
Home | [ Top of Page | Previous Page | Next Page | Contents | Index ]