Oninit® Log Ripper — Configuration Reference
Every YAML key the Ripper accepts, grouped by section. Defaults match
what config.yml.example ships. Anything not listed in this
table is silently ignored.
Precedence
For settings that exist in more than one place, the Ripper resolves
them in this order (highest wins):
| Source | Notes |
| CLI flag |
-l, -w, -i, -R, the
-v / -T / -S / -C / -D / -X debug flags. Each
overrides the matching YAML key for the run. |
| YAML file (-c) |
The configuration file is required. Anything set there
beats the built-in defaults. |
| Environment |
INFORMIXSERVER is the only env var the Ripper
reads directly — used as a fallback for
source.server when the YAML omits it. ESQL/C also
consults INFORMIXDIR, INFORMIXSQLHOSTS,
DBPATH, DB_LOCALE, CLIENT_LOCALE
via the CSDK. |
| Built-in default |
The values shown in the Default column of each
table below. |
source
| Key | Type | Default | Effect |
| database | string | (required) |
Source Informix database to capture from. |
| server | string | (required) |
Source INFORMIXSERVER name. Can also be set
via the env var. |
| user | string | (empty) |
If set, used as CONNECT TO … USER …
USING …. If empty, falls back to OS auth. |
| password | string | (empty) |
Paired with user. |
| start_lsn | integer | 0 |
0 = start at the current end of the logical log. A
non-zero value calls cdc_activatesess at that
position. The CLI -l <lsn> overrides. |
| schema_archive_dir | path | (empty) |
Directory holding archived
<db>_<tab>.sql DDL files used for
drift recovery and the startup
[ARCHIVE-CMP] safety net. Time-versioned
variants are recognized via the suffix form
<db>_<tab>.<YYYYMMDDTHHMMSSZ>.sql
(UTC, ISO 8601); the un-suffixed file means
"current / effective at +∞". The Ripper picks the
file with the largest timestamp ≤ the in-flight
transaction's BEGINTX time at acquire-time. |
| control_dir | path | (empty) |
Directory polled by workers for
release_<tab> /
acquire_<tab> sentinel files. Empty
disables live release. |
| tables | list | all user tables |
Glob patterns or extended map entries (see
Per-table extended entry below).
!pattern excludes. |
Per-table extended entry
| Key | Type | Default | Effect |
| name | string | (required) |
Table name or glob. |
| columns | list | all columns |
Per-column include / exclude patterns. An all-exclude
list (["!body"]) defaults to "include everything
else"; any include entry switches to "include only
listed". |
| skip_unsupported_columns | bool | false |
When the column DESCRIBE finds a type CDC rejects
(TEXT / BYTE / BLOB / CLOB / SET / MULTISET / LIST / ROW /
UDT), false aborts the whole table; true
silently included = 0's those columns and
captures the rest. |
target
| Key | Type | Default | Effect |
| mode | string | file |
One of file, json, csv,
kafka, informix, odbc. |
| file.directory | path | "." |
Where transaction SQL files are written. |
| json.directory | path | "." |
Where transaction JSON files are written. RFC 8259
compliant: UTF-8 only, every byte
≥ 0x80 escaped as \u00XX so the document
parses regardless of source DB locale; LSNs emitted as hex
strings (not numbers) to dodge IEEE 754 precision loss
above 253. |
| csv.directory | path | "." |
Where transaction CSV files are written. RFC 4180
compliant; long-form (one row per CDC record, every row
carries the full transaction context: txid, worker_id,
user, started, committed, begin_lsn, commit_lsn, lsn, op,
owner, table, status, sql). Bytes pass through verbatim —
set the consumer's decoder to match the source DB locale. |
| csv.delimiter | char | "," |
Single-character field delimiter. Anything containing
the delimiter, double quote, CR, or LF is wrapped in
double quotes per RFC 4180; internal double quotes are
doubled. |
| csv.include_header | bool | true |
Write the column-name row as the first line of every
per-transaction file. |
| kafka.brokers | string | (required) |
Comma-separated list of host:port bootstrap
servers for the producer. |
| kafka.topic | string | (required) |
Single destination topic. Per-record messages are
keyed by "<owner>.<table>" so all
changes for one table go to the same partition (preserves
per-table ordering). |
| kafka.acks | string | "all" |
Producer ack mode. all = wait for all in-sync
replicas (safest, default); 1 = leader only;
0 = fire-and-forget. |
| kafka.compression | string | "none" |
One of none, gzip, snappy,
lz4, zstd. |
| kafka.client_id | string | "oni_ripper" |
Producer client.id; identifies the Ripper in
broker logs and metrics. |
| kafka.flush_timeout_ms | integer | 30000 |
Per-transaction rd_kafka_flush() wait. The
worker's last_committed_lsn only advances after
the broker acks all of the transaction's messages. |
| kafka.security_protocol | string | "plaintext" |
One of plaintext, ssl,
sasl_plaintext, sasl_ssl. |
| kafka.sasl_mechanism | string | (empty) |
PLAIN, SCRAM-SHA-256, or
SCRAM-SHA-512. Required when
security_protocol is sasl_*. |
| kafka.sasl_username,
kafka.sasl_password | string | (empty) |
SASL credentials. |
| informix.database | string | — |
Target Informix DB. Cross-server allowed:
db@server. |
| informix.server | string | (empty) |
If set, alternative to the db@server form
above. |
| informix.connection | string | targetconn |
Named ESQL/C connection identifier. |
| informix.user, informix.password |
string | (empty) |
OS auth if both empty. |
| odbc.dsn | string | — |
DSN name from /etc/odbc.ini. |
| odbc.user, odbc.password |
string | — |
Passed to SQLConnect. |
threading
| Key | Type | Default | Effect |
| sessions_per_thread | integer | 10 |
Tables per worker thread. Total workers needed =
ceil(ntables / sessions_per_thread),
capped at max_threads. |
| max_threads | integer | 4 |
Hard cap on worker threads. Tables that don't fit
would otherwise be silently dropped — raise this if
a startup log line warns about table coverage. |
| cdc_timeout | integer (seconds) | 5 |
cdc_opensess timeout argument. 0 means
block-forever; finite values let the worker poll the
running flag so SIGTERM is honored. |
| max_recs_per_read | integer | 1 |
Upper bound on records the server packs into a single
ifx_lo_read response. Higher amortizes
per-read overhead; lower delivers
CDC_REC_TIMEOUT sooner. |
| cleanup_orphans | bool | true |
If set, the main thread sweeps syscdcsess
at startup and closes any leftover sessions before workers
spin up. |
logging
| Key | Type | Default | Effect |
| verbose | bool | false |
High-level info messages. Same as the -v
CLI flag. Reloadable on SIGHUP. |
| debug_trace | bool | false |
[TRACE] function-entry lines. -T.
Reloadable on SIGHUP. |
| debug_sql | bool | false |
[SQL] generated statements. -S.
Reloadable. |
| debug_cdc | bool | false |
[CDC] API trace. -C. Reloadable. |
| debug_detail | bool | false |
[DET] column-level dump. -D.
Reloadable. |
| debug_hex | bool | false |
[HEX] raw byte dump. -X.
Reloadable. |
| status_interval | integer (seconds) | 0 |
[STATUS] + [LSN] cadence. 0 = off.
Reloadable on SIGHUP. |
| logfile | path | (empty) |
Daemon-mode stdout/stderr destination. Empty →
/dev/null. |
| pidfile | path | /tmp/oni_ripper.pid |
PID file written in daemon mode. |
| log_maxsize | integer (bytes) | 0 |
Rotate / truncate trigger. 0 = unlimited. Reloadable
on SIGHUP. |
| log_keep | integer | 5 |
Rotated copies retained. 0 = legacy in-place
truncate. Reloadable on SIGHUP. |
| state_file | path | (empty) |
LSN checkpoint file. Empty disables persistence. |
| clean_restart | bool | false |
Ignore the saved state_file on read. The CLI
-R overrides this for one run. |
alerting
| Key | Type | Default | Effect |
| discard_command | shell command | (empty) |
Run via /bin/sh -c on each
CDC_REC_DISCARD event. Env vars passed:
RIPPER_EVENT, RIPPER_WORKER_ID,
RIPPER_DISCARDS, RIPPER_LSN,
RIPPER_TIMESTAMP. |
| discard_alert_min_sec | integer (seconds) | 60 |
Minimum gap between alert fires per worker, so a
discard storm doesn't fork-bomb the alerter. |
buffering
Target-offline buffering. When the configured target returns a
connection-class failure, transactions are appended to an in-memory
ring; when memory fills, the oldest batch spills to a memory-mapped
file under disk_overflow_path. A probe thread retries the
target and the buffer drains chronologically when it’s back.
| Key | Type | Default | Effect |
| memory_max_bytes | integer (bytes) | 0 |
Soft cap on the in-memory ring. 0 disables the
subsystem entirely — output failures stay
fatal. 268435456 (256 MB) is a reasonable starting point
for one-target deployments. |
| disk_overflow_path | directory path | (empty) |
Where spill files (<unix_ts>_<seq>.buf,
mmap'd on drain) are written. The ripper creates the
directory at startup if it doesn’t exist. Empty =
memory-only (no disk overflow). |
| disk_max_bytes | integer (bytes) | 0 |
Cap on total spill bytes on disk. 0 = memory-only.
Once both memory_max_bytes and disk_max_bytes
are exhausted, the buffer transitions to SUSPENDED:
a [CRITICAL] line fires and capture pauses until the
operator clears space or the target comes back. |
| target_check_seconds | integer (seconds) | 30 |
Probe cadence. The probe thread runs a zero-record
transaction through the live backend; success transitions
the state machine to DRAINING and the
backlog replays. |
| on_shutdown | enum | flush |
SIGTERM behavior while memory holds records.
flush spills the in-memory tail to
disk_overflow_path so the next startup picks it up;
drain blocks until the target accepts
(falls through to flush if it doesn’t);
abandon drops the in-memory tail with a
[CRITICAL] audit line. |
Connection-class triggers per backend (anything else stays a
data-class failure and the legacy ROLLBACK + log path):
| Mode | Offline trigger |
| informix |
SQLCODE -908 / -931 / -25555 / -956 / -25588 |
| odbc |
SQLSTATE class 08* (ODBC spec
"Connection exception") |
| postgres |
PQstatus(conn) == CONNECTION_BAD after BEGIN / exec / COMMIT |
| mysql / mariadb |
mysql_errno 2002 / 2003 / 2006 / 2012 / 2013 |
| db2 |
SQLSTATE class 08* from SQLGetDiagRec |
| kafka |
librdkafka __TRANSPORT / __ALL_BROKERS_DOWN /
__TIMED_OUT / __RESOLVE / __MSG_TIMED_OUT /
BROKER_NOT_AVAILABLE |
monitoring
Optional embedded HTTP server exposing Prometheus metrics +
liveness probe. Default-disabled; the ripper opens no listening
sockets unless this section is present and port is
non-zero.
| Key | Type | Default | Effect |
| prometheus.port | integer (TCP port) | 0 |
0 disables the endpoint. Any non-zero
value starts a single-threaded daemon HTTP server that
serves /metrics + /healthz. Bind failures
log a WARN line and continue capture —
metrics are non-critical. |
| prometheus.bind | IPv4 address | 127.0.0.1 |
Address the embedded server binds. Loopback by default
so a fresh enable carries no external attack surface. Set
0.0.0.0 to allow a Prometheus server on a different
host to scrape; wrap with a reverse-proxy if the link
crosses an untrusted network — the embedded server
is plain HTTP, no TLS, no authentication. |
Metric families exposed (v1):
| Metric | Type | Labels | Description |
| oni_logripper_build_info | gauge |
version |
Constant 1, advertises ripper version. |
| oni_logripper_records_total | counter |
worker, op |
Records emitted per worker per op ∈
insert/update/delete/truncate/discard. |
| oni_logripper_lag_seconds | gauge |
worker |
Source wall-clock minus last seen transaction
timestamp; 0 if no tx has been processed yet. |
| oni_logripper_recovery_count | counter |
worker |
Worker session recovery attempts since process
start. |
| oni_logripper_worker_running | gauge |
worker |
1 if the worker thread is actively running, 0 if
stopped. |
/healthz returns HTTP 200 when every
worker reports error == 0, otherwise
HTTP 503. Curated Grafana provisioning files +
dashboard JSON ship under share/grafana/; see
grafana.html for the
drop-in install steps, panel breakdown, and operator playbook.
CLI flags
| Flag | Effect |
| -c <file> |
Configuration file (required). |
| -l <lsn> |
Override starting LSN. |
| -w <num> |
Cap worker threads (overrides
threading.max_threads). |
| -i <secs> |
Status report interval (overrides
logging.status_interval). |
| -d |
Daemonize. |
| -v / -T / -S / -C /
-D / -X |
Verbose / Trace / SQL / CDC / Detail / heX dump.
Each enables the corresponding logging.debug_*
flag for the run. |
| -R |
Clean restart — ignore the saved state file on
read. The state file is still rewritten on shutdown. |
| -t <mode> |
Connectivity test. Runs every startup phase except
worker spawn (source connect, table resolution, LSN
monitor, log-fallback pre-flight, output init) then exits
cleanly. <mode> = auto uses
target.mode from the YAML; any other mode name
overrides for the test. Implies -v. Skips orphan
cleanup so no server-side state changes. |
| -V |
Print version and exit. |
| -h |
Show help. |
Signals
| Signal | Effect |
| SIGTERM, SIGINT |
Graceful shutdown. Drain in-flight transactions to
_NO_COMMIT files, persist the LSN checkpoint,
close CDC sessions. |
| SIGHUP |
Reload the YAML config and apply the runtime-tunable
subset (verbose, debug_*,
status_interval, log_maxsize,
log_keep, cdc_timeout). Anything else
captured at startup is logged as not-reloaded. |
| SIGCHLD |
Ignored (SIG_IGN) so the kernel auto-reaps
discard-alert subprocesses. |