Oninit® Log Ripper — Configuration Reference

Every YAML key the Ripper accepts, grouped by section. Defaults match what config.yml.example ships. Anything not listed in this table is silently ignored.

Precedence

For settings that exist in more than one place, the Ripper resolves them in this order (highest wins):

SourceNotes
CLI flag -l, -w, -i, -R, the -v / -T / -S / -C / -D / -X debug flags. Each overrides the matching YAML key for the run.
YAML file (-c) The configuration file is required. Anything set there beats the built-in defaults.
Environment INFORMIXSERVER is the only env var the Ripper reads directly — used as a fallback for source.server when the YAML omits it. ESQL/C also consults INFORMIXDIR, INFORMIXSQLHOSTS, DBPATH, DB_LOCALE, CLIENT_LOCALE via the CSDK.
Built-in default The values shown in the Default column of each table below.

source

KeyTypeDefaultEffect
databasestring(required) Source Informix database to capture from.
serverstring(required) Source INFORMIXSERVER name. Can also be set via the env var.
userstring(empty) If set, used as CONNECT TO … USER … USING …. If empty, falls back to OS auth.
passwordstring(empty) Paired with user.
start_lsninteger0 0 = start at the current end of the logical log. A non-zero value calls cdc_activatesess at that position. The CLI -l <lsn> overrides.
schema_archive_dirpath(empty) Directory holding archived <db>_<tab>.sql DDL files used for drift recovery and the startup [ARCHIVE-CMP] safety net. Time-versioned variants are recognized via the suffix form <db>_<tab>.<YYYYMMDDTHHMMSSZ>.sql (UTC, ISO 8601); the un-suffixed file means "current / effective at +∞". The Ripper picks the file with the largest timestamp ≤ the in-flight transaction's BEGINTX time at acquire-time.
control_dirpath(empty) Directory polled by workers for release_<tab> / acquire_<tab> sentinel files. Empty disables live release.
tableslistall user tables Glob patterns or extended map entries (see Per-table extended entry below). !pattern excludes.

Per-table extended entry

KeyTypeDefaultEffect
namestring(required) Table name or glob.
columnslistall columns Per-column include / exclude patterns. An all-exclude list (["!body"]) defaults to "include everything else"; any include entry switches to "include only listed".
skip_unsupported_columnsboolfalse When the column DESCRIBE finds a type CDC rejects (TEXT / BYTE / BLOB / CLOB / SET / MULTISET / LIST / ROW / UDT), false aborts the whole table; true silently included = 0's those columns and captures the rest.

target

KeyTypeDefaultEffect
modestringfile One of file, json, csv, kafka, informix, odbc.
file.directorypath"." Where transaction SQL files are written.
json.directorypath"." Where transaction JSON files are written. RFC 8259 compliant: UTF-8 only, every byte ≥ 0x80 escaped as \u00XX so the document parses regardless of source DB locale; LSNs emitted as hex strings (not numbers) to dodge IEEE 754 precision loss above 253.
csv.directorypath"." Where transaction CSV files are written. RFC 4180 compliant; long-form (one row per CDC record, every row carries the full transaction context: txid, worker_id, user, started, committed, begin_lsn, commit_lsn, lsn, op, owner, table, status, sql). Bytes pass through verbatim — set the consumer's decoder to match the source DB locale.
csv.delimiterchar"," Single-character field delimiter. Anything containing the delimiter, double quote, CR, or LF is wrapped in double quotes per RFC 4180; internal double quotes are doubled.
csv.include_headerbooltrue Write the column-name row as the first line of every per-transaction file.
kafka.brokersstring(required) Comma-separated list of host:port bootstrap servers for the producer.
kafka.topicstring(required) Single destination topic. Per-record messages are keyed by "<owner>.<table>" so all changes for one table go to the same partition (preserves per-table ordering).
kafka.acksstring"all" Producer ack mode. all = wait for all in-sync replicas (safest, default); 1 = leader only; 0 = fire-and-forget.
kafka.compressionstring"none" One of none, gzip, snappy, lz4, zstd.
kafka.client_idstring"oni_ripper" Producer client.id; identifies the Ripper in broker logs and metrics.
kafka.flush_timeout_msinteger30000 Per-transaction rd_kafka_flush() wait. The worker's last_committed_lsn only advances after the broker acks all of the transaction's messages.
kafka.security_protocolstring"plaintext" One of plaintext, ssl, sasl_plaintext, sasl_ssl.
kafka.sasl_mechanismstring(empty) PLAIN, SCRAM-SHA-256, or SCRAM-SHA-512. Required when security_protocol is sasl_*.
kafka.sasl_username, kafka.sasl_passwordstring(empty) SASL credentials.
informix.databasestring Target Informix DB. Cross-server allowed: db@server.
informix.serverstring(empty) If set, alternative to the db@server form above.
informix.connectionstringtargetconn Named ESQL/C connection identifier.
informix.user, informix.password string(empty) OS auth if both empty.
odbc.dsnstring DSN name from /etc/odbc.ini.
odbc.user, odbc.password string Passed to SQLConnect.

threading

KeyTypeDefaultEffect
sessions_per_threadinteger10 Tables per worker thread. Total workers needed = ceil(ntables / sessions_per_thread), capped at max_threads.
max_threadsinteger4 Hard cap on worker threads. Tables that don't fit would otherwise be silently dropped — raise this if a startup log line warns about table coverage.
cdc_timeoutinteger (seconds)5 cdc_opensess timeout argument. 0 means block-forever; finite values let the worker poll the running flag so SIGTERM is honored.
max_recs_per_readinteger1 Upper bound on records the server packs into a single ifx_lo_read response. Higher amortizes per-read overhead; lower delivers CDC_REC_TIMEOUT sooner.
cleanup_orphansbooltrue If set, the main thread sweeps syscdcsess at startup and closes any leftover sessions before workers spin up.

logging

KeyTypeDefaultEffect
verboseboolfalse High-level info messages. Same as the -v CLI flag. Reloadable on SIGHUP.
debug_traceboolfalse [TRACE] function-entry lines. -T. Reloadable on SIGHUP.
debug_sqlboolfalse [SQL] generated statements. -S. Reloadable.
debug_cdcboolfalse [CDC] API trace. -C. Reloadable.
debug_detailboolfalse [DET] column-level dump. -D. Reloadable.
debug_hexboolfalse [HEX] raw byte dump. -X. Reloadable.
status_intervalinteger (seconds)0 [STATUS] + [LSN] cadence. 0 = off. Reloadable on SIGHUP.
logfilepath(empty) Daemon-mode stdout/stderr destination. Empty → /dev/null.
pidfilepath/tmp/oni_ripper.pid PID file written in daemon mode.
log_maxsizeinteger (bytes)0 Rotate / truncate trigger. 0 = unlimited. Reloadable on SIGHUP.
log_keepinteger5 Rotated copies retained. 0 = legacy in-place truncate. Reloadable on SIGHUP.
state_filepath(empty) LSN checkpoint file. Empty disables persistence.
clean_restartboolfalse Ignore the saved state_file on read. The CLI -R overrides this for one run.

alerting

KeyTypeDefaultEffect
discard_commandshell command(empty) Run via /bin/sh -c on each CDC_REC_DISCARD event. Env vars passed: RIPPER_EVENT, RIPPER_WORKER_ID, RIPPER_DISCARDS, RIPPER_LSN, RIPPER_TIMESTAMP.
discard_alert_min_secinteger (seconds)60 Minimum gap between alert fires per worker, so a discard storm doesn't fork-bomb the alerter.

buffering

Target-offline buffering. When the configured target returns a connection-class failure, transactions are appended to an in-memory ring; when memory fills, the oldest batch spills to a memory-mapped file under disk_overflow_path. A probe thread retries the target and the buffer drains chronologically when it’s back.

KeyTypeDefaultEffect
memory_max_bytesinteger (bytes)0 Soft cap on the in-memory ring. 0 disables the subsystem entirely — output failures stay fatal. 268435456 (256 MB) is a reasonable starting point for one-target deployments.
disk_overflow_pathdirectory path(empty) Where spill files (<unix_ts>_<seq>.buf, mmap'd on drain) are written. The ripper creates the directory at startup if it doesn’t exist. Empty = memory-only (no disk overflow).
disk_max_bytesinteger (bytes)0 Cap on total spill bytes on disk. 0 = memory-only. Once both memory_max_bytes and disk_max_bytes are exhausted, the buffer transitions to SUSPENDED: a [CRITICAL] line fires and capture pauses until the operator clears space or the target comes back.
target_check_secondsinteger (seconds)30 Probe cadence. The probe thread runs a zero-record transaction through the live backend; success transitions the state machine to DRAINING and the backlog replays.
on_shutdownenumflush SIGTERM behavior while memory holds records. flush spills the in-memory tail to disk_overflow_path so the next startup picks it up; drain blocks until the target accepts (falls through to flush if it doesn’t); abandon drops the in-memory tail with a [CRITICAL] audit line.

Connection-class triggers per backend (anything else stays a data-class failure and the legacy ROLLBACK + log path):

ModeOffline trigger
informix SQLCODE -908 / -931 / -25555 / -956 / -25588
odbc SQLSTATE class 08* (ODBC spec "Connection exception")
postgres PQstatus(conn) == CONNECTION_BAD after BEGIN / exec / COMMIT
mysql / mariadb mysql_errno 2002 / 2003 / 2006 / 2012 / 2013
db2 SQLSTATE class 08* from SQLGetDiagRec
kafka librdkafka __TRANSPORT / __ALL_BROKERS_DOWN / __TIMED_OUT / __RESOLVE / __MSG_TIMED_OUT / BROKER_NOT_AVAILABLE

monitoring

Optional embedded HTTP server exposing Prometheus metrics + liveness probe. Default-disabled; the ripper opens no listening sockets unless this section is present and port is non-zero.

KeyTypeDefaultEffect
prometheus.portinteger (TCP port)0 0 disables the endpoint. Any non-zero value starts a single-threaded daemon HTTP server that serves /metrics + /healthz. Bind failures log a WARN line and continue capture — metrics are non-critical.
prometheus.bindIPv4 address127.0.0.1 Address the embedded server binds. Loopback by default so a fresh enable carries no external attack surface. Set 0.0.0.0 to allow a Prometheus server on a different host to scrape; wrap with a reverse-proxy if the link crosses an untrusted network — the embedded server is plain HTTP, no TLS, no authentication.

Metric families exposed (v1):

MetricTypeLabelsDescription
oni_logripper_build_infogauge version Constant 1, advertises ripper version.
oni_logripper_records_totalcounter worker, op Records emitted per worker per op ∈ insert/update/delete/truncate/discard.
oni_logripper_lag_secondsgauge worker Source wall-clock minus last seen transaction timestamp; 0 if no tx has been processed yet.
oni_logripper_recovery_countcounter worker Worker session recovery attempts since process start.
oni_logripper_worker_runninggauge worker 1 if the worker thread is actively running, 0 if stopped.

/healthz returns HTTP 200 when every worker reports error == 0, otherwise HTTP 503. Curated Grafana provisioning files + dashboard JSON ship under share/grafana/; see grafana.html for the drop-in install steps, panel breakdown, and operator playbook.

CLI flags

FlagEffect
-c <file> Configuration file (required).
-l <lsn> Override starting LSN.
-w <num> Cap worker threads (overrides threading.max_threads).
-i <secs> Status report interval (overrides logging.status_interval).
-d Daemonize.
-v / -T / -S / -C / -D / -X Verbose / Trace / SQL / CDC / Detail / heX dump. Each enables the corresponding logging.debug_* flag for the run.
-R Clean restart — ignore the saved state file on read. The state file is still rewritten on shutdown.
-t <mode> Connectivity test. Runs every startup phase except worker spawn (source connect, table resolution, LSN monitor, log-fallback pre-flight, output init) then exits cleanly. <mode> = auto uses target.mode from the YAML; any other mode name overrides for the test. Implies -v. Skips orphan cleanup so no server-side state changes.
-V Print version and exit.
-h Show help.

Signals

SignalEffect
SIGTERM, SIGINT Graceful shutdown. Drain in-flight transactions to _NO_COMMIT files, persist the LSN checkpoint, close CDC sessions.
SIGHUP Reload the YAML config and apply the runtime-tunable subset (verbose, debug_*, status_interval, log_maxsize, log_keep, cdc_timeout). Anything else captured at startup is logged as not-reloaded.
SIGCHLD Ignored (SIG_IGN) so the kernel auto-reaps discard-alert subprocesses.