Oninit Logo
The Down System Specialists
+1-913-732-8892
+44-2081-337529
Partnerships Contact

Oninit® Log Ripper — Configuration

The Ripper takes a single YAML config file (-c <config.yml>). The example shipped with the package (config.yml.example) is fully annotated. The sections below are the live ones — sections not listed are ignored.

source

source:
  database: "stores_demo"
  server:   "ol_informix1410"   # or set INFORMIXSERVER in the env
  user:     "informix"          # optional, OS auth if omitted
  password: "secret"
  start_lsn: 0                  # 0 = current end of log
  schema_archive_dir: "/var/lib/oni_ripper/schema_archive"
  control_dir:        "/var/lib/oni_ripper/control"
  tables:
    - customer
    - order*
    - "!tmp_*"
    - name: t_with_blob
      columns:
        - "!body"
    - name: t_with_row
      skip_unsupported_columns: true

tables takes plain glob patterns (with ! for exclude), or extended map entries with per-column filters (!password) and the skip_unsupported_columns flag for tables that carry a column type the Informix CDC API rejects (TEXT / BYTE / BLOB / CLOB / SET / MULTISET / LIST / ROW / UDT).

target

# File mode — one SQL file per committed transaction
target:
  mode: file
  file:
    directory: "/tmp/cdc_output"

# JSON mode — one RFC 8259 JSON file per transaction
target:
  mode: json
  json:
    directory: "/tmp/cdc_output"

# CSV mode — one RFC 4180 CSV file per transaction
target:
  mode: csv
  csv:
    directory:      "/tmp/cdc_output"
    delimiter:      ","       # single char; default ","
    include_header: true      # column-name row at top of each file

# Kafka mode — one message per CDC record via librdkafka
target:
  mode: kafka
  kafka:
    brokers: "kafka1:9092,kafka2:9092"
    topic:   "oni_cdc"
    acks:              "all"        # all (default) | 0 | 1 | -1
    compression:       "none"       # none | gzip | snappy | lz4 | zstd
    client_id:         "oni_ripper"
    flush_timeout_ms:  30000
    # security_protocol: "sasl_ssl"
    # sasl_mechanism:    "SCRAM-SHA-512"
    # sasl_username:     "ripper"
    # sasl_password:     "..."

# Informix mode — direct ESQL/C execution
target:
  mode: informix
  informix:
    database: "target_db@target_server"
    user: "informix"
    password: "secret"

# ODBC mode — any ODBC-connected DB
target:
  mode: odbc
  odbc:
    dsn: "my_postgres_dsn"
    user: "dbuser"
    password: "dbpass"

# Ingres target — via ODBC + the Ingres ODBC driver.
# Native libiiapi driver pending; use the ODBC path until then.
# sql_dialect: ingres switches the rewriter to Ingres conventions
# (INTERVAL DAY TO SECOND / YEAR TO MONTH per-keyword forms, etc.).
target:
  mode: odbc
  sql_dialect: ingres
  odbc:
    dsn: "my_ingres_dsn"
    user: "ingres_user"
    password: "secret"

Ingres target note. The native mode: ingres entry is reserved for a future libiiapi-based driver; using it today returns a CRITICAL log line at startup. Until the native driver lands, configure Ingres targets through mode: odbc with the Ingres ODBC driver: install the Ingres client + ODBC driver from Actian, register a DSN in /etc/odbc.ini, then point the ripper at it. The dialect rewriter (sql_dialect: ingres) handles Ingres-specific SQL on the way through — INTERVAL conversions, BOOLEAN/INTEGER mapping, identifier quoting differences, etc.

locale & charset

The captured stream is byte-oriented — column data lands in the captured SQL as the raw bytes the source's logical log holds, and the target accepts or rejects them per its own charset declaration. Most of the locale & charset wiring is automatic; one operator-configurable knob is documented below. See the Locale page for the verified source-locale matrix and the cross-locale compatibility table.

Automatic (no operator config)

  • CLIENT_LOCALE override. The ripper sets CLIENT_LOCALE=en_us.819 at process startup before any ESQL/C connection establishes. This forces Informix's dectoasc / dttoasc formatters to emit ASCII-dot decimal separators and ISO DATETIME literals regardless of the operator's shell environment or the source database's DB_LOCALE. Without this override, a source DB on a comma-decimal locale (de, fr, es, it, ru, pt, …) would emit 1234,56 instead of 1234.56 — captured SQL would parse as two values for one column on every target. The override only affects formatter output; it does not transcode column data.
  • MySQL / MariaDB connection. The connector issues SET NAMES binary at session init so high-bit bytes (Latin-1, UTF-8 multi-byte) pass through to the target column verbatim regardless of the connection's default character set. The column's declared CHARACTER SET determines how the bytes render to applications reading the target.
  • PostgreSQL connection. The connector issues SET client_encoding TO LATIN1 at session init so Latin-1 source bytes don't trip PG's UTF-8 validator. UTF-8 source content travels via the same byte path; the operator declares the target column's encoding to match.
  • NO_AUTO_VALUE_ON_ZERO on MySQL / MariaDB. The connector appends NO_AUTO_VALUE_ON_ZERO to the session's sql_mode at init. Without this, source rows whose SERIAL value is 0 (rare but possible) would be silently renumbered by the next AUTO_INCREMENT value on the target — captured row IDs would diverge from the source. With it, the explicit 0 lands as 0.

Operator-configurable

target:
  mode: mysql
  sql_dialect: mysql
  mysql:
    host:     "…"
    port:     3306
    database: "…"
    user:     "…"
    password: "…"
    preserve_zero_autoincrement: true   # default true; see note below

target.mysql.preserve_zero_autoincrement defaults to true (the auto-applied NO_AUTO_VALUE_ON_ZERO described above). Set false only if the legacy auto-renumber behaviour is desirable on the target — usually not what an operator wants for replay. The same knob applies to target.mariadb.preserve_zero_autoincrement via the shared connector.

Operator responsibility: source-vs-target charset

The ripper does not transcode column data. When the source's DB_LOCALE encoding differs from the target column's declared character set, the operator must align them — either by declaring the target column with a charset matching the source (e.g. CHARACTER SET latin1 on a MySQL VARCHAR capturing from a Latin-1 Informix source), or by widening the target column to a multi-byte charset before pointing the ripper at it. The Locale page documents the verified compatibility matrix.

threading

threading:
  sessions_per_thread: 10   # tables per worker thread
  max_threads: 4            # maximum worker threads
  cdc_timeout: 5            # ifx_lo_read timeout (sec); 0 = block forever
  max_recs_per_read: 1      # records per ifx_lo_read; 1 = legacy default
  cleanup_orphans: true     # close stale CDC sessions at startup

Raising max_recs_per_read amortizes the per-read overhead at the cost of a slightly longer wait between record arrival and CDC_REC_TIMEOUT delivery (a partial response stalls until either max_recs is reached or the timeout fires).

logging

logging:
  verbose: true
  debug_sql: false
  debug_cdc: false
  status_interval: 60
  logfile: "/tmp/oni_ripper.log"
  pidfile: "/tmp/oni_ripper.pid"
  log_maxsize: 10485760   # 10 MB; 0=unlimited
  log_keep: 5             # rotated copies retained; 0=truncate in place

  # LSN checkpoint
  state_file: "/tmp/oni_ripper.ckpt"
  clean_restart: false

On graceful shutdown the ripper writes min(last_committed_lsn) across all workers to state_file. The next startup reads it and resumes via cdc_activatesess at that position — at-least-once semantics around the boundary, never data loss. -R on the CLI forces a clean restart for one run while still rewriting the file at exit.

When the live log file passes log_maxsize, the ripper renames it through the rotation chain (logfilelogfile.1 → ... → logfile.<keep>), discards what would have become .<keep+1>, and opens a fresh logfile. log_keep: 0 keeps the legacy in-place truncate behavior.

alerting

alerting:
  discard_command: "/usr/local/bin/notify-slack"
  discard_alert_min_sec: 60

discard_command runs in a fork+exec subprocess on each CDC_REC_DISCARD event the ripper sees, rate-limited by discard_alert_min_sec. Context is passed via env vars: RIPPER_EVENT, RIPPER_WORKER_ID, RIPPER_DISCARDS, RIPPER_LSN, RIPPER_TIMESTAMP. Empty command = disabled.

buffering

buffering:
  memory_max_bytes:     268435456     # 256 MB; 0 disables buffering
  disk_overflow_path:   "/var/lib/oni_ripper/buffer"
  disk_max_bytes:       10737418240   # 10 GB; 0 = memory-only
  target_check_seconds: 30
  on_shutdown:          flush         # flush | drain | abandon

Target-offline buffering. When a configured database / streaming target becomes unreachable, the dispatcher routes incoming transactions into a memory ring instead of dropping them. When memory fills, the oldest batch spills to a memory-mapped file under disk_overflow_path. A probe thread retries the target every target_check_seconds; on success the buffer drains in chronological order before resuming pass-through.

on_shutdown values: flush (default) spills in-memory tx to disk so the next start picks them up; drain blocks shutdown until the target accepts (falls through to flush if it doesn’t); abandon drops the buffer with a [CRITICAL] audit line.

memory_max_bytes: 0 disables the subsystem entirely — output failures stay fatal. Leave it 0 for file/json/csv/embed targets where "target offline" has no meaning. See reference.html for the full per-key table.

monitoring

monitoring:
  prometheus:
    port: 9091              # 0 = disabled (default)
    bind: "127.0.0.1"       # 0.0.0.0 to expose externally

Optional embedded HTTP server for Prometheus scraping + orchestrator liveness probes. Default-disabled (port 0); the ripper opens no listening sockets unless this section is present and port is non-zero. When enabled it serves two routes:

  • /metrics — Prometheus text-exposition format v0.0.4. Five metric families ship in v1: oni_logripper_build_info{version}, oni_logripper_records_total{worker,op} with opinsert/update/delete/truncate/discard, oni_logripper_lag_seconds{worker}, oni_logripper_recovery_count{worker}, oni_logripper_worker_running{worker}.
  • /healthz200 OK when every worker reports error == 0, otherwise 503. Standard liveness-probe shape for Kubernetes / systemd / Datadog Agent http_check.

Default bind is 127.0.0.1 so a fresh enable carries no external attack surface. Set 0.0.0.0 to allow a Prometheus server on a different host to scrape; wrap with a reverse-proxy (nginx / haproxy / Caddy) if the link crosses an untrusted network — the embedded server is plain HTTP. Pre-canned Grafana provisioning files + dashboard JSON ship under share/grafana/; see grafana.html for the drop-in install steps and the operator playbook.

SIGHUP runtime reload

kill -HUP <pid> re-reads the YAML and applies the runtime-tunable subset to the live process: verbose, the debug_* flags, status_interval, log_maxsize, log_keep, cdc_timeout. Each reload logs an applied / not-reloaded summary so the operator can see whether the edit took effect. Source / target / threading / tables / state-file / logfile changes are captured at startup and require a restart.

To discuss how Oninit ® can assist please call on +1-913-732-8892 or alternatively just send an email specifying your requirements.


You get all this for free.. think about what you get if you pay us