Skip to content

Configuration Reference

All configuration is handled via environment variables. You can set these in your docker-compose.yml, Kubernetes Deployment, or local shell.

Basic connectivity and engine settings. These are the basic required configurations needed to run the Relay.


Required String

Default: postgres

The database engine type used for the outbox table. Currently, PostgreSQL is the primary supported engine.


Required String

Default: stdout

The target message broker where events will be delivered. Supported options: nats, kafka, redis, stdout, null.


Required DSN

Default:

The full connection string for your source database. Example: postgres://user:pass@host:5432/db


Required URL

Default:

The connection address for your message broker. Example: localhost:9092 (Kafka) or nats://localhost:4222 (NATS).


Required String

Default: openoutbox_events

The name of the database table where outbox events are stored. This allows the Relay to operate within shared databases while avoiding naming conflicts.

Pattern: ^[a-z0-9_]+$ (Lowercase letters, numbers, and underscores only).


Required String

Default: os.Hostname() + "-" + uuid[:4]

A unique identifier for this specific Relay instance. This ID is used for database lease management, metric partitioning, and distributed locking.

Behavior: If left empty, the engine automatically generates an ID using the system hostname and a short unique suffix to prevent collisions in high-density environments.

Orchestration: When running in Kubernetes or Docker Compose, it is recommended to leave this empty to allow the engine to generate a unique identity per container, or map it to the Pod/Service name via the downward API.


Required String

Default: production

The execution mode for the Relay. Use development for more verbose local logging and debugging.


Controls how the Relay polls the database and manages event ownership.


Optional Duration

Default: 500ms

The idle-state delay. The Relay operates in Draining Mode (processing batches back-to-back without sleeping) until one of the following occurs:

  1. Empty Batch: No more PENDING events are found in the database.
  2. Process Error: An error occurs during the polling or delivery cycle.

In these cases, the Relay sleeps for the POLL_INTERVAL duration before retrying.


Optional Integer

Default: 100

The maximum number of events the Relay will claim and attempt to process in a single polling iteration.


Optional Duration

Default: 3m

The time allowed for an event to stay in the DELIVERING state before it is considered “stuck” (due to a worker crash) and becomes eligible for a retry.


Optional Integer

Default: 100

The number of expired leases the cleanup engine will reset back to PENDING in a single maintenance cycle.

Optional Duration

Default: 5s

The frequency at which the engine performs background health probes on its dependencies (Storage and Publisher). This interval determines the “heartbeat” of the relay and controls how quickly the instance detects an outage and updates its operational state.


Optional Duration

Default: 5s

The amount of time the engine waits between failed attempts to establish the initial connection to the message broker (NATS, Kafka, or Redis) during the startup phase.


Optional true

Default: true

Determines whether the engine performs background database scans to calculate backlog metrics, such as the count of pending events and the age of the oldest record.


Optional String

Default: :8080

The address and port where the Relay’s HTTP server listens. This server is only used to expose statistics (/stats) for now. Supports standard Go address formats like :8080 or 127.0.0.1:8080.


Settings for handling failures and retries.


Optional Integer

Default: 25

The maximum number of delivery attempts before an event is considered permanently failed. Once this limit is reached, the status is updated to DEAD and requires manual intervention or a status reset to be retried.


Optional Duration

Default: 1s

The starting wait time for the exponential backoff algorithm. This is the duration the Relay waits after the very first failure.


Optional Duration

Default: 24h

The ceiling for exponential backoff. Even if the calculated delay for the next retry exceeds this value, the Relay will cap the wait time at this limit to ensure the event is retried at least once per day.


Optional Float

Default: 0.15

A randomness factor (0.15 = 15%) applied to each backoff calculation. This prevents “thundering herd” issues by ensuring that multiple failed events don’t all retry at the exact same millisecond.


These settings only apply when PUBLISHER_TYPE="nats".


Optional Duration

Default: 5s

The maximum amount of time the Relay will wait for an acknowledgment from the NATS server (or JetStream) before timing out the request.


Optional Duration

Default: 10s

The deadline for the initial TCP handshake and protocol negotiation with the NATS server.


These settings only apply when PUBLISHER_TYPE="redis".


Optional Duration

Default: 5s

The maximum amount of time allowed for the initial connectivity check (Ping) when the Relay service starts. If the Relay cannot reach Redis within this window, the service will fail to start.


Optional Duration

Default: 1s

The maximum time allowed for each write operation (XADD) to the Redis stream. This prevents a slow or lagging Redis instance from hanging the entire engine processing loop.


These settings only apply when PUBLISHER_TYPE="kafka".


Optional Integer

Default: 1


Optional Bytes

Default: 10485760 (10MB)

The maximum total size of a batch in bytes. The producer will trigger a request when either KAFKA_BATCH_SIZE or KAFKA_BATCH_BYTES is reached.


Optional Duration

Default: 10ms

The maximum time to wait before sending a partial batch if the size requirements haven’t been met.


Optional Boolean

Default: false

When set to false, the Relay waits for an acknowledgment from Kafka before proceeding. Set this to false for guaranteed “At-Least-Once” delivery. Setting it to true increases throughput but risks message loss if the Relay crashes.


Optional String

Default: snappy

The compression codec used for messages sent to Kafka. Supported options: none, gzip, snappy, lz4, zstd. snappy provides a good balance between CPU usage and compression ratio.


Optional String

Default: all

The number of acknowledgments the producer requires the leader to have received before considering a request complete. Options: all (maximum durability), one (faster), or none (fire and forget).


Optional Integer

Default: 5

The number of times the Kafka producer will attempt to resend a message if the initial write fails.


Optional Duration

Default: 10s

The maximum time to wait for a successful write acknowledgment from the Kafka broker.


Optional Duration

Default: 10s

The maximum time to wait for a response when reading from the Kafka broker (e.g., during metadata updates).


Optional Duration

Default: 10s

The maximum time allowed to establish the initial TCP connection and perform the metadata exchange with the Kafka brokers before the engine reports a connection failure.


These settings control how the Relay exports traces and metrics to your observability stack.


Optional String

Default: none

Defines the destination for trace data. Supported options: otlp, console, or none.


Optional String

Default: 60000

The interval (in milliseconds) between metric flushes.


Optional String

Default: none

Defines the destination for metric data. Supported options: otlp, prometheus, console, or none.


Optional URL

Default: http://localhost:4317

The address of your OpenTelemetry Collector or backend. Use the 4317 port for gRPC and 4318 for HTTP.


Optional String

Default: grpc

The transport protocol used for the OTLP exporter. Supported options: grpc or http/protobuf.


Optional Duration

Default: 30000 (30s)

The maximum time the OpenTelemetry SDK will wait for a batch of spans to be exported before timing out the request.


Optional Duration

Default: 5000 (5s)

The interval between two consecutive batch exports. Reducing this value decreases memory usage by clearing spans more frequently, but increases CPU and network overhead.