Configuration Reference
All configuration is handled via environment variables. You can set these in your docker-compose.yml, Kubernetes Deployment, or local shell.
Core Infrastructure Configuration
Section titled “Core Infrastructure Configuration”Basic connectivity and engine settings. These are the basic required configurations needed to run the Relay.
STORAGE_TYPE
Section titled “STORAGE_TYPE”Default: postgres
The database engine type used for the outbox table. Currently, PostgreSQL is the primary supported engine.
PUBLISHER_TYPE
Section titled “PUBLISHER_TYPE”Default: stdout
The target message broker where events will be delivered. Supported options: nats, kafka, redis, stdout, null.
STORAGE_URL
Section titled “STORAGE_URL”Default: —
The full connection string for your source database.
Example: postgres://user:pass@host:5432/db
PUBLISHER_URL
Section titled “PUBLISHER_URL”Default: —
The connection address for your message broker.
Example: localhost:9092 (Kafka) or nats://localhost:4222 (NATS).
STORAGE_TABLE_NAME
Section titled “STORAGE_TABLE_NAME”Default: openoutbox_events
The name of the database table where outbox events are stored. This allows the Relay to operate within shared databases while avoiding naming conflicts.
Pattern: ^[a-z0-9_]+$ (Lowercase letters, numbers, and underscores only).
RELAY_ID
Section titled “RELAY_ID”Default: os.Hostname() + "-" + uuid[:4]
A unique identifier for this specific Relay instance. This ID is used for database lease management, metric partitioning, and distributed locking.
Behavior: If left empty, the engine automatically generates an ID using the system hostname and a short unique suffix to prevent collisions in high-density environments.
Orchestration: When running in Kubernetes or Docker Compose, it is recommended to leave this empty to allow the engine to generate a unique identity per container, or map it to the Pod/Service name via the downward API.
ENVIRONMENT
Section titled “ENVIRONMENT”Default: production
The execution mode for the Relay. Use development for more verbose local logging and debugging.
Engine Tuning
Section titled “Engine Tuning”Controls how the Relay polls the database and manages event ownership.
POLL_INTERVAL
Section titled “POLL_INTERVAL”Default: 500ms
The idle-state delay. The Relay operates in Draining Mode (processing batches back-to-back without sleeping) until one of the following occurs:
- Empty Batch: No more PENDING events are found in the database.
- Process Error: An error occurs during the polling or delivery cycle.
In these cases, the Relay sleeps for the POLL_INTERVAL duration before retrying.
BATCH_SIZE
Section titled “BATCH_SIZE”Default: 100
The maximum number of events the Relay will claim and attempt to process in a single polling iteration.
LEASE_TIMEOUT
Section titled “LEASE_TIMEOUT”Default: 3m
The time allowed for an event to stay in the DELIVERING state before it is considered “stuck” (due to a worker crash) and becomes eligible for a retry.
REAP_BATCH_SIZE
Section titled “REAP_BATCH_SIZE”Default: 100
The number of expired leases the cleanup engine will reset back to PENDING in a single maintenance cycle.
HEALTH_CHECK_INTERVAL
Section titled “HEALTH_CHECK_INTERVAL”Default: 5s
The frequency at which the engine performs background health probes on its dependencies (Storage and Publisher). This interval determines the “heartbeat” of the relay and controls how quickly the instance detects an outage and updates its operational state.
PUBLISHER_CONNECT_RETRY_INTERVAL
Section titled “PUBLISHER_CONNECT_RETRY_INTERVAL”Default: 5s
The amount of time the engine waits between failed attempts to establish the initial connection to the message broker (NATS, Kafka, or Redis) during the startup phase.
ENABLE_STATS
Section titled “ENABLE_STATS”Default: true
Determines whether the engine performs background database scans to calculate backlog metrics, such as the count of pending events and the age of the oldest record.
SERVER_PORT
Section titled “SERVER_PORT”Default: :8080
The address and port where the Relay’s HTTP server listens. This server is only used to expose statistics (/stats) for now.
Supports standard Go address formats like :8080 or 127.0.0.1:8080.
Reliability & Backoff
Section titled “Reliability & Backoff”Settings for handling failures and retries.
RETRY_MAX_ATTEMPTS
Section titled “RETRY_MAX_ATTEMPTS”Default: 25
The maximum number of delivery attempts before an event is considered permanently failed. Once this limit is reached, the status is updated to DEAD and requires manual intervention or a status reset to be retried.
RETRY_BASE_DELAY
Section titled “RETRY_BASE_DELAY”Default: 1s
The starting wait time for the exponential backoff algorithm. This is the duration the Relay waits after the very first failure.
RETRY_MAX_DELAY
Section titled “RETRY_MAX_DELAY”Default: 24h
The ceiling for exponential backoff. Even if the calculated delay for the next retry exceeds this value, the Relay will cap the wait time at this limit to ensure the event is retried at least once per day.
RETRY_JITTER
Section titled “RETRY_JITTER”Default: 0.15
A randomness factor (0.15 = 15%) applied to each backoff calculation. This prevents “thundering herd” issues by ensuring that multiple failed events don’t all retry at the exact same millisecond.
NATS Specific Configurations
Section titled “NATS Specific Configurations”These settings only apply when PUBLISHER_TYPE="nats".
NATS_PUBLISH_TIMEOUT
Section titled “NATS_PUBLISH_TIMEOUT”Default: 5s
The maximum amount of time the Relay will wait for an acknowledgment from the NATS server (or JetStream) before timing out the request.
NATS_CONNECTION_TIMEOUT
Section titled “NATS_CONNECTION_TIMEOUT”Default: 10s
The deadline for the initial TCP handshake and protocol negotiation with the NATS server.
Redis Specific Configurations
Section titled “Redis Specific Configurations”These settings only apply when PUBLISHER_TYPE="redis".
REDIS_CONNECTION_TIMEOUT
Section titled “REDIS_CONNECTION_TIMEOUT”Default: 5s
The maximum amount of time allowed for the initial connectivity check (Ping) when the Relay service starts. If the Relay cannot reach Redis within this window, the service will fail to start.
REDIS_WRITE_TIMEOUT
Section titled “REDIS_WRITE_TIMEOUT”Default: 1s
The maximum time allowed for each write operation (XADD) to the Redis stream. This prevents a slow or lagging Redis instance from hanging the entire engine processing loop.
Kafka Specific Configurations
Section titled “Kafka Specific Configurations”These settings only apply when PUBLISHER_TYPE="kafka".
KAFKA_MAX_ATTEMPTS
Section titled “KAFKA_MAX_ATTEMPTS”Default: 5
The number of times the Kafka producer will attempt to resend a message if the initial write fails.
KAFKA_WRITE_TIMEOUT
Section titled “KAFKA_WRITE_TIMEOUT”Default: 10s
The maximum time to wait for a successful write acknowledgment from the Kafka broker.
KAFKA_READ_TIMEOUT
Section titled “KAFKA_READ_TIMEOUT”Default: 10s
The maximum time to wait for a response when reading from the Kafka broker (e.g., during metadata updates).
KAFKA_CONNECTION_TIMEOUT
Section titled “KAFKA_CONNECTION_TIMEOUT”Default: 10s
The maximum time allowed to establish the initial TCP connection and perform the metadata exchange with the Kafka brokers before the engine reports a connection failure.
KAFKA_BATCH_SIZE
Section titled “KAFKA_BATCH_SIZE”Default: 1
KAFKA_BATCH_BYTES
Section titled “KAFKA_BATCH_BYTES”Default: 10485760 (10MB)
The maximum total size of a batch in bytes. The producer will trigger a request when
either KAFKA_BATCH_SIZE or KAFKA_BATCH_BYTES is reached.
KAFKA_BATCH_TIMEOUT
Section titled “KAFKA_BATCH_TIMEOUT”Default: 10ms
The maximum time to wait before sending a partial batch if the size requirements haven’t been met.
KAFKA_ASYNC
Section titled “KAFKA_ASYNC”Default: false
When set to false, the Relay waits for an acknowledgment from Kafka before proceeding.
Set this to false for guaranteed “At-Least-Once” delivery. Setting it to true increases
throughput but risks message loss if the Relay crashes.
KAFKA_COMPRESSION
Section titled “KAFKA_COMPRESSION”Default: snappy
The compression codec used for messages sent to Kafka. Supported options: none, gzip, snappy, lz4, zstd. snappy provides a good balance between CPU usage and compression ratio.
KAFKA_REQUIRED_ACKS
Section titled “KAFKA_REQUIRED_ACKS”Default: all
The number of acknowledgments the producer requires the leader to have received before considering a request complete. Options: all (maximum durability), one (faster), or none (fire and forget).
KAFKA_TLS_CA
Section titled “KAFKA_TLS_CA”Default: ""
The Certificate Authority (CA) certificate used to verify the identity of the Kafka brokers. Can be an inline raw PEM text block, a file path prefixed with file://, or a Base64-encoded string prefixed with base64://.
KAFKA_TLS_CERT
Section titled “KAFKA_TLS_CERT”Default: ""
The client TLS certificate used for mutual authentication (mTLS). Must be provided alongside KAFKA_TLS_KEY. Can be an inline raw PEM text block, a file path prefixed with file://, or a Base64-encoded string prefixed with base64://.
KAFKA_TLS_KEY
Section titled “KAFKA_TLS_KEY”Default: ""
The client private key used for mutual authentication (mTLS). Must be provided alongside KAFKA_TLS_CERT. Can be an inline raw PEM text block, a file path prefixed with file://, or a Base64-encoded string prefixed with base64://.
KAFKA_TLS_VERSION
Section titled “KAFKA_TLS_VERSION”Default: "1.2"
The minimum TLS version required for secure communication with the Kafka brokers. Supported values are "1.0", "1.1", "1.2", and "1.3".
KAFKA_SERVER_NAME
Section titled “KAFKA_SERVER_NAME”Default: ""
The TLS Server Name Indication (SNI) used to verify the hostname on the broker’s certificate. Useful if your connection endpoint routing differs from the certificate’s Common Name (CN).
KAFKA_INSECURE
Section titled “KAFKA_INSECURE”Default: false
Controls whether the client skips verifying the broker’s certificate chain and hostname. Set to true only for local testing or controlled environments.
KAFKA_SASL_MECHANISM
Section titled “KAFKA_SASL_MECHANISM”Default: ""
The SASL authentication mechanism used to connect to the brokers. Supported values are "plain", "scram-sha-256", and "scram-sha-512". Leave empty for unauthenticated connections.
KAFKA_USERNAME
Section titled “KAFKA_USERNAME”Default: ""
The username credential used for SASL authentication. Required if KAFKA_SASL_MECHANISM is set.
KAFKA_PASSWORD
Section titled “KAFKA_PASSWORD”Default: ""
The password credential used for SASL authentication. Required if KAFKA_SASL_MECHANISM is set.
KAFKA_IDLE_TIMEOUT
Section titled “KAFKA_IDLE_TIMEOUT”Default: 30s
The maximum amount of time an idle connection to a Kafka broker will remain open before the transport closes it. A value of zero keeps the idle connections active indefinitely.
KAFKA_KEEP_ALIVE
Section titled “KAFKA_KEEP_ALIVE”Default: 30s
The keep-alive period for an active network connection to a Kafka broker. A value of zero uses the default operating system keep-alive interval.
Observability (OpenTelemetry)
Section titled “Observability (OpenTelemetry)”These settings control how the Relay exports traces and metrics to your observability stack.
OTEL_TRACES_EXPORTER
Section titled “OTEL_TRACES_EXPORTER”Default: none
Defines the destination for trace data. Supported options: otlp, console, or none.
OTEL_METRIC_EXPORT_INTERVAL
Section titled “OTEL_METRIC_EXPORT_INTERVAL”Default: 60000
The interval (in milliseconds) between metric flushes.
OTEL_METRICS_EXPORTER
Section titled “OTEL_METRICS_EXPORTER”Default: none
Defines the destination for metric data. Supported options: otlp, prometheus, console, or none.
OTEL_EXPORTER_OTLP_ENDPOINT
Section titled “OTEL_EXPORTER_OTLP_ENDPOINT”Default: http://localhost:4317
The address of your OpenTelemetry Collector or backend. Use the 4317 port for gRPC and 4318 for HTTP.
OTEL_EXPORTER_OTLP_PROTOCOL
Section titled “OTEL_EXPORTER_OTLP_PROTOCOL”Default: grpc
The transport protocol used for the OTLP exporter. Supported options: grpc or http/protobuf.
OTEL_BSP_EXPORT_TIMEOUT
Section titled “OTEL_BSP_EXPORT_TIMEOUT”Default: 30000 (30s)
The maximum time the OpenTelemetry SDK will wait for a batch of spans to be exported before timing out the request.
OTEL_BSP_SCHEDULE_DELAY
Section titled “OTEL_BSP_SCHEDULE_DELAY”Default: 5000 (5s)
The interval between two consecutive batch exports. Reducing this value decreases memory usage by clearing spans more frequently, but increases CPU and network overhead.
Remote Configuration
Section titled “Remote Configuration”These settings allow the Relay to bootstrap its configuration from external key-value stores, enabling centralized management across multiple instances.
REMOTE_CONFIG_PROVIDER
Section titled “REMOTE_CONFIG_PROVIDER”Default: —
Determines the external source for centralized configuration. When set, the relay will attempt to
bootstrap its settings from the specified provider. Supported values are consul, etcd, etcd3,
and firestore.
REMOTE_CONFIG_ENDPOINT
Section titled “REMOTE_CONFIG_ENDPOINT”Default: —
The network address of the configuration provider. For example, localhost:8500 for a
local Consul agent or etcd-cluster:2379 for an etcd deployment.
REMOTE_CONFIG_PATH
Section titled “REMOTE_CONFIG_PATH”Default: —
The specific key or file path within the provider where the configuration blob
is stored (e.g., config/relay.yaml).
REMOTE_CONFIG_TYPE
Section titled “REMOTE_CONFIG_TYPE”Default: yaml
Specifies the encoding format of the remote configuration blob. This ensures
the relay uses the correct parser after fetching the data.
Supported values are yaml and json.