Skip to content

Core Concepts

OpenOutbox Architecture Diagram

Open Outbox Relay is a stateless bridge between your database and your infrastructure. It is built on three core pillars:

The Storage is your source of truth. It is the database table where your application writes events. The Relay treats the database as a task queue, using a Lease Model to ensure that multiple Relay instances can work in parallel without stepping on each other.

The Publisher is the destination for your events (e.g., Kafka, NATS). The Relay is designed to be “Broker Agnostic”—it maps your event_type to topics or subjects and ensures that headers and partition keys are preserved.

The Relay is the stateless worker. It handles three main loops:

  • The Poller: Finds and claims batches of PENDING events.
  • The Dispatcher: Sends events to the Publisher and handles retries/backoff.
  • The Reaper: A self-healing background task that finds “zombie” events (stuck in DELIVERING state) and resets them for a new attempt.

Every event moves through a lifecycle managed by the Poller, the Publisher, and the Reaper.

StateRoleDescription
PENDINGWaitingThe initial state or a state returned to after a failure or a “stalled” lease recovery.
DELIVERINGProcessingThe event is currently locked by a Relay and being sent to the broker.
DELIVEREDFinalSuccess. The broker acknowledged the message.
DEADFinalFailure. The event exceeded RETRY_MAX_ATTEMPTS or faced a non-retriable error and requires manual review.

The movement between these states is governed by three specific actions:

The Relay finds events that are PENDING (and ready based on available_at). It updates them to DELIVERING to start the work.

Once the Relay attempts to publish the event:

  • Success: The event is moved to the final DELIVERED state.
  • Failure: The attempts count increases. If it’s below the limit, the event is moved back to PENDING with a backoff delay. If it hit the limit, or the error is not retryable, it is moved to the final DEAD state.

If a Relay crashes while holding a batch, or the publishing takes longer than LEASE_TIMEOUT, those events stay stuck in DELIVERING. The Reaper identifies these “stale” leases and resets them to PENDING so another worker can try again.