Job Maintenance

Job Maintenance#

Oban stores all jobs in the database, which offers several advantages:

Durability: Jobs survive application restarts and crashes
Visibility: Administrators can inspect job status and history
Accountability: Complete audit trail of job execution

Pruning Jobs#

Oban automatically prunes completed, cancelled, and discarded jobs after a configurable period (1 day, by default) to prevent the table from growing indefinitely. A pruner periodically runs in the background to delete jobs that are in a final state and have exceeded the retention period.

Configuring Pruning#

You can customize pruning behavior in oban.toml:

[pruner]
max_age = 3_600 # Keep jobs for 1 hour
limit = 50_000  # Delete up to 50k jobs per run

Or programmatically when running embedded:

oban = Oban(
    pool=pool,
    queues={"default": 10},
    pruner={"max_age": 3_600, "limit": 50_000}
)

Rescuing Jobs#

During deployment or unexpected restarts, jobs may be left in an executing state indefinitely. We call these jobs “orphans”, but orphaning isn’t a bad thing. It means that the job wasn’t lost and it may be retried again when the system comes back online.

The “lifeline” process automatically rescues orphaned jobs by periodically checking for jobs stuck in the executing state for too long and moving them back to available so they can run again.

Lifeline is enabled by default and runs every 60 seconds, rescuing jobs that have been executing for more than 5 minutes.

How Rescue Works#

Oban uses a timeout-based rescue strategy: jobs are rescued if their attempted_at timestamp is older than the configured rescue_after threshold (default: 300 seconds). This approach works reliably across node and doesn’t require coordination between nodes.

For more accurate rescue that detects crashed producers immediately, and won’t rescue jobs that are still legitimately running, see Oban Pro’s Accurate Rescue.

Configuring Lifeline#

You can customize lifeline behavior in oban.toml:

[lifeline]
interval = 30       # Check for orphaned jobs every 30 seconds
rescue_after = 600  # Rescue jobs executing for more than 10 minutes