Job Maintenance#
Oban stores all jobs in the database, which offers several advantages:
Durability: Jobs survive application restarts and crashes
Visibility: Administrators can inspect job status and history
Accountability: Complete audit trail of job execution
Pruning Jobs#
Oban automatically prunes completed, cancelled, and discarded jobs after a configurable
period (1 day, by default) to prevent the table from growing indefinitely. A pruner periodically
runs in the background to delete jobs that are in a final state and have exceeded the retention
period.
Configuring Pruning#
You can customize pruning behavior in oban.toml:
[pruner]
max_age = 3_600 # Keep jobs for 1 hour
limit = 50_000 # Delete up to 50k jobs per run
Or programmatically when running embedded:
oban = Oban(
pool=pool,
queues={"default": 10},
pruner={"max_age": 3_600, "limit": 50_000}
)
Rescuing Jobs#
During deployment or unexpected restarts, jobs may be left in an executing state indefinitely. We call these jobs “orphans”, but orphaning isn’t a bad thing. It means that the job wasn’t lost and it may be retried again when the system comes back online.
The “lifeline” process automatically rescues orphaned jobs by periodically checking for jobs stuck
in the executing state for too long and moving them back to available so they can run again.
Lifeline is enabled by default and runs every 60 seconds, rescuing jobs that have been executing for more than 5 minutes.
How Rescue Works#
Oban uses a timeout-based rescue strategy: jobs are rescued if their attempted_at timestamp
is older than the configured rescue_after threshold (default: 300 seconds). This approach works
reliably across node and doesn’t require coordination between nodes.
For more accurate rescue that detects crashed producers immediately, and won’t rescue jobs that are still legitimately running, see Oban Pro’s Accurate Rescue.
Configuring Lifeline#
You can customize lifeline behavior in oban.toml:
[lifeline]
interval = 30 # Check for orphaned jobs every 30 seconds
rescue_after = 600 # Rescue jobs executing for more than 10 minutes
Or programmatically when running embedded:
oban = Oban(
pool=pool,
queues={"default": 10},
lifeline={"interval": 30, "rescue_after": 600}
)
Choosing rescue_after#
The rescue_after value should be longer than your longest-running job. If you have jobs that
legitimately run for 10 minutes, set rescue_after to at least 15 minutes (900 seconds) to avoid
premature rescue.
Maintenance Guidelines#
All limits are soft; jobs beyond a specified age may not be pruned immediately after jobs complete. This means, pruning is best-effort and performed out-of-band.
Pruning is only applied to jobs that are
completed,cancelled, ordiscarded. It’ll never delete jobs in an incomplete state.For high-volume systems, consider reducing
max_ageto keep the jobs table smaller, or increasinglimitto prune more jobs per run.