Changelog for Oban Pro v1.7

This release enhances workflows with sub-workflows and context sharing, overhauls queue partitioning for better performance, improves dynamic plugins, and adds various usability improvements.

See the v1.7 Upgrade Guide for complete upgrade steps and migration caveats.

🗂️ Workflow Tracking

Workflows now use a dedicated oban_workflows table to track workflow metadata in real-time. Database triggers maintain accurate counts as jobs transition between states, replacing expensive aggregation queries with simple lookups.

This enables workflow tracking for uniqueness, accurate stuck workflow rescuing, and Oban Web to display workflows using highly efficient queries.

Suspended State

Jobs waiting on workflow or chain dependencies now use a proper suspended state instead of the previous on_hold pseudo-state. This provides cleaner semantics, better query performance through simplified indexes, and enables the database triggers to track workflow state counts accurately.

Note that any in-flight workflows will continue to run normally, without any backfilling or data modification.

Unique Workflows

Workflows can now be created with unique: true to prevent multiple workflows with the same name from running concurrently:

Workflow.new(name: "daily-report", unique: true)
|> Workflow.add(:fetch, FetchWorker.new(%{}))
|> Workflow.add(:process, ProcessWorker.new(%{}), deps: [:fetch])
|> Oban.insert_all()

When a duplicate unique workflow is inserted, its jobs are marked with conflict?: true instead of being inserted, similar to how unique jobs work.

⚖️ Rate Limiting Overhaul

Rate limiting gains multiple algorithms, variable job weights, and a dedicated module for interacting with rate limits outside of job execution.

Multiple Algorithms

Three algorithms are now available, each with different trade-offs:

Sliding Window—Uses weighted averaging across two time buckets for smooth rate limiting without bursts at window boundaries.
Fixed Window—Resets the count when each period expires. Simple and predictable, but allows bursting at boundaries (e.g., allowed jobs at 11:59, then allowed more at 12:00).
Token Bucket—Tokens refill continuously at allowed / period per second. Allows controlled bursting up to allowed while maintaining the overall rate. Ideal for APIs that permit short bursts but enforce sustained limits.

queues: [
  sliding: [rate_limit: [allowed: 100, period: {1, :minute}, algorithm: :sliding_window]],
  fixed: [rate_limit: [allowed: 100, period: {1, :minute}, algorithm: :fixed_window]],
  bucket: [rate_limit: [allowed: 100, period: {1, :minute}, algorithm: :token_bucket]]
]

Weighted Jobs

Jobs can now consume variable amounts of rate limit capacity, with three ways to assign weights.

The simplest is a worker default, where all jobs from a worker consume 10 units of quota:

defmodule MyApp.HeavyWorker do
  use Oban.Pro.Worker, rate: [weight: 10]
end

Slightly more dynamic is a job option, where you can override the weight at insert time:

MyApp.HeavyWorker.new(args, rate: [weight: 5])

Finally, the most flexible option, is to calculate the weight dynamically at runtime with the new c:weight/1 callback:

defmodule MyApp.BatchWorker do
  use Oban.Pro.Worker

  @impl Oban.Pro.Worker
  def weight(%{args: %{"records" => records}}), do: length(records)
end

Rate Limit API

The new Oban.Pro.RateLimit module provides functions for interacting with rate limits outside of job execution. There are functions to check availability, manually consume quota, or reset the rate limit entirely. For example, to conditionally make a batch of api calls based on capacity:

case Oban.Pro.RateLimit.available(:my_queue) do
  {:ok, capacity} when capacity >= count ->
    :ok = Oban.Pro.RateLimit.consume(:my_queue, count)

    make_api_calls()

  {:ok, _capacity} ->
    {:error, :insufficient_capacity}
end

Even simpler, there is a with_quote/4 helper that can execute a function after atomically reserving capacity, with an optional timeout:

case Oban.Pro.RateLimit.with_quota(:my_queue, 5, &make_api_calls/0, timeout: 10_000) do
  {:ok, result} -> handle_result(result)
  {:error, :timeout} -> handle_timeout()
end

All rate limit operations are globally distributed and operate at the queue or partition level, sharing quota with a running queue.

📦 Chunk Overhaul

Chunks now use a pre-computed chunk_id for grouping, enabling a lightweight for much faster chunk lookups. This replaces expensive dynamic query construction based on partitioning fields with a simple index backed query.

Additionally, chunks use a single operation for acking all jobs in a chunk, reducing database round-trips when completing, cancelling, retrying, etc. The new acking operation also improves compatibility with non-Postgres databases like CockroachDB.

Legacy Chunk Jobs

Jobs created before v1.7 won't have a chunk_id in their metadata. The DynamicLifeline plugin automatically computes and sets the chunk_id for these jobs, so no manual backfilling is required.

Snooze Support

Chunks can now selectively snooze jobs to retry them after a delay. This is useful when some items in a chunk need to wait before retrying while others complete normally:

@impl Oban.Pro.Workers.Chunk
def process(jobs) do
  {ready, not_ready} = Enum.split_with(jobs, &ready_to_process?/1)

  process_jobs(ready)

  if Enum.any?(not_ready) do
    # Snooze jobs that aren't ready, complete the rest
    {:snooze, {30, :seconds}, not_ready}
  else
    :ok
  end
end

For mixed outcomes, snooze combines with other result types:

[cancel: {"invalid", invalid_jobs}, snooze: {{1, :minute}, retry_later}]

🪝 Global Cancel/Discard Hooks

Two new worker callbacks fire when jobs are cancelled or discarded outside of execution, regardless of how the state transition happens:

on_cancelled/2 — called when a job is cancelled due to :dependency (workflow dependency failed), :manual (via Oban.cancel_job/1), or :deadline (force-cancelled by deadline)
on_discarded/2 — called when a job is discarded after exhausting all retries (:exhausted), typically triggered by DynamicLifeline

defmodule MyApp.OrderWorker do
  use Oban.Pro.Worker

  @impl Oban.Pro.Worker
  def on_cancelled(reason, job) do
    MyApp.Notifications.order_cancelled(job.args["order_id"], reason)
    :ok
  end

  @impl Oban.Pro.Worker
  def on_discarded(:exhausted, job) do
    MyApp.Notifications.order_failed(job.args["order_id"])
    :ok
  end
end

For broad concerns like logging or metrics, it's possible a hook module globally so it applies to all Oban.Pro.Worker modules just like all other hooks.

📇 Improved Indexes

The v1.7 migration includes numerous new and rebuilt indexes that aid performance for chains, chunks, workflows, and general operation while also reducing overall index sizes.

Partial Indexes

New partial indexes reduce index size and improve query performance by only indexing rows that match specific conditions. In addition to the new chunk index, it adds or rebuilds partial indexes for:

Staging index—indexes jobs ready to transition to available, enabling 2-10x faster staging queries depending on job volume and state distribution
Pruning indexes—separate partial indexes for completed_at, cancelled_at, and discarded_at on terminal job states, making cleanup queries faster with smaller indexes
Unique/partition indexes — recreated as partial indexes without reliance on generated columns to save space and avoid table locking migrations

No More Generated Columns

The uniq_key and partition_key generated columns introduced in v1.5/v1.6 are replaced with expression indexes directly on the meta field. This eliminates table locking during migrations from OSS Oban or older Pro versions, a significant improvement for applications with high throughput oban_jobs tables.

The Upgrade Guide includes instructions for optional post-migration cleanup for unused transitional indexes and legacy generated columns.

v1.7.0-rc.0 — 2026-03-23

Enhancements

[Pro] Use suspended state for workflow and chain tracking
Jobs waiting on workflow or chain dependencies now use a proper suspended job state instead of the previous on_hold psuedo-state.
This provides cleaner state semantics, better query performance through simplified indexes, and enables the database triggers to track workflow state counts accurately. The scheduled_at timestamp is preserved directly on suspended jobs, eliminating the need for orig_scheduled_at in meta.
[Pro] Add usage rules for agentic coding assistants
Ship reference documents that help coding agents understand Pro's idioms and best practices. Rules cover workers, queues, composition primitives (workflows, batches, chains, chunks), plugins, and testing.
[Chunk] Optimize queries with centralized index and better job acking
Use a pre-computed chunk_id for chunk tracking, enabling a partial index for much faster chunk lookups. This eliminates dynamic query construction based on partitioning fields in favor of direct chunk_id matching.
Chunks now use a single SQL operation for acking , reducing database round-trips when completing, cancelling, or retrying jobs within a chunk. The new acking operation has better compatibility with non-Postgres databases such as CockroachDB.
[Chunk] Add :snooze support to chunk workers
Chunks can now selectively snooze jobs using {:snooze, period, jobs} or selectively snooze some jobs with snooze: {period, jobs} in the keyword list result. Snoozed jobs are rescheduled after the specified period, while unlisted jobs complete normally.
[DynamicCron] Add get/2 function for fetching entries by name
Provides a convenient way to retrieve a single cron entry without fetching all entries. Accepts either a worker module or custom string name and returns {:ok, entry} or {:error, message}.
[DynamicLifeline] Improve workflow rescue accuracy and remove limit
The DynamicLifeline's workflow rescue mechanism now queries the aggregate table for workflows with suspended jobs, providing more accurate detection than scanning the jobs table for suspended jobs alone.
This catches edge cases where workflows are stuck but their suspended jobs may have been lost or deleted. Legacy on_hold workflows that predate the aggregate table are still found via a direct jobs table query.
Only workflows that have been executing for more than a minute are candidates for rescuing by default.
[DynamicLifeline] Automatically repair chunk jobs missing chunk_id
Chunk workers now use a pre-computed chunk_id for grouping. Jobs created before this change won't have a chunk_id in their metadata, which would prevent them from being grouped correctly.
The DynamicLifeline plugin now automatically computes and sets the chunk_id for any chunk jobs that are missing it, similar to how it repairs missing partition_key values for partitioned queues.
[DynamicPruner] Add configurable preserve_workflows option
Allow disabling workflow job preservation during pruning via the new preserve_workflows option, which defaults to true for backwards compatibility. When disabled, jobs are pruned regardless of whether their workflow is still active. This is useful for large workflows that naturally run longer than a pruning cycle.
[Migration] Replace generated columns with expression indexes
Generated columns for uniq_key and partition_key were originally added for CockroachDB compatibility but introduced unnecessary complexity and excessive table locking for large tables. This change replaces them with expression indexes directly on the meta JSONB fields.
A generated_columns migration option is still available for apps that are running CockroachDB and need the old functionality. The Smart engine detects which mode is being used and handles conflicts accordingly.
[Migration] Add partial indexes for pruning and staging
Partial indexes reduce index size and improve query performance by only indexing rows that match the filter condition. This adds partial indexes for terminal job states and a staging index for jobs ready to transition to available.
Also fixes completed job pruning to use completed_at instead of scheduled_at, which is the semantically correct timestamp for determining job age.
The new staging query may perform 2-10x faster depending on the number of jobs and overall state distribution.
[Rate Limit] Add centralized module for using rate limits outside of job execution
The module's functions allow checking, resetting, and consuming rate limits from running queues.
The consume/3 function is fully distributed and spreads consumption across multiple nodes when a single producer cannot satisfy the request.
[Smart] Add auto_space option to spread out bulk inserts
When inserting large batches of jobs, auto_space schedules each batch at increasing intervals. This prevents overwhelming queues when jobs can't all execute immediately.
[Smart] Add on_conflict: :skip option for bulk inserts
Support skipping unique conflicts without row locking during insert_all. When enabled, conflicting jobs are silently skipped and only newly inserted jobs are returned. This improves performance for high-throughput scenarios where tracking conflicts isn't needed.
[Smart] Add transaction option for bulk insert atomicity
Support transaction: :per_batch to commit each batch independently during insert_all/2. Previously inserted batches persist even if a later batch fails. The default transaction: :all preserves the existing all-or-nothing behavior.
[Smart] Add telemetry sub-spans for engine fetch_jobs
Instrument the fetch_jobs transaction with nested telemetry spans for granular observability into acking, flushing, demand calculation, and job fetching. Each sub-span emits standard span events under [:oban, :engine, :fetch_jobs, :ack | :flush | :demand | :fetch].
[Smart] Support selecting between multiple rate limiting algorithms
Rate limited queues can select from :sliding_window, :fixed_window, and :token_bucket algorithms to control how rate quotas are consumed. The :sliding_window algorithm remains the default.
[Smart] Add :fixed_window rate limiting algorithm
Introduce a fixed window algorithm, which resets the count when the period expires rather than using weighted averaging.
[Smart] Add token bucket rate limiting algorithm
Introduce token bucket algorithm that refills tokens continuously at a fixed rate rather than resetting at period boundaries. Tokens refill at allowed / period per second, providing smoother rate limiting with natural burst handling.
[Worker] Add @impl declaration for worker __opts__/0
Mark Oban.Pro.Worker.__opts__/0 as implementing the new __opts__/0 public callback from Oban.Worker.
[Worker] Add on_cancelled/2 and on_discarded/2 worker hooks
Introduce two new worker callbacks that fire when jobs are cancelled or discarded, regardless of how the state transition happens:
- on_cancelled/2 receives :dependency or :manual reason
- on_discarded/2 receives :exhausted reason
Both callbacks work with global hooks via attach_hook/1 and module-level hooks via the :hooks option.
[Worker] Apply structured args timeout/1 and backoff/1
Automatically apply encryption and structuring before calling user defined timeout/1 or backoff/1 implementations. This allows pattern matching on structured args without any code changes.
[Worker] Variable weight rate limit tracking via options and callback
Add support for job weights in rate limiting, allowing jobs to consume variable amounts of rate limit capacity:
- Worker option: use Oban.Pro.Worker, rate: [weight: 5]
- Job option: Worker.new(args, rate: [weight: 3])
- Callback: weight/1 for dynamic weight calculation at dispatch time
Jobs with higher weights consume more rate limit capacity, enabling fine-grained control over resource-intensive operations.
[Workflow] Optimize workflow flushing with de-duplication
Restructure workflow flushing to compute dependency states once per unique dependency rather than once per job-dep combination. This eliminates the M*N scaling problem when M jobs share common dependencies.
Benchmarks show ~2x faster execution, 7x fewer buffer hits, and 15x fewer index scans for workflows with shared dependencies.
[Workflow] Flushing is optimized to load minimal data up front
Only the exact data needed for workflow flush operations is loaded from the database, rather than the entire job structure. This saves data over the wire, serialization overhead, and memory usage for active workflows or jobs with large args, errors, or meta.
The full job structure is loaded asynchronously when cancellation callbacks are needed.
[Workflow] Add table for centralized workflow tracking
Introduces a dedicated table to track workflow metadata and job state counts, replacing expensive aggregation queries with precomputed values. This improves performance for large workflows and enables efficient filtering/sorting in Oban Web.
[Workflow] Add unique workflow support to prevent duplicates
Workflows can now be created with unique: true to prevent multiple workflows with the same name from running concurrently. When a duplicate unique workflow is inserted, its jobs are marked with conflict?: true instead of being inserted.

Changes

[Pro] Packages are distributed with encrypted source code
Pro packages are encrypted, with licenses that stay fresh for 30 days. Development remains seamless, so documentation, type signatures, and LSP integration all work normally.
Enterprise license holders receive unencrypted source code.
See the Upgrade Guide for details on checking license status and refreshing.

Deprecations

[DynamicPartitioner] Deprecate the DynamicPartitioner plugin
The complexity and edge cases introduced by partitioned tables far outweigh the benefits for most applications.
[Workflow] Deprecate after_cancelled/2 in favor of universal on_cancelled/2
The after_cancelled/2 callback is deprecated in favor of the universal on_cancelled/2 hook. Currently, both hooks will be called if defined, and users should switch to on_cancelled/2.

Bug Fixes

[Migration] Fix migration check crash with dynamic repo config
The migration telemetry handler crashed on startup when users configured a placeholder repo module with get_dynamic_repo providing the actual repo at runtime. The handler attempted to call get_dynamic_repo/0 on the static repo before evaluating the dynamic repo callback.
[Refresher] Add error handling to producer record refreshing
Previously, if refresh_producers or cleanup_producers raised (e.g., due to a connection checkout timeout), the GenServer would crash and restart, causing missed heartbeats and timer resets. Now errors are caught and logged, allowing the refresh cycle to continue uninterrupted.
[Testing] Ensure ordered run_workflow/2 output
Always order workflow jobs by execution completion order to preserve sequential execution results.
[Worker] Trigger on_cancelled/2 when deadline force-cancels
When a job with deadline: [force: true] exceeds its deadline during execution, the on_cancelled/2 hook is now called with :deadline as the reason. This allows workers to perform cleanup or notifications when jobs are terminated due to deadline expiration.

Next Page → Overview