Changelog for Oban Pro v1.5
This release includes the new job decorator, unified migrations, a index-backed simple unique mode, changes for distributed PostgreSQL, improved batches, streamlined chains, worker aliases, hybrid job composition, and other performance improvements.
Elixir Support
This release requires a minimum of Elixir v1.14. We officially support 3 Elixir versions back, and use of some newer Elixir and Erlang/OTP features bumped the minimum up to v1.14.
π¨ Job Decorator
The new Oban.Pro.Decorator module converts functions into Oban jobs with a teeny-tiny @job true annotation. Decorated functions, such as those in contexts or other non-worker modules, can be executed as fully fledged background jobs with retries, priority, scheduling, uniqueness, and all the other guarantees you have come to expect from Oban jobs.
defmodule Business do
use Oban.Pro.Decorator
@job max_attempts: 3, queue: :notifications
def activate(account_id) when is_integer(account_id) do
case Account.fetch(account_id) do
{:ok, account} ->
account
|> notify_admin()
|> notify_users()
:error ->
{:cancel, :not_found}
end
end
end
# Insert a Business.activate/1 job
Business.insert_activate(123)
The @job
decorator also supports most standard Job options, validated at compile time. As
expected, the options can be overridden at runtime through an additional generated clause. Along
with generated insert functions, thereβs also a new variant that be used to build up job
changesets for bulk insert, and a relay_ variant that operates like a distributed async/await.
Finally, the generated functions also respect patterns and guards, so you can write assertive clauses that defend against bad inputs or break logic into multiple clauses.
π¦ Unified Migrations
Oban has had centralized, versioned migrations from the beginning. When there's a new release with
database changes, you run the migrations and it figures out what to change on its own. Pro behaved
differently for reasons that made sense when there was a single producers
table, but don't track
with multiple tables and custom indexes.
Now Pro has unified migrations to keep all the necessary tables and indexes updated and fresh, and you'll be warned at runtime if the migrations aren't current.
See the Oban.Pro.Migration
module for more details, or check the v1.5 Upgrade Guide
for instructions on putting it to use.
π¦ Enhanced Unique
Oban's standard unique options are robust, but they require multiple queries and centralized locks to function. Now Pro supports an enhanced unique mode designed for speed, correctness, and scalability.
The enhanced mode boosts insert performance 1.5x-3.5x, reduces database load with fewer queries, improves memory usage, and remains correct across multiple processes/nodes.
Here's a comparison between inserting various batches with the legacy and enhanced modes:
jobs | legacy | enhanced | boost |
---|---|---|---|
100 | 45.08 | 33.93 | 1.36 |
1000 | 140.64 | 81.452 | 1.72 |
10000 | 3149.71 | 979.47 | 3.22 |
20000 | oom error | 1741.67 |
See more in the Enhanced Unique section.
π Distributed PostgreSQL
There were a handful of PostgreSQL features used in Oban and Pro that prevented it from running in distributed PostgreSQL clients such as CockroachDB or Yugabyte.
A few table creation options prevented even running the migrations due to unsupported database features. Then there were advisory locks, which are part of how Oban normally handles unique jobs, and how Pro coordinates queues globally.
We've worked around all of these limitations and it's now possible to run Oban and Pro on CockroachDB and Yugabyte with all of the same functionality as regular PostgreSQL (unique jobs, global, rate limits, queue partitioning).
πͺ Improved Batches
One of the Pro's original three features, batches link the execution of many jobs as a group and run optional callback jobs after jobs are processed.
Composing batches used to rely on a dedicated worker, one that couldn't be composed with other
worker types. Now, there's a stand alone Oban.Pro.Batch
module that's used to dynamically build,
append, and manipulate batches from any type of job, and with much more functionality.
Batches gain support for streams (creating and appending with them), clearer callbacks, and allow
setting any Oban.Job
option on callback jobs.
alias Oban.Pro.Batch
mail_jobs = Enum.map(mail_args, &MyApp.MailWorker.new/1)
push_jobs = Enum.map(push_args, &MyApp.PushWorker.new/1)
[callback_opts: [priority: 9], callback_worker: CallbackWorker]
|> Batch.new()
|> Batch.add(mail_jobs)
|> Batch.add(push_jobs)
π Streamlined Chains
Chains now operate like workflows, where jobs are scheduled until they're ready to run and then descheduled after the previous link in the chain completes. Preemptive chaining doesn't clog queues with waiting jobs, and it chews through a backlog without any polling.
Chains are also a standard Oban.Pro.Worker
option now. There's no need to define a chain
specific worker, in fact, doing so is deprecated. Just add the chain
option and you're
guaranteed a FIFO chain of jobs:
- use Oban.Pro.Workers.Chain, by: :worker
+ use Oban.Pro.Worker, chain: [by: :worker]
See more in the Chained Jobs section.
ποΈ Improved Workflows
Workflows began the transition from a dedicated worker to a stand-alone module several versions ago. Now that transition is complete, and workflows can be composed from any type of job.
All workflow management functions have moved to a centralized Oban.Pro.Workflow
module. An
expanded set of functions, including the ability to cancel an entire workflow, conveniently work
with either a workflow job or id, so it's possible to maneuver workflows from anywhere.
Perhaps the most exciting addition, because it's visual and we like shiny things, is the addition of mermaid output for visualization. Mermaid has become the graphing standard, and it's an excellent way to visualize workflows in tools like LiveBook.
alias Oban.Pro.Workflow
workflow =
Workflow.new()
|> Workflow.add(:a, EchoWorker.new(%{id: 1}))
|> Workflow.add(:b, EchoWorker.new(%{id: 2}), deps: [:a])
|> Workflow.add(:c, EchoWorker.new(%{id: 3}), deps: [:b])
|> Workflow.add(:d, EchoWorker.new(%{id: 4}), deps: [:c])
Oban.insert_all(workflow)
Workflow.cancel_jobs(workflow.id)
π§βπ€βπ§ Worker Aliases
Worker aliases solve a perennial production issueβhow to rename workers without breaking existing jobs. Aliasing allows jobs enqueued with the original worker name to continue executing without exceptions using the new worker code.
-defmodule MyApp.UserPurge do
+defmodule MyApp.DataPurge do
- use Oban.Pro.Worker
+ use Oban.Pro.Worker, aliases: [MyApp.UserPurge]
See more in the Worker Aliases section.
v1.5.0-rc.7 β 2024-12-04
Bug Fixes
[DynamicLifeline] Prevent rescuing chains that aren't actually stuck
This corrects a query for rescuing chains in the
DynamicLifeline
was too broad, and would make chain jobs that shouldn't run yet available.[Chain] Reverse order of chain dependency check
The prepare chains query looked at the first job that wasn't complete to get the status for a given
chain_id
. That could cause an issue if the oldest job was cancelled and the chain was continuable. Now the query sorts in the opposite order to get the state of the most recent job in the chain instead.[Chain] Cast
orig_scheduled_at
to float for CockroachDB compatibility.The previous change didn't solve the timestamp casting issue completelty. Now the output of to_timestamp is cast to a float for compatibility with the timezone function.
[Smart] Remove
on_conflict
statement from standard insert flow.With the new index backed unique mechanism, there's no reason to have an
on_conflict
clause for regular job insertion. Frequent inserts usingon_conflict
can cause aLWLock:WALWrite
issue within Aurora Postgres due to write contention on the disk.[Smart] Guard against
nil
checks after partition changes.After partition changes, records from producers that ran before the changes may have a
nil
worker value. That flags an Ecto guard that preventsnil
checks in queries, and subsequently crashes the queue. Nownil
values are ignored until the old records are safely cleared out.[Testing] Allow all repo options in
use
macro.Use of
Keyword.validate!
only allowed a repo option and blocked other valid options likeprefix
,log,
etc.[Workflow] Correct multi-arity specs for workflow functions
This fixes, and verifies, the typespecs for all
*_jobs
Workflow
functions.[Unique] Use
inserted_at
orscheduled_at
time in unique hash.The unique time was always derived from
utc_now
, not the time passed through the job changeset.[Worker] Gracefully handle
:deadline
option added to existing jobs.Adding a
:deadline
to jobs that were created without the correct meta would cause an error at runtime. Now that clause is handled gracefully by using the scheduled time.[Migration] Define an empty
Batch
migration for backward compatibility.[Migration] Prevent duplicates in migrated version comment query.
This fixes the query used to check migrated versions by selecting the
pg_class
descripton specifically.
v1.5.0-rc.6 β 2024-11-11
Enhancements
[Smart] Record a timestamp when a queue starts shutting down
Producer meta now includes a
shutdown_started_at
field to indicate that a queue isn't just paused, but is actually shutting down as well.[DynamicScaler] Pass scaling information from leader to all peers
In order to preserve historic scaling information, the leader now broadcasts scaling information to all other peers. That way, if the leader is shut down, other instances have historical scaling details (which is most important for longer cooldown periods).
Bug Fixes
[Chain] Prevent memory leak from malformed chain query
The chain query's memory use increased based on the number of jobs because of a malformed union query. Inserting chain jobs now uses a much faster query, and uses a fraction of the memory.
[Chain] Ensure chain continuation value is an integer
The
to_timestamp
function requires an integer, but in CRDB the result of a binary/
operator may be a float. This switches to thediv
function instead, to ensure it's always an integer.[Chunk] Restore
after_process/3
hook executionAn optimization to hooks prevented
after_process/3
from triggering for Chunk workers.[DynamicCron] Avoid array slice operations for CRDB compat
CRDB doesn't support array slice operations, e.g.
[1:2]
, which were used for DynamicCron operations. Now it uses a combination of functions to maintain insertion history.[DynamicScaler] Invert comparison for scaling cooldown check
The cooldown check was reversed, and would scale during the opposite of the intended time. Scaling telemetry events now include a reason scaling was skipped for introspection.
[Smart] Cache
uniq_key
column existence by instance nameMultiple instances may share the same prefix name, despite pointing to entirely different databases. This changes the unique check cache key to use the instance name rather than prefix.
[Unique] Consistently sort unique keys and hash value
To avoid differences in mixed key maps or large map values, unique keys are sorted as strings and hashed rather than inspected.
v1.5.0-rc.5 β 2024-10-28
Enhancements
[Chunk] Prevent single job chunks after a brief delay
This removes the last time check from chunks to simplify the logic, reduce queries, and most importantly, prevent erroneously running a chunk with a single job after a delay in processing.
[DynamicScaler] Accurately count upcoming
available
,scheduled
, orretryable
jobs.Restricting queue size checks to jobs scheduled since the last scaling event artificially restricts the recorded size, e.g. it makes large queues look smaller than they really are. The new check considers all available, scheduled, or retryable jobs in the upcoming
lookback
window.[Chain] Inject
chain_id
along withchain_key
in legacy chain worker jobs.The new
chain_id
format is now added to legacy chain worker jobs so they can be transitioned seamlessly.
Bug Fixes
[Smart] Continue staging after resolving unique conflicts.
Recursively continue staging until there aren't any remaining unique conflicts to handle.
[Workflow] Include ignore_* in
add_option/0
type.Ignore options were missing, though they're accepted as options when adding a job to a workflow.
v1.5.0-rc.4 β 2024-10-04
Enhancements
[Smart] Use advisory locks for producer coordination when available.
Row level locks generate noisy error logs within Postgres and don't offer any benefit when advisory locks are available. Producers that need global coordination now use advisory locks when they're available, and they'll fall back to row level locks in distributed databases.
[Workflow] Optimize workflow queries to make better use of indexes.
Frequently called workflow queries are now further optimized to ensure they make use of the compound workflow index and avoid slower bitmap index scans.
[DynamicLifeline] Discard jobs stuck available with max attempts.
Jobs may be left in the
available
state when they have exhausted attempts. When that happens, they can "clog" a queue by preventing job fetching. Now those jobs arediscarded
or have attempts bumped as configured.[DynamicLifeline] Unblock stuck chains that missed scheduling events
Chains may become stuck if there are persistent errors while acking or an unexpected shutdown. This adds periodic chain rescuing to match the existing workflow rescuing.
Bug Fixes
[DynamicQueues] Make queue updates more resilient to changes.
Queues use an optimistic lock and the
update/3
function could raise an error when a scale signal conflicted with an update call. Now updating only touches the database record once, and the update operation is retried after aStaleEntry
exception.[DynamicCron] Clear insertion history on crontab upsert.
Using
update/3
to update a DynamicCron entry would clear insertions when the expression or timezone changed, but upserting withinsert/2
wouldn't. Now callinginsert/2
or setting thecrontab
in the plugin options will conditionally clear the insertions history as well.[DynamicPruner] Correct state timestamp checking in pruning query.
The query used an
or_where
rather than groupingor
together, which caused overrides to apply tocancelled
anddiscarded
jobs incorrectly.[Smart] Clear unique conflicts during staging operations.
Uniqueness that includes
available
but lacksscheduled
orretryable
could cause a conflict during job staging. This adds a safeguard to clear the source of the conflict.[Chain] Disable uniqueness on chain workers when continuing.
Uniqueness across partial states, e.g.
available
only, could stop a chain due to conflicts, which would prevent chains from running entirely.
v1.5.0-rc.3 β 2024-09-13
Upgrading from a Previous RC
The legacy, hybrid, and simple
unique
modes have been removed in favor of a single index-backed unique mechanism. You must re-run the1.5.0
migration in order to drop and recreate the correct index:def up do Oban.Pro.Migration.down(version: "1.5.0") Oban.Pro.Migration.up(version: "1.5.0") end
Enhancements
[Worker] Add
before_new/2
worker hook.The
before_new/2
hook can augment or override theargs
andopts
used to construct new jobs. It's akin to overriding a worker'snew/2
callback, but the logic can be shared between workers or operate on a global scale.The hook is most useful for augmenting jobs from a central point or injecting custom meta, i.e. to add span ids for tracing. Here, we define a global hook that injects a
span_id
into the meta:defmodule MyApp.SpanHook do def before_new(args, opts) do opts = opts |> Keyword.put_new(:meta, %{}) |> put_in([:meta, :span_id], MyApp.Telemetry.gen_span_id()) {:ok, args, opts} end end
[Smart] Use the array concat operator instead of
array_append
.Using the concat operator,
||
, allows working with an errors column that's an array ofjsonb
or a singlejsonb
column. This change allows the Smart engine to work with CockroachDB.[Smart] Remove various unique modes in favor of single, index-backed uniqueness mechanism.
The new mechanism uses a generated column paired with a partial unique index. It can handle all job uniqueness without various modes. The sole exception are partitioned tables, which don't support arbitrary unique indexes. In that case, we fall back to the legacy advisory lock and query mechanism.
Bug Fixes
[DynamicQueues] Reboot dynamically started queues after the queue supervisor crashes.
Oban's
Midwife
process is typically responsible for starting queues after theForeman
(queue supervisor) crashes. However, theMidwife
isn't aware of dynamic queues, which would leave those queues unstarted after theForeman
restarts.Now the
DynamicQueues
plugin monitors theForeman
and automatically restarts queues after an unexpected crash.[Smart] Handle unique conflicts on state change while acking.
Jobs with partial uniqueness, e.g. only
[:scheduled]
, are prone to unique conflicts on state change. Now when an acking job encounters a unique violation we clear the conflicting job's uniqueness to prevent a subsequent violation.
v1.5.0-rc.2 β 2024-08-28
Enhancements
[Smart] Increase base expected delay and retry counts for transaction safety.
Expected retries apply to lock contention and serialization failures. Now there's increased time between retries and many more of them to compensate for busy systems with many nodes.
[DynamicPruner] Always use per state timestamps for pruning.
The DynamicPruner always uses per state timestamps for pruning now, rather than making it opt-in via a
by_state_timestamp
option. However, thecompleted
state still prunes using thescheduled_at
timestamp so it can leverage the base compound index.This is a more sensible default, and the query performance is identical in a system with a typical balance of
completed/cancelled/discarded
jobs.
Bug Fixes
[Testing] Stop using gen_id in
perform_callback/4
New
Batch
workers don't implementgen_id/0
. Now the perform callback helper generates a UUIDv7 directly when one isn't provided.[Smart] Prevent noisy logs caused by periodic refresh queries.
Producer refresh queries no longer generate noisy logs every time they run. Multis don't use the same query options as the transaction and producer cleanup queries were missing transaction options.
[DynamicPartitioner] Prevent retries when managing partitioned tables.
v1.5.0-rc.1 β 2024-08-15
Enhancements
[Workflow] Add
get_recorded/3
andall_recorded/3
helpers.Fetching workflow dependencies to retreive their recorded output is a common operation. However, it's repetitive and slightly wasteful because it loads an entire job when only part of the
meta
is required.The new
get_recorded/3
andall_recorded/3
functions simplify fetching recordings from other workflow jobs. For example, to retrieve all recordings:%{"job_a" => job_a_return, "job_b" => job_b_return} = Workflow.all_recorded(job)
[Workflow] Use decorated job names in workflow graph output.
Show the decorated name, i.e.
Foo.bar/2
, instead of generically usingOban.Pro.Decorator
in workflow graphs.[Smart] Async ack jobs with in a single update.
A carefully crafted query is capable of acking all job rows in a single query. That reduces total queries per-transaction and data sent over the wire, and in an engine benchmark that constantly inserts and executes jobs, it reduces the total CPU load from 46% to 28%.
[Smart] Rely on upstream
Repo.transaction/3
for automatic retries.Oban.Repo.transaction/3
now has built-in backoff for all transactions, so theSmart
engine no longer needs to to implement it manually. All other engine operations, as well as plugins, now benefit from more resilient database transactions.[Testing] Add
:decorated
option to assert/refute enqueuedTesting decorated jobs is tricky because the worker is constant and exact args are opaque. The new
:decorated
assertion option expands a captured function and optional args into the correct shape for testing.For example, use a capture to test that a decorated function with any args was enqueued:
assert_enqueued decorated: &Business.weekly_report/2
[Migration] Include the current
prefix
in migration warnings.The migration warning now indicates which prefix is missing the migration. This will help applications that miss the migration in one of several prefixes.
Bug Fixes
[Smart] Prevent indeterminate type errors in Postgres by casting literal values.
PostgreSQL 16.2 (and possibly greater) can't reliably determine the type of values passed literally into a prepared union query.
[Smart] Skip unnecessary empty query while fetching jobs.
The smart engine made an "empty" producer query even when nothing required it. The empty query used the default
public
prefix, which caused problems for systems withoutoban_*
tables in the public schema.[Worker] Update parameterized type usage for Ecto v3.12 compatibility.
The recently released Ecto v3.12 change the signature for parameterized types, e.g. enums, which broke enum in structured args.
[Workflow] Apply structured args in
after_cancelled/2
callback.The
after_cancelled
hook was applied outside normal worker execution and it lacked structured args processing.[Workflow] Inject current conf during
after_cancelled/2
callbacksThe
after_cancelled/2
callback was passed a job without the current Oban configuration, which broke workflow functions such asall_jobs/2
.Now the
conf
is injected accordingly, and error messages from unhandledafter_cancelled/2
execeptions include the stacktrace.[Workflow] Explicitly differentiate references to legacy Workflow worker.
Conflicting aliases may cause compilation warnings and potential runtime issues for legacy workflows.
[DynamicCron] Prevent use of the
:delete
option with automatic sync mode.The
:delete
option has no effect with:automatic
sync mode, and it shouldn't be used. It's no loger an accepted option because it could crash on boot and prevent any cron jobs from running.[DynamicLifeline] Delete conflicting unique jobs while rescuing.
The DynamicLifeline may moves
executing
jobs back to theavailable
state on rescue. When a unique config includes theavailable
state and omits theexecuting
state, a duplicate may already be enqueued and waiting. Now the orphanedexecuting
job is deleted on conflict to unblock subsequent rescue operations.
v1.5.0-rc.0 β 2024-07-26
Enhancements
[Smart] Implement
check_available/1
engine callback for faster staging queries.The smart engine defines a custom
check_available/1
callback that vastly outperforms theBasic
implementation on large tables. This table illustrates the difference in a benchmark of 24 queues with an event split ofavailable
andcompleted
jobs on a local database with no additional load:jobs original optimized boost 2.4m 107.79ms 0.72ms 149x 4.8m 172.10ms 1.15ms 149x 7.2m 242.32ms 4.28ms 56x 9.6m 309.46ms 7.89ms 39x The difference in production may be much greater.
Worker
[Worker] Add
before_process/1
callback.The new callback is applied before
process/1
is called, and is able to to modify thejob
or stop processing entirely. Likeafter_process
, it executes in the job's process after all internal processing (encryption, structuring) are applied.[Worker] Avoid re-running stages during
after_process/1
hooks.Stages such as structuring and encryption are only ran once during execution, then the output is reused in hooks.
[Worker] Prevent compile time dependencies in worker hooks.
Explicit worker hooks are now aliased to prevent a compile time dependency. Initial validation for explicit hooks is reduced as a result, but it retains checks for global hooks.
Batch
[Batch] Add new callbacks with clearer, batch-specific names.
The old
handle_
names aren't clear in the context of hybrid job compositions. This introduces newbatch_
prefixed callbacks for theOban.Pro.Batch
module. The legacy callbacks are still handled for backward compatibility.[Batch] Add
from_workflow/2
for simplifiedBatch
/Worker
composition.Create a batch from a workflow by augmenting all jobs to also be part of a batch.
[Batch] Add
cancel_jobs/2
for easy cancellation of an entire batch.[Batch] Add
append/2
for extending batches with new jobs.[Batch] Uniformly accept
job
orbatch_id
in all functions.Now it's possible to fetch batch jobs from within a
process/1
block via aJob
struct, or from anywhere in code with thebatch_id
alone.
Workflow
[Workflow] Add
get_job/3
for fetching a single dependent job.Internally it uses
all_jobs/3
, but the name and arguments make it clear that this is the way to get a single workflow job.[Workflow] Add
cancel_jobs/3
for easy cancellation of an entire workflow.The new helper function simplifies cancelling full or partial workflows using the same query logic behind
all_jobs
andstream_jobs
.[Workflow] Drop optional dependency on
libgraph
The
libgraph
dependency hasn't had a new release in years and is no longer necessary, we can build the same simple graph with thedigraph
module available in OTP.[Workflow] Add
to_mermaid/1
for mermaid graph output.Now that we're using digraph we can also take control of rendering our own output, such as mermaid.
[Workflow] Uniformly accept
job
orworkflow_id
in all functions.Now it's possible to fetch workflow jobs from within a
process/1
block via aJob
struct, or from anywhere in code with theworkflow_id
alone.
Bug Fixes
[Smart] Require a global queue lock with any flush handlers.
Flush handlers for batches, chains, and workflows must be staggered without overlapping transactions. Without a mutex to stagger fetching the transactions on different nodes can't see all completed jobs and handlers can misfire.
This is most apparent with batches, as they don't have a lifeline to clean up after there's an issue as with workflows.
[DynamicPruner] Use state-specific timestamps when pruning.
Previously, the most recent timestamp was used rather than the timestamp for the job's current state.
[Workflow] Prevent incorrect
workflow_id
types by validating options passed tonew/1
.
Deprecations
[Oban.Pro.Workers.Batch] The worker is deprecated in favor of composing with the new
Oban.Pro.Batch
module.[Oban.Pro.Workers.Chain] The worker is deprecated in favor of composing with the new
chain
option inOban.Pro.Worker
.[Oban.Pro.Workers.Workflow] The worker is deprecated in favor of composing with the new
Oban.Pro.Workflow
module.