Changelog for Oban Pro v1.4

🚅 Streamlined Workflows

Workflows now use automatic scheduling to run workflow jobs in order, without any polling or snoozing. Jobs with upstream dependencies are automatically marked as on_hold and scheduled far into the future. After the upstream dependency executes they're made available to run, with consideration for retries, cancellations, and discards.

  • An optimized, debounced query system uses up to 10x fewer queries to execute slower workflows.

  • Queue concurrency limits don't impact workflow execution, and even complex workflows can quickly run in parallel with global_limit: 1 and zero snoozing.

  • Cancelled, deleted, and discarded dependencies are still handled according to their respective ignore_* policies.

All of the previous workflow "tuning" options like waiting_limit and waiting_snooze are gone, as they're not needed to optimize workflow execution. Finally, older "in flight" workflows will still run with the legacy polling mechanism to ensure backwards compatibility.

❤️‍🩹🌟 Upgrading Workflows in v1.4

Because workflow staging is now reactive, you must add a database trigger to handle deleted upstream dependencies. Otherwise, dependencies of deleted jobs may be left in a scheduled state.

defmodule MyApp.Repo.Migrations.AddWorkflowTriggers do
  use Ecto.Migration

  defdelegate change, to: Oban.Pro.Migrations.Workflow
end

⏲️ Job Execution Deadlines

Jobs that shouldn't run after some period of time can be marked with a deadline. After the deadline has passed the job will be pre-emptively cancelled on its next run, or optionally during its next run if desired.

defmodule DeadlinedWorker do
  use Oban.Pro.Worker, deadline: {1, :hour}

  @impl true
  def process(%Job{args: args}) do
    # If this doesn't run within an hour, it's cancelled
  end
end

Deadlines may be set at runtime as well:

DeadlinedWorker.new(args, deadline: {30, :minutes})

In either case, the deadline is always relative and computed at runtime. That also allows the deadline to consider scheduling—a job scheduled to run 1 hour from now with a 1 hour deadline will expire 2 hours in the future.

🧙 Automatic Crontab Syncing

Synchronizing persisted entries manually required two deploys: one to flag it with deleted: true and another to clean up the entry entirely. That extra step isn't ideal for applications that don't insert or delete jobs at runtime.

To delete entries that are no longer listed in the crontab automatically set the sync_mode option to :automatic:

[
  sync_mode: :automatic,
  crontab: [
    {"0 * * * *", MyApp.BasicJob},
    {"0 0 * * *", MyApp.OtherJob}
  ]
]

To remove unwanted entries, simply delete them from the crontab:

 crontab: [
   {"0 * * * *", MyApp.BasicJob},
-  {"0 0 * * *", MyApp.OtherJob}
 ]

With :automatic sync, the entry for MyApp.OtherJob will be deleted on the next deployment.

❤️‍🩹🌟 Upgrading DynamicCron in v1.4

Changes to DynamicCron require a migration to add the new insertions column. You must re-run the Oban.Pro.Migrations.DynamicCron migration when upgrading.

defmodule MyApp.Repo.Migrations.UpdateObanCron do
  use Ecto.Migration

  defdelegate change, to: Oban.Pro.Migrations.DynamicCron
end

v1.4.3 — 2024-04-08

Enhancements

  • [Workflow] Use a database trigger function to handle deleted deps.

    Querying the full oban_jobs table to detect abandonded workflow deps is too slow for sizable workflows. The only consistent intercept for deleted dependencies is the database, so this adds a Oban.Pro.Migrations.Workflow migration that creates a trigger specifically tailored to cleaning up deps after deleting a a workflow job.

    This approach is vastly faster and applied immediately, without waiting for a pruning event. However, it does require an optional migration to safely handle deleted workflow deps.

    See the admonition above on upgrading workflows, or the section on handling deleted deps in the docs for details.

  • [Smart] Handle Batch callback and Workflow deps checks after async acking, rather than forcing synchronous acking with a separate debounce cycle.

    This change improves the overall throughput of queues with Batch and Workflow jobs, while also increasing the responsiveness of batch callback enqeuing and deps staging.

    • Batch and Workflow checks are still grouped to avoid duplicate work
    • Acking is always synchronous, there's no blocking job execution for debounced checks
    • Queries for acking, fetching, and checking all run in a single transaction
    • Debounce options are ignored and no longer documented
  • [Testing] Mark test processes when draining to simplify async acking checks.

Bug Fixes

  • [Smart] Track producer meta changes without fetched jobs.

    An empty clause prevented tracking global changes for empty partitions, e.g. partitions without additional jobs to fetch. This was most noticeable for partitioned queues with a low limit and sparse jobs.

  • [Testing] Reload all jobs after draining in run_workflow/1,2.

    Appended jobs weren't included in the result from run_workflow/1, despite being inserted and executed. Now all jobs from the workflow are considered for summarization, or returned entirely without a summary.

v1.4.2 — 2024-03-26

  • [Workflow] Always resolve all workflow deps with single check.

    Workflow deps checks with a mismatched number of deps and finished jobs could be made available erroneously.

  • [Workflow] Reimplement deleted workflow handling for accuracy, performance, and memory.

    The workflow deps anti-join query was incorrect, causing it to return jobs that weren't actually part of on-hold workflows. The number of jobs returned grew with the total historic jobs, which could cause memory issues.

v1.4.1 — 2024-03-25

Bug Fixes

  • [Workflow] Fix workflow staging for jobs with multiple deps.

    The final optimization to workflow deps querying introduced a bug that caused jobs with multiple deps to be made available after the first dep completed. This restores the original, correct query.

v1.4.0 — 2024-03-21

Enhancements

  • [DynamicCron] Add :sync_mode for automatic entry management.

    Most users expect that when they delete an entry from the crontab it won't keep running after the next deploy. A new sync_mode option allows selecting between automatic and manual entry management.

    In addition, this moves cron entry management into the evaluate handler. Inserting and deleting at runtime can't leverage leadership, because in rolling deployments the new nodes are never the leader.

  • [DynamicCron] Use recorded job insertion timestamps for guaranteed cron.

    A new field on the oban_crons table records each time a cron job is inserted up to a configurable limit. That field is then used for guaranteed cron jobs, and optionally for historic inspection beyond a job's retention period.

  • [DynamicCron] Stop adding a unique option when inserting jobs, regardless of the guaranteed option.

    There's no need for uniqueness checks now that insertions are tracked for each entry. Inserting without uniqueness is significantly faster.

  • [DynamicCron] Inject cron information into scheduled job meta.

    This change mirrors the addition of cron: true and cron_expr: expression added to Oban's Cron in order to make cron jobs easier to identify and report on through tools like Sentry.

  • [Worker] Add :deadline option for auto cancelling jobs

    Jobs that shouldn't run after some period of time can be marked with a deadline. After the deadline has passed the job will be pre-emptively cancelled on its next run, or optionally during its next run if desired.

  • [Workflow] Invert workflow execution to avoid bottlenecks caused by polling and snoozing.

    Workflow jobs no longer poll or wait for upstream dependencies while executing. Instead, jobs with dependencies are "held" until they're made available by a facilitator function. This inverted flow makes fewer queries, doesn't clog queues with jobs that aren't ready, avoids snoozing, and is generally more efficient.

  • [Workflow] Expose functions and direct callback docs to function docs.

    Most workflow functions aren't true callbacks, and shouldn't be overwritten. Now all callbacks point to the public function they wrap. Exposing workflow functions makes it easier to find and link to documentation.

Bug Fixes

  • [DynamicCron] Don't consider the node rebooted until it is leader

    With rolling deploys it is frequent that a node isn't the leader the first time it evaluates. However, the :rebooted flag was set to true on the first run, which prevented reboots from being inserted when the node ultimately acquired leadership.

  • [DynamicQueues] Accept streamline partition syntax for global_limit and rate_limit options.

    DynamicQueues didn't normalize the newer partition syntax before validation. This was an oversight, and a sign that validation had drifted between the Producer and Queue schemas. Now schemas use the same changesets to ensure compatibility.

  • [Smart] Handle partitioning by :worker and :args regardless of order.

    The docs implied partitioning by worker and args was possible, but there wasn't a clause that handled any order correctly.

  • [Smart] Explicitly cast transactional advisory lock prefix to integer.

    Postgres 16.1 may throw an error that it's unable to determine the argument type while taking bulk unique locks.

  • [Smart] Preserve recorded values between retries when sync acking failed.

    Acking a recorded value for a batch or workflow is synchronous, and a crash or timeout failure would lose the recorded value on subsequent attempts. Now the value is persisted between retries to ensure values are always recorded.

  • [Smart] Revert "Force materialized CTE in smart fetch query".

    Forcing a materialized CTE in the fetch query was added for reliability, but it can cause performance regressions under heavy workloads.

  • [Testing] Use configured queues when ensuring all started.

    Starting a supervised Oban in manual mode with tests specified would fail because in :manual testing mode the queues option is overridden to be empty.