Dagster: Controlling Parallelism within a Run

Dagster has many ways to control parallelism. In Dagster Cloud deployments, you can control how many concurrent runs can happen at one time through deployment settings.

Within a run, you can also control how many parallel operations happen at once. By default, runs use the multi-process executor, and the number of parallel operations within a run is based on the number of parallel threads available. For example, if you are using Dagster Cloud Hybrid with Kubernetes, the number of parallel operations within a run will be based on the resources available in a pod.

This behavior can be changed by modifiying the executor settings.

Default

You can specify the default concurrency used by the executorm for all jobs by configuring the executor that is used within a project:

from dagster import Definitions, multiprocess_executor

defs = Definitions(
    executor = multiprocess_executor.configured({"max_concurrent": 1}),
    assets = [my_assets],
    ...
)

Per Job: In Code

You can over-ride this configuration for a specific job in code. For example:

analytics_job = define_asset_job(
     name = "refresh_analytics_model_job",
     selection=AssetSelection.keys(["ANALYTICS", "daily_order_summary"]).upstream(),
     tags = {"dagster/max_retries": "1"},
     config = {"execution": {"config": {"multiprocess": {"max_concurrent": 1}}}}
)

Per Job: Ad-hoc Launches

You can also over-ride this configuration for a specific job at run time via the launchpad. When go to launch a run for a job (or materialize a set of assets by shift clicking the materialize button), supply the following in the launch pad:

execution:
  config:
    multiprocess:
      max_concurrent: 1

slopp/README.md

Dagster: Controlling Parallelism within a Run

Default

Per Job: In Code

Per Job: Ad-hoc Launches