Overview
This page provides an overview of how the package internally works. See the Getting Started guide for a quick introduction to dagster jobs.
Dynamic jobs
Dagster jobs are compiled by calling OpDefinition
and GraphDefinition
instances within the context of a function decorated by @dagster.job
.
Even though runtime calculations using Python-native functions can only occur
inside dagster ops, it is possible to dynamically change how a job is created
using this mechanism.
Consider the following job:
1from dagster import job, op2
3@op4def return_five():5 return 56
7@job8def several_calls():9 return_five()10 return_five()11 return_five()
This is a job that executes the same op three times. Instead, a Python for loop could be used to obtain the same behavior:
1@job2def several_calls():3 for _ in range(3):4 return_five()
Extending this idea further, we may consider the number of calls to
return_five
to not be fixed. It may instead depend on a value obtained from
elsewhere. To this end a new function is introduced as:
1def job_builder(n_calls):2 @job3 def wrapped_job():4 for _ in range(n_calls):5 return_five()6
7 return wrapped_job
This approach is widely used and is known as the factory pattern. A composable graph builds on this concept by formalizing the interface to the builder and simplifying the definition of the job.
Further reading
- ”Factory Patterns in Python” on the dagster blog.
- ”Unlocking Flexible Pipelines: Customizing the Asset Decorator” on the dagster blog.
- ”Abstracting Pipelines for Analysts with a YAML DSL” on the dagster blog.
- YAML DSL for Asset Graphs Example on dagster github.