zduymz · December 9, 2020 18:53
diff --git a/airflow-scheduler-pseudo-code.txt b/airflow-scheduler-pseudo-code.txt
 Step 0. Load available DAG definitions from disk (fill DagBag)

 While the scheduler is running:
 	Step 1. The scheduler uses the DAG definitions to 
 	        identify and/or initialize any DagRuns in the
 	        metadata db.
 	
 	Step 2. The scheduler checks the states of the 
 	        TaskInstances associated with active DagRuns, 
 		resolves any dependencies amongst TaskInstances, 
 		identifies TaskInstances that need to be executed, 
 		and adds them to a worker queue, updating the status 
 		of newly-queued TaskInstances to "queued" in the
 		datbase.
 	
 	Step 3. Each available worker pulls a TaskInstance from 
 		the queue and starts executing it, updating the 
 	        database record for the TaskInstance from "queued" 
 	        to "running".
 	
 	Step 4. Once a TaskInstance is finished running, the 
 	        associated worker reports back to the queue 
 	        and updates the status for the TaskInstance 
 	        in the database (e.g. "finished", "failed", 
 	        etc.)
 	
 	Step 5. The scheduler updates the states of all active 
 	        DagRuns ("running", "failed", "finished") according 
 	        to the states of all completed associated 
 	        TaskInstances.
 	
 	Step 6. Repeat Steps 1-5
	Step 0. Load available DAG definitions from disk (fill DagBag)

	While the scheduler is running:
	Step 1. The scheduler uses the DAG definitions to
	identify and/or initialize any DagRuns in the
	metadata db.

	Step 2. The scheduler checks the states of the
	TaskInstances associated with active DagRuns,
	resolves any dependencies amongst TaskInstances,
	identifies TaskInstances that need to be executed,
	and adds them to a worker queue, updating the status
	of newly-queued TaskInstances to "queued" in the
	datbase.

	Step 3. Each available worker pulls a TaskInstance from
	the queue and starts executing it, updating the
	database record for the TaskInstance from "queued"
	to "running".

	Step 4. Once a TaskInstance is finished running, the
	associated worker reports back to the queue
	and updates the status for the TaskInstance
	in the database (e.g. "finished", "failed",
	etc.)

	Step 5. The scheduler updates the states of all active
	DagRuns ("running", "failed", "finished") according
	to the states of all completed associated
	TaskInstances.

	Step 6. Repeat Steps 1-5