In the context of distributed systems orchestration, a Saga Execution Coordinator (SEC) is defined as a centralized, persistent Finite State Automaton (FSA). Its purpose is to manage the integrity of a distributed transaction by enforcing a strict execution path across a pre-ordered sequence of local transactions and their corresponding inverse operations (compensations).
Let a Saga
Where:
-
$T = \langle t_1, t_2, \dots, t_n \rangle$ is a sequence of local transactions where$n \in \mathbb{N}, n \geq 1$ . -
$C = \langle c_1, c_2, \dots, c_n \rangle$ is a sequence of compensating transactions. -
$c_i$ is the semantic inverse of$t_i$ such that the execution of$(t_i \circ c_i)$ restores the system to a state semantically equivalent to the state before$t_i$ occurred.
The Saga Execution Coordinator is a state machine
-
$Q$ (States) The set of finite states representing the progress of the saga.$Q = { q_0, q_{\text{success}}, q_{\text{aborted}} } \cup { \text{Exec}_i \mid 1 \le i \le n } \cup { \text{Comp}_i \mid 1 \le i < n }$ -
$\text{Exec}_i$ : State where the SEC is attempting to execute transaction$t_i$ . -
$\text{Comp}_i$ : State where the SEC is attempting to execute compensation$c_i$ .
-
-
$\Sigma$ (Input Alphabet) The set of possible responses from the participant services.$\Sigma = { \omega, \varepsilon }$ -
$\omega$ : Represents a Successful execution (ACK/Commit). -
$\varepsilon$ : Represents a Failed execution (NACK/Error).
-
-
$q_0$ (Initial State) The starting state is the execution of the first transaction.$q_0 = \text{Exec}_1$ -
$F$ (Final States) The set of terminal states where the SEC halts.$F = { q_{\text{success}}, q_{\text{aborted}} }$
The core logic is defined by the transition function
If the current transaction succeeds, the SEC advances to the next transaction
If a transaction fails, the SEC switches to "compensating mode." If
Once in compensating mode, successful execution of a compensation
The SEC enforces that valid traces belong to one of two patterns:
-
Commit Scenario:
$\tau_{\text{commit}} = \langle t_1, t_2, \dots, t_n \rangle$ -
Rollback Scenario (failure at
$k$ ):$\tau_{\text{rollback}} = \langle t_1, \dots, t_{k-1}, t_k(\text{fail}), c_{k-1}, \dots, c_1 \rangle$
The mathematical model above relies on the omission of the transition
The hypothesis asserts that a compensating transaction
Mathematically, this transforms compensation into a deterministic termination property. Let
- Importance: It guarantees mathematical closure. Without it, the automaton contains "trap states" where the system is neither successful nor aborted.
- Consequence of Failure: If met with a permanent failure during compensation, the system enters a Heuristic Failure State.
- Broken Atomicity: The system is partially committed.
- Manual Intervention: The automaton cannot resolve the state; human intervention or external reconciliation scripts are required to restore consistency.
For the SEC to function correctly, participating services must adhere to the following constraints.
Due to the retry mechanism required by the Perfect Compensation Hypothesis, participants must guarantee idempotence. Let
Applying the same operation multiple times produces the same side effect as applying it once. Without this, retries would corrupt data (e.g., double refunds).
Sagas are often structured around a "Pivot Point" rather than uniform compensability.
-
Compensatable Transactions (
$T_c$ ): Steps with a defined$c_i$ . -
Pivot Transaction (
$T_p$ ): The point of no return. If$T_p$ commits, the Saga guarantees success. -
Retriable Transactions (
$T_r$ ): Steps after the pivot. They do not require compensation logic because the system is now committed to forward recovery only.
Each participant must guarantee that local transaction
Sagas typically lack global Isolation.
- Assumption: The business domain accepts "Dirty Reads."
-
Consequence: External systems may view the changes made by
$t_1$ before$t_n$ is reached. If the Saga subsequently rolls back, the external system may have acted on data that no longer exists.