rponte/avoid-distributed-transactions.md

Last active November 14, 2024 08:14

Star () You must be signed in to star a gist
Fork () You must be signed in to fork a gist

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/rponte/9477858e619d8b986e17771c8be7827f.js"></script>
Save rponte/9477858e619d8b986e17771c8be7827f to your computer and use it in GitHub Desktop.

Download ZIP

THEORY: Distributed Transactions and why you should avoid them (2 Phase Commit , Saga Pattern, TCC, Idempotency etc)

Raw

avoid-distributed-transactions.md

Distributed Transactions and why you should avoid them

Modern technologies won't support it (RabbitMQ, Kafka, etc.);
This is a form of using Inter-Process Communication in a synchronized way and this reduces availability;
All participants of the distributed transaction need to be avaiable for a distributed commit, again: reduces availability.

Implementing business transactions that span multiple services is not straightforward. Distributed transactions are best avoided because of the CAP theorem. Moreover, many modern (NoSQL) databases don’t support them. The best solution is to use the Saga Pattern.

[...]

One of the most well-known patterns for distributed transactions is called Saga. The first paper about it was published back in 1987 and has it been a popular solution since then.

There are a couple of different ways to implement a saga transaction, but the two most popular are:

Events/Choreography: When there is no central coordination, each service produces and listen to other service’s events and decides if an action should be taken or not;
Command/Orchestration: when a coordinator service is responsible for centralizing the saga’s decision making and sequencing business logic;

Author

rponte commented Sep 2, 2024

YugabyteDB avoids bloat with the outbox pattern

Author

rponte commented Sep 5, 2024 •

edited

Loading

⭐️ Fault-Tolerance and Data Consistency Using Distributed Sagas
- It’s important to note that a compensating transaction Ci does not necessarily return the database to the original state it was in before the transaction Ti was run. The compensation only needs to be semantically equivalent.

Author

rponte commented Sep 10, 2024 •

edited

Loading

Exactly-once message processsing

Distributed algorithms are difficult. If you find yourself struggling to understand one of them, we assure you – you are not alone. We have spent last couple of years researching ways to ensure exactly-once message processing in systems that exchange messages in an asynchronous and durable way (a.k.a. message queues) and you know what? We still struggle and make silly mistakes. The reason is that even a very simple distributed algorithm generates vast numbers of possible execution paths.

Very good article, Exactly-once intuition, about a set of heuristics that are very helpful in sketching the structure of an algorithm to achieve exactly-once message processing. Below there're a summary of those heuristics:

The transaction and the side effects: The outcome of processing a message consists of two parts. There is a transactional part and a side effects part. The transaction consists of application state change and of marking the incoming message as processed. The side effects include things like creating objects in non-transactional data stores (e.g. uploading a blob) and sending messages.;
Until the transaction is committed, nothing happened: In order for an algorithm to behave correctly, it has to guarantee that until a transaction is committed, no effects of the processing are visible to the external observers.
Prepare - Commit - Publish: [...] For this reason any correct algorithm has to make sure the side effects are made durable, but not visible (prepared), before the transaction is committed. Then, after the commit, the side affects are published.
Side effects stored in each processing attempt are isolated: [...] In our PDF example each processing attempt would generate its own PDF document but only the attempt that succeeded to commit would publish its outgoing messages, announcing to the world the true location of the PDF.
Register - Cleanup: Although we can’t avoid generating garbage, a well-behaved algorithm ensures that the garbage is eventually cleaned up.
Concurrency control ensures serialization of processing: [...] the outbox record also contains the side effects information. It can exist in only two states: created and dispatched. The transition from created to dispatched does not generate any new information so it does not require concurrency control to prevent lost writes.

Author

rponte commented Sep 16, 2024

Não confie em Exactly-Once Message Processing

Author

rponte commented Sep 18, 2024

Scaling Shared Data in Distributed Systems

Consistency, by definition, requires linearizability. In multi-threaded programs, we achieve this with mutexes. In distributed systems, we use transactions and distributed locking. Intuitively, both involve performance trade-offs.
There are several different strategies, each with their own pros and cons: Immutable Data > Last-Write Wins > Application-Level Conflict Resolution > Causal Ordering > Distributed Data Types
Use weakly consistent models when you can because they afford you high availability and low latency, and rely on stronger models only when absolutely necessary. Do what makes sense for your system.

Author

rponte commented Oct 11, 2024 •

edited

Loading

Some interesting articles from LittleHorse blog:

While Saga is very hard to implement, it's simple to describe:

Try to perform the actions across the multiple systems.

If one of the actions fails, then run a compensation for all previously-executed tasks.

The compensation is simply an action that "undoes" the previous action. For example, the compensation for a payment task might be to issue a refund.

The Basics of Workflow

But what is a workflow engine?

It is a system that allows you to reliably execute a series of steps while being robust to technical failures (network outages, crashes) and business process failures. A step in a workflow can be calling a piece of code on a server, reaching out to an external API, waiting for a callback from a person or external system, or more.

A core challenge when automating a business process is Failure and Exception Handling: figuring out what to do when something doesn't happen, happens with an unexpected outcome, or plain simply fails. This is often difficult to reason about, leaving your applications vulnerable to uncaught exceptions, incomplete business workflows, or data loss.

A workflow engine standardizes how to throw an exception, where the exception is logged, and the logic around when/how to retry. This gives you peace of mind that once a workflow run is started, it will reliably complete.

Author

rponte commented Nov 6, 2024 •

edited

Loading

Sequin

Sequin is a tool for capturing changes and streaming data out of your Postgres database.

No such thing as exactly-once delivery
- Processing is the the full message lifecycle: the message was delivered to the receiver, the receiver did its job, and then the receiver acknowledged the message.
  
  With that definition, SQS, Kafka, and Sequin are all systems that guarantee exactly-once processing. The term processing captures both the delivery of the message and the successful acknowledgment of the message.
- In my mind, the terms at-most-once and at-least-once delivery help us distinguish between delivery mechanics. And the term "exactly-once processing" indicates it's a messaging system with at-least-once delivery and acknowledgments.
- A debate over a Github issue - At the end of the day, perfect exactly-once mechanics are a platonic ideal. And a system can only bring you so far, at some point you must implement idempotency on the client if your requirements demand it.

rponte/avoid-distributed-transactions.md

Distributed Transactions and why you should avoid them

rponte commented Sep 2, 2024

rponte commented Sep 5, 2024 • edited Loading

rponte commented Sep 10, 2024 • edited Loading

Exactly-once message processsing

rponte commented Sep 16, 2024

rponte commented Sep 18, 2024

rponte commented Oct 11, 2024 • edited Loading

rponte commented Nov 6, 2024 • edited Loading

Sequin

rponte commented Sep 5, 2024 •

edited

Loading

rponte commented Sep 10, 2024 •

edited

Loading

rponte commented Oct 11, 2024 •

edited

Loading

rponte commented Nov 6, 2024 •

edited

Loading