(the video is here, the original talk notes are here)
At least one of those four patterns are in play when you talk about "event-driven" architectures:
- Event Notification: components communicating via events
- Event-carried State Transfer: allowing components to access data without calling the source
- Event Sourcing: using an event log as the primary record for a system
- CQRS: having a separate component for updating a store from any readers of the store
To reverse the dependency between systems ("dependency switch")
E.g. Instead of having
Costumer management (depends on)->
Insurance quoting
You have
customer management -> fire event "address changed" -> insurance quoting listen to that event and decide how to react, what process to trigger (re-quoting the insurance e.g.)
You "bottle" the change: the change becomes a first class citizen, we wrap the change as a record, an object that we can refer to, pass around, store, etc
"first class thing imagicks"
(when do we have events and when do we have commands?)
One more benefit: many other systems can hook to the event stream without touching the existing system
So we are able to:
- Decouple receiver from sender
BUT we have
- No statement of overall behavior: the inability to see what's going on in the system as a whole (this is the dark side of event-driven architecture)
It's a trade off.
Event Notification often involve additional traffic: the "consumer systems" may still have to go back to the "original source" to get more information about the change event they just consumed. You may reduce this burden putting more details in the fired event.
A more extreme pattern is the "Event-carried State Transfer":
To avoid getting back to the original source, the consumer system keeps a copy of all the data that it will ever going to need.
The event source system has to broadcast in its event all the data that the downstream systems will need.
- => more decoupling
- => improve performance (reduced traffic between the systems)
- => reduce load on the supplier (the event source system)
- => improve availability (if the event source system goes down, the downstream systems can still work)
But... you have to replicate the data, and you have now Eventual Consistency!
You have a representation of the current state of the world, but you also have a log of all the events happened.
E.g change user address
- fire an event object ("address changed") and keep it in a separate persistent area (event store)
- process the event, to actually change the address
You now have two representation of the world:
- the Application State, the current representation of the world, and
- a log of all the events that changed that world
The test definition of Event Sourcing: at any time we can blow away the application state and confidently rebuild it from the log.
You are doing Event Sourcing if this statement is true for you.
A real-world example of event source system are version control systems (git, subversion, ...):
- the application state is the working copy, the tree of files
- the log are all the commits
and you also have snapshots: an application state in a given point in time, obtained by applying a stream of events until that given point in time.
Another real-life example is the bank account system.
Benefits:
- Audits
- Debugging
- Historic State
- Alternative State
- Memory Image
Cons:
- More complexity
- Unfamiliar
- Deal with external systems (every interaction with external system should be transformed in a corresponding event, so that you will be able to save it and replay it if you need to)
- Event schema: how do you store my events in a way that I can confidently replay them even as I change the code that process them?
- Identifiers
- Versioning can be really tricky (unless you use frequent snapshots)
How to capture business intention in the events?
For example, you may have
- an input event, which captures business semantics,
- an internal event, which captures change in records, and
- an output event, which captures observable change
What's more important? The internal or the input event? Be clear of what you're storing. Most of the time you have to store both.
Greg Young: "Don't have any business logic in between your event and its storage", to avoid tangled versioning problems (e.g. replay events after fixing a bug in the code).
But if you don't store the input event you loose the business intention.
Often the solution is: store both.
- Store the external event (the input event)
- And all the cascading events triggered by the process of the incoming external event
So a canonical solution is to end up with a pipeline of events:
PROCESS -> (triggering) EVENTs -> PROCESSES -> EVENTS -> ...
You separate the components that read and write to your permanent store.
You have two separate models (they are effectively two separate piece of software):
- the "Command Model", for dealing with updates, and
- the "Query Model", dealing with reads.
Be careful: use it if you really need it, and really understand how it works, because it is difficult.
Many people confuse CQRS with having a separate READ store, but this is already a well-known solution: having separated operations and reporting databases. It's not CQRS.
The key point of CQRS is that the command model (which updates) isn't used by anything for reading. They are distinct models!
Great notes! Thank you 👍