State machines are everywhere in interactive systems, but they're rarely defined clearly and explicitly. Given some big blob of code including implicit state machines, which transitions are possible and under what conditions? What effects take place on what transitions?
There are existing design patterns for state machines, but all the patterns I've seen complect side effects with the structure of the state machine itself. Instances of these patterns are difficult to test without mocking, and they end up with more dependencies. Worse, the classic patterns compose poorly: hierarchical state machines are typically not straightforward extensions. The functional programming world has solutions, but they don't transpose neatly enough to be broadly usable in mainstream languages.
Here I present a composable pattern for pure state machiness with effects, meant to be idiomatic in a typical imperative context. The pure-impure separation is inspired by functional core, imperative shell; the modeling of novelty and effect by Elm; the hierarchical composition by Harel statecharts.
The solution is implemented in Swift because it's a typical imperative context, and because Swift's types help illustrate the structure I'm suggesting.
The key idea is that states, transitions, and specifications of effect are modeled by a pure value type; a separate dumb object actuates those effects (as in Elm; or think of the traditional Command pattern).
protocol StateType {
/// Events are effectful inputs from the outside world which the state reacts to, described by some
/// data type. For instance: a button being clicked, or some network data arriving.
associatedtype InputEvent
/// Commands are effectful outputs which the state desires to have performed on the outside world.
/// For instance: showing an alert, transitioning to some different UI, etc.
associatedtype OutputCommand
/// In response to an event, a state may transition to some new value, and it may emit a command.
mutating func handleEvent(event: InputEvent) -> OutputCommand
// If you're not familiar with Swift, the mutation semantics here may seem like a very big red
// flag, destroying the purity of this type. In fact, because states have *value semantics*,
// mutation becomes mere syntax sugar. From a semantic perspective, any call to this method
// creates a new instance of StateType; no code other than the caller has visibility to the
// change; the normal perils of mutability do not apply.
//
// If this is confusing, keep in mind that we could equivalently define this as a function
// which returns both a new state value and an optional OutputCommand (it just creates some
// line noise later):
// func handleEvent(event: InputEvent) -> (Self, OutputCommand)
/// State machines must specify an initial value.
static var initialState: Self { get }
// Traditional models often allow states to specific commands to be performed on entry or
// exit. We could add that, or not.
}
Here's an example based on A Pattern Language of Statecharts, figure 3:
/// First up, the StateType. Note that this type is isolated, pure, trivial to fully test with zero mocks/stubs, etc.
enum TurnstyleState: StateType {
case Locked(credit: Int)
case Unlocked
indirect case Broken(oldState: TurnstyleState)
enum Event {
case InsertCoin(value: Int)
case AdmitPerson
case MachineDidFail
case MachineRepairDidComplete
}
enum Command {
case SoundAlarm
case CloseDoors
case OpenDoors
}
static let initialState = TurnstyleState.Locked(credit: 0)
private static let farePrice = 50
// Try to picture the boxes-and-arrows diagram of this state machine using this
// method. Hopefully the structure makes this translation mostly straightforward.
mutating func handleEvent(event: Event) -> Command? {
switch (self, event) {
case (.Locked(let credit), .InsertCoin(let value)):
let newCredit = credit + value
if newCredit >= TurnstyleState.farePrice {
self = .Unlocked
return .OpenDoors
} else {
self = .Locked(credit: newCredit)
}
case (.Locked, .AdmitPerson):
return .SoundAlarm
case (.Locked, .MachineDidFail):
self = .Broken(oldState: self)
case (.Unlocked, .AdmitPerson):
self = .Locked(credit: 0)
return .CloseDoors
case (.Unlocked, .MachineDidFail):
self = .Broken(oldState: self)
case (.Broken, .MachineRepairDidComplete):
self = .Locked(credit: 0)
default: break
// Or should we throw?
// In a more ideal world, it should be impossible to write code which reaches this line (I discuss
// how this might be achieved at the very end).
}
return nil
}
}
/// Now, an imperative shell that hides the enums and delegates to actuators.
/// Note that it has no domain knowledge: it just connects object interfaces.
class TurnstyleController {
private var currentState = TurnstyleState.initialState
private let doorHardwareController: DoorHardwareController
private let speakerController: SpeakerController
init(doorHardwareController: DoorHardwareController, speakerController: SpeakerController) {
self.doorHardwareController = doorHardwareController
self.speakerController = speakerController
self.doorHardwareController.opticalPersonPassingSensorSignal
.takeUntilObjectDeallocates(self)
.subscribeNext { [unowned self] in self.handleEvent(.AdmitPerson) }
}
func customerDidInsertCoin(value: Int) {
self.handleEvent(.InsertCoin(value: value))
}
func mechanicDidCompleteRepair() {
self.handleEvent(.MachineRepairDidComplete)
}
private func handleEvent(event: TurnstyleState.Event) {
switch currentState.handleEvent(event) {
case .SoundAlarm?:
self.speakerController.soundTheAlarm()
case .CloseDoors?:
self.doorHardwareController.sendControlSignalToCloseDoors()
case .OpenDoors?:
self.doorHardwareController.sendControlSignalToOpenDoors()
case nil:
break
}
}
}
protocol DoorHardwareController {
func sendControlSignalToOpenDoors()
func sendControlSignalToCloseDoors()
/// A signal which fires an event whenever the physical door hardware detects a person passing through.
var opticalPersonPassingSensorSignal: Signal<(), NoError> { get }
// If the idea of a Signal is not familiar, don't worry about it: it's not terribly
// important here. This could be a callback or a delegate method instead.
}
protocol SpeakerController {
func soundTheAlarm()
}
Now, if you have a lot of states, you might want to start grouping them into "child" state machines, especially if there's a significant shared semantic. Here I rephrase the example state machine hierarchically: an inner state machine and an outer one, following figure 6 from "A Pattern Language of Statecharts."
Critically, no other types must change: the code above can use HierarchicalTurnstyleState as a drop-in replacement. StateTypes are closed under hierarchical composition. This is a natural consequence of the data-oriented design, which takes advantage of the simpler fact that enums are closed over carrying instances of other enums as an associated value, and that functions are closed over calling other functions. Nothing more complicated is required.
enum HierarchicalTurnstyleState: StateType {
case Functioning(FunctioningTurnstyleState)
case Broken(oldState: FunctioningTurnstyleState)
enum Event {
case InsertCoin(value: Int)
case AdmitPerson
case MachineDidFail
case MachineRepairDidComplete
}
typealias Command = FunctioningTurnstyleState.Command // Of course, this could be some more complicated composition.
static let initialState = HierarchicalTurnstyleState.Functioning(FunctioningTurnstyleState.initialState)
mutating func handleEvent(event: Event) -> Command? {
switch (self, event) {
case (.Functioning(let state), .MachineDidFail):
self = .Broken(oldState: state)
case (.Functioning(let state), _) where FunctioningTurnstyleState.Event(event) != nil:
var newState = state
let command = newState.handleEvent(FunctioningTurnstyleState.Event(event)!)
self = .Functioning(newState)
return command
case (.Broken(let oldState), .MachineRepairDidComplete):
self = .Functioning(oldState)
default:
break // or maybe throw, etc
}
return nil
}
}
// Note that this could be private, or not. It's just as much of a StateType as HierarchicalTurnstyleState.
enum FunctioningTurnstyleState: StateType {
case Locked(credit: Int)
case Unlocked
enum Event {
case InsertCoin(value: Int)
case AdmitPerson
}
enum Command {
case SoundAlarm
case CloseDoors
case OpenDoors
}
static let initialState = FunctioningTurnstyleState.Locked(credit: 0)
private static let farePrice = 50
mutating func handleEvent(event: Event) -> Command? {
// Now somewhat simpler because the turnstyle can't be broken.
switch (self, event) {
case (.Locked(let credit), .InsertCoin(let value)):
let newCredit = credit + value
if newCredit >= TurnstyleState.farePrice {
self = .Unlocked
return .OpenDoors
} else {
self = .Locked(credit: newCredit)
}
case (.Locked, .AdmitPerson):
return .SoundAlarm
case (.Unlocked, .AdmitPerson):
self = .Locked(credit: 0)
return .CloseDoors
default:
break // or maybe throw, etc
}
return nil
}
}
// Conversion between the layers. You could approach this a variety of ways.
extension FunctioningTurnstyleState.Event {
init?(_ event: HierarchicalTurnstyleState.Event) {
switch event {
case .InsertCoin(let value):
self = .InsertCoin(value: value)
case .AdmitPerson:
self = .AdmitPerson
case .MachineDidFail, .MachineRepairDidComplete:
return nil
}
}
}
The hierarchical state machines introduced above were always alternations: the parent machine is always in a state of one of its child machines. But instead the composition might be a Cartesian product, where the parent runs the child machines simultaneously, and its effective state is a tuple of the child states.
The relationship between this kind of composition (orthogonal) and the earlier kind of composition (alternative) is the same as the relationship between product types (structs, tuples) and sum types (enums), so we see that reflected in the declaration below.
For instance, a keyboard might have two different state machines running simultaneously: one handles the toggling behavior of the main keypad's caps lock key and its impact on keypad behavior; the other handles the same thing for the numpad and num lock.
struct KeyboardState: StateType {
var mainKeypadState: MainKeypadState
var numericKeypadState: NumericKeypadState
enum HardwareKeyEvent {
// A quick and dirty definition.
case Alpha(Character)
case Number(Int)
case NumLock
case CapsLock
}
static let initialState = KeyboardState(mainKeypadState: MainKeypadState.initialState, numericKeypadState: NumericKeypadState.initialState)
mutating func handleEvent(event: HardwareKeyEvent) -> KeyboardOutputCommand? {
if let mainKeypadEvent = MainKeypadState.HardwareKeyEvent(event) {
return mainKeypadState.handleEvent(mainKeypadEvent)
} else if let numericKeypadEvent = NumericKeypadState.HardwareKeyEvent(event) {
return numericKeypadState.handleEvent(numericKeypadEvent)
} else {
return nil
}
}
}
enum MainKeypadState: StateType {
case CapsLockOff
case CapsLockOn
enum HardwareKeyEvent {
case Alpha(Character)
case CapsLock
}
static let initialState = MainKeypadState.CapsLockOff
mutating func handleEvent(event: HardwareKeyEvent) -> KeyboardOutputCommand? {
switch (self, event) {
case (.CapsLockOff, .CapsLock):
self = .CapsLockOn
case (.CapsLockOff, .Alpha(let c)):
return .AlphaNumeric(c)
case (.CapsLockOn, .CapsLock):
self = .CapsLockOff
case (.CapsLockOn, .Alpha(let c)):
return .AlphaNumeric(c.uppercaseCharacter)
}
return nil
}
}
enum NumericKeypadState: StateType {
case NumLockOff
case NumLockOn
enum HardwareKeyEvent {
case Number(Int)
case NumLock
}
static let initialState = NumericKeypadState.NumLockOn
mutating func handleEvent(event: HardwareKeyEvent) -> KeyboardOutputCommand? {
switch (self, event) {
case (.NumLockOff, .NumLock):
self = .NumLockOn
case (.NumLockOff, .Number(let n)):
if let arrowDirection = ArrowDirection(numericKeypadInput: n) {
return .Arrow(arrowDirection)
}
case (.NumLockOn, .NumLock):
self = .NumLockOff
case (.NumLockOn, .Number(let n)):
return .AlphaNumeric("\(n)".characters.first!)
}
return nil
}
}
enum KeyboardOutputCommand {
case AlphaNumeric(Character)
case Arrow(ArrowDirection)
}
enum ArrowDirection {
case Up, Down, Left, Right
}
extension ArrowDirection {
// Mapping the numpad keys to arrow keys.
init?(numericKeypadInput: Int) {
switch numericKeypadInput {
case 2: self = .Up
case 4: self = .Left
case 6: self = .Right
case 8: self = .Down
default: return nil
}
}
}
// We can various other constructions which might make this kind of glue lighter.
extension MainKeypadState.HardwareKeyEvent {
init?(_ hardwareKeyEvent: KeyboardState.HardwareKeyEvent) {
switch hardwareKeyEvent {
case .Alpha(let c): self = .Alpha(c)
case .CapsLock: self = .CapsLock
case .Number, .NumLock: return nil
}
}
}
extension NumericKeypadState.HardwareKeyEvent {
init?(_ hardwareKeyEvent: KeyboardState.HardwareKeyEvent) {
switch hardwareKeyEvent {
case .Alpha, .CapsLock: return nil
case .Number(let n): self = .Number(n)
case .NumLock: self = .NumLock
}
}
}
extension Character {
var uppercaseCharacter: Character {
return Character(String(self).uppercaseString)
}
}
Or more generally (but it's ugly and non-idiomatic):
struct OrthogonalState<StateA: StateType, StateB: StateType>: StateType {
var stateA: StateA
var stateB: StateB
static var initialState: OrthogonalState<StateA, StateB> {
return OrthogonalState(stateA: StateA.initialState, stateB: StateB.initialState)
}
mutating func handleEvent(event: OrthogonalStateEvent<StateA, StateB>) -> OrthogonalStateCommand<StateA, StateB>? {
// In some orthogonal state machines, certain events will want to be dispatched to both children. We'd need a somewhat more complex structure for that.
switch event {
case .A(let event):
return stateA.handleEvent(event).map(OrthogonalStateCommand.A)
case .B(let event):
return stateB.handleEvent(event).map(OrthogonalStateCommand.B)
}
}
}
// Not allowed to nest types in generic types.
enum OrthogonalStateEvent<StateA: StateType, StateB: StateType> {
case A(StateA.Event)
case B(StateB.Event)
}
enum OrthogonalStateCommand<StateA: StateType, StateB: StateType> {
case A(StateA.Command)
case B(StateB.Command)
}
// But hey, I can now define the same KeyboardState interface as an Adapter on OrthogonalState. Not that this is particularly cleaner, but it does contain less domain knowledge:
struct KeyboardState2: StateType {
private typealias OrthogonalStateType = OrthogonalState<MainKeypadState, NumericKeypadState>
private var orthogonalState: OrthogonalStateType
enum HardwareKeyEvent {
// A quick and dirty definition.
case Alpha(Character)
case Number(Int)
case NumLock
case CapsLock
private var orthogonalEvent: OrthogonalStateType.Event {
switch self {
case .Alpha(let c): return .A(.Alpha(c))
case .Number(let n): return .B(.Number(n))
case .NumLock: return .B(.NumLock)
case .CapsLock: return .A(.CapsLock)
}
}
}
static let initialState = KeyboardState2(orthogonalState: OrthogonalStateType.initialState)
mutating func handleEvent(event: HardwareKeyEvent) -> KeyboardOutputCommand? {
switch orthogonalState.handleEvent(event.orthogonalEvent) {
case .A(let c)?: return c
case .B(let c)?: return c
case nil: return nil
}
}
}
The use of enum
s in this pattern is not terribly idiomatic. In a fundamentally imperative language, we usually describe events with methods. Unfortunately, methods aren't as flexible in use-vs-reference as enums, so state machines defined in this way don't compose quite so cleanly.
Another issue with using functions to specify events is that they make the machine harder to "read" by emphasizing transitions over states. Compare this definition of TurnstyleState to the initial one; I argue it's much harder to picture the boxes-and-diagrams graph. (note that it's impossible to implement OrthogonalStateType
on top of this structure)
enum TurnstyleState2 {
case Locked(credit: Int)
case Unlocked
indirect case Broken(oldState: TurnstyleState2)
enum Command {
case SoundAlarm
case CloseDoors
case OpenDoors
}
static let initialState = TurnstyleState2.Locked(credit: 0)
private static let farePrice = 50
func insertCoin(value: Int) -> (TurnstyleState2, Command?) {
switch self {
case .Locked(let credit):
let newCredit = credit + value
if newCredit >= TurnstyleState.farePrice {
return (.Unlocked, .OpenDoors)
} else {
return (.Locked(credit: newCredit), nil)
}
case .Unlocked, .Broken:
return (self, nil)
}
}
func admitPerson() -> (TurnstyleState2, Command?) {
switch self {
case .Locked:
return (self, .SoundAlarm)
case .Unlocked:
return (.Locked(credit: 0), .CloseDoors)
case .Broken:
return (self, nil)
}
}
func machineDidFail() -> TurnstyleState2 {
switch self {
case .Locked, .Unlocked:
return .Broken(oldState: self)
case .Broken:
return self // Or throw?
}
}
func machineRepairDidComplete() -> TurnstyleState2 {
switch self {
case .Broken(let oldState):
return oldState
case .Locked, .Unlocked:
return self // Or throw?
}
}
}
All those places where I wrote "or throw?"... can we do better?
One key issue is that event handlers are fairly indiscriminate. If we're listening to events from some hardware sensor, those events are going to arrive regardless of what our state is, so we have to branch somewhere to decide whether to handle it or not. Right now, we branch in handleEvent
. If individual states had their own types (only some of which included certain events), we'd just kick the branch up to whatever component dispatches events to states.
So a rigorous solution must meaningfully affect dispatch of the input events which lead to state changes. One solution (similar to Elm's) would be to allow states to (directly or indirectly) specify which upstream subscriptions are actually active. Those specifications could include enough type information to directly enforce that states only receive events with corresponding transitions.
I'll save a detailed treatment on this topic for another day.
My thanks to Colin Barrett, Chris Eidhof, Justin Spahr-Summers, and Rob Rix for useful feedback on a first draft of this concept.
My thanks to Matthew Johnson for suggesting that optionality of OutputCommand
in handleEvent()
can itself be optional.
@alexsullivan114 That's a curious and thought-provoking observation! Normally a "state" might be considered the set or composition of all run-time dependencies or links within the system, and the "next state" simply the next discrete form of that set after any change occurs to any dependency, but in these cases these "state machines" are severe, severe abstractions aiding knowledge share by creating a reasonable mental model or abstraction.
With that in mind, and given the fact that statecharts (orthogonal, hierarchical state machines as described above) are canonical UML, let's presume that the arrows in the graphics not only represent event transitions, but, in code form, also represent a code dependency as you're suggesting. They will certainly represent a knowledge dependency if you use some sort of event bus or router, but given that these implemented-in-code state machines serve the abstraction, and not necessarily the system (the system de facto is a state machine at all times, regardless of how you represent it in code), then creating either dependency becomes a simple quality issue. By this I mean are there any external constraints that might cause conflict in creating a direct coupling that hinders you from representing this abstraction efficiently? I would argue no, as the amount of knowledge encapsulated within each state is simply the state itself, and not any larger system demands.