This machine determines if a specified substring exists within a string. The initial machine is the first machine I wrote in CoffeeScript, so there should be plenty of room for improvement.
Also interesting, I think there is an opportunity for some meaningful use of CoffeeScript classes.
First we'll add a shim for alert.
alert = (value) -> console.log value
Now we can get started with our conversion, we'll start from the bottom up this time.
Here is our current test code.
test1 = "Windows is an operating system installed on many machines."
test2 = "Welcome to the operating room, the Doctor is just finishing his martini."
getMachineResult = (b) ->
result = ""
if b == true
result = " Accepted. "
else
result = " Not Accepted. "
result
machine = createMachine(rules, 0, finalStates)
machine.readAll test1
alert getMachineResult(machine.acceptedInput()) + test1
machine2 = createMachine(rules, 0, finalStates)
machine2.readAll test2
alert getMachineResult(machine2.acceptedInput()) + test2
Lets remove the duplication and the weirdo getMachineResult
method.
testMachine = (input) ->
machine = createMachine(rules, 0, finalStates)
machine.readAll input
alert [machine.acceptedInput(), input].join '--'
testMachine "Windows is an operating system installed on many machines."
testMachine "Welcome to the operating room, the Doctor is just finishing his martini."
Our testMachine
function currently closes over rules
and finalStates
from the CoffeeScript "module" scope. We also have these five lines which represent the logic necessary for getting ready to create our machine.
source = "operating system".split ''
rules = (createRule x, source for x in [0..source.length])
altRules = (createAltRule x, source for x in [0..source.length])
rules.concat altRules
finalStates = [ source.length ]
And the machine is actually created in testMachine
machine = createMachine(rules, 0, finalStates)
Lets encapsulate all this logic into one function, get it out of the module scope, and generalize it so that it can work with any search string.
createSearchMachine = (searchString) ->
source = searchString.split ''
states = [0..source.length]
rules = (createRule x, source for x in states)
.concat (createAltRule x, source for x in states)
createMachine rules, 0, [ source.length ]
testMachine = (input, searchString) ->
machine = createSearchMachine searchString
# ...
I did a little experimenting with this one and I'm prettly pleased with the results. I think its clearer now that our list comprehensions actually need the character array index values, because these represent the states the search machine can take. I also decided to see if I could call concat
right off of the list comphrenshion and that worked. Then the line was really long, so i decided to see if significant whitespace would prevent me from continuing the statement on the next line, but it worked just fine. Note that the ()
around the list comprehension are necessary both times. The parenthesis aren't part of calling concat
, they are part of the list comprehension.
I saved another line by inlining finalStates
, then updated the signature and call site in testMachine
.
Here is our current createMachine
implementation. I don't see any advantage to making it into a class, so I'll just clean it up.
createMachine = (rules, initialState, finalStates) ->
{
currentState: initialState
transitionRules: rules
acceptanceStates: finalStates
read: (c) ->
for i in [0..this.transitionRules.length - 1]
if this.transitionRules[i].match this.currentState, c
return this.currentState = this.transitionRules[i].resultState
readAll: (s) -> this.read x for x in s.split ''
acceptedInput: () -> this.currentState in this.acceptanceStates
}
I got rid of the object properties and capture with closure instead. I also converted the counted loop into a cleaner for..in
loop. Finally, I picked some better variable names.
createMachine = (transitionRules, currentState, acceptanceStates) ->
readAll: (input) -> this.read inputChar for inputChar in input.split ''
read: (inputChar) ->
for rule in transitionRules
if rule.match currentState, inputChar
return currentState = rule.resultState
acceptedInput: -> currentState in acceptanceStates
After testing I see that this still works as expected.
Our rule creation is really spread out and contains a bit of duplication:
createRule = (index, source) -> {
initialState: index
inputChar: source[index]
resultState: index + 1
match: (q, c) -> true if (this.initialState == q && this.inputChar == c)
}
createAltRule = (index, source) ->
if index == source.length
{
# trap in acceptance state
initialState: index
inputChar: null
resultState: index
match: (q, c) -> true if (this.initialState == q)
}
else
{
# return to start state
initialState: index
inputChar: source[index]
resultState: 0
match: (q, c) -> true if (this.initialState == q && this.inputChar != c)
}
Primarily these rules only vary by thier method for evaluating match
. Our POR (plain old rule) takes an index (which in this case represents a state) and a source string (of which we are only interested in one character) then computes the next state. The match function is true if the passed in state and input character match the stored values. This method can definitely be cleaned up.
createRule = (CurrentState, InputChar, NextState) ->
getNextState: -> NextState
match: (machineState, inputChar) ->
machineState is CurrentState and inputChar is InputChar
With createRule
defined this way, we need to make some changes to createSearchMachine
so that it passes the needed data, and to createMachine
so that it calls getNextState
. If the machine is going to query getNextState
, then the alternate rules will need them as well.
createAltRule = (index, source) ->
if index == source.length
{
# trap in acceptance state
initialState: index
inputChar: null
resultState: index
getNextState: -> @resultState
match: (q, c) -> true if (this.initialState == q)
}
else
{
# return to start state
initialState: index
inputChar: source[index]
resultState: 0
getNextState: -> @resultState
match: (q, c) -> true if (this.initialState == q && this.inputChar != c)
}
So I'm beginning to sense a relationship between the POR and the alternate rules. First, I test the new code and once again find out that it works.
Lets convert createRule
into a class, then use the createRule
method as a shim to call the constructor.
class Rule
constructor: (@CurrentState, @InputChar, @NextState) ->
getNextState: => @NextState
match: (machineState, inputChar) =>
machineState is @CurrentState and inputChar is @InputChar
createRule = (CurrentState, InputChar, NextState) ->
new Rule CurrentState, InputChar, NextState
The purpose of the shim is just to ensure that we don't need to update all the call sites while we are refactoring the rules. With this new code in place, I test and everything still works. Now we can look at extending Rule
to take over parts of createAltRule
. Lets look at the top half of createAltRule
first.
createAltRule = (index, source) ->
if index == source.length
{
# trap in acceptance state
initialState: index
inputChar: null
resultState: index
getNextState: -> @resultState
match: (q, c) -> true if (this.initialState == q)
}
The purpose of this branch is to define a rule that traps the machine in a certain state. In this case, it will be the acceptance state. If the machine manages to reach the state identified by the same nubmer as the length of the source, then it means we sucessfully read each character in the search string. Now, the machine must finish reading the string, but we don't want it to evaluate any more rules or change state anymore. We just want it to stay in the same state, so the match function only checks the machine's state. If the machine state is the same as the trap rule's stored state, then the rule will match. When the machine queries the rule for the next state, the rule will simply return the same state again.
Here is how we can extend Rule
to support this.
class TrapRule extends Rule
constructor: (@CurrentState) ->
super @CurrentState, null, @CurrentState
match: (machineState, inputChar) =>
machineState is @CurrentState
The TrapRule
class extends Rule
by replacing the match
implementation with an implementation with the same logic as the old implementation. I use super
to call the Rule
constructor with the @CurrentState
value as both the current and next state. I pass null
as the character to make it explicit that the input character is not used.
Now we can update the createAltRule
function.
createAltRule = (index, source) ->
if index == source.length
new TrapRule index
else
{
# return to start state
initialState: index
inputChar: source[index]
resultState: 0
getNextState: -> @resultState
match: (q, c) -> true if (this.initialState == q && this.inputChar != c)
}
The bottom half of createAltRule
describes a "return rule". This rule handles any scenario where the machine reads a character that is not in the search string, and also any scneario where the character is in the search string, but not in the correct position. For example, both of our test strings start with the character W. The return rule would apply because W does not appear in operating system. As the machine reads along it will encounter o at position 5. Finding this it will advance from state 0 to state 1. The next character it wants to read is p. But in either test string, p is not the character at position 6. So, the return rule will apply and return the machine to state 0. Without the return rules, the machine could only accept strings that start with the search string, but we want a machine that accept strings that contain the search string. So, the match
function looks at the input character passed in by the machine, and matches when the character does not match the stored character, and getNextState
always returns 0
.
Here is a ReturnRule
class that implements the same logic.
class ReturnRule extends Rule
constructor: (@CurrentState, @InputChar) ->
super @CurrentState, @InputChar, 0
match: (machineState, inputChar) =>
machineState is @CurrentState and inputChar isnt @InputChar
And we can reduce our createAltRule
method to this:
createAltRule = (index, source) ->
if index is source.length
new TrapRule index
else
new ReturnRule index, source[index]
And this works too. Now that we have cleaned up our rules we can see that our shim is only called in one place, and does very little, so we can inline the Rule
constructor and delete createRule
.
createSearchMachine = (searchString) ->
source = searchString.split ''
states = [0..source.length]
rules = (new Rule x, source[x], x + 1 for x in states)
.concat (createAltRule x, source for x in states)
createMachine rules, 0, [ source.length ]
Lets try to get rid of createAltRule
with better list comprehension.
createSearchMachine = (searchString) ->
source = searchString.split ''
states = [0..source.length]
rules = (new Rule x, source[x], x + 1 for x in states)
.concat (new ReturnRule x, source[x] for x in states when x isnt source.length)
createMachine rules, 0, [ source.length ]
As an intermediate step, I simply filtered out the state that would have led to creating the trap rule and repalced the call to createAltRule
with a direct instantiation of ReturnRule
. I expect this to fail, but it doesn't! Turns our our machine has a bug in it, since I designed it knowing that every possible state/character combination would have a rule, I did not handle the case in read
where no rules matched. Just to be on the safe side, I'll update read by moving the current state to -1
when no rules match.
read: (inputChar) ->
for rule in transitionRules
if rule.match currentState, inputChar
return currentState = rule.getNextState()
currentState = -1
Now, the machine fails as expected. And we can add the TrapRule
and push it into the rules collection.
createSearchMachine = (searchString) ->
source = searchString.split ''
states = [0..source.length]
rules = (new Rule x, source[x], x + 1 for x in states)
.concat (new ReturnRule x, source[x] for x in states when x isnt source.length)
rules.push new TrapRule source.length
createMachine rules, 0, [ source.length ]
This works so we can get rid of createAltRule
.