After writing up the Finite State Machine from SoCalCodeCamp, I had CoffeeScript on the brain and decided to go back and look at my first CoffeeScript experiment to see if there was any room for improvement. Actually, I knew there was room for improvement. One of the times when I was stuck in my presentation I flipped over to this machine and I’m pretty sure I blanched visibly and the counted loop in the “read” function.
So lets revisit my first CoffeeScript state machine. This machine determines if a specified substring exists within a string. In other words, it implements bool String.Contains(String). Besides all the room for improvement, I think there is an opportunity for some meaningful use of CoffeeScript classes.
First we’ll warm up by adding a shim for alert.
alert = (value) -> console.log value
Now we can get started with our conversion, we’ll start from the bottom up this time.
Remove Test Duplication
Here is our current test code.
test1 = "Windows is an operating system installed on many machines." test2 = "Welcome to the operating room, the Doctor is just finishing his martini." getMachineResult = (b) -> result = "" if b == true result = " Accepted. " else result = " Not Accepted. " result machine = createMachine(rules, 0, finalStates) machine.readAll test1 alert getMachineResult(machine.acceptedInput()) + test1 machine2 = createMachine(rules, 0, finalStates) machine2.readAll test2 alert getMachineResult(machine2.acceptedInput()) + test2
Lets remove the duplication and the weirdo getMachineResult method.
testMachine = (input) -> machine = createMachine(rules, 0, finalStates) machine.readAll input alert [machine.acceptedInput(), input].join '--' testMachine "Windows is an operating system installed on many machines." testMachine "Welcome to the operating room, the Doctor is just finishing his martini."
Encapsulate Machine Creation
Our testMachine function currently closes over rules and finalStates from the CoffeeScript “module” scope. We also have these five lines which represent the logic necessary for getting ready to create our machine.
source = "operating system".split '' rules = (createRule x, source for x in [0..source.length]) altRules = (createAltRule x, source for x in [0..source.length]) rules.concat altRules finalStates = [ source.length ]
And the machine is actually created in testMachine
machine = createMachine(rules, 0, finalStates)
Lets encapsulate all this logic into one function, get it out of the module scope, and generalize it so that it can work with any search string.
createSearchMachine = (searchString) -> source = searchString.split '' states = [0..source.length] rules = (createRule x, source for x in states) .concat (createAltRule x, source for x in states) createMachine rules, 0, [ source.length ] testMachine = (input, searchString) -> machine = createSearchMachine searchString # ...
I did a little experimenting with this one and I’m pretty pleased with the results. Because the character array indexes actually represent the states the search machine can take, we cant just iterate over the characters. I think the new version makes it clear that the use of the index range is not a mistake. I also decided to see if I could chain the call to concat right off of the list comprehension and that worked. Then the line was really long, so i decided to see if significant whitespace would prevent me from continuing the statement on the next line, but it worked just fine. Note that the () around the list comprehension are necessary both times. The parenthesis aren’t part of calling concat, they are part of the list comprehension.
I saved another line by moving finalStates inline, then updated the signature and call site in testMachine.
Creating Machines
Here is our current createMachine implementation. I don’t see any advantage to making it into a class, so I’ll just clean it up.
createMachine = (rules, initialState, finalStates) ->
{
currentState: initialState
transitionRules: rules
acceptanceStates: finalStates
read: (c) ->
for i in [0..this.transitionRules.length - 1]
if this.transitionRules[i].match this.currentState, c
return this.currentState = this.transitionRules[i].resultState
readAll: (s) -> this.read x for x in s.split ''
acceptedInput: () -> this.currentState in this.acceptanceStates
}
I got rid of the object properties and captured with closure instead. I also converted the counted loop into a cleaner for..in loop. Finally, I picked some better variable names.
createMachine = (transitionRules, currentState, acceptanceStates) -> readAll: (input) -> this.read inputChar for inputChar in input.split '' read: (inputChar) -> for rule in transitionRules if rule.match currentState, inputChar return currentState = rule.resultState acceptedInput: -> currentState in acceptanceStates
After testing I see that this still works as expected.
Creating Rules
Our rule creation is really spread out and contains a bit of duplication:
createRule = (index, source) -> {
initialState: index
inputChar: source[index]
resultState: index + 1
match: (q, c) -> true if (this.initialState == q && this.inputChar == c)
}
createAltRule = (index, source) ->
if index == source.length
{
# trap in acceptance state
initialState: index
inputChar: null
resultState: index
match: (q, c) -> true if (this.initialState == q)
}
else
{
# return to start state
initialState: index
inputChar: source[index]
resultState: 0
match: (q, c) -> true if (this.initialState == q && this.inputChar != c)
}
Primarily these rules only vary by their method for evaluating match. Our POR (plain old rule) takes an index (which in this case represents a state) and a source string (of which we are only interested in one character) then computes the next state. The match function is true if the passed in state and input character match the stored values. This method can definitely be cleaned up.
createRule = (CurrentState, InputChar, NextState) ->
getNextState: -> NextState
match: (machineState, inputChar) ->
machineState is CurrentState and inputChar is InputChar
With createRule defined this way, we need to make some changes to createSearchMachine so that it passes the needed data, and to createMachine so that it calls getNextState. If the machine is going to query getNextState instead of accessing a property, then the alternate rules will need functions with the same name and semantics.
createAltRule = (index, source) -> if index == source.length { # trap in acceptance state initialState: index inputChar: null resultState: index getNextState: -> @resultState match: (q, c) -> true if (this.initialState == q) } else { # return to start state initialState: index inputChar: source[index] resultState: 0 getNextState: -> @resultState match: (q, c) -> true if (this.initialState == q && this.inputChar != c) }
So I’m beginning to sense a relationship between the POR and the alternate rules. First, I test the new code and once again find out that it works. Lets convert createRule into a class, then use the createRule method as a shim to call the constructor.
class Rule constructor: (@CurrentState, @InputChar, @NextState) -> getNextState: => @NextState match: (machineState, inputChar) => machineState is @CurrentState and inputChar is @InputChar createRule = (CurrentState, InputChar, NextState) -> new Rule CurrentState, InputChar, NextState
The purpose of the shim is just to ensure that we don’t need to update all the call sites while we are refactoring the rules. With this new code in place, I test and everything still works. Now we can look at extending Rule to take over parts of createAltRule. Lets look at the top half of createAltRule first.
createAltRule = (index, source) -> if index == source.length { # trap in acceptance state initialState: index inputChar: null resultState: index getNextState: -> @resultState match: (q, c) -> true if (this.initialState == q) }
The purpose of this branch is to define a rule that traps the machine in a certain state. In this case, it will be the acceptance state. If we count the number of characters in the search string, we find a number that represents the machine’s acceptance state. If we manage to reach that acceptance state, we want to stay in it. Once we have determined that the source string contains the search string, the machine should accept the input, no matter what the rest of the string says. The trap state implementation just compares the machine state to the acceptance state, and returns the acceptance state when the machine calls getNextState.
Here is how we can extend Rule to support this.
class TrapRule extends Rule constructor: (@CurrentState) -> super @CurrentState, null, @CurrentState match: (machineState, inputChar) => machineState is @CurrentState
The TrapRule class extends Rule by replacing the match implementation with the same logic comparison implemented in the old code. I use super to call the Rule constructor with the @CurrentState value as both the current and next state. I pass null as the character to make it explicit that the input character is not used.
Now we can update the createAltRule function.
createAltRule = (index, source) -> if index == source.length new TrapRule index else { # return to start state initialState: index inputChar: source[index] resultState: 0 getNextState: -> @resultState match: (q, c) -> true if (this.initialState == q && this.inputChar != c) }
The bottom half of createAltRule describes a “return rule”. This rule handles any scenario where the machine reads a character that is not in the search string, and also any scenario where the character is in the search string, but not in the correct position. For example, both of our test strings start with the character W. The return rule would apply because W does not appear in operating system. As the machine reads along it will encounter o at position 5. Finding this character, it will advance from state 0 to state 1. The next character it wants to read is p. But in either test string, p is not the character at position 6. So, the return rule will apply and return the machine to state 0. Without the return rules, the machine could only accept strings that start with the search string, but we want a machine that accept strings which contain the search string. So, the match function looks at the input character passed in by the machine, and matches when the character does not match the stored character, and getNextState always returns 0.
Here is a ReturnRule class that implements the same logic.
class ReturnRule extends Rule constructor: (@CurrentState, @InputChar) -> super @CurrentState, @InputChar, 0 match: (machineState, inputChar) => machineState is @CurrentState and inputChar isnt @InputChar
And we can reduce our createAltRule method to this:
createAltRule = (index, source) -> if index is source.length new TrapRule index else new ReturnRule index, source[index]
And this works too. Now that we have cleaned up our rules we can see that our shim is only called in one place, and does very little, so we can inline the Rule constructor and delete createRule.
createSearchMachine = (searchString) -> source = searchString.split '' states = [0..source.length] rules = (new Rule x, source[x], x + 1 for x in states) .concat (createAltRule x, source for x in states) createMachine rules, 0, [ source.length ]
Lets try to get rid of the call to createAltRule with better list comprehension.
createSearchMachine = (searchString) -> source = searchString.split '' states = [0..source.length] rules = (new Rule x, source[x], x + 1 for x in states) .concat (new ReturnRule x, source[x] for x in states when x isnt source.length) createMachine rules, 0, [ source.length ]
As an intermediate step, I simply filtered out the state that would have led to creating the trap rule and replaced the call to createAltRule with a direct instantiation of ReturnRule. I expect this to fail, but it doesn’t! Turns our our machine has a bug in it, since I designed it “knowing” that every possible state/character combination would have a rule, I did not handle the case in read where no rules matched. Just to be on the safe side, I’ll update read by moving the current state to -1 when no rules match. Although it’s conventional to use non-negative integer states when simulating state machines, the truth is that you can use any symbol, positive, negative, or Celtic runes, it’s up to you. Here is the updated Read implementation:
read: (inputChar) ->
for rule in transitionRules
if rule.match currentState, inputChar
return currentState = rule.getNextState()
currentState = -1
Now, the machine fails as expected. We can repair it by creating the TrapRule and pushing it into the rules collection.
createSearchMachine = (searchString) -> source = searchString.split '' states = [0..source.length] rules = (new Rule x, source[x], x + 1 for x in states) .concat (new ReturnRule x, source[x] for x in states when x isnt source.length) rules.push new TrapRule source.length createMachine rules, 0, [ source.length ]
This works so we can get rid of the createAltRule function. I made all these changes to the existing gist and I committed a few intermediate steps along the way. Enjoy.