Use Closures to Put C# on a Diet

I enjoy playing with state machines. They are simple enough to implement quickly, and complex enough to give the implementation language a little workout. Followers of this blog will know that I’ve enjoyed using Finite State Machines to explore CoffeeScript.

But, when I look at the search terms leading to my blog, I see that people want to see FSMs implemented in C#, C++, or even VB. About the only language I haven’t seen people looking for is CoffeeScript. I’m not too surprised, since most people searching for FSM implementations are probably students trying to cheat get help on their homework. I doubt any professors are accepting CoffeeScript implementations.

When I see these search results, I consider quickly posting a solution in the requested language. I don’t mind if students crib solutions from me. I’m a big believer in learning by example. Certainly if by looking at my FSMs I can save one student from writing a switch statement, then I’ll consider it a public service. On the other hand, I’m not really interested in producing solutions by language on demand. Its not so much that I’m worried about students reading my solutions, as much as it is that implanting FSMs over and over again in different languages isn’t very interesting on its own merits.

While I was refactoring the String.Contains FSM for my previous post, I saved a few lines by taking advantage of closure. My inspiration for this refactoring came from a question one of the attendee’s asked at my CoffeeScript Crash Course presentation at SoCalCodeCamp. If you watch the video you will hear him ask, “Does the @ imply closure, and if so why?” As I said in my presentation and on this blog, I’m no expert at CoffeeScript or JavaScript, so I can’t claim to have instantly grasped the context of his question. However, as I listened to it again, and listened to his comments to another attendee, I began to understand why he asked the question. The result was the new implementation I wrote about in my last post.

C# also supports closure, so I began to wonder what a C# String.Contains implementation would look like if we wanted to use as much closure as possible. I’ve implemented this machine in C# and C++ before, but I’m intrigued enough by the closure idea to try C# again.

Getting Started

This sample should be simple, Visual Studio seems like overkill, so I’m going to code it in Sublime Text 2 and use the C# compiler on the command line.

csc .\StringCompare.cs 

If StringCompare.cs contains this skeleton program, it should compile to a do-nothing program:

public class Program
{
    public static void Main(params string[] args)
    {
    }
}

Rules

In our CoffeeScript implementation we have a class called Rule with two subclasses. Lets see how this looks in C#:

public class Rule
{
    public Rule(int currentState, char inputChar, int nextState)
    {
        this.NextState = nextState;
        this.Match = (machineState, readChar) =>
        {
            return currentState == machineState && readChar == inputChar; 
        };
    }

    public int NextState { get; private set; }
    public Func<int, char, bool> Match { get; private set; }
}

Remember, our goal is to use as much closure as possible, so if we can avoid declaring data properties, we will. All of our data comes in through the constructor, so we can create a lambda expression that closes over the currentState and inputChar and use it as our Match function. We still need to declare the Match function and its signature, but there is no need to create instance properties for the data. For NextState we could define a Func<int>, but that would have no benefit compared to a simple int property, so we’ll use a property.

That’s all fine in theory, but does it work? Let’s write a little test in the Main function.

public class Program
{
    public static void Main(params string[] args)
    {
        var myRule = new Rule(0, '1', 1);
        Console.WriteLine(myRule.Match(0,'1'));
        Console.WriteLine(myRule.Match(1,'1'));
    }
}

As expected, this test prints True followed by False. It’s clear from this test that closure works just as well in constructors as anywhere else. Having satisfied my intuition, I can delete this test code If I want to and move on to implementing the specialized TrapRule and ReturnRule classes.

public class TrapRule : Rule
{
    public TrapRule(int currentState)
        : base(currentState, (char)0, currentState)
    {
        this.Match = (machineState, readChar) =>
        {
            return currentState == machineState;
        };
    }
}

Like the CoffeeScript implementation, I pass currentState to the base implementation twice. However, I can’t pass null for a char value in C#. Instead, I pass in the null character 0. Its also interesting to note that I don’t need to make Match virtual, and I don’t need to override it. All I need to do is make the setter protected so that the subclass can replace the implementation. We can update the test in Main to verify that this subclass works.

public class Program
{
    public static void Main(params string[] args)
    {
        var myRule = new TrapRule(0);
        Console.WriteLine(myRule.Match(0,'1'));
        Console.WriteLine(myRule.Match(1,'1'));
    }
}

As expected, we see True printed, followed by False. Next, we can implement the ReturnRule.

public class ReturnRule : Rule
{
    public ReturnRule(int currentState, char inputChar)
        : base(currentState, inputChar, 0)
    {
        this.Match = (machineState, readChar) =>
        {
            return currentState == machineState && readChar != inputChar;
        };
    }
}

We actually don’t need to pass currentState and inputChar to the base class, because it only needs to know the nextState. However, it doesn’t hurt to pass it either. Once again, the child class replaces the Match implementation with one appropriate for the return rule. As before, we can update our test to verify the ReturnRule behavior.

public static void Main(params string[] args)
{
    var myRule = new ReturnRule(0, '0');
    Console.WriteLine(myRule.Match(0,'0'));
    Console.WriteLine(myRule.Match(0,'1'));
}

As expected, this prints False followed by True. In theory, we’ve implemented Rule classes that should work with a String.Contains implementation. We need to see how this works with a machine implementation.

Machine

In the CoffeeScript implementation, I saw no benefit in converting the createMachine function into a class. In C#, I thought I might be able to do something similar with dynamic. It turns out that this doesn’t work though. The C# compiler produces error CS0828 because you cannot assign a lambda expression to an anonymous type property. I don’t use dynamic very often, so this is news to me. It seems I’m already learning more about C# through this exercise.

So, with this restriction in place, I will need to create a class to implement the machine.

public class Machine 
{
    public Machine(Rule[] transitionRules, int currentState, int[] acceptanceStates)
    {
        this.ReadAll = (input) =>
        {
            foreach(var inputChar in input)
            {
                Read(inputChar);
            }
        };
        this.Read = (inputChar) => 
        {
            foreach(var rule in transitionRules) 
            {
                if (rule.Match(currentState, inputChar))
                {
                    currentState = rule.NextState;
                    return;
                }
            }
            currentState = -1;
        };
        this.AcceptedInput = () => Array.IndexOf(acceptanceStates, currentState) >= 0;
    }

    public Action<string> ReadAll { get; private set; }
    public Action<char> Read { get; private set; }
    public Func<bool> AcceptedInput { get; private set; }
}

This implementation compiles, but does it work? We’ll need some test code to determine for sure. To do that we’ll need to implement createSearchMachine and testMachine. These implementations should be possible using Func<> instances in Main without having to create classes.

Here is createSearchMachine:

Func<string, Machine> createSearchMachine = (searchString) =>
{
    var source = searchString.ToCharArray();
    var states = Enumerable.Range(0, source.Length);
    var rules = states.Select(x => new Rule(x, source[x], x + 1))
        .Concat(states.Where(x => x != source.Length).Select(x => new ReturnRule(x, source[x])))
        .Concat(new []{ new TrapRule(source.Length) });
    return new Machine(rules.ToArray(), 0, new[]{ source.Length });
};

It’s interesting that, while slightly more verbose and noisy, C# supports a nearly identical implementation as CoffeeScript. Here is testMachine:

Action<string, string> testMachine = (input, searchString) =>
{
    var machine = createSearchMachine(searchString);
    machine.ReadAll(input);
    Console.WriteLine("{0}--{1}", machine.AcceptedInput(), input);
};

Now we can finally run the machine and see how it works.

var target = "operating system";
testMachine("Windows is an operating system installed on many machines.", target);
testMachine("Welcome to the operating room, the Doctor is just finishing his martini.", target);

And our output is just what we expect:

C:\...\gist-3063755> .\StringContains.exe 
True--Windows is an operating system installed on many machines. 
False--Welcome to the operating room, the Doctor is just finishing his martini. 

Great, Closure All The Things!

No, not really.  In CoffeeScript/JavaScript closures are the norm and expected.  In C# the question you have to ask yourself is why would you close over everything.  Keep in mind that, although you do not declare classes with data members, they still exist.  The compiler generates them for you, so its not like you’ve “saved” anything by using closure.  Maybe closures make your code harder to reflect into, since the compiler will use unspeakable names to refer to the classes it generates.  This obscurity might appeal to some, but for me it’s simply a learning exercise.  I hope you learned something too.

You can check out the finished code at this gist.