Coding

Implementing String.Contains

After writing up the Finite State Machine from SoCalCodeCamp, I had CoffeeScript on the brain and decided to go back and look at my first CoffeeScript experiment to see if there was any room for improvement.  Actually, I knew there was room for improvement.  One of the times when I was stuck in my presentation I flipped over to this machine and I’m pretty sure I blanched visibly and the counted loop in the “read” function.

So lets revisit my first CoffeeScript state machine.  This machine determines if a specified substring exists within a string. In other words, it implements bool String.Contains(String).  Besides all the room for improvement, I think there is an opportunity for some meaningful use of CoffeeScript classes.

First we’ll warm up by adding a shim for alert.

alert = (value) -> console.log value 

Now we can get started with our conversion, we’ll start from the bottom up this time.

Remove Test Duplication

Here is our current test code.

test1 = "Windows is an operating system installed on many machines."
test2 = "Welcome to the operating room, the Doctor is just finishing his martini."

getMachineResult = (b) -> 
  result = ""
  if b == true 
   result = " Accepted. "
  else
   result = " Not Accepted. "
  result

machine = createMachine(rules, 0, finalStates)
machine.readAll test1
alert getMachineResult(machine.acceptedInput()) + test1

machine2 = createMachine(rules, 0, finalStates)
machine2.readAll test2
alert getMachineResult(machine2.acceptedInput()) + test2

Lets remove the duplication and the weirdo getMachineResult method.

testMachine = (input) ->
  machine = createMachine(rules, 0, finalStates)
  machine.readAll input  
  alert [machine.acceptedInput(), input].join '--'

testMachine "Windows is an operating system installed on many machines."
testMachine "Welcome to the operating room, the Doctor is just finishing his martini."

Encapsulate Machine Creation

Our testMachine function currently closes over rules and finalStates from the CoffeeScript “module” scope. We also have these five lines which represent the logic necessary for getting ready to create our machine.

source = "operating system".split '' 
rules = (createRule x, source for x in [0..source.length])
altRules = (createAltRule x, source for x in [0..source.length])
rules.concat altRules
finalStates = [ source.length ]

And the machine is actually created in testMachine

machine = createMachine(rules, 0, finalStates)

Lets encapsulate all this logic into one function, get it out of the module scope, and generalize it so that it can work with any search string.

createSearchMachine = (searchString) ->
  source = searchString.split '' 
  states = [0..source.length]
  rules = (createRule x, source for x in states)
    .concat (createAltRule x, source for x in states)
  createMachine rules, 0, [ source.length ]

testMachine = (input, searchString) ->
  machine = createSearchMachine searchString
  # ...

I did a little experimenting with this one and I’m pretty pleased with the results.  Because the character array indexes actually represent the states the search machine can take, we cant just iterate over the characters.  I think the new version makes it clear that the use of the index range is not a mistake.  I also decided to see if I could chain the call to concat right off of the list comprehension and that worked. Then the line was really long, so i decided to see if significant whitespace would prevent me from continuing the statement on the next line, but it worked just fine. Note that the () around the list comprehension are necessary both times. The parenthesis aren’t part of calling concat, they are part of the list comprehension.

I saved another line by moving finalStates inline, then updated the signature and call site in testMachine.

Creating Machines

Here is our current createMachine implementation. I don’t see any advantage to making it into a class, so I’ll just clean it up.

createMachine = (rules, initialState, finalStates) ->
 {
   currentState: initialState
   transitionRules: rules
   acceptanceStates: finalStates
   read: (c) ->
    for i in [0..this.transitionRules.length - 1]
      if this.transitionRules[i].match this.currentState, c
        return this.currentState = this.transitionRules[i].resultState
   readAll: (s) -> this.read x for x in s.split ''
   acceptedInput: () -> this.currentState in this.acceptanceStates
 }

I got rid of the object properties and captured with closure instead. I also converted the counted loop into a cleaner for..in loop. Finally, I picked some better variable names.

createMachine = (transitionRules, currentState, acceptanceStates) ->
  readAll: (input) -> this.read inputChar for inputChar in input.split ''
  read: (inputChar) ->
    for rule in transitionRules
      if rule.match currentState, inputChar
        return currentState = rule.resultState  
  acceptedInput: -> currentState in acceptanceStates

After testing I see that this still works as expected.

Creating Rules

Our rule creation is really spread out and contains a bit of duplication:

createRule = (index, source) -> {
 initialState: index
 inputChar: source[index]
 resultState: index + 1
 match: (q, c) -> true if (this.initialState == q && this.inputChar == c)
 }

createAltRule = (index, source) -> 
 if index == source.length
  {
    # trap in acceptance state
    initialState: index
    inputChar: null
    resultState: index
    match: (q, c) -> true if (this.initialState == q) 
  }
 else
  {
    # return to start state
    initialState: index
    inputChar: source[index]
    resultState: 0
    match: (q, c) -> true if (this.initialState == q && this.inputChar != c)
  }

Primarily these rules only vary by their method for evaluating match. Our POR (plain old rule) takes an index (which in this case represents a state) and a source string (of which we are only interested in one character) then computes the next state. The match function is true if the passed in state and input character match the stored values. This method can definitely be cleaned up.

createRule = (CurrentState, InputChar, NextState) ->
  getNextState: -> NextState
  match: (machineState, inputChar) ->
    machineState is CurrentState and inputChar is InputChar

With createRule defined this way, we need to make some changes to createSearchMachine so that it passes the needed data, and to createMachine so that it calls getNextState. If the machine is going to query getNextState instead of accessing a property, then the alternate rules will need functions with the same name and semantics.

createAltRule = (index, source) -> 
 if index == source.length
  {
    # trap in acceptance state
    initialState: index
    inputChar: null
    resultState: index
    getNextState: -> @resultState
    match: (q, c) -> true if (this.initialState == q) 
  }
 else
  {
    # return to start state
    initialState: index
    inputChar: source[index]
    resultState: 0
    getNextState: -> @resultState
    match: (q, c) -> true if (this.initialState == q && this.inputChar != c)
  }

So I’m beginning to sense a relationship between the POR and the alternate rules. First, I test the new code and once again find out that it works.  Lets convert createRule into a class, then use the createRule method as a shim to call the constructor.

class Rule
  constructor: (@CurrentState, @InputChar, @NextState) ->
  getNextState: => @NextState
  match: (machineState, inputChar) =>
    machineState is @CurrentState and inputChar is @InputChar

createRule = (CurrentState, InputChar, NextState) ->
  new Rule CurrentState, InputChar, NextState

The purpose of the shim is just to ensure that we don’t need to update all the call sites while we are refactoring the rules. With this new code in place, I test and everything still works. Now we can look at extending Rule to take over parts of createAltRule. Lets look at the top half of createAltRule first.

createAltRule = (index, source) -> 
 if index == source.length
  {
    # trap in acceptance state
    initialState: index
    inputChar: null
    resultState: index
    getNextState: -> @resultState
    match: (q, c) -> true if (this.initialState == q) 
  }

The purpose of this branch is to define a rule that traps the machine in a certain state. In this case, it will be the acceptance state. If we count the number of characters in the search string, we find a number that represents the machine’s acceptance state.  If we manage to reach that acceptance state, we want to stay in it.  Once we have determined that the source string contains the search string, the machine should accept the input, no matter what the rest of the string says.  The trap state implementation just compares the machine state to the acceptance state, and returns the acceptance state when the machine calls getNextState.

Here is how we can extend Rule to support this.

class TrapRule extends Rule
  constructor: (@CurrentState) ->  
    super @CurrentState, null, @CurrentState  
  match: (machineState, inputChar) =>
    machineState is @CurrentState

The TrapRule class extends Rule by replacing the match implementation with the same logic comparison implemented in the old code. I use super to call the Rule constructor with the @CurrentState value as both the current and next state. I pass null as the character to make it explicit that the input character is not used.

Now we can update the createAltRule function.

createAltRule = (index, source) -> 
 if index == source.length
    new TrapRule index
 else
  {
    # return to start state
    initialState: index
    inputChar: source[index]
    resultState: 0
    getNextState: -> @resultState
    match: (q, c) -> true if (this.initialState == q && this.inputChar != c)
  }

The bottom half of createAltRule describes a “return rule”. This rule handles any scenario where the machine reads a character that is not in the search string, and also any scenario where the character is in the search string, but not in the correct position. For example, both of our test strings start with the character W. The return rule would apply because W does not appear in operating system. As the machine reads along it will encounter o at position 5. Finding this character, it will advance from state 0 to state 1. The next character it wants to read is p. But in either test string, p is not the character at position 6. So, the return rule will apply and return the machine to state 0. Without the return rules, the machine could only accept strings that start with the search string, but we want a machine that accept strings which contain the search string. So, the match function looks at the input character passed in by the machine, and matches when the character does not match the stored character, and getNextState always returns 0.

Here is a ReturnRule class that implements the same logic.

class ReturnRule extends Rule
  constructor: (@CurrentState, @InputChar) ->
    super @CurrentState, @InputChar, 0
  match: (machineState, inputChar) =>
    machineState is @CurrentState and inputChar isnt @InputChar

And we can reduce our createAltRule method to this:

createAltRule = (index, source) -> 
 if index is source.length
    new TrapRule index
 else
    new ReturnRule index, source[index]

And this works too. Now that we have cleaned up our rules we can see that our shim is only called in one place, and does very little, so we can inline the Rule constructor and delete createRule.

createSearchMachine = (searchString) ->
  source = searchString.split '' 
  states = [0..source.length]
  rules = (new Rule x, source[x], x + 1 for x in states)
    .concat (createAltRule x, source for x in states)
  createMachine rules, 0, [ source.length ]

Lets try to get rid of the call to createAltRule with better list comprehension.

createSearchMachine = (searchString) ->
  source = searchString.split '' 
  states = [0..source.length]
  rules = (new Rule x, source[x], x + 1 for x in states)
    .concat (new ReturnRule x, source[x] for x in states when x isnt source.length)
  createMachine rules, 0, [ source.length ]

As an intermediate step, I simply filtered out the state that would have led to creating the trap rule and replaced the call to createAltRule with a direct instantiation of ReturnRule. I expect this to fail, but it doesn’t! Turns our our machine has a bug in it, since I designed it “knowing” that every possible state/character combination would have a rule, I did not handle the case in read where no rules matched. Just to be on the safe side, I’ll update read by moving the current state to -1 when no rules match.  Although it’s conventional to use non-negative integer states when simulating state machines, the truth is that you can use any symbol, positive, negative, or Celtic runes, it’s up to you.  Here is the updated Read implementation:

  read: (inputChar) ->
    for rule in transitionRules
      if rule.match currentState, inputChar
        return currentState = rule.getNextState()  
    currentState = -1

Now, the machine fails as expected. We can repair it by creating the TrapRule and pushing it into the rules collection.

createSearchMachine = (searchString) ->
  source = searchString.split '' 
  states = [0..source.length]
  rules = (new Rule x, source[x], x + 1 for x in states)
    .concat (new ReturnRule x, source[x] for x in states when x isnt source.length)
  rules.push new TrapRule source.length
  createMachine rules, 0, [ source.length ]

This works so we can get rid of the createAltRule function.  I made all these changes to the existing gist and I committed a few intermediate steps along the way.  Enjoy.

Advertisements
Coding

Shadows vs. Overrides

Photo Credit: ludovic.celle

I spent some time recently thinking about the difference between override methods, which replace virtual methods (as far as outside callers are concerned), and new methods, which merely hide base class methods (not necessarily virtual methods either). While refactoring the other day I stumbled onto something I thought was clever, but that all hinges on whether shadowing and overriding behave the way I think they do.  So it seemed worthwhile to research it a bit.

After a simple Google search I found plenty of examples, but none that confirmed or denied all of my intuitions about the behavior of each method type.  As is often the case, it was more instructive to start a test project and explore the behavior myself.

Override Basics

Lets start with an example from csharpfaq.

public class Base
{
    public virtual void SomeMethod()
    {
    }
}

public class Derived : Base
{
    public override void SomeMethod()
    {
    }
}

That’s rather dull, we can jazz it up a bit.

public class BasicLogger
{
    protected virtual void WriteMessage(string source, string message)
    {
        EventLog.WriteEntry(source, message);
    }

    public void CallOutTheHour(string post, string hour)
    {
        this.WriteMessage(post, hour + " o'clock and all is well.");
    }
}

public class TraceLogger : BasicLogger
{
    private readonly TraceSource Log;
    public TraceLogger(TraceListener listener)
    {
        this.Log = new TraceSource("LoggerApp");
        this.Log.Switch.Level = SourceLevels.All;
        if (listener != null)
        {
            this.Log.Listeners.Add(listener);
        }
    }

    protected override void WriteMessage(string source, string message)
    {
        this.Log.TraceEvent(
            TraceEventType.Information, 
            0, 
            "{0}: {1}", 
            source,
            message);
    }
}

On versions of Windows that have UAC (Vista+ or 2008+) BasicLogger is basically untestable/unusable unless you want to run as administrator. That should make our tests clear, I should get an exception anytime BasicLogger.WriteMessage is called.

[TestClass]
public class ShadowsVsOverride
{
    [TestMethod]
    [ExpectedException(typeof(SecurityException))]
    public void TestBasicLogger()
    {
        new BasicLogger().CallOutTheHour("One", "Six");
    }
}

Likewise, we expect that TraceLogger instances will write out the message.

[TestMethod]
public void TestTraceLogger()
{
    using (var listener = new StringWriter())
    {
        new TraceLogger(new TextWriterTraceListener(listener))
            .CallOutTheHour("One", "Six");                
        Assert.AreEqual(
            "LoggerApp Information: 0 : One: Six o'clock and all is well." + Environment.NewLine, 
            listener.ToString());
    }
}

Note that CallOutTheHour is declared on the base, but that the WriteMessage method it actually calls is on the derived class. This is what people mean when they say that the virtual method is replaced, the base class can’t even call its own WriteMessage method. Once something is overridden, only the derived class has the option of calling the original method. We could write a method like this on TraceLogger:

public void WriteToEventLogDirect(string source, string message)
{
    base.WriteMessage(source, message);
}

We can see that this “works” by observing the exception.

[TestMethod]
[ExpectedException(typeof(SecurityException))]
public void TestWriteDirectlyToEventLog()
{
    new TraceLogger(null).WriteToEventLogDirect("Foo", "Bar");
}

To beat this to death we can see that it doesn’t matter what type the calling code thinks its using, the override replaces the virtual.

[TestMethod]
public void TestCasting()
{
    using (var listener = new StringWriter())
    {
        TraceLogger t = new TraceLogger(new TextWriterTraceListener(listener));
        BasicLogger target = t;

        target.CallOutTheHour("One", "Six");
        Assert.AreEqual(
            "LoggerApp Information: 0 : One: Six o'clock and all is well." + Environment.NewLine,
            listener.ToString());
    }
}

Shadowing Basics

Shadowing, using the Shadows keyword in Visual Basic, or the new keyword in C#, “merely hides” the method in the base class. What does this mean? Lets modify our loggers and see. First I’m going to make BasicLogger.WriteMessage a non-virtual method.

protected void WriteMessage(string source, string message)
{
    EventLog.WriteEntry(source, message);
}

This generates a compiler error:

'LoggerApp.Shadow.TraceLogger.WriteMessage(string, string)': cannot override inherited member 'LoggerApp.Shadow.BasicLogger.WriteMessage(string, string)' because it is not marked virtual, abstract, or override 

The Visual Basic compiler is even more to the point.

'Protected Overrides Sub WriteMessage(source As String, message As String)' cannot override 'Protected Sub WriteMessage(source As String, message As String)' because it is not declared 'Overridable' 

Although, in both languages “it” is ambiguous. We know that “it” refers to the base class method, because that’s all I changed.  We can resolve the error by removing the override declaration.

protected void WriteMessage(string source, string message)
{
    this.Log.TraceEvent(
        TraceEventType.Information, 
        0, 
        "{0}: {1}", 
        source,
        message);
}

Removing the override keyword in TraceLogger changes the error into a warning in C#.

'LoggerApp.Shadow.TraceLogger.WriteMessage(string, string)' hides inherited member 'LoggerApp.Shadow.BasicLogger.WriteMessage(string, string)'. Use the new keyword if hiding was intended. 

Visual Basic also generates a warning, but doesn’t suggest the Shadows keyword, instead it suggests Overloads.

sub 'WriteMessage' shadows an overloadable member declared in the base class 'BasicLogger'. If you want to overload the base method, this method must be declared 'Overloads'. 

I guess we would have to look at the IL to see if there is a difference between Shadows and Overloads in this situation, but I’m not going to go that far today. For now we’ll add the new keyword to our TraceLogger.WriteMessage declaration and see how our tests run.

protected new void WriteMessage(string source, string message)
{
    this.Log.TraceEvent(
        TraceEventType.Information, 
        0, 
        "{0}: {1}", 
        source,
        message);
}

When using shadowing, TestCasting and TestTraceLogger both fail with exceptions. In fact, every method now throws exceptions, its just that the other two tests were expecting exceptions, and these two were not. It seems now that BasicLogger.WriteMessage is always called. Before exploring why, I’ll decorate these two tests with ExpectedExceptionAttributes so that they pass, then we’ll write some new tests to continue to explore shadowing.

CallOutTheHour is declared on BasicLogger and so is “underneath” the shadow cast by TraceLogger. TraceLogger is doing its best to hide the method with its own implementation, but it can’t hide it from BasicLogger‘s own methods. However, from any code that recognizes TraceLogger as a TraceLogger (including TraceLogger itself), the shadow prevents the BaseLogger.WriteMessage method from being executed. Lets give TraceLogger its own public method to see that it can still call its own method:

public class TraceLogger : BasicLogger
{
    private readonly TraceSource Log;
    public TraceLogger(TraceListener listener)
    {
        // ...
    }

    protected new void WriteMessage(string source, string message)
    {
        // ...
    }

    // ...

    public void CallOutAlarm(string source, string message)
    {
        this.WriteMessage(source, "To Arms! It's " + message + "!");
    }
}

And here’s the test to verify:

[TestMethod]
public void TestSibling()
{
    using (var listener = new StringWriter())
    {
        new TraceLogger(new TextWriterTraceListener(listener))
            .CallOutAlarm("One", "Robin Hood");
        Assert.AreEqual(
            "LoggerApp Information: 0 : One: To Arms! It's Robin Hood!" + Environment.NewLine,
            listener.ToString());
    }
}

TraceLogger can still call the BaseLogger method using the base keyword. We don’t need a new test for this because our existing TestWriteDirectlyToEventLog test already covers that scenario.

Shadowing has an interesting capability which overriding lacks. You can use shadowing to change the visibility of a method. This lets us make WriteMessage public on TraceLogger if we like.

public new void WriteMessage(string source, string message) 

After this change, we have no errors or warnings, and our tests still pass. But now we can test WriteMessage directly if we want to.

[TestMethod]
public void TestWriteMessage()
{
    using (var listener = new StringWriter())
    {
         new TraceLogger(new TextWriterTraceListener(listener))
            .WriteMessage("Foo", "Bar");
        Assert.AreEqual(
            "LoggerApp Information: 0 : Foo: Bar" + Environment.NewLine,
            listener.ToString());
    }
}

This capability is interesting, but confusing. Compared to overriding, there is more mental baggage to keep track of, you have to remember what side of the shadow each piece of code is on, and you have to remember what interface the caller is using. Callers that create objects like this:

var t = new TraceLogger(...); TraceLogger v = new TraceLogger(...); 

Will get different behavior than those who declare the reference like this:

BasicLogger b = new TraceLogger(...); 

That’s something to think about before jumping into method hiding.

Considerations

Some people, when learning about shadowing, become very consternated. They think hiding methods is a bad idea, and it probably is. Then they go one step too far and use is as an example of bad design in .NET. But, it has nothing to do with .NET, C++ has always had the ability to hide non-virtual methods. Consider this code from SO#5289774

class Foo;
struct B : A<Foo> {
  void doSomething();
};

// and in the .cpp file:
#include "Foo.h"

void B::doSomething() {
  A<Foo>::doSomething();
}

Instead of just letting this happen, .NET has a keyword that lets you do it explicitly, and the compiler has a warning that might give you a clue that what you’re about to do is not what you intend. But the point in both languages is that hiding methods is a confusing implementation choice and should be used sparingly if at all. In simple cases like our logger, we should have just added another public method instead of using shadowing to increase the visibility of WriteMessage. If you only want increased visibility for testing purposes, you can subclass in your test project and place the extra accessor method there.

Coding, Presentations

CoffeeScript Crash Course

CoffeeScript Crash Course | SoCalCodeCamp @ UCSD | 6/24/2012

As I mentioned in recent posts, SoCalCodeCamp was last weekend.  I presented two sessions:

  • Stop Guessing About MEF Composition, Start Testing
  • CoffeeScript Crash Course

I did not record video for the MEF session, but you can find a recording of my MEF session from SoCalCodeCamp @ CSU Fullerton linked from this previous post.  The video for the CoffeeScript session is embedded above.  Below, find links to the presentation materials.

Since I’m not an expert in CoffeeScript or JavaScript, this session was targeted at beginners and was meant to be my fun or challenging session.  It was also an excuse to sneak in a Computer Science topic and talk about Finite State Machines.  During the session, I live coded a FSM that accepts even length strings.  Here’s what it looks like:

UCSD.FSM

I explain a bit about how this works in the video and I was able to work through it during the session.  But there were two bugs that prevented the machine from providing the correct results.  Oh well, people were mostly there to see CoffeeScript, getting the machine to work was secondary.  After getting home, I figured out the bugs and posted the code.  I also wanted to create this supplemental post to show a cleaner version of the working code.  In this article, I’ll also show how to implement the machine using CoffeeScript classes, which I wasn’t able to do in the demo.

The code is available as a gist, and if you are really interested in its evolution, you can step back and forth between the different commits and study the changes.

Let’s get started.

Instant Feedback With Node.js

First install node.js and the coffee-script NPM module. You can find instructions on coffeescript.org.  Watch-compile your source file with this command:

coffee --watch --compile .\coffeescriptcrashcourse.coffee 

A *.js file should appear with the same name as the CoffeeScript file. In our case, coffeescriptcrashcourse.js. Open both files in Sublime Text 2, side by side.

When you make changes in the CoffeeScript file and save, the watch compiler will update the JavaScript as soon as the file hits the disk. To see the changes, just click in the Sublime window displaying the js, and Sublime should instantly refresh (as long as you haven’t made any changes in that window.)

Shrinking the Implementation

Creating rules

My initial implementation of createRule (created live during my SoCalCodeCamp session) looked like this:

createRule = (currentState, inputCharacter, nextState) ->
 state: currentState
 char: inputCharacter
 next: nextState
 Match: (machineState, i) ->   
   @state is machineState and @char is i

This works fine, and its the most obvious way to setup a function that does what we want. Like I told my session attendees, just try what you think will work and it probably will. But if we try, we can take advantage of some additional CoffeeScript language features to make this implementation shorter:

createRule = (CurrentState, InputChar, NextState) ->
  GetNextState: () -> NextState
  Match: (machineState, inputChar) -> 
    machineState is CurrentState and inputChar is InputChar

CoffeeScript doesn’t require us to manually create fields to store the state passed into the function, the variables will be automatically visible to our object. But, they wont be available later on unless we close over them. Hence, while we removed three lines, we added one back in order to make NextState accessible. I also renamed some of the variables, and then updated the machine implementation to use the new GetNextState function instead of directly accessing data.

I test, and everything works.

Creating machines

Likewise, I can tighten up the createMachine implementation with the same technique. Here is the original:

createMachine = (initialState, rules, finalStates) ->
  currentState: initialState
  ruleSet: rules
  acceptance: finalStates 
  ReadAll: (s) ->
    @Read x for x in s.split ''
    return undefined

  Read: (c) ->
    return if @currentState == 2 
    for rule in @ruleSet
      if rule.Match @currentState, c
        return @Update rule
    @currentState = 2

  Update: (r) ->
    @currentState = r.GetNextState()
  AcceptedInput: () ->
    @currentState in @acceptance

And the new:

createMachine = (CurrentState, Rules, FinalStates) ->
  ReadAll: (value) ->
    @Read x for x in value.split ''
    return undefined
  Read: (inputChar) ->
    return if CurrentState == 2 
    for rule in Rules
      return @Update rule if rule.Match CurrentState, inputChar        
    CurrentState = 2   
  Update: (rule) ->
    CurrentState = rule.GetNextState()
  AcceptedInput: () ->
    CurrentState in FinalStates

I also eliminated one level of nesting by turning the loop’s inner conditional statement into a post fixed condition. Since nothing outside the object manipulated the object’s fields, I don’t need to provide any getters for CurrentState, Rules, or FinalStates. Although I can just take advantage of closure to capture my data members, I still need to use the @ symbol when referring to non-closure data (in this case the non-closure fields are all functions, but the same should be true if they were just data values).

I test this and everything still works.

Machine Factory

During the demo I created one machine instance by writing down the rules and then invoking createMachine. Once I had the machine, I used it to evaluate a string and query whether the machine accepted the input. Due to a couple bugs, the machine rejected input that it should have accepted. However, once I got home and debugged it, the machine worked as expected.

Creating a one off machine was fine for the demo, but not very useful when we want to test the machine with different values. This is because a new machine must be created for each input string, since there is no way to reset a machine after it has run. createMachine is (almost) a method for creating any finite state machine. I say “almost” because the “dead” state is hard coded as 2. To make createMachine useful, we must create a rule set and pass it to the machine along with the start state, final state and (ideally) the dead state.

Lets fix the dead state issue first.

createMachine = (CurrentState, Rules, FinalStates, DeadState) ->
  ReadAll: (value) ->
    @Read x for x in value.split ''
    return undefined
  Read: (inputChar) ->
    return if CurrentState == DeadState 
    for rule in Rules
      return @Update rule if rule.Match CurrentState, inputChar        
    CurrentState = DeadState   
  Update: (rule) ->
    CurrentState = rule.GetNextState()
  AcceptedInput: () ->
    CurrentState in FinalStates

By promoting the dead state to a parameter, we should be able to use this code to create just about any (deterministic) Finite State Machine. Of Course, we need to pass 2 when we call createMachine for our current example to continue to work.

myMachine = createMachine 0, rules, [0], 2 

Now, we can wrap up the code that creates the specific FSM from our demo.

createMyMachine = () ->
  rules = [
    createRule 0, '0', 1
    createRule 0, '1', 1
    createRule 1, '0', 0
    createRule 1, '1', 0
  ]
  createMachine 0, rules, [0], 2

With createMyMachine we can test with blocks like this:

myMachine = createMyMachine()
myMachine.ReadAll "00"
alert myMachine.AcceptedInput()

myMachine2 = createMyMachine()
myMachine2.ReadAll "1"
alert myMachine2.AcceptedInput()

But clearly, we have an opportunity to remove duplication, lets do that. While we’re in there we can make the output a little more informative too.

testMyMachine = (input) ->
  myMachine = createMyMachine()
  myMachine.ReadAll input
  alert [input, myMachine.AcceptedInput()].join ','

Now we have a prettier implementation that’s more useful that the starting implementation, but just happens to use the same number of lines.

Get Classy

During the demo I couldn’t show the attendees CoffeeScript’s class support. When I watched the video I realized why. I forgot to put the skinny arrow after my constructor! I should have kept a closer eye on the top right corner of coffeescript.org, there was a message up there which probably would have taken me to the right line.

Since I couldn’t do it then, lets do it now, and convert our two objects into classes. Starting with the machine.

class Machine
  constructor: (@CurrentState, @Rules, @FinalStates, @DeadState) ->
  ReadAll: (value) =>
    @Read x for x in value.split ''
    return undefined
  Read: (inputChar) => 
    return if @CurrentState is @DeadState
    for rule in @Rules
      return @Update rule if rule.Match @CurrentState, inputChar
    @CurrentState = @DeadState
    return undefined
  Update: (rule) =>
    @CurrentState = rule.GetNextState()
    return undefined
  AcceptedInput: () => 
    @CurrentState in @FinalStates

Here is our Machine class. Its not that different than our createMachine implementation. However, we do go back to using @ everywhere and we are using the fat arrow. The fat arrow seems like the right thing to do, according to my reading of the CoffeeScript documentation:

When used in a class definition, methods declared with the fat arrow will be automatically bound to each instance of the class when the instance is constructed.

coffeescript.org

However, I tried it with the skinny arrow and in our use case, it still worked. So, I still can’t claim to understand this binding. Oh well, we also need to update createMyMachine to use the new operator:

createMyMachine = () ->
  rules = [
    createRule 0, '0', 1
    createRule 0, '1', 1
    createRule 1, '0', 0
    createRule 1, '1', 0
  ]
  new Machine 0, rules, [0], 2

Now lets turn createRule into a class.

class Rule 
  constructor: (@CurrentState, @InputChar, @NextState) ->
  GetNextState: () => @NextState  
  Match: (machineState, inputChar) =>
    machineState is @CurrentState and inputChar is @InputChar

Nothing very exiting, just more of what we saw when we converted Machine. We just need to update createMyMachine to get this to work with our test.

createMyMachine = () ->
  rules = [
    new Rule 0, '0', 1
    new Rule 0, '1', 1
    new Rule 1, '0', 0
    new Rule 1, '1', 0
  ]
  new Machine 0, rules, [0], 2

In our scenario, there really isn’t a big advantage to using classes, because we have no need to extend Rule or Machine. But, I wanted to show what this looks like because I’m sure it will be interesting to someone getting started with CoffeeScript.

Someday…

Someday I’ll do one of these machines without iterating over a set of rules  The approach in this example has big O complexity of at least n*n.  If we had long strings and large rule sets, we would spend a lot of time looping in the “Read” method.

Since there was so much interest in this session, maybe I’ll do it again at the next CodeCamp, but there are still plenty of other languages I’d like to play with including .NET languages like Nemerle and F#.  Whatever crash course I do in the future, it will be for beginners and it will be for fun.

Coding, Testing

When is the ExportProvider Interesting?

Recently, when writing about the CompositionTests API (here) I wrote that our test should focus on the interesting part of the composition and that most of the time the catalog is the only interesting item.  Later I wrote that there were other scenarios where the ExportProvider was also interesting and when I wrote that I was thinking of custom ExportProvider implementations.  Just Google “mef custom exportprovider” and you will find a wealth of articles written by people who created their own ExportProviders to adapt MEF for their specific needs.  I think I even built one back when I first started out with MEF.

But lets say you haven’t needed or wanted to roll your own ExportProvider.  Does trusty old CompositionContainer ever effect composition?  Can it ever be interesting enough to motivate you to use VerifyCompositionInfo?

A Simple Mailer

Here’s an example of when you might want to use VerifyCompositionInfo because the CompositionContainer is interesting.  Say you are working on an application that wants to send emails. You might start with a little mail sender like this:

[Export(typeof(ISendMail))]
public class BasicMailSender : ISendMail
{
    [ImportingConstructor]
    public BasicMailSender(
        SmtpHost host,
        Port port)
    {
        this.port = port;
        this.host = host;
    }

    public void SendMessage(MailMessage message)
    {
        // ...
    }
}

You have read the CompositionTests readme, and written a simple Program class that allows you to share your ComposablePartCatalog (in this case a TypeCatalog) with your test.

public class Program
{
    static Program()
    {
        Catalog = new TypeCatalog(typeof(BasicMailSender));
        Host = new CompositionContainer(Catalog);
    }

    public static TypeCatalog Catalog { get; private set; }

    public static CompositionContainer Host { get; private set; }

    public static void Main(string[] args)
    {
    }
}

And written a test.

using ApprovalTests.Reporters;
using CompositionTests;
using Microsoft.VisualStudio.TestTools.UnitTesting;

namespace MailApp.Tests
{
    [TestClass]
    [UseReporter(typeof(DiffReporter))]
    public class CompositionTest
    {
        [TestMethod]
        public void VerifyComposition()
        {
            MefComposition.VerifyTypeCatalog(Program.Catalog);
        }
    }
}

Given our current setup, we would expect an unhappy composition, shown below.

[Part] MailApp.BasicMailSender from: TypeCatalog (Types='MailApp.BasicMailSender').
  [Primary Rejection]
  [Export] MailApp.BasicMailSender (ContractName="MailApp.BasicMailSender")
  [Import] MailApp.BasicMailSender..ctor (Parameter="host", ContractName="MailApp.SmtpHost")
    [Exception] System.ComponentModel.Composition.ImportCardinalityMismatchException: No exports were found that match the constraint: 
    ContractName    MailApp.SmtpHost
    RequiredTypeIdentity    MailApp.SmtpHost
   at System.ComponentModel.Composition.Hosting.ExportProvider.GetExports(ImportDefinition definition, AtomicComposition atomicComposition)
   at System.ComponentModel.Composition.Hosting.ExportProvider.GetExports(ImportDefinition definition)
   at Microsoft.ComponentModel.Composition.Diagnostics.CompositionInfo.AnalyzeImportDefinition(ExportProvider host, IEnumerable`1 availableParts, ImportDefinition id)
  [Import] MailApp.BasicMailSender..ctor (Parameter="port", ContractName="MailApp.Port")
    [Exception] System.ComponentModel.Composition.ImportCardinalityMismatchException: No exports were found that match the constraint: 
    ContractName    MailApp.Port
    RequiredTypeIdentity    MailApp.Port
   at System.ComponentModel.Composition.Hosting.ExportProvider.GetExports(ImportDefinition definition, AtomicComposition atomicComposition)
   at System.ComponentModel.Composition.Hosting.ExportProvider.GetExports(ImportDefinition definition)
   at Microsoft.ComponentModel.Composition.Diagnostics.CompositionInfo.AnalyzeImportDefinition(ExportProvider host, IEnumerable`1 availableParts, ImportDefinition id)

So we need to provide SmtpHost and Port implementations. The decision on the Port implementation is easy. While we might want to use a non-default Port during testing, we feel pretty safe determining that we want to use the default SMTP port (25) in production. So we can whip up a DefaultPort implementation and export it.

[Export(typeof(Port))]
public class DefaultPort : Port
{
    public DefaultPort() :
        base(25)
    {
    }
}

Everything works as expected if we update our TypeCatalog with this new type.

static Program()
{
    Catalog = new TypeCatalog(typeof(BasicMailSender), typeof(DefaultPort));
    Host = new CompositionContainer(Catalog);
}

And now the test produces this result:

[Part] MailApp.BasicMailSender from: TypeCatalog (Types='MailApp.BasicMailSender, MailApp.DefaultPort').
  [Primary Rejection]
  [Export] MailApp.BasicMailSender (ContractName="MailApp.BasicMailSender")
  [Import] MailApp.BasicMailSender..ctor (Parameter="host", ContractName="MailApp.SmtpHost")
    [Exception] System.ComponentModel.Composition.ImportCardinalityMismatchException: No exports were found that match the constraint: 
    ContractName    MailApp.SmtpHost
    RequiredTypeIdentity    MailApp.SmtpHost
   at System.ComponentModel.Composition.Hosting.ExportProvider.GetExports(ImportDefinition definition, AtomicComposition atomicComposition)
   at System.ComponentModel.Composition.Hosting.ExportProvider.GetExports(ImportDefinition definition)
   at Microsoft.ComponentModel.Composition.Diagnostics.CompositionInfo.AnalyzeImportDefinition(ExportProvider host, IEnumerable`1 availableParts, ImportDefinition id)
  [Import] MailApp.BasicMailSender..ctor (Parameter="port", ContractName="MailApp.Port")
    [SatisfiedBy] MailApp.DefaultPort (ContractName="MailApp.Port") from: MailApp.DefaultPort from: TypeCatalog (Types='MailApp.BasicMailSender, MailApp.DefaultPort').

[Part] MailApp.DefaultPort from: TypeCatalog (Types='MailApp.BasicMailSender, MailApp.DefaultPort').
  [Export] MailApp.DefaultPort (ContractName="MailApp.Port")

The Port is now satisfied, but SmtpHost is trickier. While the default port for SMTP is defined by standard, there is no such thing as a default server. Now, I would note that one option is to simply approve this result. There is no rule that says you can only approve a CompositionTest when it has no exceptions. The fact that SmtpHost can’t be satisfied could just mean that this part must be provided by a third-party at deployment time. However, there are two reasons why I’ll try to satisfy all of the parts in this test.

  • Ignoring exceptions is like ignoring warnings. When there are only a few you can say to yourself, “Well, that one is there for a reason, and I understand why.” But when the number of warnings/exceptions grow, the signal to noise ratio rises and eventually obscures real errors.
  • If I don’t try to satisfy the part, I won’t be able to show you a scenario where the ExportProvider is interesting.

Lets say that we have a SmtpHost in our test project that we use for development, it points at the localhost. When making these kind of test collaborators I usually nest the Testing class inside the test it supports.

[TestClass]
[UseReporter(typeof(DiffReporter))]
public class CompositionTest
{
    [TestMethod]
    public void VerifyComposition()
    {
        MefComposition.VerifyTypeCatalog(Program.Catalog);
    }

    [Export(typeof(SmtpHost))]
    public class TestingSmtpHost : SmtpHost
    {
        public TestingSmtpHost()
            : base("localhost")
        {
        }
    }
}

It would be nice to use this part when verifying composition, but since it is in our test class, our production code can’t see it. One way around this is to add an instance to our CompositionContainer using a CompositionBatch. We’ll need to add a method, Program.AddInstance, to allow this.

public class Program
{
    static Program()
    {
        Catalog = new TypeCatalog(typeof(BasicMailSender), typeof(DefaultPort));
        Host = new CompositionContainer(Catalog);
    }

    public static TypeCatalog Catalog { get; private set; }

    public static CompositionContainer Host { get; private set; }

    public static void AddInstance(object part)
    {
        var batch = new CompositionBatch();
        batch.AddPart(part);
        Host.Compose(batch);
    }

    // ...
}

Now we can add a class constructor to our test, so that Program is only manipulated once during the course of the test.

[TestClass]
[UseReporter(typeof(DiffReporter))]
public class CompositionTest
{
    static CompositionTest()
    {
        Program.AddInstance(new TestingSmtpHost());
    }

Note that you can’t just throw any object into the method and expect it to compose, it has to be an attributed MEF part. Ok, now we might expect our test result to show a happy composition, but the BasicMailSender still fails to compose.

[Import] MailApp.BasicMailSender..ctor (Parameter="host", ContractName="MailApp.SmtpHost")
  [Exception] System.ComponentModel.Composition.ImportCardinalityMismatchException: No exports were found that match the constraint: 
    ContractName    MailApp.SmtpHost
    RequiredTypeIdentity    MailApp.SmtpHost

What’s up with that? Look closely at the AddInstance method. Nowhere in that method do we manipulate the Catalog.  When we use CompositionBatch, we are adding part directly to the CompositionContainer–outside the catalog. So, our ExportProvider (in this case just a plain old CompositionContainer) is suddenly interesting. Without it, there is no way for the CompositionInfo used by CompositionTests to know about the instance of TestingSmtpHost that we added.

Lets rewrite our test to use an overload of VerifyCompositionInfo so that we can pass in the ExportProvider.

[TestMethod]
public void VerifyComposition()
{
    MefComposition.VerifyCompositionInfo(Program.Catalog, Program.Host);
}

And now our composition is happy.

[Part] MailApp.BasicMailSender from: TypeCatalog (Types='MailApp.BasicMailSender, MailApp.DefaultPort').
  [Export] MailApp.BasicMailSender (ContractName="MailApp.BasicMailSender")
  [Import] MailApp.BasicMailSender..ctor (Parameter="host", ContractName="MailApp.SmtpHost")
    [SatisfiedBy] MailApp.Tests.CompositionTest+TestingSmtpHost (ContractName="MailApp.SmtpHost") from: MailApp.Tests.CompositionTest+TestingSmtpHost
  [Import] MailApp.BasicMailSender..ctor (Parameter="port", ContractName="MailApp.Port")
    [SatisfiedBy] MailApp.DefaultPort (ContractName="MailApp.Port") from: MailApp.DefaultPort from: TypeCatalog (Types='MailApp.BasicMailSender, MailApp.DefaultPort').

[Part] MailApp.DefaultPort from: TypeCatalog (Types='MailApp.BasicMailSender, MailApp.DefaultPort').
  [Export] MailApp.DefaultPort (ContractName="MailApp.Port")

Notice that while the TestingSmtpHost satisfies the host parameter on the BasicMailSender constructor, its “from” tag doesn’t indicate that it comes from a catalog (in contrast, you can see that DefaultPort came from the TypeCatalog). Likewise, TestingSmtpHost has no [Part] listing and a [Part] listing for DefaultPort does exist.

Most of the time, you can analyze your composition using just the catalog, but the container can have parts too. When your container has parts, that’s when you’ll need to use VerifyCompositionInfo directly to get the full analysis.

You can download and play with the MailApp code from the Sample directory in the CompositionTests github repository.

Coding, Testing

Catching Email With Rnwood.SmtpServer

Not too long ago, Llewellyn Falco posted Using ApprovalTests in .Net 19 Email, where he describes a really easy way to test email messages using ApprovalTests.  The video describes a testing seam that separates message creation from message sending, and this makes testing email straightforward.  If you are currently working with .NET source, then you really should follow the simple instructions in that video and stop reading this post, it’s not for you.

This post is for you if:

  • You don’t control the source of the email you want to test.
  • You control the source, but it’s not .NET.
  • You control the source in theory, but you can’t change it (eg. boss says no)

Context

In my case, I’m moving a little reporting script from Perl to .NET.  Lets call the script “report.pl” for the sake of discussion.  We want to move from Perl to .NET using TDD and without any noticeable changes to the users (which happen to be the Executive Committee where I work).  I don’t want to create a system that works more or less the same, I want one that works exactly the same.  I need to lock down the current system, and use it as the gold standard for the new system.

Here’s what the current script looks like with the responsibilities color-coded:

image

The blue blocks are responsibilities that are implementation specific, when the script wakes up the first thing it does is read its config and validate its environment.  Its not likely that these tasks will transfer to a new system.

The script creates an Excel Workbook, this responsibility is identified by green blocks.

The script reads its data from a database, this responsibility is identified by red blocks.

The script manipulates the data before writing it to the Workbook, these blocks are purple.

The script sends the Workbook to its recipients over email, this block is yellow.  This is the first block I’ll work on, since it’s the only one that isn’t interwoven with other responsibilities.

Strategy

1. Identify a responsibility

image

2. Let’s assume that this responsibility was encapsulated in a subroutine.  Copy that subroutine to a new Perl script “sendReport.pl”.  If the responsibility was not encapsulated in a subroutine, make sure to wrap the lines of code you copy with a sub{} in sendReport.pl.

image

3. Use Perl’s “require” mechanism to include “SendReport.pl” in “Report.pl” and remove the local definition of the “sendReport” subroutine.  The script will call the imported definition instead.

image

4. Create a third Perl script “SendReportRunner.pl” which is just a thin shell around the extracted responsibility, which will let me execute the responsibility with any parameters I like.

image

5. Create a unit test in C# that uses a Process object to invoke SendReportRunner.pl.  Notice that Report.pl is no longer in the picture.

image

6. Capture the output in an ApprovalTest.  Because SendReport.pl actually wants to send email over SMTP, this is where the plan starts to go off the rails, but we can work through it.

image

7. Build C# implementation (Report.dll) that passes same ApprovalTest.  Once we have a C# implementation, we can create a seam that separates the message creation from the message sending, and use EmailApprovals, just like Llewellyn described, but getting past step 6 will be tricky.

image

8. Now we need to get Report.dll working with Report.pl.  I’ll create C# shell that invokes Report.dll from the command line.

image

9. Replace sendReport() call in Report.pl with a system() call to to ReportRunner.exe

image

10. Repeat until Report.pl is just a bunch of blue blocks containing system() calls to C# code runners.

11. Replace Report.pl with a C# executable, PowerShell script or whatever.

Moving the responsibility down to Step 4 was relatively easy.  I spent some tedious hours tracing variables to determine their scope and effect.  Then I spent half a day wresting with the COM object the Perl script uses to send the mail.  At the end of the day I could use SendReportRunner.pl to send emails and catch them with smtp4dev.  But to get past Step 6 I needed to answer the question: How do I get those messages into a .NET unit test so I can capture and approve them?

Catching Email

Smtp4dev is a nice little application that fills a similar niche to CassiniDev a webserver I wrote about previously.  Smtp4dev sits in your system tray and listens on the SMTP port for incoming mail.  When it gets a message, it logs the message arrival in it’s window and you you can double click the message to see it in your default email program:

image

Since smtp4dev lives on CodePlex, I figured there was a good chance that it was written in .NET and sure enough it was written in C#.  Thinking back to CassiniDev, I wondered if there was a way I could host smtp4dev in my unit test, catch the messages from the Perl process, and then hand them over to ApprovalTests.  I grabbed the source for the project and found an example named “SimpleServer’’ that looked like it could be used to create a test fixture similar to the CassiniDev fixture I used when testing MVC views.

I created an empty class library and a test project to go with it.  The test project will need ApprovalTests and a reference to Rnwood.SmtpServer, which is the server that powers smtp4dev.  The server wasn’t on NuGet yet, so I put it there and used NuGet to add both references.  The pattern for creating the test fixture was nearly the same as creating a fixture for CassiniDev:

using Microsoft.VisualStudio.TestTools.UnitTesting;
using Rnwood.SmtpServer;

namespace Report.Tests
{
  [TestClass]
  public class SmtpFixture : DefaultServer
  {
    public SmtpFixture(Ports port)
      : base(port)
    {
    }

    [TestInitialize]
    public void StartServer()
    {
      this.Start();
    }

    [TestCleanup]
    public void StopServer()
    {
      this.Stop();
    }
  }
}

To implement a test, I’ll extend this fixture.  In the broad strokes, we want something like this:

public SendReportTest()
  : base(Ports.SMTP)
{
}

[TestMethod]
public void SendReportOverEmail()
{
  try
  {
    this.MessageReceived += CatchMessage;
    GenerateMessage();
    ApprovalTests.Email.EmailApprovals.Verify("??");
  }
  finally
  {
    this.MessageReceived -= CatchMessage;
  }
}

I specify the default SMTP port in the constructor, I could have used “Ports.AssignAutomatically” and the server would pick an empty port.  That’s nice functionality, but the Perl script wants to use the default port.  I’ve declared some methods but not implemented them, and I’m still not sure what I’m going to give to ApprovalTests.

When we get a MessageReceivedEvent, it will come with a MessageEventArgs and we need to figure out if we can somehow get a MailMessage from that, which is what EmailApprovals is expecting from us.  CatchMessage needs to do that for us.

We also need to generate a message. I’ll do that first, since once I can generate and catch messages I’ll be able to look at a live instance of MessageEventArgs and see what its guts look like.  Our goal is for the message to come from Perl, but for the moment, we’ll just use an SmtpClient to stand in for the script.  Notice that we pass the fixture’s port number to the SmtpClient, if we were using a random port, this would ensure that we actually send it to the right place.

private void GenerateMessage()
{
  using (var client = new SmtpClient("localhost", this.PortNumber))
  {
    using (var message = new MailMessage(
      "noreply@localhost", 
      "jim@localhost", 
      "Hello World", 
      "Well, you caught me."))
    {
      client.Send(message);
    }
  }
}

Implementing CatchMessage gave me some pause.  If I use a lambda, I can’t easily unsubscribe from the event.  Maybe that doesn’t matter in the context of a test, but it’s a bad idea to leave events attached, and I don’t want to be in the habit.  I could unsubscribe safely I had a regular method, but then I need some plumbing to get the data back to the test method. I thought about it for a minute or two and decided to create a class to handle the event.  Later on this turned out to be a pretty good decision, because I was able to substitute some special logic to handle the Perl message in the catcher class without obscuring the test intention.

The basic MessageCatcher just needs to handle the event and store the message data.  Then we can create one of these in our test and use it there.

public class MessageCatcher
{
  public IMessage Message { get; private set; }
  public void CatchMessage(object sender, MessageEventArgs e)
  {
    this.Message = e.Message;
  }
}
[TestMethod]
public void SendReportOverEmail()
{
  var catcher = new MessageCatcher();
  try
  {
    this.MessageReceived += catcher.CatchMessage;
    GenerateMessage();
    ApprovalTests.Email.EmailApprovals.Verify(catcher.Message);
  }
  finally
  {
    this.MessageReceived -= catcher.CatchMessage;
  }
}

But it turns out that the IMessage interface is not what we want, because it’s not what EmailApprovals wants, and its not convertible into a MailMessage.  At the moment it doesn’t look like we can use EmailApprovals, but that doesn’t mean we can’t use ApprovalTests.  The SimpleServer example code shows how to dump the IMessage to a eml file:

// If you wanted to write the message out to a file, then could do this...
File.WriteAllBytes("myfile.eml", e.Message.GetData());

It turns out that *.eml is just a fancy name for “text file”.  I don’t want to dump it to the file system if I can avoid it.  Since GetData() returns a Stream, I should be able to read it directly.

public class MessageCatcher
{
  public string Message { get; private set; }

  public void CatchMessage(object sender, MessageEventArgs e)
  {
    using (var reader = new StreamReader(e.Message.GetData()))
    {
      this.Message = reader.ReadToEnd();
    }
  }
}

Then I can update my test to be an ordinary Approval instead of an EmailApproval.  Since I think I’m about ready to run this, I add a FileLauncherReporter.

[TestMethod]
[UseReporter(typeof(FileLauncherReporter))]
public void SendReportOverEmail()
{
  var catcher = new MessageCatcher();
  try
  {
    this.MessageReceived += catcher.CatchMessage;
    GenerateMessage();
    ApprovalTests.Approvals.Verify(catcher.Message);
  }
  finally
  {
    this.MessageReceived -= catcher.CatchMessage;
  }
}

The test run completes and notepad launches:

image

This is both really cool, and kind of a bummer.  Its really cool because I can (in theory) catch the Perl script’s messages and use them as a baseline for developing my C# implementation.  Although, before moving on, I see one thing I need to take care of, and that is the timestamp in the middle of the message.  A little regex should take care of that:

public void CatchMessage(object sender, MessageEventArgs e)
{
  using (var reader = new StreamReader(e.Message.GetData()))
  {
    this.Message = Regex.Replace(
      reader.ReadToEnd(),
      @"Date:\s[\d\s\w,:-]+\d+\r\n",
      string.Empty);
  }
}

The bigger disappointment is that notepad launched at all.  When Llewellyn used a FileLauncherReporter in his video, Thunderbird launched.  That was cool.  I’m jealous.  Luckily ApprovalTests is open source so I can go see how de did that.  Turns out to be pretty simple, we just need to make sure that when ApprovalTests saves the received file, it uses the .eml extension.  To do this, I make a small change to the way I call Verify().

[TestMethod]
[UseReporter(typeof(FileLauncherReporter))]
public void SendReportOverEmail()
{
  var catcher = new MessageCatcher();
  try
  {
    this.MessageReceived += catcher.CatchMessage;
    GenerateMessage();
    Approvals.Verify(new ApprovalTextWriter(catcher.Message, "eml"));
  }
  finally
  {
    this.MessageReceived -= catcher.CatchMessage;
  }
}

And now the file launches in my default mail client, which happens to be Outlook.

image

Catching Perl

Now that I understand how to catch mail using Rnwood.Smtpserver, and my childish need to see my message in an email client is satisfied, I can get this working with Perl.  I’m going to create a PerlMessageGenerator class for that.

public class PerlMessageGenerator : IMessageGenerator
{
  private const string MissingPerlMessage = "You must have a 32-bit perl at [{0}]. Please visit http://http://www.activestate.com/ to acquire Perl.";
  private const string PerlPath = @"C:\Perl\bin\perl.exe";

  public PerlMessageGenerator()
  {
    if (!File.Exists(PerlPath))
    {
      throw new InvalidOperationException(MissingPerlMessage.FormatWith(PerlPath));
    }
  }

  public void GenerateMessage(string host, string to, string attachementPath)
  {
    var binPath = Path.GetDirectoryName(Assembly.GetExecutingAssembly().Location);
    var arguments = "sendReportRunner.pl {0} {1} {2}".FormatWith(host, to, attachementPath);
    var pi = new ProcessStartInfo(PerlPath, arguments)
    {
      UseShellExecute = false,
      WorkingDirectory = binPath,
      CreateNoWindow = true
    };
    using (var p = new Process { StartInfo = pi })
    {
      p.Start();
      p.WaitForExit();
    }
  }
}

Now I just need to get my scripts into place by adding them as linked files to my test project, with “Copy To Output Directory” set to “Copy Always.”  This actually works, the test catches the Perl message, but as I mentioned the Perl output needs some additional scrubbing over and above the simple message.  SendReport.pl adds another timestamp in the subject line, a Message-ID field that varies on each run, and because it has an attachment, there are MIME boundaries that need to be ditched.  I’ll spare you the gory details.

The important part is that we caught the message.  After creating a separate PerlMessageCatcher to handle all the special cases, my test passes consistently in Visual Studio.  Just for kicks, and because this will eventually be production code, I turn on NCrunch.  And I’m very happy to see that the test passes under NCrunch as well.

Here’s the final test class:

[TestClass]
public class SendReportTest : SmtpFixture
{
  public SendReportTest()
    : base(Ports.SMTP)
  {
  }

  [TestMethod]
  public void SendReportOverEmail()
  {
    var catcher = new PerlMessageCatcher();
    try
    {
      this.MessageReceived += catcher.CatchMessage;
      new PerlMessageGenerator().GenerateMessage("localhost", "jim@contoso.com", "sendreport.pl");
      Approvals.Verify(new ApprovalTextWriter(catcher.Message, "eml"));
    }
    finally
    {
      this.MessageReceived -= catcher.CatchMessage;
    }
  }
}

That’s probably enough for one day.  I’ve made it past Step 6 in my porting list.  PerlMessageCatcher is pretty twisted code and could use some refactoring.  On the other hand, once I make it to Step 9, the (as yet non-existent) .NET implementation will be the canonical implementation, and I can simply use EmailApprovals directly.  The need for the PerlMessageCatcher will go away, so perhaps getting to Step 9 is a more worthy goal than refactoring the catcher.  We’ll see.

Coding, Testing

MEF Composition Tests, Redux

Photo Credit: FutUndBeidl

A while back I gave a talk at SoCalCodeCamp about testing MEF composition.  Because there are a few steps involved in setting up the test, I wrote Stop Guessing About MEF Composition And Start Testing.  I had hoped that the article, along with the source code from my talk would help more people cross the gap from having once heard that it was possible to test composition, to actually implementing tests.  Since there were only about five people at my talk, it didn’t take long for the number of blog readers to exceed the number of attendees, and yet I still wonder how many people decide setting up the test is too complicated and just decide to use MEFX or Visual MEFX “as needed”.

While MEFX tools are valuable (and these composition tests are just a more powerful way to automate MEFX), running tests on an ad-hoc basis is never ideal.  Bugs creep in when you don’t expect them to, and automatic tests are your defense against them.  I also find composition tests provide valuable design-time feedback.  While taking questions at the end of the talk, I was asked about my process for designing MEF parts.  I answered that I more or less disregarded MEF until the end of the process, and then just added the needed attributes.  That was a truthful answer, but over the intervening months, I’ve realized it was the wrong answer.  What finally put the nail in the coffin was this great article: How test-driven development works (and more!), by J. B. Rainsberger.  After reading, it hit me that that I had setup a little waterfall process.  First, design and build a unit.  Then, glue MEF onto it.  The problem is that even when “keeping MEF in mind” I would sometimes do foolish things like pass a bunch of primitives into a constructor, and this would lead me to rework at the end of the development cycle when I tried to shove my class into a container.

So, to avoid rework and break some bad habits, I decided to start testing composition to get feedback during the design phase as soon as I know I’m going to use MEF (which these days is most of the time).  Its already paying dividends on my current project, as soon as I see the need to pass in a simple piece of data like a string or an integer, my composition test will blow up if I pass it through the constructor.  A lot of times I still prefer to pass the data in through the constructor, but since my test is blowing up, I’m forced design and test the system for getting that data to the constructor immediately, rather than at the end of the cycle.

I started testing composition out of necessity, I had a broken composition that needed fixing.  Now, I do it to save time and effort, and it seems more valuable to me every day.  More people should do it, but I’m afraid some will be turned off by the number of steps and concepts involved.

Here are a few of them:

  1. Synchronizing catalogs between test and production
  2. For some, this could also be their first exposure to ApprovalTests
  3. Finding a copy of MEFX.
  4. Setting up the test.

So, does it have to be so complicated?

Reducing the Surface Area

Lets look at a sample project and implement the integration test I described in my last article about composition tests.  Some people have a one-track mind.  Mine has two tracks: cars and pizza.  This time we will go with cars and our sample project will be a simple program called CarDealership with some parts related to cars in it.  It doesn’t have to be very complex since our primary interest is the test.

Lets look at that test:

[TestClass]
public class IntegrationTest
{
  [TestMethod]
  public void DiscoverParts()
  {
    try
    {
      var catalog = new DirectoryCatalog(".");
      var host = new CompositionContainer(catalog);
      var compositionInfo = new CompositionInfo(catalog, host);
      using (var stringWriter = new StringWriter())
      {
        CompositionInfoTextFormatter.Write(
            compositionInfo,
            stringWriter);
        Approvals.Verify(stringWriter.ToString());
      }
    }
    catch (ReflectionTypeLoadException ex)
    {
      Array.ForEach(
          ex.LoaderExceptions,
          lex => Console.WriteLine(lex.ToString()));
      throw;
    }
  }
}

A few things are already different about the test.  There’s no UseReporterAttribute on the class anymore, because I usually define this in AssemblyInfo.cs now using an assembly scoped attribute:

// AssemblyInfo.cs
using ApprovalTests.Reporters;
[assembly: UseReporter(typeof(DiffReporter))]

I also moved a few more lines inside the test body because I don’t usually have a reason to run more than one test with the same CompositionInfo.  Also, since my intention is to break this test down into something cleaner, I want to see it all in one place.

Right now, our test wont compile because we don’t have references to ApprovalTests or MEFX.  ApprovalTests has been available on NuGet for some time, and lately the NuGet package has been kept up to date with releases on SourceForge.  So, go ahead and use NuGet to install ApprovalTests:

In my previous post I explained where to find MEFX, which includes the CompositionDiagnostics library that we need for our test.  But you had to download it or compile it, and then add the reference by hand… icky.  But, we had no choice because this library wasn’t available on NuGet.  Well, it is available now, because I put it there a few months ago.  Now its much easier to get your hands on a copy of the library:

image

Now our test compiles and runs.  While you weren’t looking I added some parts to the CarDealership program.  But it doesn’t matter because I haven’t setup a pre-build event to copy assemblies from production to test.

I never liked doing that.  I knew it would be brittle and it was.  Ideally, I’d just like to use the exact catalog from production in my test.  It turns out this is a lot easier to arrange if you don’t use a DirectoryCatalog.  AssemblyCatalogs or TypeCatalogs are easy to share programmatically.  So, ask yourself if you really need to use a DirectoryCatalog.  Lets see what the test looks like when we use an AssemblyCatalog:

[TestMethod]
public void DiscoverParts()
{
  try
  {      
    var compositionInfo = new CompositionInfo(
      Program.Catalog, 
      Program.Host);
    using (var stringWriter = new StringWriter())
    {
      CompositionInfoTextFormatter.Write(
          compositionInfo,
          stringWriter);
      Approvals.Verify(stringWriter.ToString());
    }
  }
  catch (ReflectionTypeLoadException ex)
  {
    Array.ForEach(
        ex.LoaderExceptions,
        lex => Console.WriteLine(lex.ToString()));
    throw;
  }
}

To facilitate sharing, I’ve added static properties on the Program class.  They are simple properties, initialized in Program’s class constructor:

using System.ComponentModel.Composition.Hosting;
using System.ComponentModel.Composition.Primitives;
using System.Reflection;

namespace CarDealership
{
  public class Program
  {
    static Program()
    {
      Catalog = new AssemblyCatalog(Assembly.GetAssembly(typeof(Program)));
      Host = new CompositionContainer(Catalog);
    }

    public static ComposablePartCatalog Catalog { get; private set; }

    public static ExportProvider Host { get; private set; }

    private static void Main(string[] args)
    {
    }
  }
}

Looking back at the test, we can see the effect of sharing the catalog and host from Program.  We can now get the CompositionInfo with just one statement:

var compositionInfo = new CompositionInfo(Program.Catalog, Program.Host);

Another interesting development, the catalog and host are only referenced on this line.  Could we create a method that takes “compositionInfo” as a parameter without changing this test’s behavior?  The answer is no, but lets try it and see why.

[TestMethod]
public void DiscoverParts()
{
  var compositionInfo = new CompositionInfo(
    Program.Catalog,
    Program.Host);
  DiscoverParts(compositionInfo);
}

private static void DiscoverParts(CompositionInfo compositionInfo)
{
  try
  {
    using (var stringWriter = new StringWriter())
    {
      CompositionInfoTextFormatter.Write(
          compositionInfo,
          stringWriter);
      Approvals.Verify(stringWriter.ToString());
    }
  }
  catch (ReflectionTypeLoadException ex)
  {
    Array.ForEach(
        ex.LoaderExceptions,
        lex => Console.WriteLine(lex.ToString()));
    throw;
  }
}

To make the CompositionInfo a parameter, we pulled it out of the try-catch block.  Since the CompositionInfo constructor can throw the ReflectionTypeLoadException we’re trying to catch, moving it out of the try-catch block defeats the purpose of the exception handler.  Moving the exception handler back to the test method requires us to implement the handler (which never changes) every time we write a test.  We need to control when “new” is called on CompositionInfo.  We can do so by changing our parameter into a Func<CompositionInfo>:

[TestMethod]
public void DiscoverParts()
{
  DiscoverParts(() => new CompositionInfo(Program.Catalog, Program.Host));
}

private static void DiscoverParts(Func<CompositionInfo> getCompositionInfo)
{
  try
  {
    using (var stringWriter = new StringWriter())
    {
      CompositionInfoTextFormatter.Write(
          getCompositionInfo(),
          stringWriter);
      Approvals.Verify(stringWriter.ToString());
    }
  }
  catch (ReflectionTypeLoadException ex)
  {
    Array.ForEach(
        ex.LoaderExceptions,
        lex => Console.WriteLine(lex.ToString()));
    throw;
  }
}

Now, I could still create a CompositionInfo instance in the test method, then use the delegate to pass back that instance.  Doing so, I would not be protected by the exception handler, but that would be a minor problem.  The exception handler only exists as a time saver, when a ReflectionTypeLoadException occurs, it will print additional information not included in the default dump.  Without the handler, you would need to rerun the test in the debugger in order to see the loader exceptions.  So, its not the end of the world, but at least now its possible to reuse the exception handler as long as we remember how to build the delegate correctly.

Can we make DiscoverParts easier to use by removing the caveat that the constructor must be part of the delegate?  We could provide an overload that creates the delegate for the caller:

[TestMethod]
public void DiscoverParts()
{
  DiscoverParts(Program.Catalog, Program.Host);
}

private static void DiscoverParts(ComposablePartCatalog catalog, ExportProvider host)
{
  DiscoverParts(() => new CompositionInfo(catalog, host));
}

private static void DiscoverParts(Func<CompositionInfo> getCompositionInfo)
{
   // ...
}

That’s better, and maybe we don’t need that delegate after all.  For the time being, I’m going to leave it there because I think the more interesting question is: Do we need the host?  If you look again at the construction of the host in Program, you’ll see that it’s the simplest possible CompositionContainer, and the only interesting thing about it is the catalog it consumes.  So why not focus our test on the catalog?

[TestMethod]
public void DiscoverParts()
{
  DiscoverParts(Program.Catalog);
}

private static void DiscoverParts(ComposablePartCatalog catalog)
{
  DiscoverParts(catalog, new CompositionContainer(catalog));
}

private static void DiscoverParts(ComposablePartCatalog catalog, ExportProvider host)
{
  DiscoverParts(() => new CompositionInfo(catalog, host));
}

In a great many cases, the container itself is very boring, but there certainly are plenty of cases where some other, more interesting, ExportProvider comes into play. So, both overloads are useful.  In the case where only the catalog is interesting, the overload without the host should be used.  This sends the signal that the host is not important.  On the other hand, if you choose to use the overload with the host specified, then you’re sending the signal that the host is interesting and important.  Who are you sending the signal too?  Probably your future self.

So we’ve accomplished a lot.  Our test method has been reduced to a simple method call that can take as little as one parameter.  The best thing about it is that this parameter is part of the “normal’ set of Composition types, it does not require the caller to know anything about CompositionInfo, CompositionInfoTextFormatter, or ReflectionTypeLoadException.  All that code is still there though, and its not conveniently packaged for distribution.  Can we make this test even easier by completely hiding all this code?

Introducing CompositionTests

One could take everything we’ve looked at here, bundle it up into a library, and put it on NuGet and GitHub.  I’ve done just that:

image

It’s just a little library but it should make composition tests dead simple:

using CompositionTests;
using Microsoft.VisualStudio.TestTools.UnitTesting;

namespace CarDealership.Tests
{
  [TestClass]
  public class IntegrationTest
  {
    [TestMethod]
    public void DiscoverParts()
    {
      Composition.DiscoverParts(Program.Catalog);
    }
  }
}

That’s the whole test.

You can also scrub the formatted text before sending it to ApprovalTests.

using ApprovalUtilities.Utilities;
using CompositionTests;
using Microsoft.VisualStudio.TestTools.UnitTesting;

namespace CarDealership.Tests
{
  [TestClass]
  public class IntegrationTest
  {
    [TestMethod]
    public void DiscoverParts()
    {
      Composition.DiscoverParts(
        Program.Catalog,
        StringExtensions.ScrubVersionNumber,
        s => s.ScrubPath(@"C:\Test"));
    }
  }
}

You can use any number of scrubbers to transform your text into a consistent state.  For example, scrubbing version numbers is useful when you use an AssemblyCatalog with an assembly that has an auto incrementing build number.  I’ve also included an extension to scrub the public key token, which is useful when using NCrunch (which disables assembly signing in its builds) alongside other test runners (which do sign the assemblies).  Or, you can use any Func<string, string> that you can think of.

The library also includes (and uses by default) an text formatter that orders the parts by name.  The default text formatter does not guarantee that the parts come out in any particular order.  I have seen them printed in different orders in different build environments, so enforcing order is a must if we want our tests to play nice with VS, NCrunch and a build server.  I’m still not sure that I’ve covered all the scenarios that can lead to things appearing out of order, because the behavior is not easy to trigger.  Luckily I had an example in production which I was able to recreate in the library’s unit tests.  If I encounter any more ordering problems, they will be addressed in future versions of the library.

Check It Out

Coding, Tips

Tip: Modern INotifyPropertyChanged

Photo Credit: Sanne Roemen

I don’t get much chance to work on the desktop or in Silverlight so I haven’t had too many opportunities to work with the INotifyPropertyChanged interface.  While refactoring an old class toward the Single Responsibility Principle today, I thought to myself “I need an event here,” and since the event was related to a property changing I decided to implement INotifyPropertyChanged instead of using a basic event.

Being rusty with INotifyPropertyChanged I used Google to search for any cool tricks people have come up with since the last time I gave this interface any thought.  I think the most cutting edge way to do it is probably with AOP, but there are some other cool tricks too.  Maybe this is old hat, but it’s new to me.

Generic Setter

There are a few posts out there about using generic setters.  Some of them smell bad.  This one smells good to me: INotifyPropertyChanged, the Anders Hejlsberg Way, by Dan Rigby.  Tips from Anders are always nice :).

Here is what the setter looks like:

private void SetProperty<T>(ref T field, T value, string name)
{
    if (!EqualityComparer<T>.Default.Equals(field, value))
    {
        field = value;
        var handler = PropertyChanged;
        if (handler != null)
        {
          handler(this, new PropertyChangedEventArgs(name));
        }
    }
}

The only issue I have with the implementation is that it still requires a magic string.  We’d like to avoid magic strings because they aren’t code, and because they aren’t code they are usually ignored by refactoring tools and can’t be checked by the compiler.

No Magic

Several posts explain how to get rid of the magic string using a lambda expression.  I used this article as reference: Silverlight/WPF: Implementing PropertyChanged with Expression Tree, by Michael Sync.  Michael’s post takes a different approach and just passes the lambda into the setter.  Later he shows some extension methods that can eliminate the need to specify the type parameter explicitly.  But, when I combined his technique with Dan’s I didn’t find the need to specify the type parameter, since the compiler can infer this information from the field and value parameters.

Here’s what we get after combining the two:

private void SetProperty<T>(ref T field, T value, Expression<Func<T>> member)
{
    Contract.Requires(member != null, "member must not be null.");
    var me = member.Body as MemberExpression;
    if (me == null)
    {
        throw new InvalidOperationException("member.Body must be a MemberExpression");
    }

    if (!EqualityComparer<T>.Default.Equals(field, value))
    {
        field = value;
        var handler = PropertyChanged;
        if (handler != null)
        {
            handler(this, new PropertyChangedEventArgs(me.Member.Name));
        }
    }
}

Here is how we use it:

public int Threshold
{
    get
    {
        return this.threshold;
    }
    private set
    {
        this.SetProperty(ref this.threshold, value, () => this.Threshold);
    }
}

Its no longer refactoring resistant, yay!  I think the error-checking in the setter is a little ugly though.

Cleaner Code with .NET 4.5

I actually found this article first: INotifyPropertyChanged, The .NET 4.5 Way, by Dan Rigby, then followed a link there to the Anders-inspired example.  In the post, Dan shows how to use the CallerMemberNameAttribute (new in .NET 4.5) to eliminate the need for the expression tree.  Without the expression tree, we don’t need the error checking code related to the tree and we can implement a cleaner version of SetProperty.  Be sure to check out Dan’s article as he mentions a couple more useful attributes coming in .NET 4.5.