Saturday, December 20, 2008

Enterprise Windows Workflow Foundation Development Part II

Over the last week I’ve changed from using the manual thread scheduling service to the default CLR thread pool based one. Some of the techniques I mentioned before have been updated in this post and also a few new items discussed
  • Exception handling
  • Workflow performance
  • Workflow custom tracking service for state mapping
  • Unit testing
  • WF design

Exception handling

Using the default thread scheduling service and checking a LastExceptionRaised property from a local service to check exceptions simply will not work now. As the WF execution is performed asyncronously in all cases the property will simply be null when checked straight after calling a WF local service method. Only some time later when the WF thread is scheduled will this property be set.

The best approach I've found to handle this case is to firstly log all exceptions passed to your local service property setter when the WF fault handler kicks in. You can use the enterprise libraries HandleException method or something similar depending on what logging framework you use. This ensures the rich detailed exception information, including all inner exceptions and their stack traces, is captured and logged.

You should also manage a collection of exceptions within the local service to ensure you don’t miss any e.g. if two WF threads raise exceptions at a similar time. To handle the exceptions you will to create a timer which periodically checks whether any exceptions exist in the collection and simply throws the first exception in the list. This can then be caught and handled/ logged / rethrown in your application depending on the exception handling policies defined. You need to use a timer to check the exception and re throw otherwise the workflow would terminate if an exception re thrown on the WF thread.

In terms of "handling" unexpected exceptions, our application uses the AppDomain UnhandledException filter which processes the exception and uses the “Unhandled Exception” enterprise library policy to define the handling. By default we configure this policy to log the exception to the event log and presents the user with a friendly dialog that a serious issue has occurred ( well about as friendly as an unexpected dialog type window can appear )

Workflow performance

In some scenarios you may have many workflows which continually load / unload e.g. if they are triggered by a delay activity. If the delay activity is relatively short e.g. few minutes and you have a few hundred WF's you may find the WF runtime spending most of it's time loading / unloading WF's. There are a few approaches you could take

  1. Develop a customised persistence service and override the UnloadOnIdle method e.g. you might customised SqlWorkflowPersistence. In there you could put some custom logic to only unload some workflows. E.g. you might not unload workflows which are in a state with logic driven by a DelayActivity.
  2. Don’t set UnloadOnIdle to true in the constructor of the persistence service and in the WorkflowIdled event you can decide if a workflow should be unloaded using similar rules as in 1.

In relation to diagnosing general WF performance issues I highly recommend reading Performance Characteristics of Windows Workflow Foundation. There are a variety of useful WF performance counters you can use to inspect your running WF host to avoid second guessing where any bottlenecks might be. As always good detailed analysis is essential so you really are tuning the least performant parts of your app and not those you most like to tinker with :)

Workflow custom tracking service for state mapping

The original approach I mentioned works by mapping the current state to the workflow state in the workflow idle event, however this has the following implications.

  1. The StateMachineWorkflowInstance class can be expensive to construct every time your workflow idles.
  2. The change of status in the state machine may not be immediately reflected in your entities mapped state. i.e. some time passes between entering a specific state and the workflow idling e.g. if performing some work in a StateInitializationActivity within that state the workflow will not idle until the StateInitializationActiviy has completed.

Matt Milner wrote an excellent article on tracking services and presents one which tracks state changes in a state machine activity. As soon as a new state activity starts executing the tracking service is called. I’ve adapted this service so it fires an event on a state change which a local service responds to by mapping the current state to an enumerated status value. The status of the WF state machine and status in the entities record are now immediately in sync with each other

Unit testing

There are many excellent articles on unit test approaches to WF i.e. Matt wrote one and Cliffard has also produced an excellent lab on applying TDD with WF. Two important points when creating tests to add to these articles.

  1. Always try and refactor logic within activities / workflow into separate components and test them in isolation first. Keep the minimal amount of code in your activity. It’s fine for an activity to simply provide a simple façade around an existing service. If you adopt a TDD approach you would actually find yourself more likely to write those services first, and thus make the activities easier to test using mock objects.
  2. Use IOC mechanisms such as dependency injection so you can mock most of the objects your workflow / activities use. This means your tests can focus as much as possible on testing just the activity or workflow class and not many other dependent classes.

In testing a state machine I’d like to test everything about the model I’ve created i.e. the start state / completed states, possibles transitions from each state and assert any rules embedded within the workflow are evaluated, and ensure ALL activities end up being executed and covered. This is very important when you have many IfElse branch activities in your workflow based on rules. The last thing you want to do is change a condition within a branch only to find the branches beneath it can never be reached :(

WF Design

One thing I’ve found out over the last month working on WF is the vast number of approaches there are to implement any given design. WF is highly extensible, both in the new services you can plug in and existing services that can be customised using standard mechanisms such as inheritance / events / overriding methods.

What ever approach you choose it’s important you try and fit within the WF framework as much as possible. It’s up to you how little or much of your workflow logic you decide to implement in WF. Even if you don’t think your design is ideal and doesn’t feel right I would encourage you to simply start and be prepared to refactor as you learn more about WF.

No comments: