Saturday, December 20, 2008

Enterprise Windows Workflow Foundation Development Part II

Over the last week I’ve changed from using the manual thread scheduling service to the default CLR thread pool based one. Some of the techniques I mentioned before have been updated in this post and also a few new items discussed
  • Exception handling
  • Workflow performance
  • Workflow custom tracking service for state mapping
  • Unit testing
  • WF design

Exception handling

Using the default thread scheduling service and checking a LastExceptionRaised property from a local service to check exceptions simply will not work now. As the WF execution is performed asyncronously in all cases the property will simply be null when checked straight after calling a WF local service method. Only some time later when the WF thread is scheduled will this property be set.

The best approach I've found to handle this case is to firstly log all exceptions passed to your local service property setter when the WF fault handler kicks in. You can use the enterprise libraries HandleException method or something similar depending on what logging framework you use. This ensures the rich detailed exception information, including all inner exceptions and their stack traces, is captured and logged.

You should also manage a collection of exceptions within the local service to ensure you don’t miss any e.g. if two WF threads raise exceptions at a similar time. To handle the exceptions you will to create a timer which periodically checks whether any exceptions exist in the collection and simply throws the first exception in the list. This can then be caught and handled/ logged / rethrown in your application depending on the exception handling policies defined. You need to use a timer to check the exception and re throw otherwise the workflow would terminate if an exception re thrown on the WF thread.

In terms of "handling" unexpected exceptions, our application uses the AppDomain UnhandledException filter which processes the exception and uses the “Unhandled Exception” enterprise library policy to define the handling. By default we configure this policy to log the exception to the event log and presents the user with a friendly dialog that a serious issue has occurred ( well about as friendly as an unexpected dialog type window can appear )

Workflow performance

In some scenarios you may have many workflows which continually load / unload e.g. if they are triggered by a delay activity. If the delay activity is relatively short e.g. few minutes and you have a few hundred WF's you may find the WF runtime spending most of it's time loading / unloading WF's. There are a few approaches you could take

  1. Develop a customised persistence service and override the UnloadOnIdle method e.g. you might customised SqlWorkflowPersistence. In there you could put some custom logic to only unload some workflows. E.g. you might not unload workflows which are in a state with logic driven by a DelayActivity.
  2. Don’t set UnloadOnIdle to true in the constructor of the persistence service and in the WorkflowIdled event you can decide if a workflow should be unloaded using similar rules as in 1.

In relation to diagnosing general WF performance issues I highly recommend reading Performance Characteristics of Windows Workflow Foundation. There are a variety of useful WF performance counters you can use to inspect your running WF host to avoid second guessing where any bottlenecks might be. As always good detailed analysis is essential so you really are tuning the least performant parts of your app and not those you most like to tinker with :)

Workflow custom tracking service for state mapping

The original approach I mentioned works by mapping the current state to the workflow state in the workflow idle event, however this has the following implications.

  1. The StateMachineWorkflowInstance class can be expensive to construct every time your workflow idles.
  2. The change of status in the state machine may not be immediately reflected in your entities mapped state. i.e. some time passes between entering a specific state and the workflow idling e.g. if performing some work in a StateInitializationActivity within that state the workflow will not idle until the StateInitializationActiviy has completed.

Matt Milner wrote an excellent article on tracking services and presents one which tracks state changes in a state machine activity. As soon as a new state activity starts executing the tracking service is called. I’ve adapted this service so it fires an event on a state change which a local service responds to by mapping the current state to an enumerated status value. The status of the WF state machine and status in the entities record are now immediately in sync with each other

Unit testing

There are many excellent articles on unit test approaches to WF i.e. Matt wrote one and Cliffard has also produced an excellent lab on applying TDD with WF. Two important points when creating tests to add to these articles.

  1. Always try and refactor logic within activities / workflow into separate components and test them in isolation first. Keep the minimal amount of code in your activity. It’s fine for an activity to simply provide a simple façade around an existing service. If you adopt a TDD approach you would actually find yourself more likely to write those services first, and thus make the activities easier to test using mock objects.
  2. Use IOC mechanisms such as dependency injection so you can mock most of the objects your workflow / activities use. This means your tests can focus as much as possible on testing just the activity or workflow class and not many other dependent classes.

In testing a state machine I’d like to test everything about the model I’ve created i.e. the start state / completed states, possibles transitions from each state and assert any rules embedded within the workflow are evaluated, and ensure ALL activities end up being executed and covered. This is very important when you have many IfElse branch activities in your workflow based on rules. The last thing you want to do is change a condition within a branch only to find the branches beneath it can never be reached :(

WF Design

One thing I’ve found out over the last month working on WF is the vast number of approaches there are to implement any given design. WF is highly extensible, both in the new services you can plug in and existing services that can be customised using standard mechanisms such as inheritance / events / overriding methods.

What ever approach you choose it’s important you try and fit within the WF framework as much as possible. It’s up to you how little or much of your workflow logic you decide to implement in WF. Even if you don’t think your design is ideal and doesn’t feel right I would encourage you to simply start and be prepared to refactor as you learn more about WF.

Saturday, December 06, 2008

Enterprise Windows Workflow Foundation Development

It's nearly a year ago since I blogged about some of the really cool features of WF and it's only recently I've had an opportunity to apply this technology in my day job. Although other projects I've worked on until now could have used WF equally well, they were simply too large and complex to refactor WF in easily. However they would have been a great candiate to use WF from the start as they use complex hierarchical state machines for which the logic is unfortunately littered throughout the application.

My previous WF projects were simply for my own learning ( aiming for an MCP qualification eventually ) and although I covered a lot of different areas of WF, it's only through developing a production application I've come across areas that needed to be reviewed further. e.g.

  • Workflow authoring mode - Out of the three options which is the most appropriate
  • Unhandled exceptions in the workflow . Exceptions need to be logged by the host and in some cases we want to avoid termination of the workflow
  • Querying workflow state - It is often not feasible to simply load every workflow to query it's state.
  • Dependency injection - How can types be resolved
  • Dynamic updates - Useful for simple customisation
  • Associating business objects with workflow instances
  • Runtime scheduling service - Is the default service always appropriate
  • Workflow runtime and instance wrapper classes

  • Workflow authoring mode

    Although there are three authoring modes to choose from code, code with separation, and no code, visual studio only lets you choose you the first two. All the guidance on this mentions you should evaluate how much flexibility the end user requires to change activities and workflows. Based on this advice though I found it hard to choose an appropriate model. On the one hand we should keep our applications as simple as possible right now based on our requirments, however on the other hand we need to design to make future changes easier :).

    I actually think the no code authoring mode is probably the preferred route as it encourages developing good practises i.e. building your activities upfront and then building your workflow around them being clear to separate the rules and the binding of properties between activities as separate tasks. However given the poor tool support in VS 2008 and also 2010 for no code workflows I find it hard to recommend. If you really can't perceive a need for this level of flexibility i.e. hosting the designer and allowing users to change the workflow, then it maybe overkill.

    Note if you do choose a code only route and want to move to code separation it is possible with a bit of work. Simply copy all activities within the designer and then paste into a workflow with code separation. This will created the XOML and then you can add all code activities in as required.

    For the project I worked on I applied the code separation model which seems to be no more complex to manage than the code only model, but at least provides some flexibility. The workflow and rules could be modified relatively easily by hosting the designers and new workflow types compiled as required.

    Exception handling

    It's inenvitable your application will have some bugs no matter how hard you try and in this case it's important that the application ( or more specifically a data store ) maintains it's integrity so users can continue using the system. It's also important to capture these exceptions and log them appropraite e.g. we use enterprise library 4.0 which provides rich support to log exceptions.

    The default behaviour for unhandled exceptions is to terminate the workflow which also has the effect of removing the workflow from the persistence store. However if your using a state machine which is currently in one of states you may not wish to terminate the workflow abruptly. If the workflow ends you would have to create a new workflow and start the state machine in the same state perhaps and also ensure workflow properties were set correctly.

    In our project we managed this by setting up fault handlers in nearly every state which could throw an exception. ( it was acceptable for the first state state to throw an exception and terminate the workflow ). In this fault handler we always handle the generic Exception type and when the exception is raised a local service method is called on the host application which passes the Exception object. The host application can then perform any necessary action e.g. transitioning to a failed state , logging the exception and in most cases presenting the user with an exception has occurred type window. Now we should have good information to diagnose this exception further and the state of the workflow is maintained.

    Querying workflow state

    There are many mechanisms to query the workflow state e.g. from StateMachineWorkflowInstance there are properties such as CurrentState / PossibleStateTransitions which provide you with a "picture" of the workflow state. However this does require the workflow to be loaded to obtain this state. For many applications it wouldn't be feasible to load all workflow instances just to find out which workflows are in a specific state. It also doesn't lend itself well to standard querying techniques used on a data store e.g. select all invoices from a table which are in state x

    Most application entities typically contain a state column which usually maps to an enumerated value defined in the application e.g. InvoiceState with values New, Awaiting Approval / Approved. It would be nice if we could still use this state column to filter out the records were interested in, but at the same time use the workflow instance to drive the workflow, determining possible state transitions / events that can be raised e.t.c. My first thought was to use a custom tracking service to track the changes to the state machine and report them back to the host application which could then update the state column of the related entity to reflect the current state. This way both the workflow instance and state column would always be in sync.

    However after creating a custom tracking service to do this I found the excellent article Matt has written on tracking services which also builds a similar service for tracking state machines, well at least I'd learnt tracking services well :) . I then found a much simpler solution which works well and doesn't require the use of tracking services. In the workflow idle event the current state can be checked from the StateMachineWorkflowInstance wrapper class and then the related entitiy updated. This works well because the workflow always idles between transitioning to new states.

    Dependency Injection

    Dependency injection / service locator which are implementations of the IOC pattern are well known good practises however it's not clear how you can inject instances into workflow types / activities. I did raise this question here and got some useful links. I haven't followed up the related links in detail however I did manage to find a relatively simple solution by using the services based infrastructure which WF provides.

    The host application which creates workflow instances creates the types which need to be resolved by dependency injection. Those objects are then passed to AddService from the WorkflowRuntime instance. You then need to access the interfaces through the IServiceProvider interface or the ActivityExecutionContext instance calling GetService . If you only need to access the interfaces in the Activity.Execute method then you can simply use the ActivityExecutionContext instance directly within Execute. Otherwise override the method OnActivityExecutionContextLoad and store away the interface references. If you do this however be sure to mark the fields in the activity / workflow with the NonSerialized attributes to prevent them also being serialised when the workflow is persisted.


    Scheduling services

    The default scheduling service uses a thread pool to schedule workflows to be executed as required. However given the workflow types in our project AND the host application access the same business object underling the workflow, this does introduce potential concurrency issues. The manual scheduling service provides a model which is simpler to manage and would be more appropriate here.

    You have to specifically schedule the ManualWorkflowSchedulerService to run a workflow using the RunWorkflow method. The host application simply does this after raising any events into the workflow through the local service. However some parts of the state machine activities require work to be continually performed in the background waiting for a specific rule evaluate to true and using a DelayActivity to sleep the workflow. To handle this a simple WPF DispatcherTimer can be used with DispatcherPriority.ApplicationIdle priority to execute any workflows which have expired timers. The timer can be setup using the following code

    DispatcherTimer workflowPollingTimer = new DispatcherTimer(new TimeSpan(0, 0, 0, 5)),DispatcherPriority.ApplicationIdle,new EventHandler(workflowPollingTimer_Tick), Dispatcher.CurrentDispatcher);workflowPollingTimer.Tick += new EventHandler(workflowPollingTimer_Tick);workflowPollingTimer.Start();

    Within the timer the workflows will be executed when their timers expire

    ManualWorkflowSchedulerService scheduler = workflowManager.WorkflowRuntime.GetService(<>);

    foreach(SqlCePersistenceWorkflowInstanceDescription workflowDesc in persistence.GetAllWorkflows()){

    if (workflowDesc.NextTimerExpiration <>

    {

    scheduler.RunWorkflow(workflowDesc.WorkflowInstanceId);

    }


    Dynamic updates for customisation

    Although dynamically updating a workflow is often used for versioning reasons e.g. migrating old to new workflow versions, or making ad hoc changes to specific workflows at runtime it can often be used to provide flexible customisation within a workflow. Although rules can be managed directly within the workflow, they can also be managed within the application e.g. preferences for document approval. You may find a need where a number of different activities ( based on runtime conditions ) need to be executed and using If / Else type branches within the workflow can quickly make your workflow very complex and hard to maintain.

    It's very simple to use a place holder activity within your workflow and then dynamically add activities to add at runtime based on application defined rules. Perhaps you have a number of different approval strategies which need to be executed, these could be added at the point of approval in the workflow. This does provide another option you can use to get the customisation you require without having to write designer hosting support in your applcation :)

    Workflow runtime and instance wrapper classes

    As you develop the host application you will probably find yourself writing proxies over the WorkflowInstance and WorkflowRuntime CLR classes to provide additional services when executing workflows. They can also make it easier to execute those workflows e.g. defining a ManualResetEvent within the workflow wrapper and then setting this when the workflow completed event fires so you can execute workflows and wait until they've completed. Bruce Bukovics has developed a good set of wrapper classes you can use as a base and adopt as required for your own requirements.

    Associating business objects with workflow instances

    One of the requirements in a workflow applications is to load a specific business object and then raise a workflow event. To do this you need to maintain a link between the business object and it's associated WF workflow instance. You can add the business object link directly into the workflow or add the workflow instance link into the business object or even both.

    If you do both so the business object contains a WorkflowId of type Guid in it's entity table, and the workflow itself contains a reference to the identifier of the business object. In this way the business object can be loaded by the client application, when a workflow event needs to be raised into the business object the correct workflow instance can be retrieved and called on. Having the business object reference directly in the workflow also allows the workflow to access the object to query any data within it or to make modifications and save to the data store.

    You may find it preferable to you use an object identifier rather than the object reference otherwise for long running workflows you may find yourself using stale data e.g. if the host application modifies the business object and workflow is using a stale copy. Using an object identifier or even using an Identity pattern ( link to Martn fowlers identify pattern reference ) means you can load the object as and when required in the workflow instance.

    Conclusion

    I hope I've shown you a few techniques you can use to make your life a little simpler when developing WF applications. Both Maurice de Beijer and Bruce Bukovics have provided me with useful feedback when tackling WF design issues. Bruce has written an excellent book on WF which is even worth getting just for the additional chapters covering NET 3.5 changes. Maurice also teaches a highly respected course on WF.