Monday, November 02, 2009

Monitoring your Drools Flow processes

You need to actively monitor your processes to make sure you can detect any anomalies and react to unexpected events as soon as possible. Business Activity Monitoring (BAM) is concerned with real-time monitoring of your processes and the option of intervening directly, possibly even automatically, based on the analysis of these events.

There are numerous technical ways to monitor your processes, and this blog will describe two options: analyzing the low-level process events emitted by the process engine or using custom business events. Finally, a preview screencast on the BAM web-console is presented.

Analyzing low-level process events

Drools Flow can be configured to emit events about the execution of your processes (start / stop) and each of the nodes inside (triggered / left). Using Drools Fusion, these events could be processes using event processing rules (CEP) to detect anomalies, derive higher-level business events, etc. To start processing these (low-level, generic) events, add a process listener to the session that forwards all related process events to a session responsible for processing these events (this could be the same session as the one executing the processes, or an entirely independent one).

You can then define CEP rules that process these low-level events. For example, the following rule that accumulates all start process events for one specific order process over the last hour, using the sliding window support. This rule prints out an error message if more than 1000 process instances were started in the last hour (e.g., to detect a possible overload of the server).
declare ProcessStartedEvent
@role( event )
end

rule "Number of process instances above threshold"
when
Number( nbProcesses : intValue > 1000 )
from accumulate(
e: ProcessStartedEvent( processInstance.processId == "com.sample.order.OrderProcess" )
over window:size(1h),
count(e) )
then
System.err.println( "WARNING: Nb of order processes in the last hour > 1000: " + nbProcesses );
end
Defining custom business events

While processing generic, low-level process events could allow you to derive higher-level business events, defining these derivation rules could be complex. In many cases, people simply want to annotate their process with meta-data that indicates when specific business events are happening. For example, one node in the process might be annotated as a "New Customer" event to indicate that, when processing that node, we are actually registering a new customer. Similar annotations could be used to annotate all nodes that fall under the "Inform Customer" category, etc. During the execution of the process, this meta-data can then be used to generate higher-level business events.

First requirement is then being able to annotate nodes with custom meta-data. Luckily, the BPMN2 specification provides an extensibility mechanism that allows you to add custom extensions to the specification, like for example in our case for monitoring meta-data. The Drools XML framework also supports plugging in custom XML handlers, so this allows us to handle these custom XML tags and add them as meta-data to the nodes.

For example, nodes in a BPMN2 process could then be annotated with this (very simple) custom monitoring data:
<userTask id="_15" name="Inform" implementation="humanTaskWebService" >
<bam:event name="Inform" type="onEntry" data="#{request.customerId}" />
...
</userTask>
A custom event listener can then use this meta-data to derive when these business events should be created and processed.

Finally, an event processing rule can use these higher-level events to derive crucial monitoring information, for example that a user has not received any information in a time period of 6 hours after the initial processing of his request:
rule "Verify time after request"
when
start: BAMEvent( name == "Process" )
not ( BAMEvent( name == "Inform", this after[0h,6h] start ) )
then
System.out.println("Customer not informed for over 6h!");
end


BAM console

Finally, monitoring information like the one derived above should not just be printed out to the console, but displayed using charts, graphs, etc. The Service Activity Monitoring (SAM) project is planning to offer just that. A GWT-based web console allows you to view these charts, and a simple example was recently presented. We have adapted this example, to generate a chart that continuously shows the number of started process instances. While its functionality is still very limited, I hope this already shows the direction we're going, and we hope to extend it steadily.