Monday, November 30, 2009

Camel Integration with Drools Pipeline

Here Lucaz (Lucas Amador) and I take the first approach to integrating Apache Camel with the Drools Pipelines project (http://blog.athico.com/2009/04/batchexecutor.html).
This first stage of the drools-pipeline/drools-camel project includes just the Apache Camel dependencies and two simple tests that
use Drools VSM (Virtual Service Manager) to show how we can interact with a Drools session going through a pipeline that transforms
a piece of data into executable commands. The main idea of the integration at this stage is to show how Camel can do exactly what
Drools Pipeline is doing while adding more flexibility with all the built-in camel components (http://camel.apache.org/components.html).

The other advantage that Apache Camel brings is the possibility to implement more advanced enterprise integration patterns than a simple transformation pipeline.
Take a look at the following URL to discover all the patterns that Apache Camel supports and you can start using right away to interact with a Drools Session from the outside world. (http://www.enterpriseintegrationpatterns.com/toc.html)

Here we also provide an example that shows how we can configure Apache Camel to listen to a directory in your hard drive, take all the content from the files that are stored there and then start a pipeline that will transform these XML files into commands that will run against a Drools Session. At last the pipeline will send an email and log the results to show how we can route the results to different outcomes.

In the following route we describe how we chain different endpoints that will process the message, in this case we start with a file that will be transformed. This message will be transformed by Processors, that will be defined by implementing the Processor interface.



Because we are using the SpringCamelContext implementation this route can be defined inside the applicationContext.xml file. The following XML snippet shows us how we can express the route shown above. Please note that all the Processors used in this example need to be defined as Spring Beans too. Take a look inside the project to see the full spring configuration.

<camel:camelContext id="directoryEmailCamelContext">
<camel:route>
<camel:from uri="file://src/test/resources/xml/?noop=true"/>
<camel:process ref="droolsContextInitProcessor" />
<camel:process ref="xmlNodeTransformer" />
<camel:to uri="direct:xstreamTransformer" />
</camel:route>
<camel:route>
<camel:from uri="direct:xstreamTransformer" />
<camel:process ref="camelXStreamFromXmlVsmTransformer" />
<camel:to uri="direct:executor" />
</camel:route>
<camel:route>
<camel:from uri="direct:executor" />
<camel:process ref="batchExecutorProcessor" />
<camel:to uri="direct:xstreamTransformerResult" />
</camel:route>
<camel:route>
<camel:from uri="direct:xstreamTransformerResult" />
<camel:process ref="camelXStreamToXmlVsmTransformer" />
<camel:to uri="direct:finalResult" />
</camel:route>
<camel:route>
<camel:from uri="direct:finalResult" />
<camel:process ref="assignResultProcessor" />
<camel:to uri="direct:executeResult" />
</camel:route>
<camel:route>
<camel:from uri="direct:executeResult" />
<camel:process ref="executeResultProcessor" />
<camel:to uri="file://src/test/resources/xml/output" />
<camel:to uri="log:org.apache.camel.example.result?level=INFO" />
</camel:route>
</camel:camelContext>

In the previous figure we show the full pipeline configured with Spring taking advantage of the camel-spring module. This pipeline describes the route that one file containing
commands will pass through to be executed. As you can see in the example project, a directory is polled in order to look for files. When a file is found the first step will be to initialize the Drools Context.
This Drools context will be inserted into the Message that will pass throughout all the camel endpoints. This context will represent the services needed to execute all the stuff related with Drools inside camel.
If you want to use Apache Camel with Drools you will probably need to initialize this Drools context using the pluggeable Camel Processor called: DroolsContextInitProcessor.
The following steps executed in this pipeline are:

* Transform the content of the file into a DOM document
* Convert the DOM document using XStream into a Set of commands (BatchExecution)
* Execute all these commands
* Tranform the Results objects into XML
* Assign the result to the ResultHandler


The last two steps are pure Camel endpoints. One of them will Log the result to the standard output using Commons Logging, and the other will create a file with the obtained results. As you can see we are taking advantage of the Routing capabilities of Camel that let us send the message to two different endpoints. It's important for you to know that multiple endpoints can be added here, for example: send a mail with the results, print the results, etc.
In this case we use the following file that will be consumed by the first endpoint that is pooling the directory: file://src/test/resources/xml/

sample.xml:

<batch-execution lookup="ksession1">
<insert identifier="'lucaz'">
<org.drools.camel.model.person>
<name>lucaz</name>
<age>0</age>
</org.drools.camel.model.Person>
</insert>
<fire-all-rules>
</batch-execution>

When the pipeline is executed we will get the following outputs:

* A file inside the //src/test/resources/xml/output
* A log in the system console
* The result from the resultHandler

Each of these three outputs will contain the following content:

sample.xml (in the output directory):

<execution-results>
<result identifier="lucaz">
<org.drools.camel.model.person>
<name>lucaz</name>
<age>25</age>
</org.drools.camel.model.person>
</result>
<fact-handle identifier="lucaz" externalform="0:1:9695314:9695314:2">
</fact-handle>
</execution-results>

Just one last thing to note is that for each new input type that you use you will need to implement a Processor to get the XML content. For more details about that take a look at the FileContextInitProcessor implementation.

Have Fun!

Download the example project: Pipelines with Camel

Friday, November 20, 2009

Should we rename Drools Solver to Drools Planner?

Drools Solver solves planning problems. The problem is that developers can't seem to find it when they are looking for a library to help with their problems, such as nurse rostering, bin packaging, course scheduling, ... Only when someone tells them what Drools Solver does, they react with "yes, that's what I am looking for".

According to wikipedia, a Solver is a software tool for 'solving' mathematical systems of equations.
  • Drools Solver is specifically designed for planning problems, so the term "solver" is too broad.

  • It's not equations based, but score rules based (which is more developer friendly).

  • It's confusing with the Excel solver or openoffice.org solver, which do solve equations.

So Solver is not the best name.

So I propose to rename it to "Drools Planner".
  • Pro:

    • "Planning problem" is the only common name I can think of for its use cases.

    • A good meme could be "Drools Planner optimizes your resources."

  • Against:

    • Maven dependencies change from drools-solver to drools-planner

    • Need to replace all imports (but that will be clearly documented in UpgradeFromPreviousVersionRecipe.txt).

Do you think it's a good idea to rename "Drools Solver" to "Drools Planner"?
Or maybe you can think of a better name?

Thursday, November 19, 2009

Guvnor Portuguese Brazilian Translation

We would like to thank Hugo Amaral from Brazil, who has been contributing the Brazilian Portuguese translation for Guvnor.



At the moment Guvnor can be used with the following languages:
- English
- Japanese
- Simplified Chinese
- Spanish
- Portuguese (Now on the trunk)

It would be nice to see German on the list, so if any contributor in Germany can take the lead on that it would be great!

Two great pieces of news from the Drools team: Mark Proctor to join RuleML as a Director and US Navy and OSDE community involvement.

Two great pieces of news from the Drools team: Mark Proctor to join RuleML as a Director and US Navy and OSDE community involvement.
(pdf version)

Mark Proctor has been invited to join RuleML as a director, to which he has graciously accepted. RuleML is best known for its work in relation to standards, both the RuleML standards and W3C's RIF standard. However the remit of the RuleML group extends much further than this, as was seen at this year’s RuleML Symposium. RuleML is more than just rule standards, its interests cover a wide variety of topics from event processing to the semantic web and it is a working group for general logic based collaboration and knowledge sharing. The Symposium event is in itself a great opportunity for researchers and engineers to meet and discuss the future, as well as listen to other leaders in the field. While other conferences have shrunk in size, it was good to see RuleML continue to grow again this year and it's always a pleasure to see students giving Drools related talks. RuleML along with the October Rules Festival are leading the way for research and engineering related events.

At the same time we are pleased to announce that the Drools team will be tripling its core community size from 5 to 15 developers thanks to development projects within the Department of Defense and OSDE (Argentina's largest healthcare organisation). A US Navy lead Clinical Decision Support research effort will be coordinating the work of its 5 full time developers with those of the core Drools team. OSDE will similarly align 4 full time and two part time (50%) developers. All members have the common goal of building the ultimate in Enterprise Decision Management and Business Automation; their commitment to community coordination is appreciated.
The coordinated work will focus on building enterprise tooling and capabilities with Mark Proctor providing subject matter expertise. The continued work of unifying and integrating rules, workflow and event processing for seamless use will remain a central tenet for all aspects of the development. Activities include extending existing authoring metaphors, adding powerful templating capabilities and meta authoring features, building sophisticated deployment and runtime management systems, enhancing the existing WS-HT based human task system for the demanding needs of the healthcare industry and completing the current BPMN2 solution. There will also be a focus of work around grid, cloud and interactive debugging, simulating and testing.

Healthcare is a demanding industry that often rides the cutting edge of technology. The recent advances of Drools, which brings fully unified and integrated workflow and event processing capabilities to the table, with our already powerful rule engine, positions Drools as a disruptive and enabling piece of technology for the Healthcare industry. In support of this and in true Open Source tradition of “We can do more when we work together” we are pleased to welcome these new community members.

For the Drools team, and the entire community involved in our project, things have never been more interesting and the future never looked so bright.

Sponsoring community involvement is one of the best ways to take control of your IT investment and get a great ROI, with the accelerated results in Drools helping you to build a more flexible and agile business. If you would like to follow in the footsteps of OSDE and the US Navy and get involved, please don't hesitate to contact me - mproctor at codehaus d0t org.

Mark Proctor
Drools Project and Community Lead.

Pacman and the importance of BetaNode sharing - Rete Explained

The pacman.drl is starting to shape up, I just added in the additional logic to have pacman slow down during eating. The example is starting to show the value of using a rule engine and hopefully I can use this to explain the interesting characteristics of Rete network and node sharing.

Originally I had just two rules, one that detects when pacman eats normal food and another for when he eats a power pill.
rule EatFood dialect "mvel" no-loop when
$char : Character( name == "Pacman" )
$l : Location( character == $char )
$target : Cell( row == $l.row, col == $l.col)
$contents : CellContents( cell == $target, cellType == CellType.FOOD )
$s : Score()
then
modify( $contents ) { cellType = CellType.EMPTY };
modify( $s ) { score += 1 };
end

rule EatPowerPill dialect "mvel" no-loop when
$char : Character( name == "Pacman" )
$l : Location( character == $char )
$target : Cell( row == $l.row, col == $l.col)
$contents : CellContents( cell == $target, cellType == CellType.POWER_PILL )
$s : Score()
then
modify( $contents ) { cellType = CellType.EMPTY };
modify( $s ) { score += 5 };
end
Notice those two rules share the first three patterns, but not the forth. This means that the evaluation for that logic only happens once, but works for both rules
I then added a third rule that has monster collision detection. That rule only shares the first pattern. While in the current set of rules only the first pattern is shared here, actually this rule has a larger set of sharing with rules in other packages.
rule MonsterCollision dialect "mvel" no-loop when
$pac : Character( name == "Pacman" )
$pacLoc : Location( character == $pac )
$mon : Character( name == "Monster" )
$monLoc : Location( character == $mon, col == $pacLoc.col, row == $pacLoc.row )
$t : Tick()
Then much later I thought about the logic to slow pacman down and added that. What I like about this is I was able to think about this logic in isolation, without worrying about the other rules.
rule SlowWhenEating dialect "mvel" no-loop when
$char : Character( name == "Pacman" )
$l : Location( character == $char )
$target : Cell( row == $l.row, col == $l.col)
$contents : CellContents( cell == $target, cellType == CellType.FOOD || == CellType.POWER_PILL )
$update : ScheduledLocationUpdate( character == $char )
then
modify ( $update ) { tock += 2 };
end
This rule adds a few more tocks to the current scheduled location update, effectively adding in a small delay that is perceived as Pacman slowing down. The rule also shares the first three patterns again, with a nice compact syntax for the fourth pattern. But then I thought, hang on the other two rules it shares with, one checks FOOD and the other a POWER_PILL and the logic is also mutually exclusive. If I was to use the 'or' conditional element it would actually generate 2 rules, one for each branch of the logic, and this would allow each to share the fourth pattern. I then changed the rule to this:
rule SlowWhenEating dialect "mvel" no-loop when
$char : Character( name == "Pacman" )
$l : Location( character == $char )
$target : Cell( row == $l.row, col == $l.col)
(or $contents : CellContents( cell == $target, cellType == CellType.FOOD )
$contents : CellContents( cell == $target, cellType == CellType.POWER_PILL ) )
$update : ScheduledLocationUpdate( character == $char )
then
modify ( $update ) { tock += 2 };
end
If we look at those nodes in the Rete viewer, we get something like below:

(click to enlarge)


The first think you'll notice is there are 4 black terminal nodes, yet we have 3 rules. That's because of the 'or', remember an 'or' conditional element actually uses a series of logic transformations to remove the 'or's and instead replacing them with rules that represent each possible outcome - all resulting rules are independent of each other and can match and fire, so be careful as this does not have the same behaviour as an pattern infix '||'.

All four rules share the first pattern:
$char : Character( name == "Pacman" )
The leftmost blue alpha node (1) constraining 'name == "pacman"' is the root node for all rules, so it's tested once and true for all for rules. The connecting yellow node is the left input adapter, which is necessary for the first pattern to allow it to propagate to the green beta nodes.

Terminal 3 is the "MonsterCollision" rule, other than the first shared node, notice that all other patterns, represented by the green beta nodes, which are of the join node type, are independent and exclusive to that rule.

The join node (2) represents the three patterns shared by the "EatFood", "EatPowerPill" and "SlowWhenEating" rules which constrains to the correct Cell:
Cell( row == $l.row, col == $l.col)
At this point we have a split. 4a and 4b are the two possible outcomes of the "SlowWhenEating" rule, due to the 'or' conditional element. So each shares the 4th pattern, one for FOOD the other for POWER_PILL. Notice that while both outcomes have the pattern the node(5 and 6) is repeated twice :
$update : ScheduledLocationUpdate( character == $char )
That's because the sharing only happens while the sequences of patterns are the same from the root pattern, once the split occurs the network stays split.

Finally you'll notice 7a and 7b, which refer to the Score pattern of the "EatFood" and "EatPowerPill" rules.

I hope that has given a bit of insight into both Rete and beta node sharing works as well as the current Pacman implementation.

Wednesday, November 18, 2009

Complex Event Processing (CEP) - The industry that never should have happened (Part 1)

PART 1

CEP is all the rage these days, everyone has to have one. I'm going to be lazy and just quote Wikipedia as to what CEP is, so I can quickly head into the meat of this article:
Complex Event Processing, or CEP, is primarily an event processing concept that deals with the task of processing multiple events with the goal of identifying the meaningful events within the event cloud.

CEP employs techniques such as detection of complex patterns of many events, event correlation and abstraction, event hierarchies, and relationships between events such as causality, membership, and timing, and event-driven processes.”
(Wikipedia)

Some CEP examples:
  • When the average price of a stock falls below $25 over any 5 minute period, then sell.
  • When 2 transactions happen on an account from radically different geographic locations within a certain time window then report as potential fraud.
  • When a gold customer's trouble ticket is not resolved within 1 hour, then escalate.
  • When a team meeting request overlaps with my lunch break, then deny the team meeting and demote the meeting organizer.
When talking to people about CEP the second question they ask is, “aren't CEP statements just rules?” The first being “What's CEP?”. They struggle to understand why we have two separate industries and approaches to what appears on the surface to be the same – i.e. when some scenario/situation in your data is detected, do something.

The reason for this is simple... they are right. Let me explain why <tongue in cheek>CEP is the industry that never should have happened</tongue in cheek>.

While CEP has become the adopted term for this industry, there is some general concern about it's inappropriateness, just google for “Tim Bass” and you'll see plenty of comments on this. CEP is in actuality a huge research area and many CEP products, including Drools, only touch a fraction of what is possible. David Luckham's book “The Power of Events: An Introduction to Complex Event Processing in Distributed Enterprise Systems” is considered the “bible” on this subject and shows how indeed we are only just scratching the surface. I had the luck to be at RuleML2008 where David was a keynote speaker, which was very inspirational.

Most of the engines in the market currently focus on what is actually event stream processing (ESP), which many consider to be a subset of CEP. Again Wikipedia to the rescue:

“ESP deals with the task of processing multiple streams of event data with the goal of identifying the meaningful events within those streams, employing techniques such as detection of complex patterns of many events, event correlation and abstraction, event hierarchies, and relationships between events such as causality, membership, and timing, and event-driven processes.”
(Wikipedia)

Tim Bass has a nice presentation on this subject “Mythbusters: Event Stream Processing v. Complex Event Processing”. While most of the industry is actually ESP, for continuity I'll continue to refer to it as CEP throughout this article.

So now back to the subject “CEP is the industry that never should have happened”. Over ten years ago ILog introduced the event management and alarms capability into their flagship rule engine product, and things where good, this really was cutting edge stuff for a mainstream product at that time. This allowed for temporal event correlation, as a natural extension to their existing rule language, with fully managed life cycles; so the user didn't have to worry about retracting all their objects and the engine memory blowing up. This wasn't a separate engine, it was an integral part of their existing engine. The same way that truth maintenance or other advanced features are part of the engine. ILog had some high profile customers and rapid growth in the telecoms market, which was the target market for this functionality. At the same time they also had a whole host of really interesting products as the result of ongoing artificial intelligence (AI) research.

At the Business Rules Forum 2008 (BRF08) I had the pleasure of spending some time with the ILog guys – it's great when we can put the marketing and competition to one side, and just chat as engineers. We talked about all the really cool things that ILog R&D had produced, but have had to trim back, or put on hold, over the last 15 years, as business focuses changed. It was all fascinating stuff. So what happened? Two things. The AI winter happened where some of the hype cooled, as reality hit. It became obvious that while interesting some of these things only worked well in a very narrow scope, making it hard to build a growing business around. While event management did have a strong and broad benefit the AI winter was shortly followed by the telecoms downturn and spending on research and new IT projects was vastly reduced. As the telecoms downturn set in, Business Rules started to emerge and a change in company focus happened as they realized the financial potential. Rather than pushing for more AI features, the push was more for simplification aimed at something business analysts can use for business automation and decision management. Slowly the advanced AI features were put on pause, slimmed down or removed. Apparently many cool features are still retained in the capabilities of the engine today, but just not exposed to the end user. The event management was put into maintenance and while it's still in the product today, it is no longer the cutting edge of technology that it once was.

Fast forward to 2009 and CEP is now everywhere, everyone understands it, wants and needs it. A whole new industry has grown out of this with companies such as TIBCO, StreamBase and EsperTech. Many of the CEP companies denounce rule engines, and the Rete algorithm especially, as out dated technologies and not suitable for CEP, promoting their own specialist algorithms. These systems tend to be based around an extended SQL, called “Streaming SQL” that detects event patterns in streams and executes actions. These tools do not tend to offer inferencing or provide capabilities for reasoning over sets of data, such as that found in a decision table. TIBCO, one of the main players on the field, unlike many of their rivals do implement their technology on an enhanced Rete algorithm and able to offer full rule engine capabilities, beyond just event processing.

At BRF08, to the ILog engineer's, I hypothesised that had ILog stayed the course and continued to develop and promote their event management capabilities, the CEP industry as standalone from rule engines may have never occurred. Ilog could have have emerged the leader, being the early entrant and market creator. I had great fun pointing this out to the ILog engineer's and joking that CEP is the industry that never should have happened. :)

Part 2 of this article will be a lot more technical showing how Drools, originally a Rete based rules engine, was easily, cleanly and orthogonally extended as a platform for (complex) event processing.

Drools does Pacman

In the interests of finding a fun and more complex problem with multiple things happening, I decided to start writting a Pacman implementation. The basics are now in place, in that I can load a grid and guide a Pacman around it with a Monster (Ghost) tracking it.


The grid is loaded from a text file that uses symbols to map the layout, currently I use a very simple layout that looks like this:
* * * * * * * * * * *
* # . . . _ . . . # *
* . * * * * * * * . *
* . * * * * * * * . *
* . . . . # . . . . *
* . * * * * * * * . *
* . * * * * * * * . *
* . . . . # . . . . *
* . * * * * * * * . *
* . * * * * * * * . *
* # . . . _ . . . # *
* * * * * * * * * * *

* Wall
. Food
# Power Pill
_ Empty

When the game starts off Pacman is in the lower empty cell and the Monster in the top empty cell. The arrow keys move Pacman around and the Monster tracks the Pacman. The rules are split into four drl files; base, key-handlers, Pacman and Monster.

A KeyListener implementation is hooked up to a WorkingMemory EntryPoint and feeds in key presses. From the KeyEvent it creates a derived (not in WorkingMemory) possible Direction and validates that Direction. If the new Direction is valid the old Direction is retracted and the new one inserted. The exitpoint is used to send print information to a channel, which is appended to the GUI.
rule KeyListenerRule dialect "mvel" when
$keyEvent : KeyEvent() from entry-point "KeyListener"
$char : Character( name == "Pacman" )
$l : Location( character == $char )
$newD : Direction() from createDirection( $l.character, $keyEvent )
$target : Cell( row == ($l.row + $newD.vertical), col == ($l.col + $newD.horizontal) )
CellContents( cell == $target, cellType != CellType.WALL )
$oldD : Direction( character == $l.character )
then
exitPoints["ConsoleExitPoint" ].insert( "insert " + $newD + "\n" );
retract( $keyEvent );
retract( $oldD );
insert( $newD );
end
As the Tick, simulated time, increases we attempt to change a Character's Location based on the given Direction. The rule makes sure the new Location is valid, and if so schedules the move to the new Location, in time with the Tick.
rule validDirection dialect "mvel" when
$l : Location( )
$d : Direction( character == $l.character )
$target : Cell( row == ($l.row + $d.vertical), col == ($l.col +
$d.horizontal) )
CellContents( cell == $target, cellType != CellType.WALL )
not ScheduledLocationUpdate( location == $l )
$t : Tick()
then
insert( new ScheduledLocationUpdate($l, $l.row += $d.vertical,
$l.col += $d.horizontal, $t.tock + 1) );
end

rule setNewDirection dialect "mvel" when
$s : ScheduledLocationUpdate()
$l : Location( this == $s.location )
Tick( tock == $s.tock )
then
exitPoints["ConsoleExitPoint"].insert( "set new Location " + $l + "\n" );
modify( $l ) { row = $s.row, col = $s.col };
retract( $s );
end
As the pacman moves around it detects the CellContents, if it's Food it'll increase the score by 1.
rule EatFood dialect "mvel" no-loop when
$char : Character( name == "Pacman" )
$l : Location( character == $char )
$target : Cell( row == $l.row, col == $l.col)
$contents : CellContents( cell == $target, cellType == CellType.FOOD )
$s : Score()
then
modify( $contents ) { cellType = CellType.EMPTY };
modify( $s ) { score += 1 };
end
Among other things it's also looking out for Monster collisions.
rule MonsterCollision dialect "mvel" no-loop when
$pac : Character( name == "Pacman" )
$pacLoc : Location( character == $pac )
$mon : Character( name == "Monster" )
$monLoc : Location( character == $mon, col == $pacLoc.col, row == $pacLoc.row )
$t : Tick()
then
retract( $t );
end

rule FinishedKilled dialect "mvel" when
$pac : Character( name == "Pacman" )
$pacLoc : Location( character == $pac )
$mon : Character( name == "Monster" )
$monLoc : Location( character == $mon, col == $pacLoc.col, row == $pacLoc.row )
not Tick()
$s : Score()
then
exitPoints["ConsoleExitPoint"].insert( "Killed!!!! score = " + $s.score + " \n" );
kcontext.knowledgeRuntime.halt();
end
The implementation currently uses a simple distance diff from the Monster to the Pacman to determine the Monster direction. The direction must be valid and if both a horizontal and a vertical direction is valid it uses dynamic salience to pick the one with the highest difference. This is a simplistic approach, just to get the ball rolling, ideally we would implement the logic as in the original arcade game.
rule GoRight dialect "mvel"  salience (Math.abs( $df.colDiff ))  when
$df : DirectionDiff(colDiff > 0 )
$target : Cell( row == $df.row, col == ($df.col + 1) )
CellContents( cell == $target, cellType != CellType.WALL )
$d : Direction( character == $df.fromChar, horizontal != Direction.RIGHT)
then
retract( $d );
retract( $df );
insert( new Direction($df.fromChar, Direction.RIGHT, 0 ) );
end

rule GoDown dialect "mvel" salience (Math.abs( $df.rowDiff )) when
$df : DirectionDiff(rowDiff < 0 )
$target : Cell( col == $df.col, row == ($df.row - 1))
CellContents( cell == $target, cellType != CellType.WALL )
$d : Direction( character == $df.fromChar, vertical != Direction.DOWN)
then
retract( $d );
retract( $df );
insert( new Direction($df.fromChar, 0, Direction.DOWN ) );
end
Running the game and pressing the left arrow gives the following output. Notice Pacman moves to the left and stops when he reaches the wall, the Monster tracks him to the left and then comes down for the kill.
insert Direction Pacman speed = 5 LEFT
set new Location Location Monster speed = 3 10:4
set new Location Location Pacman speed = 5 1:4
set new Location Location Monster speed = 3 10:3
set new Location Location Pacman speed = 5 1:3
set new Location Location Monster speed = 3 10:4
set new Location Location Monster speed = 3 10:3
set new Location Location Pacman speed = 5 1:2
set new Location Location Monster speed = 3 10:2
retract Direction Monster speed = 3 LEFT
set new Location Location Monster speed = 3 10:1
set new Location Location Pacman speed = 5 1:1
retract Direction Pacman speed = 5 LEFT
set new Location Location Monster speed = 3 9:1
set new Location Location Monster speed = 3 8:1
set new Location Location Monster speed = 3 7:1
set new Location Location Monster speed = 3 6:1
set new Location Location Monster speed = 3 5:1
set new Location Location Monster speed = 3 4:1
set new Location Location Monster speed = 3 3:1
set new Location Location Monster speed = 3 2:1
set new Location Location Monster speed = 3 1:1
Killed!!!! score = 8
I've committed everything to drools-examples, you'll need drools trunk, as there are a few fixes necessary for this to work:
trunk/drools-examples/drools-examples-drl/src/main/java/org/drools/examples/pacman/
trunk/drools-examples/drools-examples-drl/src/main/resources/org/drools/examples/pacman/
trunk/drools-examples/drools-examples-drl/src/main/rules/org/drools/examples/pacman/

It's still very basic. Next it needs to be hooked up to a GUI, such as SwtPacman, the source code of which is provided here on a wiki page where you can also add notes:
http://www.jboss.org/community/docs/DOC-14378

Then it should be updated to real Pacman grid layouts and all the monsters added, each with it's own custom logic. There is also additional logic like Ghosts slowing down when they turn corners and Pacman slowing down when he eats food. You can find out all the details at Wikipedia, here.
Ghost Color Original Pac Man[13] American Pac-Man
Character (Personality) Translation Nickname Translation Alternate
character
Alternate
nickname
Character (Personality) Nickname
Red Oikake (????) chaser Akabei (???) red guy Urchin Macky Shadow Blinky
Pink Machibuse (????) ambusher Pinky (????) pink guy Romp Micky Speedy Pinky
Cyan Kimagure (????) fickle Aosuke (??) blue guy Stylist Mucky Bashful Inky
Orange Otoboke (???) stupid Guzuta (???) slow guy Crybaby Mocky Pokey Clyde


I hope everyone finds this useful, and if anyone wants to help me finish this please dive straight in :)

Sunday, November 15, 2009

Rete DSL testing harness

As our Rete implementation gets more complicated we need to find easier and more maintainable ways to test our node implementations. Currently we have unit tests for all our nodes done in pure java, setting these up is laborious and because of the amount of java code involved makes it hard to read the intention. I'm finding that we are relying more on higher level integration tests, which is lazy and not as good as catching the problems earlier and in a more isolated manner. Take for example the JoinNode, this is the sample code necessary to setup a node for testing and apply some assertion tests:
   public void setUp() {
// create mock objects
constraint = mockery.mock( BetaNodeFieldConstraint.class );
final ContextEntry c = mockery.mock( ContextEntry.class );

// set mock objects expectations
mockery.checking( new Expectations() {
{
// allowed calls and return values
allowing( constraint ).createContextEntry();
will( returnValue( c ) );

allowing( c ).updateFromFactHandle( with( any( InternalWorkingMemory.class ) ),
with( any( InternalFactHandle.class ) ) );
allowing( c ).updateFromTuple( with( any( InternalWorkingMemory.class ) ),
with( any( LeftTuple.class ) ) );
allowing( c ).resetTuple();
allowing( c ).resetFactHandle();
}
} );

this.rule = new Rule( "test-rule" );
this.context = new PropagationContextImpl( 0,
PropagationContext.ASSERTION,
null,
null,
null );
this.workingMemory = new ReteooWorkingMemory( 1,
(ReteooRuleBase) RuleBaseFactory.newRuleBase() );

this.tupleSource = new MockTupleSource( 4 );
this.objectSource = new MockObjectSource( 4 );
this.sink = new MockLeftTupleSink();

final RuleBaseConfiguration configuration = new RuleBaseConfiguration();

ReteooRuleBase ruleBase = (ReteooRuleBase) RuleBaseFactory.newRuleBase();
BuildContext buildContext = new BuildContext( ruleBase,
ruleBase.getReteooBuilder().getIdGenerator() );

this.node = new JoinNode( 15,
this.tupleSource,
this.objectSource,
new DefaultBetaConstraints( new BetaNodeFieldConstraint[]{this.constraint},
configuration ),
Behavior.EMPTY_BEHAVIOR_LIST,
buildContext );

this.node.addTupleSink( this.sink );

this.memory = (BetaMemory) this.workingMemory.getNodeMemory( this.node );

// check memories are empty
assertEquals( 0,
this.memory.getLeftTupleMemory().size() );
assertEquals( 0,
this.memory.getRightTupleMemory().size() );

}

public void testRetractTuple() throws Exception {
// set mock objects expectations
mockery.checking( new Expectations() {
{
// allowed calls and return values
allowing( constraint ).isAllowedCachedLeft( with( any( ContextEntry.class ) ),
with( any( InternalFactHandle.class ) ) );
will( returnValue( true ) );
allowing( constraint ).isAllowedCachedRight( with( any( LeftTuple.class ) ),
with( any( ContextEntry.class ) ) );
will( returnValue( true ) );
}
} );

// setup 2 tuples 3 fact handles
final DefaultFactHandle f0 = (DefaultFactHandle) this.workingMemory.insert( "test0" );
this.node.assertObject( f0,
this.context,
this.workingMemory );

final DefaultFactHandle f1 = (DefaultFactHandle) this.workingMemory.insert( "test1" );
final LeftTuple tuple1 = new LeftTuple( f1,
this.node,
true );
this.node.assertLeftTuple( tuple1,
this.context,
this.workingMemory );

final DefaultFactHandle f2 = (DefaultFactHandle) this.workingMemory.insert( "test2" );
final LeftTuple tuple2 = new LeftTuple( f2,
this.node,
true );
this.node.assertLeftTuple( tuple2,
this.context,
this.workingMemory );

final DefaultFactHandle f3 = (DefaultFactHandle) this.workingMemory.insert( "test3" );
this.node.assertObject( f3,
this.context,
this.workingMemory );

final DefaultFactHandle f4 = (DefaultFactHandle) this.workingMemory.insert( "test4" );
this.node.assertObject( f4,
this.context,
this.workingMemory );

assertLength( 6,
this.sink.getAsserted() );

// Double check the item is in memory
final BetaMemory memory = (BetaMemory) this.workingMemory.getNodeMemory( this.node );
assertTrue( memory.getRightTupleMemory().contains( f0.getFirstRightTuple() ) );

// Retract an object, check propagations and memory
this.node.retractRightTuple( f0.getFirstRightTuple(),
this.context,
this.workingMemory );
assertLength( 2,
this.sink.getRetracted() );

List tuples = new ArrayList();
tuples.add( ((Object[]) this.sink.getRetracted().get( 0 ))[0] );
tuples.add( ((Object[]) this.sink.getRetracted().get( 1 ))[0] );

assertTrue( tuples.contains( new LeftTuple( tuple1,
f0.getFirstRightTuple(),
this.sink,
true ) ) );
assertTrue( tuples.contains( new LeftTuple( tuple1,
f0.getFirstRightTuple(),
this.sink,
true ) ) );

// Now check the item is no longer in memory
assertFalse( memory.getRightTupleMemory().contains( f0.getFirstRightTuple() ) );

this.node.retractLeftTuple( tuple2,
this.context,
this.workingMemory );
assertEquals( 4,
this.sink.getRetracted().size() );

tuples = new ArrayList();
tuples.add( ((Object[]) this.sink.getRetracted().get( 2 ))[0] );
tuples.add( ((Object[]) this.sink.getRetracted().get( 3 ))[0] );

assertTrue( tuples.contains( new LeftTuple( tuple2,
f3.getFirstRightTuple(),
this.sink,
true ) ) );
assertTrue( tuples.contains( new LeftTuple( tuple2,
f4.getFirstRightTuple(),
this.sink,
true ) ) );
}
I think everyone agrees that's a lot of code and hard for anyone, especially noobies, to understand it's intent.

This means that developers can be apathetic when adding more similar tests for edge cases and we have a long term maintenance problem when bringing new developers on board.

Enter the "Rete DSL testing harness". This is an indentation based DSL for setting up and testing nodes. My plan is next to have it working with JUnit4 with a customised test suite. Hopefully everyone can understand what this is doing, which is actually doing and testing more than the above java code.
// setup the nodes
ObjectTypeNode
otn1, java.lang.Integer
LeftInputAdapterNode
lian0, otn1
ObjectTypeNode
otn2, java.lang.Integer
ObjectTypeNode
otn3, java.lang.Integer

// creating a binding to be used in the JoinNode creation
Binding
p1, 0, java.lang.Integer, intValue

JoinNode
join1, lian0, otn2
intValue, !=, p1
JoinNode
join2, join1, otn3
intValue, !=, p1

//insert some facts, this returns and stores an array called "h"
Facts
0, 1, 2, 3, 4

// h\d+ is used for compactness (not too many brackets) but is internally rewritten
// as h[\d+] and evaluated with MVEL against a "h"
assert
otn1 [h1, h3]
otn2 [h0, h2]
otn3 [h4]

// we can now test some memories, memory order is deterministic
join1
leftMemory [[h1], [h3]] // matches with only one fact
rightMemory [h0, h2]
join2
leftMemory [[h1, h0], [h3, h0],
[h1, h2], [h3, h2]] // matches with two chained facts
rightMemory [h4]
retract
otn1 [h1]
otn2 [h2];
join1
leftMemory [ [h3] ]
rightMemory [h0]
join2
leftMemory [[h3, h0]]
rightMemory [h4]

Friday, November 13, 2009

Are we there yet?

This presentation by Rich Hickey, published at InfoQ, has so much to do with the things I am researching for Drools now and summarizes so well some of the important notions that we will have to deal with in the near future that I highly recommend it:

Are We There Yet?

Money quotes:

"The future is a function from the past, it doesn't change it."

"Time is atomic, epochal succession of process events."

"There is no such thing as mutable objects. If you can really believe in that, you can build better systems."

If you want to understand the above quotes, watch the presentation. :)

Edson

Monday, November 09, 2009

What is inference and how does it facilitate good rule design and maintenance

Inference has a bad names these days, as something not relevant to business use cases and just too complicated to be useful. It is true that contrived and complicated examples occur with inference, but that should not detract from the fact that simple and useful ones exist too. But more than this, correct use of inference can crate more agile and less error prone businesses with easier to maintain software.

So what is inference? Something is inferred when we gain knowledge of something from using previous knowledge. For example given a Person fact with an age field and a rule that provides age policy control, we can infer whether a Person is an adult or a child and act on this.
rule "Infer Adult"
when
$p : Person( age >= 18 )
then
insert( new IsAdult( $p ) )
end
So in the above every Person who is 18 or over will have an instance of IsAdult inserted for them. This fact is special in that it is known as a relation. We can use this inferred relation in any rule:
    $p : Person()
IsAdult( person == $p )
In the future we hope to improve our language so you can have special handling of known relation facts, so you can just do following and the join is implicit:

Person() IsAdult( )
So now we know what inference is, and have a basic example, how does this facilitate good rule design and maintenance?

Let's take a government department that are responsible for issuing ID cards when children become adults, hence forth referred to as ID department. They might have a decision table that includes logic like this, which says when an adult living in london is 18 or over, issue the card:


However the ID department does not set the policy on who an adult is. That's done at a central government level. If the central government where to change that age to 21 there is a change management process. Someone has to liaise with the ID department and make sure their systems are updated, in time for the law going live.

This change management process and communication between departments is not ideal for an agile environment and change become costly and error prone. Also the card department is managing more information than it needs to be aware of with its "monolothic" approach to rules management which is "leaking" information better placed else where. By this I mean that it doesn't care what explicit "age >= 18" information determines whether someone is an adult, only that they are an adult.

Instead what if we were to split (de-couple) the authoring responsibility, so the central government maintains its rules and the ID department maintains its.

So its the central governments job to determine who is an adult and if they change the law they just update their central repository with the new rules, which others use:


The IsAdult fact, as discussed previously, is inferred from the policy rules. It encapsulates the seemingly arbitrary piece of logic "age >= 18" and provides semantic abstractions for it's meaning. Now if anyone uses the above rules, they no longer need to be aware of explicit information that determines whether someone is an adult or not. They can just use the inferred fact:


While the example is very minimal and trivial it illustrates some important points. We started with a monolithic and leaky approach to our knowledge engineering. We create a single decision table that had all possible information in it that leaks information from central government that the ID department did not care about and did not want to manage.

We first de-coupled the knowledge process so each department was responsible for only what it needed to know. We then encapsulated this leaky knowledge using an inferred fact IsAdult. The use of the term IsAdult also gave a semantic abstraction to the previously arbitrary logic "age >= 18".

So a general rule or thumb when doing your knowledge engineering is:

Bad
  • Monolithic
  • Leaky
Good
  • De-couple knowledge responsibilities
  • Encapsulate knowledge
  • Provide semantic abstractions for those encapsulations

Sunday, November 08, 2009

Drools Solver at Devoxx (Javapolis)

Devoxx (AKA Javapolis) is little more then a week away.

I 'll be holding a BOF presentation about the examination timetabling problem and how I 've solved it with Drools Solver. So you are welcome at my BOF on monday 16 november at 20:00.

Here's a preview of one the slides, which shows a tiny examination instance:



Update: I 've posted an article on DZone which explains that slide in detail.

Monday, November 02, 2009

Monitoring your Drools Flow processes

You need to actively monitor your processes to make sure you can detect any anomalies and react to unexpected events as soon as possible. Business Activity Monitoring (BAM) is concerned with real-time monitoring of your processes and the option of intervening directly, possibly even automatically, based on the analysis of these events.

There are numerous technical ways to monitor your processes, and this blog will describe two options: analyzing the low-level process events emitted by the process engine or using custom business events. Finally, a preview screencast on the BAM web-console is presented.

Analyzing low-level process events

Drools Flow can be configured to emit events about the execution of your processes (start / stop) and each of the nodes inside (triggered / left). Using Drools Fusion, these events could be processes using event processing rules (CEP) to detect anomalies, derive higher-level business events, etc. To start processing these (low-level, generic) events, add a process listener to the session that forwards all related process events to a session responsible for processing these events (this could be the same session as the one executing the processes, or an entirely independent one).

You can then define CEP rules that process these low-level events. For example, the following rule that accumulates all start process events for one specific order process over the last hour, using the sliding window support. This rule prints out an error message if more than 1000 process instances were started in the last hour (e.g., to detect a possible overload of the server).
declare ProcessStartedEvent
@role( event )
end

rule "Number of process instances above threshold"
when
Number( nbProcesses : intValue > 1000 )
from accumulate(
e: ProcessStartedEvent( processInstance.processId == "com.sample.order.OrderProcess" )
over window:size(1h),
count(e) )
then
System.err.println( "WARNING: Nb of order processes in the last hour > 1000: " + nbProcesses );
end
Defining custom business events

While processing generic, low-level process events could allow you to derive higher-level business events, defining these derivation rules could be complex. In many cases, people simply want to annotate their process with meta-data that indicates when specific business events are happening. For example, one node in the process might be annotated as a "New Customer" event to indicate that, when processing that node, we are actually registering a new customer. Similar annotations could be used to annotate all nodes that fall under the "Inform Customer" category, etc. During the execution of the process, this meta-data can then be used to generate higher-level business events.

First requirement is then being able to annotate nodes with custom meta-data. Luckily, the BPMN2 specification provides an extensibility mechanism that allows you to add custom extensions to the specification, like for example in our case for monitoring meta-data. The Drools XML framework also supports plugging in custom XML handlers, so this allows us to handle these custom XML tags and add them as meta-data to the nodes.

For example, nodes in a BPMN2 process could then be annotated with this (very simple) custom monitoring data:
<userTask id="_15" name="Inform" implementation="humanTaskWebService" >
<bam:event name="Inform" type="onEntry" data="#{request.customerId}" />
...
</userTask>
A custom event listener can then use this meta-data to derive when these business events should be created and processed.

Finally, an event processing rule can use these higher-level events to derive crucial monitoring information, for example that a user has not received any information in a time period of 6 hours after the initial processing of his request:
rule "Verify time after request"
when
start: BAMEvent( name == "Process" )
not ( BAMEvent( name == "Inform", this after[0h,6h] start ) )
then
System.out.println("Customer not informed for over 6h!");
end


BAM console

Finally, monitoring information like the one derived above should not just be printed out to the console, but displayed using charts, graphs, etc. The Service Activity Monitoring (SAM) project is planning to offer just that. A GWT-based web console allows you to view these charts, and a simple example was recently presented. We have adapted this example, to generate a chart that continuously shows the number of started process instances. While its functionality is still very limited, I hope this already shows the direction we're going, and we hope to extend it steadily.