Your hosts: Mark Proctor, Michael Neale, Edson Tirelli and Kris Verlaenen.

Wednesday, July 08, 2009

A take on Drools Boot Camp 09...

These are my notes and pictures - I am sure others will have notes and pictures to post (if you do have pictures, feel free to email them to me and I will share them here, or links to Flickr or whatever...).

First day, "state of the union":

We had a few kind-of presentations each day, but mostly unstructured. I think it worked pretty well for the most part.

Also had JavaOne duties:

Just a few of the interesting things I worked on with others:

Franklin American crew:

  • a Guvnor plugin to persist rule data to a RDBMS in a reasonably normalised format (that they could mine, use for other things)
  • a design for "smart views" - searches that are persistent based on category, status etc, which can be used to build/run tests and "deploy" (so you are only deploying certain rules in a certain status) - selectors on steroids.
NHIM:
Talked with Emory Fry about a standard rule based "platform" for the National Health Information Network - the idea being a common model, execution server and environment so that rules can be sent as messages, and processed close to where the data lives (rather then having to pull all the data around).

Michael Finger worked on making Jackrabbit clusterable for failover (I expect a blog from him sometime !).

A general point I took away was the different ways people will want to integrate and re-use your code, often in unexpected ways. I need to be more vigilant in leaving clear interfaces, with documentation for others to use, and think more about plug in architectures that allow people to extend things in an safe and upgrade friendly way.

For some R&R some of us went on an epic ride across the Golden Gate Bridge:


Kris Verlaenen tried to kill us a few times, taking us down dead end hills only to ride back up. Kris is quite fit and this meant nothing to him ;) Another fact about Kris: please address him as "Dr Kris" as he actually has a doctorate.

I look forward to next time.

Thursday, July 02, 2009

Drools Job requested - Buenos Aires Argentina

Hi guys and girls out there, my name is Mauricio Salatino (a.k.a. Salaboy), I'm working at a health care company in Argentina, located in Capital Federal, Buenos Aires. I've been working, almost full time, with the Drools team as a contributor for about 8 months now. While the work has been very demanding, requiring a lot of initiative and hard work, it's been hugely rewarding and a fantastic experience both professionally and personally.

My employers would like to hire someone else to work with me (located in Capital Federal, Buenos Aires - Argentina).
The ideal skillset is:

- GWT
- Eclipse RPC
- BAM
- Drools Basics
- Tenacity, Passion and Initiative

While the technical skills are important, it is far more important to find someone with the interest and motiviation to work on Drools. While the best mentoring is available to hand, it's very passive and you need to ask - you will not be spoon fed. The ideal candidate will be able work with minimal guidance and direction under their own initiative. Working on Drools and OSS should be a passion that will drive many long coding nights.


If you know GWT and have experience with BPM, BAM, Business Rules, CEP, or if you are already involved with another Open Source project, this is a great chance for you to get involved.

Interested, think you could meet this challenge to work with one of the most exciting OSS teams in the world? If so then email me with your resume and a covering letter explaining why you are prefect for the position. Please use the email subject heading "DROOLS JOB".

Here is my mail: salaboy@gmail.com

Introducing a new smart forms tool: Drools Advisor

A new tool is in the works that we are currently code naming Advisor (open to other ideas ? perhaps Smart Forms?). This idea came about on talking with the good folks at Solnet Solutions in New Zealand (NZ is also known as Middle Earth !) - they had a need to create Questionnaire style form apps quickly - which were generated dynamically, entirely from writing rules (so they could quickly change, and roll out new apps etc). This was a forms engine with a few twists, and sounded very interesting...

In the last couple of months Solnet have been beavering away on this tool, which will be open sourced soon. There are a few components to it, but first lets look at some samples:

1) "Earth departure card":
(iPhone)

(and web):


2) Tax calculator:


To get the above to work in this tool, only a few things were needed: 1) Some rules to express the question logic, 2) CSS styles to make the forms look how you want (everything is style-able) - to fit in with a given look. Its not just dumb form fields either, there are lots of things you can do, the questions can be quite complex structures etc..

If you were writing rules by hand, they would take the form of:


when
#some conditions go here, or none if you like
then
Question question = new Question();
question.setId("flight");
question.setAnswerType("text");
question.setPreLabel("Flight number / Name of ship");
insertLogical(question);

In the condition part - you do whatever you normally do. If there is no condition part, then obviously it executes. In the RHS is where it is interesting: the tool provides some built in model classes which allow you to "ask questions".

You can then use the answers to that question to drive further questions, and so on (creating a chain of logic, like a decision tree) - so it is intelligently asking the user more questions depending on the answers received. (Truth maintenance - logical assertions, are very useful here, as the user could change an answer, and then the items in the "subtree" below it are no longer needed, and will be retracted automatically):


rule "next country"
when
Question(id == "stayingInNZ", answer != "true");
then
Question question = new Question();
question.setId("countryNext");
question.setAnswerType("text");
question.setPreLabel("or");
insertLogical(question);
end
The above says that when the "stayingInNZ" question is answered (to "not true") then ask them for another country etc...


(by clicking on "Yes" radio button, other questions are revealed - this happens via ajax - talking to the server as the user enters data, executing rules etc).

There are a few components at work: A web front end (based on jQuery) and the Advisor logic itself (which, no surprise, is implemented itself in rules). Obviously this isn't locking it in to web front ends (so you could have other front ends or systems answering questions, not just a user via a web browser).


My sketch above tries to convey the different parts. The Advisor logic lives on the server - hosted in the new "execution server" (more on that in a future post) which is a web server that talks XML to a stateful knowledge session (and of course the Advisor is a model + rules itself). In the web case, the "client" is a javascript library, using jQuery that renders questions based on responses from the knowledge session.

This "client" is generic, and can be hosted in any web page or app (doesn't require java) - it can "hang" off a html "div". CSS provides styling for all controls/fonts/colours (I even saw an example where a slider was used to provide a numeric value).

"Pixie Dust": As we were designing this, we realised most of the logic can and should be implemented in built in rules - we affectionately called these "pixie dust" rules - not meant to be edited or viewed by end users, but they do the magic behind the scenes.

There is much more that is possible: for instance mapping from the "answers" coming back from a user to a pojo which you use in your business rules (so you can use the Q&A stuff as a means to request for more data when needed).

Why this is interesting: using rules and truth maintenance to establish chains of logic allows large Q&A apps to be developed that intelligently ask for more data as needed (eg when evaluating someone for life insurance). Of course, there are features built in such as Groups, which allow more traditional "form" and "page" style behavior that people will expect (much more then can be shown here) - and you have the full power of all the normal things you might want to do with Drools.

Next steps: One of the key aims is to have a domain specific GUI in Guvnor to assist with building these apps end to end (guiding the user to make it as easy as is possible) - and adding the Advisor "pixie dust" to any set of rules, and running these apps (and deploying etc..).

I hope this is an interesting introduction, there is more to come...

Tuesday, June 30, 2009

Drools Video Series Released

My name is Ray Ploski. I've been a JBoss Solutions Architect for some time. It's a fantastic job - I get to play with all the technologies produced by JBoss, learn how our customers are using the technologies and to solve real world problems and then share these ideas with you. When I first started to learn more about how people were using Drools in combination with traditional Java development I was blown away with the power and simplicity the technology brought to even the most complex (or even simple) business use cases.

If you wonder "What is a rule engine?", "Why is this different/better to an 'if' statement", "Where do I get started?" or "How do I learn more?" I've released a series of videos to introduce you to the project and its technologies:

Introduction to Drools Expert:
Expert is the core rule engine central to the Drools project. In this video we'll walk through the concepts relating to Facts, Rules, Agendas and Working Memory as well as how one sets up and debugs a Drools project within Eclipse. Learn how to use audit trails. (Video in Hi-Res or Low-Res)

Introduction to Drools Guvnor:
Guvnor is a centralized repository with rich web-based graphical user interfaces to create and manage your logic. In this video we walk through features of searching, creating, testing and deploying knowledge based assests. (Video in Hi-Res or Low-Res)

Introduction to Drools Fusion:
Fusion is an extension to the Drools engine enables Complex Event Processing and Temporal Reasoning. In this video we walk through the concepts introduced with Fusion as well as walk through an example. (Video in Hi-Res or Low-Res)

Introduction to Drools Flow:
Flow provides workflow and process capabilities to Drools. The project tightly integrates rules with business process. In this video we walk through the tooling and concepts involved with flow including step-wise debugging. (Video in Hi-Res or Low-Res)

Hopefully you will share my enthusiasm and excitement of what you can accomplish with Drools. There will be more, short clips added soon on topics such as: Decision Tables, creating your own Work Items, What's new in Drools 5, Commands within Expert, Guvnor/Eclipse Integration. What else would you like to see?

Chatting on IRC - a reminder of web tools available

As people should know, the "virtual office" or "virtual bar" (depending on the time of day) where devs and users hang out is the IRC chat room generously provided by codehaus.org.

Server: irc.codehaus.org
Room: #drools

For those who don't want to or can't run IRC clients, don't despair, there are web interfaces. Codehaus has http://irc.codehaus.org, but I thought I would show a new ajaxy web one that I think works quite well:


Go to here and click on the "Server" link. Enter irc.codehaus.org as the server address, and #drools as a channel, and a short nick name (< 8 chars please !).

Sunday, June 28, 2009

Drools a reflection on 5 years.

5 years ago, when I first started to promote rule engines to the mainstream java developer market the questions I most often received where "What is a rule engine?" and "Why is this different/better to an 'if' statement". In a room of 25 developers maybe only 3 or 4 would have heard of Jess, JRules or Prolog and only 1 or 2 would have any actual experience.

It was a long hard slog of repeating the same information over and over and over and over again to get the message out.

5 years later and the picture is very different. I'm no longer asked what a rule engine is or having to explain the benefits and everyone has heard of Drools. My personally feeling is that 2009 has become the tipping point for Drools, our "coming of age" year.

Could we have had the October Rules Festival 5 years ago and filled the room with over 100 people, the bulk of which where Java developers? Could I have had a Boot Camp backed with A-list names such as Wells Fargo, Boing, Fedex, Lockheed Martin, Sony, HP, Sun.

Reflecting on this made me feel immensely proud of what the Drools team (Michael Neale, Edson Tirelli, Krisv Verlaenen, Toni Rikkola and in the early years Bob McWhirter) and myself had achieved. We where actually responsible for making a whole mainstream market for Rule Engine technology. No other OSS engine has had any real market penetration and the commercial engines still do not target the mainstream Java developers - i.e. you don't see JRules or Blaze Advisor at JavaOne or Devoxx(JavaPolis) or other similar events.

Drools was first established in 2001 by Bob McWhirter, there was no Drools 1.0 release. For those that remember the very early versions of Drools used Jelly (that xml scripting framework) and didn't even compile on windows without cygwin as it required bash shell scripts - Bob's handy work ;) A little later I got involved in Drools and together Bob, myself and others from the community finally managed to push out Drools 2.0, the first release, in June 2005, you can see the TSS announcement here . Drools 2.0 was a simple xml scripting language, that was a partial rete "like" impl.

It was at this point that I become project lead, replacing Bob McWhirter who by then had become interested in other things, although he still remained involed in Drools, just to a lesser extent. When I first became involved in Drools I had zero background in rule engine technology, although I had an AI background in search space technology, specifically genetic algorithms.

Exactly 1 year later Drools 3.0 was released in June 2006, TSS announcement here. Drools 3.0 was a fully Rete implementation aimed at the Jess market.

Just over 1 year from that in July 2007 Drools 4.0 was released, TSS announcement here. Drools 4.0 moved up the food chain and was aimed at the JRules BRE market.

It took two full years to finally release Drools 5.0, TSS announcement here. Drools 5.0 has no target market and innovates beyond what traditional rule engines do to become what we refer to as a Business Logic integration Platform (BLiP). Drools 5.0 integrates and unifies rules, workflow and event processing. Drools 5.0 also includes Drools Solver, which is lead by community member Geoffrey De Smet. Probably the only comparable system now is Tibco Business Events, which is going in a similar direction.

For a bit of fun I thought I'd paste an old IRC entry from my early days (my handle is conan) with Drools, when Bob McWhirter was my mentor - Unfortunately my entries prior to 2004 are lost :( This paste provides a comical reference to where I'd told my employers that an unreleased piece of software was stable "as granite" and production ready in my efforts to sell Drools, only to find out otherwise. Should hopefully be encouraging for people to see me in my more "clue free" days, showing that if i can do it, anyone can :)

[2004-02-09 19:23:38] <conan> supposed to be getting the rules running at cisco this week, if drools is broken - I'm going to have a serious confidence problem with management.
[2004-02-09 19:24:16] <topping> yes, caveats are good when pushing unreleased software to managment :-)
[2004-02-09 19:25:03] <conan> topping: yeah I've been telling them its stable as granite!!!
[2004-02-09 19:52:12] <conan> I'm thinking it might just be easier to stick in as an example for now in drools-examples
[2004-02-09 19:52:21] <topping> i dunno, what problem are you having?
[2004-02-09 19:53:29] <conan> I add two "request" objects which have states. on reset even which is fired when one request state = "Q" and any other request state != "N" can end up with request1 and requet2 being the same.
[2004-02-09 19:53:35] <conan> which is fine
[2004-02-09 19:53:53] <conan> I then retract the object, but its seems to recurse around still, even though there should be no data.
2004-02-09 19:56:05] <bob> howdy
[2004-02-09 19:56:16] <bob> if you've got a rule firing against a previously retracted object, then definitely a drools bug
[2004-02-09 19:56:20] <bob> probably in the Agenda management
[2004-02-09 19:56:39] <bob> Agenda isn't dropping rule activations that involve retracted objects
[2004-02-09 19:56:42] <bob> (just guessing)
[2004-02-09 19:57:10] <conan> could this be beause the object is referenced by two parameters?
[2004-02-09 19:57:13] <bob> nope
[2004-02-09 19:57:22] <bob> an object either is or is-not in the working-memory
[2004-02-09 19:57:31] <bob> if you take it out of the memory and it's still in a rule activation, then bug
[2004-02-09 19:57:44] <conan> bob: I'm going to knock up an example then and probe this.
[2004-02-09 19:57:51] <bob> entirely possible I broke this in beta-12
[2004-02-09 19:58:04] <bob> bad idea saying drools is "stable as granite" :)
[2004-02-09 19:58:13] <conan> bob: yeah I know :)

Things would be remiss if I didn't take this opportunity to thank many of the wonderful community contributors (in no particular order beyond old and new school) that helped make Drools what it is. Please if I missed off your name, then do let me know and I'll add it.

Old School:

Alexander Saint Croix (nalex), Thomas Deisler, Doug Bryant (doug), Brain Topping (topping), Peter Royal (proyal), Simon Harris (sharris), Peter Lin (woolfel), David Cramer, Roger F. Gay, Barry Kaplan (memelet), Andy Barnett (dbarnett), Matt Ho (savaki), Martin Hold (mhald), Pete Kazmier (kaz), Alexander Bagerman (bagerman), Michael Frandsen.

New School:
David Sinclair (stampy88), Ming Jin (ming), Ellen Zhao (ellen), Ben Truit, Wolfgang Laun (Laune) , Matthias Groch, Matt Geis, Joe White (joe), Michael Rhoden (mrhoden), Geoffrey De Smet (Ge0ffrey), Alexandre Porcelli (porcelli), Ahti Kitsik (Ahti), Tihomir Surdilovic, Salatino Mauricio (salaboy), Davide Sottara (sotty).

Thursday, June 25, 2009

Drools Flow performance

People sometimes ask for tests, benchmarks or numbers that they can use for evaluating whether Drools actually is fast enough. Fast enough always depends on your specific case. We've had various blogs before on performance for the rules engine itself, but so far we have never published anything for Drools Flow.

However, not publishing can sometimes lead to confusion as well (as for example here, where Drools Flow was used as one candidate in a performance evaluation and we at first sight only seemed a fraction faster, but it's difficult to actually figure out what the exact results were). That's why I will post some figures here anyway, simply as some kind of reference, to determine the kind of overhead the engine creates during the execution of your processes.

The test we're using here is actually a very simple one: we simply start an empty process (a start and end node connected to each other) and execute that 10.000x in sequence and measure the avg time it takes to execute that process. These results of course heavily depend on how you configure your engine and we will show these results in three different settings:

A. Simple POJO execution: The Drools engine is used as a simple local Java component (so without any persistence or transactions)

B. Persistence / transactions: The same process is executed but in a transactional context (a new transaction for each process instance), and the state of the engine is always persisted in the database.

C. Optimized Java mode: This is actually one of my pet side-projects, where we translate the Drools Flow process straight into Java code and execute that Java code for you (the client simply needs to change on simple configuration for the process). While this severely limits the types of nodes you're allowed to use in your process (no wait states for example), and reduces the flexibility of your process, it shows how we can make Drools Flow lightning fast (in specific circumstances) if necessary. And it is of course a good reference for showing what the limit is ;) This is again without persistence and transactions.

Results [using IBM ThinkPad T61 laptop running RHEL, Java 1.6]

A. Simple:
388ms -> 0.04ms / process instance
B. Persistence / transactions: 21.9s -> 2ms / process instance
C. Optimized Java: 126ms -> 0.01ms / process instance

If you're using the engine itself without any persistence or transactions (those are added as orthogonal layers, not part of the core itself), we think it's pretty fast :)

As you can see, there's a certain price you have to pay for adding persistence and transactions. But since simply opening a JPA session and persisting one object in a transaction takes about 1.5ms here as well (75% of the total time), we believe we probably do limit the additional overhead.

The optimized Java mode shows that, if you really need to, you can still get about 4x performance increase by generating Java code from the process description. We hope to get this included into the code base at some time, and maybe even provide this functionality to certain parts of your process.

If these numbers are insufficient, you'll still be able to start looking at executing commands in parallel (they were all executed in sequence now), using multiple session to split up the work, etc.

For those who want to verify themselves, the actual test code can be found here.

Complex Logic Formulas #3

In my previous two posts I introduced configurable operators, but maybe some of you noticed that I (willingly) left one out: the negation NOT.

In Drools, negation appears essentially in three places, with slightly different semantics:


  • relational evaluators have their negated counterpart:
    Person( age == 18 , age != 18 )
  • custom evaluators support the "not" prefix keyword:
    Person( age not ~old )
  • the (non) existential quantifier is allowed:
    not Person( age < 18)


In particular, the second and the third case have diffrent meanings.

The negation in "age not old" is a logical negation: it takes the result of the evaluator (be it a boolean or a generalized degree) and inverts it, mapping true to false and vice versa. The quantifier "not", instead, models the condition "when there is NO object matching the pattern...".

In fact, logicians use the term logical negation in the former, and negation as failure in the latter. Due to this ambiguity, in Drools Chance not is still supported with the usual, context-dependent semantics, but is deprecated. Instead, the two operators neg and naf are proposed.

neg can be used both before evaluators and between-pattern evaluators, and can be nested:


rule "NEG"
when
neg neg neg ( // equivalent to a single neg
$p : Person( age neg ~young )
and
Car( owner == $p, price neg ~low )
)
then
// this rule activates for each pair Person/Car in which
// either the owner is young or their car is expensive



naf, instead, must be used before a sub-formula:


rule "NAF"
when
naf Person( age < 18 )
then
// this rule will activate if there are no people who aren't young
// (i.e. all "Person"s are old)



Notice that in Drools the two are connected by the relation naf <-> neg exists.

Complex Logic Formulas #2

In the last post, I introduced logic operators in boolean rules. Drools, in its standard form, supports AND and, in a limited way, OR. In fact, these operators are sufficient to write a number of rules. The addition of the other common logic operators (XOR, EQUIV,IMPLY) is more syntactic sugar than a real valuable feature - in fact, no rule engine supports them openly.

In presence of imperfection, instead, much changes. A degree is more than a simple boolean and thus carries additional information that can be combined in complex ways. Let's take the conjuction AND as an example. The common, general idea is that the result should tend to true (whatever true means) the more all the operands tend to true individually.
In practice this is a vague constraint that leaves many degrees of freedom.

Consider the basic case: imperfection is used to model fuzziness, and real numbers are used as degrees. This is perhaps the simplest case, since operators are truth-functional (i.e. they just require the degrees of their operands to be evaluated) and degrees themselves are extremely simple.
The logic conjunction of two degrees can be obtained by taking their minimum:

1) d(A && B) = min( d(A) , d(B) )

but also their product:

2) d(A && B) = d(A) * d(B)

or again:

3) d(A && B) = max( 0 , d(A) + d(B) -1 )

These operations (technically called t-norms) are but some - fundamental - examples of a whole family of operations, all of which are candidate implementation of the AND operator.

Things do not improve much if one chooses different types of imperfection: take, for example, probability. Supposing, again, that real values model the probability of truth (thus putting ourselves in the simplest probabilistic case), AND can be implemented by taking the product of the operand probabilities - BUT only if the operands are conditionally independent. If that is not the case, the operator will not be truth-functional and thus will have to perform more complicated calculations, possibly argument-dependent:

4) p(A && B) = p(A) * p(B|A)

Similar concepts apply to the other operators. An operator, then, is actually an abstract construct that can be customized and configured. To do so, attributes can be attached to each individual operator, choosing one or more among the following:


  • id : an identifier which can be used to reference the operator
  • kind : a string selecting a specific implementation of the operator
  • args : a string containing additional information required to configure the operator


Perhaps the best way to understand them is to imagine the following call:

ID = new OperatorKind(Args)

In fact, a centralized factory is used to instantiate the operators during ther construction of the RETE network: it uses the value of kind to choose the concrete classes and args to provide arguments to the constructors. Obviously, the factory can be configured with a default type to return if no kind is specified explicitly.

These attributes can be attached to operators, both within and between Patterns, and also to pattern themselves. The reason is simple: a pattern
Type ( constraints )
is transformed into the conjuction
object.class == Type && constraints
so the attributes are attached to the hidden conjunction.

The exact syntax is shown in the following example (attributes are optional) :


rule "Annotated_Ops_Example"
when
Type1(
field1 == "a"
op_within @( id="..." kind="..." args="..." )
field2 == "b"
)

op_between @( id="..." )

Type2(
field3 == "c"
) @( kind="..." args="..." )

then
...


The symbol "@" is used to introduce the metadata between the brackets, which, in the specific case, are given by the pairs attribute/value.

Wednesday, June 24, 2009

One model to rule them all... and in the darkness bind them

It seems to be a bit of a holy grail of Service Oriented Architecture, or in fact for any large organisation, to have a single canonical model/form of all important entities to their business.

After watching a colleague struggle with a 900K WSDL that defines something like that (900K of XML !), I happened to stumble across this interesting and amusing blog post: http://service-architecture.blogspot.com/2009/06/single-canonical-form-only-suicidal.html

Quote:
If SaaS is anywhere in your future, and it will be unless you are a military secure establishment and even then it might me, then GIVE UP NOW on the idea that you can mandate data standards in applications and create a single great big view that represents everything.

I have watched organisations spend millions even trying to define what the most "basic" entity looks like: A Customer ! Its hard to even agree on the basics !

So what ends up happening is that some sort of a standard is reached, and projects have to pay an expensive "architecture tax" to use these huge models, fail, and then feel guilty for creating their own little models that at least allow them to build their app.

With modern mapping tech, such as this or this, the cost of mapping between models is much much less, perhaps its less then the tax of using huge complex models? (this is directly relevant to the models that rules use: rules *can* use the canonical models, depending on how complex you want them to be, but sometimes it is clearer to use a model tailored for where it is used, and map to it from the external model).