Saturday, July 31, 2010

Slot Specific and Refraction

Slot Specific and Refraction are two techniques to help with recursive loop problems when writting rules. Drools currently implements neither.

Refraction
Refraction has been around a long time and was part of the original LEX and MEA conflict resolution strategies. When a rule + fact[] fires the set of facts that make up the rule continue to be true. When the fields for an existing set of facts are changed any rules that fact[] have previously fired against are not re-added to the agenda for firing. Simply put a rule cannot refire for any give set of data regardless of whether the set of data is modified by the current rule or other rules. However if a modification to the fact[] stops it being true for a given rule and then later the fact[] becomes true again for that rule it can now be refired.
The OPS5 manual on refraction:

http://www.math-cs.gordon.edu/courses/cs323/OPS5/ops5.html
    REFRACTION

This term comes from the neurobiological observation of a refractory
period
for a neuron, which means that the neuron is not able to fire
immediately without first going through a relaxation process. In a
similar way, OPS5 will not allow the same instantiation in the conflict
set from firing twice in a row. This prevents the inference engine from
entering into an infinite loop.
Refraction is supported in OPSJ and JRules, however it is not supported in Drools, Clips or Jess (please correct me if i'm wrong).

Slot Specific
Jess also recently introduced "slot-specific" as an alternative way of dealing with recursive looping. You can read a thread here on "slot-specific" and refraction in Jess, http://www.mail-archive.com/jess-users@sandia.gov/msg05488.html. Slot specific means a pattern will only propagated facts for fields that are changed which it constrains on. This means you can modify a value in the consequence and if the rule does not constrain on that field it will not refire. Clips COOL (Clips Object Oriented Language) has this feature (but not deftemplates) as default for all fields of an Object. However "slot-specific" is not implemented in Drools, JRules or OPSJ (please correct me if i'm wrong).

Refraction + Slot-Specific + onChange
I think there are advantages and disadvantages to each idea and I've been trying to think of better ways to deal with recursion in Drools and realised the two can be used together.

In Drools 6.0 I propose refraction to be the default behaviour for the engine, so no rules that have fired and still true can be reactived from a modify without first being made false. However "slot-specific" can override this allowing a pattern to react to changes on specific fields. With "Differential Diff", previous called "True modify" refraction is trivial to support. A performant slot specific implementation will not be trivial, especially with our support for nested accessors.
http://blog.athico.com/2010/03/drools-halves-memory-use-with-new-true.html
http://blog.athico.com/2010/01/rete-and-true-modify.html

I propose "onChange" property listeners as a mechanism to specify which slots can receive modification updates. "onChange" would be a magical field supported in patterns that specifies whether a pattern listens to and propagates changes, "onChange" would give us "slot-specific" type semantics but with more user control and flexability. It takes an array of field names, here are example semantics:

Person( onChange == [name]) // listen to any "name" property changes, if "name" is changed then propagate, other property changes will not be propagated.
// Notice we don't need quotes as property names never have spaces

Person( onChange == [name, age, location]) // listens and propagates "name", "age" and "location" changes.

Person( onChange == [*] ) // listen to all properties and propagates, this is the behaviour of current Drools.

Person( onChange == [!name, *]) // listen to all properties except "name".

Person( onChange == [name, *]) // is allowed but obviously redundant

Person( onChange == [name, !name, *]) // is not allowed

By default Drools will not listen to any field changes:
Person() // will match just once and never refire again for any given Person instance

All in all onChange allows the user to take advantage of slot-specific and refraction qualities, while not locking the user into either. I've also previously discussed constraining on previous values of a field, which would give even more fine grained control.

Field Versioning
There are times when you need to compare between current and previous values of a field, users can do this now by intermediary facts; i.e. inserting an Event to represent the before and after value for a field change, but it's a little clunky. Intead we can provide built in support for this into the language, using an @ver(int) attribute. The idea is that Drools would store previous values, only on demand, so you only pay the cost if you use this feature. The value for @ver is an effective "diff" value counting backwards starting from 0, which is now. So @ver(0) is the current field value, @ver(-1) is the previous field value @ver(-2) is the value 2 versions ago.

SomeFact( fieldName != fieldName @ver( -1 ) )

So any field with no @ver is effectively, and you could write it as such, @ver(0)
SomeFact( @ver(0) fieldName != fieldName @ver( -1 ) )

We can allow bindings to previous versions:
SomeFact( $var : @ver(-2) fieldName )
OtherFact( field == $var )

We should also support the ability to add a range of values to a list, for processing with accumulate:
SomeFact $list : @var(0....-5) fieldName )

Tuesday, July 27, 2010

Two Great October 2010 Events : Rules Fest and RuleML

I and members of the Drools and jBPM teams will be attending two conferences in October. Rules Fest and RuleML, both are great conferences that I strongly recommend to anyone wanting increase their understanding of rules, cep, workflow or semantics. Rules Fest will also be doing an all day Drools Boot Camp, where we will cover Drools basics and help people with questions on their projects and architectures.

RuleML


The International Web Rule Symposium has evolved from an annual series of international workshops since 2002, international conferences in 2005 and 2006, and international symposia since 2007. This year, the 4th International Web Rule Symposium (RuleML-2010) will be held near Washington, DC, USA. RuleML-2010 is devoted to practical distributed rule technologies and rule-based applications, which need language standards for rules (inter)operating in, e.g., the Semantic Web, Enterprise Systems, Intelligent Multi-Agent Systems, Event-Driven Architectures, and Service-Oriented Applications.
Want to Sponsor RuleML?

Registration, Early Bird Specials still available

Rules Fest 2010


Rules FestTM is the world's only technical conference devoted to the practical application of all rule-based and knowledge-based reasoning, inferencing, and decisioning technologies.

Rules Fest brings you the best and brightest speakers from industry, academia, and private research to share practical knowledge and techniques for creating, utilizing, and managing software that incorporates rule engines, inference engines, logical reasoners, or other rule-based and reasoning technologies.

Rules Fest exists to serve the:

  • Architects
  • Engineers,
  • Developers, and
  • Programmers

who use these technologies to solve complex information processing and decision-making problems.

Want to Sponsor Rules Fest
Registration, Early Bird Specials still available

Friday, July 23, 2010

Glazed Lists examples for Drools Live Querries

A while back I talked about the new features with Drools for Live querries:
http://blog.athico.com/2010/05/live-querries.html

Where you could open a query in Drools and receive event notifications for added, deleted and upated rows. I mentioned this could be used with Glazed Lists for filtering, sorting and transformation.

I just added a unit test to Drools, which people can use as a template for their own Drools integration with Glazed Lists. The test is based on the one in QueryTest.testOpenQuery():
DroolsEventList
DroolsEventListTest

The EventList implemention itself is very simple. At the moment it backs onto an ArrayList and uses linear searches for the updates and removes. Because Drools is likely to have a high volume of changes it should probably be backed by a HashMap or something for constant levels of performance for those lookups.
public class DroolsEventList extends AbstractEventList implements ViewChangedEventListener {
List data = new ArrayList();

public Row get(int index) {
return this.data.get( index );
}

public int size() {
return this.data.size();
}

public void rowAdded(Row row) {
int index = size();
updates.beginEvent();
updates.elementInserted(index, row);
boolean result = data.add(row);
updates.commitEvent();
}

public void rowRemoved(Row row) {
int index = this.data.indexOf( row );
updates.beginEvent();
Row removed = data.remove( index );
updates.elementDeleted(index, removed);
updates.commitEvent();
}

public void rowUpdated(Row row) {
int index = this.data.indexOf( row );
updates.beginEvent();
updates.elementUpdated(index, row, row);
updates.commitEvent();
}
}
Creating and using the EventList is also trivial, here is a snippet from the test using the SortedEventList:
        DroolsEventList list = new DroolsEventList();
// Open the LiveQuery
LiveQuery query = ksession.openLiveQuery( "cheeses", new Object[] { "cheddar", "stilton" } , list );

SortedList sorted = new SortedList( list, new Comparator() {

public int compare(Row r1,
Row r2) {
Cheese c1 = ( Cheese ) r1.get( "stilton" );
Cheese c2 = ( Cheese ) r2.get( "stilton" );
return c1.getPrice() - c2.getPrice();
}
});


assertEquals( 3, sorted.size() );
assertEquals( 1, ((Cheese)sorted.get( 0 ).get( "stilton" )).getPrice() );
assertEquals( 2, ((Cheese)sorted.get( 1 ).get( "stilton" )).getPrice() );
assertEquals( 3, ((Cheese)sorted.get( 2 ).get( "stilton" )).getPrice() );

Wednesday, July 21, 2010

Declarative REST Services for Drools using Spring, Camel and CXF

5.1CR1 has just been tagged and is being released, while that's happening I thought I'd blog the new declarative services for drools-server.

For those wanting to just dive in, download this .war and just unzip into TomCat.
https://hudson.jboss.org/hudson/job/drools/lastSuccessfulBuild/artifact/trunk/target/drools-5.1.0.SNAPSHOT-server.war

Once that's unzipped you should be able to look at and run the test.jsp to see it working. This example just executes a simple "echo" type application. It sends a message to the rule server that pre-appends the word "echo" to the front and sends it back. By default the message is "Hello World", different messages can be passed using the url parameter msg - test.jsp?msg="My Custom Message".

Under the hood the jsp invokes the Test.java class, this then calls out to Camel which is where the meet happens. The camel-client.xml defines the client with just a few lines of xml:

<camelContext id="camel" xmlns="http://camel.apache.org/schema/spring">     
   <route>
      <from uri="direct://kservice"/>
      <policy ref="droolsPolicy">
         <to uri="cxfrs://http://localhost:8080/drools-server-app/kservice/rest"/>
      </policy>
   </route>         
 </camelContext>

"direct://kservice" is just a named hook, allowing java to grab a reference and push data into it. In this example the data is already in xml, so we don't need to add any DataFormat's to do the marshalling. The DroolsPolicy adds some smarts to the route and you'll see it used on the server side too. If JAXB or XStream were used, it would inject custom paths and converters, it can also set the classloader too on the server side, on the client side it automatically unwrapes the Response object.

Configuring a Rest server with Spring and Camel is just a few lines of xml:

<cxf:rsServer id="rsServer"  
              address="/kservice/rest"
              serviceClass="org.drools.jax.rs.CommandExecutorImpl">
<cxf:providers>
    <bean class="org.drools.jax.rs.CommandMessageBodyReader"/>
   </cxf:providers>
</cxf:rsServer>  

With the server configured we can now set up our Camel route using Spring, this will unmarshall incoming payloads using xstream before executing against the drools runtime named "ksession1". The policy augments the XStream converter with some custom converters for Drools objects, as well as setting the ClassLoader to the one used by the ksession.

<bean id="droolsPolicy" class="org.drools.camel.component.DroolsPolicy" />  
   
<camelContext id="camel" xmlns="http://camel.apache.org/schema/spring">        
   <route>
      <from uri="cxfrs://bean://rsServer"/>
      <policy ref="droolsPolicy">
         <unmarshal ref="xstream" />       
         <to uri="drools:node1/ksession1" />
         <marshal ref="xstream" />
      </policy>
   </route>           
</camelContext>

The final but is the declaration of the Drools services themselves:

<drools:execution-node id="node1" />
 
<drools:kbase id="kbase1" node="node1">
<drools:resources>
    <drools:resource  type="DRL" source="classpath:test.drl"/>
   </drools:resources>                                             
</drools:kbase>
        
<drools:ksession id="ksession1" type="stateless" kbase="kbase1" node="node1"/>  

The execution-node is optional and could have been left out, it's role is to provide a context to store multiple ksessions. It then allows one rest endpoint to execute against those named ksessions, based on the given name; either in the header or attribute in the root element.

The rule itself can be found here: test.drl. Notice the type Message is declared part of the drl and is thus not present on the Classpath.

declare Message
   text : String
end
   
 
rule "echo" dialect "mvel"
when
   $m : Message();
then
   $m.text = "echo:" + $m.text;
end


Lucaz has also done a write up of the new Drools Server, http://lucazamador.wordpress.com/2010/07/20/drools-server-configuration-updated/.

Sunday, July 11, 2010

Simulated Annealing: a new algorithm for Drools Planner

Drools Planner has a very good tabu search implementation, but tabu search is just one meta-heuristic algorithm. There are plenty more, such as simulated annealing, great deluge, late acceptance, ...

Until recently, the simulated annealing implementation in Drools Planner was very experimental (not to say useless). But a little tender care and attention (and the pressure of a nurse rostering competition deadline) changed that in the last month.

Here's a comparison (created by the Benchmarker) between tabu search and the new simulated annealing, in which both configurations are roughly tweaked:



As you can see, for these 3 testdata sets, simulated annealing beats tabu search. This holds true for the other medium (10 minute) datasets of the nurse rostering example too, but not for all the other examples. So if you're using Drools Planner with tabu search, you might want to replace it with simulated annealing (from version 5.1.0.CR1). And that's easy! Just change a few lines. Or use the Benchmarker to compare both and select a winner:

<solverBenchmark>
<name>tabuSearch</name>
<localSearchSolver>
<selector>
...
</selector>
<acceptor>
<completeSolutionTabuSize>1000</completeSolutionTabuSize>
<completePropertyTabuSize>11</completePropertyTabuSize>
</acceptor>
<forager>
<minimalAcceptedSelection>800</minimalAcceptedSelection>
</forager>
</localSearchSolver>
</solverBenchmark>
<solverBenchmark>
<name>simulatedAnnealing</name>
<localSearchSolver>
<selector>
...
</selector>
<acceptor>
<simulatedAnnealingStartingTemperature>10.0</simulatedAnnealingStartingTemperature>
<completePropertyTabuSize>5</completePropertyTabuSize>
</acceptor>
<forager>
<pickEarlyType>FIRST_BEST_SCORE_IMPROVING</pickEarlyType>
<minimalAcceptedSelection>4</minimalAcceptedSelection>
</forager>
</localSearchSolver>
</solverBenchmark>

Notice that I have even used a little bit tabu in the simulated annealing configuration, although I haven't experimented much yet to prove that that is indeed better.

Below are best score over time graphs of each of the 3 testdata sets:


In medium05, the simulated annealing algorithm kicks in very late, because a simulatedAnnealingStartingTemperature of 10.0 is way to high for it apparently.



In medium_hint03, at first, the 2 algorithms are competitive, but later on simulated annealing is much more flexible to escape local optima.



In medium_late05, simulated annealing is always better than tabu search.

So what's next? Good implementations of the other meta-heuristics of course, but more importantly: phasing. Phasing can chain several meta-heuristics in a single run. For example, it can use simulated annealing the first 80% of the time, but switch to tabu search for the last 20% of the time.

Thursday, July 08, 2010

Events: "To be, or not to be: that is the question"

Drools, as readers know, is pushing the boundaries of conservative (read "conservative" in the "bad" sense) software modeling and proving that there is a strong synergy between different declarative modeling technologies. More than that, a solution designer should have the freedom to chose the best metaphor for his model's components, without worrying about the actual technology required to handle (read execute) such model.

That is why out of the box, Drools provides a platform that allows you to model processes, rules and to process events, using a single seamless and unified environment.

Although, such power and flexibility requires that users learn the different tools/features at their disposal, and preferably choose the best fit for the job.

A simplistic analogy is the classic: "give someone a hammer, and everything will look like a nail". Stretch that scenario and imagine we also give him a screwdriver and he will have additional options:
  • he can still continue to use only the hammer and results will continue the same, but still sub-optimal.
  • he can start handling screws in a more efficient way with the screwdriver and we might imagine he will get better results
  • or he can try to use the screwdriver for everything, including to handle plain old nails, and results might be worst than they were with the old and reliable hammer.
In the pure open source spirit we always try to help users and one of the more frequent confusions I face is, since the inception of Drools Fusion, when users start to see everything as an event. Like the third scenario above, that sometimes make things more difficult for them than they should.

The heart of the issue I would like to discuss is: what is an event?

There are several definitions for what an event is, but the one I like best is actually very simple:

"An event is a record of a significant change of state in a given point in time."

What I always tell people when I am asked if something is an event or not, is to use this definition as a starting guideline to decide. There are 3 important aspects to it:
  • An event is significant to the system: if your system deals with electronic air tickets sale, you probably will not care about a car accident in some random road, although the same event might be extremely important to a road traffic control system. Sounds obvious, but you would be surprised by how many users fall into this trap.
  • An event represents a change of state: if you are monitoring a given sensor, lets say a power outage sensor, repeating events stating that power levels were ok at 2pm in the afternoon are probably useless after the first one was acknowledged. Please note that if the events inform about different points in time (like 2:05pm, 2:10pm, etc, they still represent changes of state from unknown to known state at that point in time).
  • An event has an associated timestamp: there usually is no value in knowing that a building was on fire if I don't know if that happened 100 years ago or if it is happening now, as sending fire workers there now for something that happened 100 years ago will probably be ineffective.
If your entity under analysis fails any of the above criteria, you can rule it out as an event. It might still be a fact though, as, from a Business Logic perspective:

"All events are facts, but not all facts are events".

Using the 3rd example above, for an insurance system it might be worth to know that a building was on fire in the past, even if I don't know when. In that case, that information would be a fact from my system, although not an event.

So, after knowing all that, I ask you the question:

Imagine an insurance claim processing system, is a claim an event?

And the answer is: "it depends!"

"WHAT?" you might ask. Yes, it depends. In the end, modeling something as an event or a regular fact are just tools at our disposal, and you might model a claim as an event or as a regular fact. It depends on what your system will do with such a claim. More often than not, there will be two entities in your model: a Claim fact representing the transactional (and mutable) data in your system, and a ClaimEvent, representing the actual claims coming into the system (remember you can't change the past and so events should be treated as immutable instances). But I digress. Back to the point.

What do I do when **I** analyze a problem domain and have to decide on how to model an entity? Is it an event or a regular fact?
  1. I go over the above checklist. If the entity fails for any of those criteria, it is not an event.
  2. If it passes the criteria, I proceed to the practical aspects. Dealing with the temporal dimension is a complex matter that most users do not comprehend until they start dealing with it. There are only 2 features that can not be applied to regular facts: Sliding windows and automatic event life-cycle management (check the manual if you don't know what they are). If I won't need any of these features for the entity in question, I model it as a fact. I can always change my mind later and convert a fact into an event.
  3. If after the previous analysis I realize that I have an event in my hands, I will model it as an event and deal with it.

With that I am not saying that you should not use events. On the contrary, please use the best features for the job at hand. It is just that sometimes we allow ourselves to become too excited by the shiny new features that we forget that a hammer is still the best tool for the plain old nails.

PS: this post is intentionally directed to users with a Rules Engine background coming into the event processing world. One might realize in the future that users coming from the event processing world might use an analogous post regarding Rules Engines features.

Happy Drooling,
Edson