Tuesday, December 24, 2013

Drools 6.0 Workbench and Editor Videos

6 workbench introduction videos. Watch full screen at 720HD. Press Play All.
http://www.youtube.com/playlist?list=PLb9jQNHBKBRj9IJkc_F5nCJAvXaegOGW8


7 editor metaphor videos. Watch full screen at 720HD. Press Play All.
http://www.youtube.com/playlist?list=PLb9jQNHBKBRipbtadRC-UaUObjwp0aBHJ


KIE Workbench
Drools Editor Metaphors







Share/Bookmark

Sunday, December 22, 2013

Deployment with Drools 6.0

KieScanner
The 6.0 KieScanner replaces the 5.x KnowledgeAgent. It uses embedded Maven to allow the resolving and retrieving of jars at runtime. 6.0 applications can now easily support dependencies and transitive dependencies; using well known Maven semantics for versioning. It allows for deployment on the class path and also dynamically at runtime. Currently it supports manual "scanNow" and interval polling, remoting will be added in the future. A KieScanner can be registered on a KieContainer as in the following example:

KieServices kieServices = KieServices.Factory.get();
ReleaseId releaseId = kieServices.newReleaseId( "org.acme", "myartifact", "1.0-SNAPSHOT" );
KieContainer kContainer = kieServices.newKieContainer( releaseId );
KieScanner kScanner = kieServices.newKieScanner( kContainer );
// Start the KieScanner polling the Maven repository every 10 seconds
kScanner.start( 10000L );

In this example the KieScanner is configured to run with a fixed time interval, but it is also possible to run it on demand by invoking the scanNow() method on it. If the KieScanner finds in the Maven repository an updated version of the Kie project used by that KieContainer it automatically downloads the new version and triggers an incremental build of the new project. From this moment all the new KieBases and KieSessions created from that KieContainer will use the new project version.

Installation
Deployment
Settings.xml and Remote Repository Setup 
The maven settings.xml is used to configure Maven execution. Detailed instructions can be found at the Maven website: http://maven.apache.org/settings.html The settings.xml file can be located in 3 locations, the actual settings used is a merge of those 3 locations.

  • The Maven install: $M2_HOME/conf/settings.xml 
  • A user's install: ${user.home}/.m2/settings.xml 
  • Folder location specified by the system propert kie.maven.settings.custom

The settings.xml is used to specify the location of remote repositories. It is important that you activate the profile that specifies the remote repository, typically this can be done using "activeByDefault":
<profiles>
  <profile>
    <id>profile-1</id>
    <activation>
      <activeByDefault>true</activeByDefault>
    </activation>
    ...
  </profile>
</profiles>

Maven Versions and Dependencies Maven supports a number of mechanisms to manage versioning and dependencies within applications. Modules can be published with specific version numbers, or they can use the SNAPSHOT suffix. Dependencies can specify version ranges to consume, or take avantage of SNAPSHOT mechanism.

StackOverflow provides a very good description for this, which is reproduced below. http://stackoverflow.com/questions/30571/how-do-i-tell-maven-to-use-the-latest-version-of-a-dependency

If you always want to use the newest version, Maven has two keywords you can use as an alternative to version ranges. You should use these options with care as you are no longer in control of the plugins/dependencies you are using.

When you depend on a plugin or a dependency, you can use the a version value of LATEST or RELEASE. LATEST refers to the latest released or snapshot version of a particular artifact, the most recently deployed artifact in a particular repository. RELEASE refers to the last non-snapshot release in the repository. In general, it is not a best practice to design software which depends on a non-specific version of an artifact. If you are developing software, you might want to use RELEASE or LATEST as a convenience so that you don't have to update version numbers when a new release of a third-party library is released. When you release software, you should always make sure that your project depends on specific versions to reduce the chances of your build or your project being affected by a software release not under your control. Use LATEST and RELEASE with caution, if at all.

See the POM Syntax section of the Maven book for more details.

http://books.sonatype.com/mvnref-book/reference/pom-relationships-sect-pom-syntax.html
http://books.sonatype.com/mvnref-book/reference/pom-relationships-sect-project-dependencies.html

Here's an example illustrating the various options. In the Maven repository, com.foo:my-foo has the following metadata:
<metadata>
  <groupId>com.foo</groupId>
  <artifactId>my-foo</artifactId>
  <version>2.0.0</version>
  <versioning>
    <release>1.1.1</release>
    <versions>
      <version>1.0</version>
      <version>1.0.1</version>
      <version>1.1</version>
      <version>1.1.1</version>
      <version>2.0.0</version>
    </versions>
    <lastUpdated>20090722140000</lastUpdated>
  </versioning>
</metadata>

If a dependency on that artifact is required, you have the following options (other version ranges can be specified of course, just showing the relevant ones here): Declare an exact version (will always resolve to 1.0.1):
<version>[1.0.1]</version>
Declare an explicit version (will always resolve to 1.0.1 unless a collision occurs, when Maven will select a matching version):
<version>1.0.1</version>
Declare a version range for all 1.x (will currently resolve to 1.1.1):
<version>[1.0.0,2.0.0)</version>
Declare an open-ended version range (will resolve to 2.0.0):
<version>[1.0.0,)</version>
Declare the version as LATEST (will resolve to 2.0.0):
<version>LATEST</version>
Declare the version as RELEASE (will resolve to 1.1.1):
<version>RELEASE</version>
Note that by default your own deployments will update the "latest" entry in the Maven metadata, but to update the "release" entry, you need to activate the "release-profile" from the Maven super POM. You can do this with either "-Prelease-profile" or "-DperformRelease=true"

Share/Bookmark

Monday, December 16, 2013

Results from the Drools & jBPM Workshop @ Bcn (10th Dec)

I'm quite happy to say that the Barcelona Workshop was a complete success. Once again I would like to thanks the @BarcelonaJug for all the effort that they put into organise these meetups. This post is to share some pics and some overall comments about the event.

The Barcelona Java User Group

5df05-logo_bicolor_blog
Find here some pics about the group, with almost 30 attendees interested in Drools & jBPM the workshop was a success, all of the attendees manage to run & play with KIE Workbench:
photo
IMG-20131212-WA0007IMG-20131212-WA0003IMG-20131212-WA0005IMG-20131212-WA0006IMG-20131212-WA0008IMG-20131212-WA0002


If you are working in a company based in Barcelona, I strongly suggest you to support your local JUG. This will benefit you to build up your knowledge, bring expert people from abroad and share the knowledge that is being produced locally by the other members of the group.

Do the Workshop on your own Laptop


Once again I'm sharing the files that we distributed on the workshop so you can try it on your own environment. If you are interested in building all the projects from the source code let me know I can help you out to set up the environment to get you started.

Feedback from the Community

As usual, when we present the community projects, we gather a lot of feedback from the people that is testing the tool and we take notes to improve the user experience. I've noticed from this last meet up that there is a lot of interest in knowing about the project internals. For that reason I've proposed another meet up probably for the beginning of February to cover Drools & jBPM from the framework perspective without showing the tooling. I do believe that knowing the framework internals is something interesting for most of developers, but at the same time I feel that knowing about the tools will give them a more higher level understand about what's the main purpose of these frameworks and tools. You can obviously target to a wider audience with the UI provided by the KIE Workbench, showing with this tools how you can help the whole company spectrum with a knowledge driven solution.
As a reminder, all the software that we had shown in the meet up was built using the master branch of the source code repositories, so you can expect some issues that are being solved. The main idea of showing the community builds is to encourage people to participate into this community projects.
Feel free to write some comments into this blog post with some suggestions about what kind of topics you would like to see in an internal talk about Drools & jBPM. I usually like to talk about both frameworks because they shared a lot of common topics that needs to be discussed under the same context.

Share/Bookmark

Wednesday, December 11, 2013

Construction Heuristics with multiple variables or multiple planning variables

Lately, OptaPlanner users have started asking questions how to configure the construction heuristics for multiple variables or multiple planning entities. I've just enriched the docs to explain that better. Until those are published, I 've copied the relevant section here.

Note that this is for advanced users, for normal users the simple configuration should suffice.


8.6. Advanced Greedy Fit

8.6.1. Algorithm description

Advanced Greedy Fit is versatile, generic form of First Fit, First Fit Decreasing, Best Fit and Best Fit Decreasing.

8.6.2. Configuration

A Best Fit Decreasing configuration for a single entity class with a single variable (which is the verbose version of the simple constructionHeuristicType BEST_FIT_DECREASING configuration):

  <constructionHeuristic>
    <queuedEntityPlacer>
      <entitySelector id="placerEntitySelector">
        <cacheType>PHASE</cacheType>
        <selectionOrder>SORTED</selectionOrder>
        <sorterManner>DECREASING_DIFFICULTY</sorterManner>
      </entitySelector>
      <changeMoveSelector>
        <entitySelector mimicSelectorRef="placerEntitySelector"/>
        <valueSelector>
          <cacheType>PHASE</cacheType>
          <selectionOrder>SORTED</selectionOrder>
          <sorterManner>INCREASING_STRENGTH</sorterManner>
        </valueSelector>
      </changeMoveSelector>
    </queuedEntityPlacer>
    <!--<forager>-->
      <!--<pickEarlyType>FIRST_NON_DETERIORATING_SCORE</pickEarlyType>-->
    <!--</forager>-->
  </constructionHeuristic>

Per step, the QueuedEntityPlacer selects 1 uninitialized entity from the EntitySelector and applies the winning Move (out of all the moves for that entity generated by the MoveSelector). The mimic selection ensures that the winning Move changes (only) the selected entity.

To customize the entity or value sorting, see sorted selection. Other Selector customization (such as filtering) is supported too.

8.6.3. Multiple variables

There are 2 ways to deal with multiple variables, depending on how their ChangeMoves are combined:

- Cartesian product of the ChangeMoves (default): All variables of the selected entity are assigned together. Has far better results (especially for timetabling use cases).
- Sequential ChangeMoves: One variable is assigned at a time. Scales much better, especially for 3 or more variables.
For example, presume a course scheduling example with 200 rooms and 40 periods.

This First Fit configuration for a single entity class with 2 variables, using a cartesian product of their ChangeMoves, will select 8000 moves per entity:

  <constructionHeuristic>
    <queuedEntityPlacer>
      <entitySelector id="placerEntitySelector">
        <cacheType>PHASE</cacheType>
      </entitySelector>
      <cartesianProductMoveSelector>
        <changeMoveSelector>
          <entitySelector mimicSelectorRef="placerEntitySelector"/>
          <valueSelector>
            <variableName>room</variableName>
          </valueSelector>
        </changeMoveSelector>
        <changeMoveSelector>
          <entitySelector mimicSelectorRef="placerEntitySelector"/>
          <valueSelector>
            <variableName>period</variableName>
          </valueSelector>
        </changeMoveSelector>
      </cartesianProductMoveSelector>
    </queuedEntityPlacer>
    ...
  </constructionHeuristic>

Warning: With 3 variables of 1000 values each, a cartesian product selects 1000000000 values per entity, which will take far too long.
This First Fit configuration for a single entity class with 2 variables, using sequential ChangeMoves, will select 240 moves per entity:

  <constructionHeuristic>
    <queuedEntityPlacer>
      <entitySelector id="placerEntitySelector">
        <cacheType>PHASE</cacheType>
      </entitySelector>
      <changeMoveSelector>
        <entitySelector mimicSelectorRef="placerEntitySelector"/>
        <valueSelector>
          <variableName>period</variableName>
        </valueSelector>
      </changeMoveSelector>
      <changeMoveSelector>
        <entitySelector mimicSelectorRef="placerEntitySelector"/>
        <valueSelector>
          <variableName>room</variableName>
        </valueSelector>
      </changeMoveSelector>
    </queuedEntityPlacer>
    ...
  </constructionHeuristic>


Important: Especially for sequential ChangeMoves, the order of the variables is important. In the example above, it's better to select the period first (instead of the other way around), because there are more hard constraints that do not involve the room (for example: no teacher should teach 2 lectures at the same time). Let the Benchmarker guide you.

With 3 or more variables, it's possible to combine the cartesian product and sequential techniques:


  <constructionHeuristic>
    <queuedEntityPlacer>
      ...
      <cartesianProductMoveSelector>
        <changeMoveSelector>...</changeMoveSelector>
        <changeMoveSelector>...</changeMoveSelector>
      </cartesianProductMoveSelector>
      <changeMoveSelector>...</changeMoveSelector>
    </queuedEntityPlacer>
    ...
  </constructionHeuristic>

8.6.4. Multiple entity classes

The easiest way to deal with multiple entity classes is to run a separate construction heuristic for each entity class:


  <constructionHeuristic>
    <queuedEntityPlacer>
      <entitySelector id="placerEntitySelector">
        <cacheType>PHASE</cacheType>
        <entityClass>...DogEntity</entityClass>
      </entitySelector>
      <changeMoveSelector>
        <entitySelector mimicSelectorRef="placerEntitySelector"/>
      </changeMoveSelector>
    </queuedEntityPlacer>
    ...
  </constructionHeuristic>
  <constructionHeuristic>
    <queuedEntityPlacer>
      <entitySelector id="placerEntitySelector">
        <cacheType>PHASE</cacheType>
        <entityClass>...CatEntity</entityClass>
      </entitySelector>
      <changeMoveSelector>
        <entitySelector mimicSelectorRef="placerEntitySelector"/>
      </changeMoveSelector>
    </queuedEntityPlacer>
    ...
  </constructionHeuristic>

Share/Bookmark

Friday, November 29, 2013

Drools & jBPM Workshop - Barcelona (10th Dec)

Hi All! This is me again with some great news! We are planning a meeting in Barcelona to show the new tooling provided by Drools & jBPM 6. The event will be hosted by BarcelonaJUG as we did last year but this year we are aiming to do a more workshop oriented event, where you can attend with your laptop and play with the tooling. The main goal of the workshops is for you to have the tooling installed and working in your own environment. We will also be giving some guidelines about how to contribute with the community projects, so if you are interested in being part of these amazing communities this is a very good opportunity to get you started.

The event

The location of the event will be in "La Fontana" (The fountain):
Carrer Gran de Gràcia, 19008012 Barcelona
On the 10th of December we will be starting at 6:30 pm, but feel free to be there before that time because I'm sure that we will be setting up the place at least an hour earlier.
It will be good if you show up with your laptop, because we will be trying to showing in your own environment how the tools work. A couple of days before the meet up I will be sharing some links that you will be able to download and pre install if you want to, to save some time on the event for more important and advanced questions. Another important thing that you can bring to the event is your 3G dongle if you have one. We are not 100% sure about the internet connectivity of the venue, but for some of the demos internet connection will be required, so be prepared to share some bandwidth.
That's all for now, stay tuned!
Don't hesitate to drop a comment if there is something in particular that you want to see on the event.
PS: the event will be in Spanish, but if you are from abroad and want to attend we will do our best to share as much as we can with you.

El Evento (Spanish)

Nos estaremos reunion en "La Fontana" ( en el centro de Barcelona. La dirección del evento sera:
Carrer Gran de Gràcia, 19008012 Barcelona
El 10 de Diciembre 2013 estaremos comenzando a las 6:30 pm, pero siéntanse libres de llegar un poco antes ya que estaremos preparando el lugar como mínimo una hora antes de empezar.
La idea principal del evento es poder compartir con ustedes las novedades de los proyectos comunitarios Drools y jBPM y al mismo tiempo que ustedes se lleven instaladas las herramientas provistas por estos proyectos en sus laptops. En los días previos al evento voy a estar compartiendo algunos links con las herramientas para descargar e instalar. Con esto ahorraremos algo de tiempo para entrar mas en detalle durante la charla y no perder el tiempo instalando la herramienta o copiando archivos, así que visiten el blog o ponganse en contacto si no reciben los mismos.
Aparte de sus laptops seria importante si pueden conseguir Dongles 3G para tener conectividad a internet durante el evento, ya que algunas demos requieren conectarse a servidores remotos para bajar ejemplos. En el caso de tener un Dongle 3G por favor llevenlos al evento y estén dispuestos a compartir conectividad con el resto de los participantes.
Eso es todo por ahora! No duden en dejar un comentario sobre sus intereses personales sobre el evento. Siempre es bueno saber cuales son las expectativas de la audiencia.
PD: el evento sera en español.

Share/Bookmark

Thursday, November 28, 2013

Drools 5.x Developer's Guide Review

Last week I finally had the opportunity to read Drools JBoss Rules 5.x Developer's Guide, by Michal Bali, and I have to say I was not disappointed.

Michal wrote his first book on Drools 5.0 back in 2009 and with this new book he brings the content up to date with the Drools 5.5/5.6 release.

More importantly, different from other books written about Drools, his writing style is more of a guided tutorial. He starts from the beginning on why and how to write rules and builds on it as it progresses throughout the book. It is an approach that really helps beginners to learn following a logical sequence of content and gradually expanding their understanding of the complete platform.

The book includes plenty of code examples and several notes from the author touching up on design decisions, tips, food for thought and even some best practices.

Regarding content, he covers quite an extensive array of features from the platform, including native DRL rules, Domain Specific Languages, Decision Tables, Complex Event Processing and even Processes with jBPM. He presents not only how to build a solution using rules, but how to properly test and integrate it. He even discusses some of the internal details of the engine for those curious (and brave) enough to read!

Having said all that, I have only one minor nitpick: although most of the examples are didactic and clear, some are not often seen in the field (or at least I haven't seen then). In any case, it is not easy to find examples that are at the same time easy enough for new users to understand and representative enough of real world applications.

All in all, a great book!

Drools JBoss Rules 5.X Developer’s Guide
Share/Bookmark

Wednesday, November 27, 2013

Using OptaPlanner in Camel (for example to expose it as a REST service)

Camel 2.13 will probably include the OptaPlanner component.
This makes it easier to integrate OptaPlanner in Camel, for example to expose OptaPlanner as a REST or SOAP service.

Read the documentation of camel-optaplanner. Or get the code.

WARNING: This is only the first iteration of this component. The tests show it works. Feedback to expand its scope and features are welcome.


Special thanks to Charles Moulliard for explaining Camel basics to me and Claus Ibsen for polishing the code.

Share/Bookmark

Thursday, November 21, 2013

Videos showing new Workbench

6 short videos, to be played individually or continuously (play all) in the playlist, that demonstrates the new Workbench.


Share/Bookmark

Tuesday, November 19, 2013

Tennis club scheduling during a Devoxx lunch

Last week, while attending Devoxx (the biggest Java conference in Europe and arguably the best in the world), I met one of my colleagues, Tobias, who recently faced a tennis club planning problem which he solved manually. He wondered if OptaPlanner could solve it too. So we decided to implement it over lunch.

The problem is relatively simple: Each week his tennis club has 2 courts available, so 4 teams can play. There are 7 teams. Teams can be unavailable on some days. The number of times each team plays must be fairly balanced. Also the number of confrontations between any 2 teams must be evenly balanced.

The challenge worked out well. In a little over an hour, we had a working implementation, which included a rudimentary GUI and the data in XML format. OptaPlanner's solution, automatically generated in a few seconds, even improved Tobias's manual solution which took him several evenings to find (which is - given the size of the search space - not surprising).

Apparently this problem isn't even uncommon: later on, I found out that another colleague of mine, Cojan, also faced the same problem in his tennis club. Therefore it's now part of the official OptaPlanner examples.


Share/Bookmark

Monday, November 11, 2013

Space Invaders in 8 minutes with Drools

Following in the same fashion of Pong and Wumpus World, I've written a simplified Space Invaders game. I've uploaded it to youtube, make sure you watch it in high quality and full screen, to avoid blur text:
http://www.youtube.com/watch?v=wORlAZoxttA

It's not the complete game, but in it's current form is simpler than Pong or Wumpus World; so it's a better place to start learning. I've written it, with re-use in mind, such as the configuration classes and key rules, to be re-used in other games; although they will need refactoring first. For Invaders I've committed each stage of the game, as separate files, so it's easy to see the stages for yourself.

The model classes and the 6 mains are here:
https://github.com/droolsjbpm/drools/tree/master/drools-examples/src/main/java/org/drools/games/invaders

The 6 drl folders for each of the mains are here:
https://github.com/droolsjbpm/drools/tree/master/drools-examples/src/main/resources/org/drools/games



Share/Bookmark

Friday, November 08, 2013

Cloud optimization with OptaPlanner video

Yet another video, to prepare for the 6.0 final release :)


Share/Bookmark

Friday, November 01, 2013

R.I.P. RETE time to get PHREAKY

I've just done some high level documentation for the new rule algorithm I've called PHREAK, a word play on Hybrid Reasoning. It's still a bit rough and high level, but hopefully still interesting. It builds on ReteOO, so good to read that bit first.

(Please vote up on dzone http://www.dzone.com/links/rip_rete_time_to_get_phreaky.html)

Follow up article on queries and backward chaining:
http://blog.athico.com/2014/01/drools-phreak-stack-based-evaluations.html

Performance follow up:
http://blog.athico.com/2014/02/drools-6-performance-with-phreak.html

ReteOO Algorithm

The ReteOO was developed throughout the 3, 4 and 5 series releases. It takes the RETE algorithm and applies well known enhancements, all of which are covered by existing academic literature:
  • Node sharing
    Sharing is applied to both the alpha and beta network. The beta network sharing is always from the root pattern.
  • Alpha indexing
    Alpha Nodes with many children use a hash lookup mechanism, to avoid testing each result.
  • Beta indexingJoin, Not and Exist nodes indexing their memories using a hash. This reduces the join attempts for equal checks. Recently range indexing was added to Not and Exists.
  • Tree based graphs
    Join matches did not contain any references to their parent or children matches. Deletions would have to recalculate all join matches again, which involves recreating all those join match objects, to be able to find the parts of the network where the tuples should be deleted. This is called symmetrical propagation. A tree graph provides parent and children references, so a deletion is just a matter of following those references. This is asymmetrical propagation. The result is faster and less impact on the GC, and more robust because changes in values will not cause memory leaks if they happen without the engine being notified.
  • Modify-in-place
    Traditional RETE implements a modify as a delete + insert. This causes all join tuples to be GC'd, many of which are recreated again as part of the insert. Modify-in-place instead propagates as a single pass, every node is inspected
  • Property reactive
    Also called "new trigger condition". Allows more fine grained reactivity to updates. A Pattern can react to changes to specific properties and ignore others. This alleviates problems of recursion and also helps with performance.
  • Sub-networks
    Not, Exists and Accumulate can each have nested conditional elements, which forms sub-networks.
  • Backward Chaining
    Prolog style derivation trees for backward chaining are supported. The implementation is stack based, so does not have method recursion issues for large graphs.
  • Lazy Truth Maintenance
    Truth maintenance has a runtime cost, which is incurred whether TMS is used or not. Lazy TMS only turns it on, on first use. Further it's only turned on for that object type, so other object types do not incur the runtime cost.
  • Heap based agenda
    The agenda uses a binary heap queue to sort rule matches by salience, rather than any linear search or maintenance approach.
  • Dynamic Rules
    Rules can be added and removed at runtime, while the engine is still populated with data.

PHREAK Algorithm

Drools 6 introduces a new algorithm, that attempts to address some of the core issues of RETE. The algorithm is not a rewrite form scratch and incorporates all of the existing code from ReteOO, and all it's enhancements. While PHREAK is an evolution of the RETE algorithm, it is no longer classified as a RETE implementation. In the same way that once an animal evolves beyond a certain point and key characteristics are changed, the animal becomes classified as new species. There are two key RETE characteristics that strongly identify any derivative strains, regardless of optimizations. That it is an eager, data oriented algorithm. Where all work is done during the insert, update or delete actions; eagerly producing all partial matches for all rules. PHREAK in contrast is characterised as a lazy, goal oriented algorithm; where partial matching is aggressively delayed.

This eagerness of RETE can lead to a lot of churn in large systems, and much wasted work. Where wasted work is classified as matching efforts that do not result in a rule firing.

PHREAK was heavily inspired by a number of algorithms; including (but not limited to) LEAPS, RETE/UL and Collection-Oriented Match. PHREAK has all enhancements listed in the ReteOO section. In addition it adds the following set of enhancements, which are explained in more detail in the following paragraphs.
  • Three layers of contextual memory; Node, Segment and Rule memories.
  • Rule, segment and node based linking.
  • Lazy (delayed) rule evaluation.
  • Isolated rule evaluation.
  • Set oriented propagations.
  • Stack based evaluations, with pause and resume.
When the PHREAK engine is started all rules are said to be unlinked, no rule evaluation can happen while rules are unlinked. The insert, update and deletes actions are queued before entering the beta network. A simple heuristic, based on the rule most likely to result in firings, is used to select the next rule for evaluation; this delays the evaluation and firing of the other rules. Only once a rule has all right inputs populated will the rule be considered linked in, although no work is yet done. Instead a goal is created, that represents the rule, and placed into a priority queue; which is ordered by salience. Each queue itself is associated with an AngendaGroup. Only the active AgendaGroup will inspect it's queue, popping the goal for the rule with the highest salience and submitting it for evaluation. So the work done shifts from the insert, update, delete phase to the fireAllRules phase. Only the rule for which the goal was created is evaluated, other potential rule evaluations from those facts are delayed. While individual rules are evaluated, node sharing is still achieved through the process of segmentation, which is explained later.

Each successful join attempt in RETE produces a tuple (or token, or partial match) that will be propagated to the child nodes. For this reason it is characterised as a tuple oriented algorithm. For each child node that it reaches it will attempt to join with the other side of the node, again each successful join attempt will be propagated straight away. This creates a descent recursion effect. Thrashing the network of nodes as it ripples up and down, left and right from the point of entry into the beta network to all the reachable leaf nodes.

PHREAK propagation is set oriented (or collection-oriented), instead of tuple oriented. For the rule being evaluated it will visit the first node and process all queued insert, update and deletes. The results are added to a set and the set is propagated to the child node. In the child node all queued inset, update and deletes are processed, adding the results to the same set. Once finished that set is propagated to the next child node, and so on until the terminal node is reached. This creates a single pass, pipeline type effect, that is isolated to the current rule being evaluated. This creates a batch process effect which can provide performance advantages for certain rule constructs; such as sub-networks with accumulates. In the future it will leans itself to being able to exploit multi-core machines in a number of ways.

The Linking and Unlinking uses a layered bit mask system, based on a network segmentation. When the rule network is built segments are created for nodes that are shared by the same set of rules. A rule itself is made up from a path of segments, although if there is no sharing that will be a single segment. A bit-mask offset is assigned to each node in the segment. Also another bit-mask (the layering) is assigned to each segment in the rule's path. When there is at least one input (data propagation) the node's bit is set to on. When each node has it's bit set to on the segment's bit is also set to on. Conversely if any node's bit is set to off, the segment is then also set to off. If each segment in the rule's path is set to on, the rule is said to be linked in and a goal is created to schedule the rule for evaluation. The same bit-mask technique is used to also track dirty node, segments and rules; this allows for a rule already link in to be scheduled for evaluation if it's considered dirty since it was last evaluated.

This ensures that no rule will ever evaluate partial matches, if it's impossible for it to result in rule instances because one of the joins has no data. This is possible in RETE and it will merrily churn away producing martial match attempts for all nodes, even if the last join is empty.

While the incremental rule evaluation always starts from the root node, the dirty bit masks are used to allow nodes and segments that are not dirty to be skipped.

Using the existence of at at least one items of data per node, is a fairly basic heuristic. Future work would attempt to delay the linking even further; using techniques such as arc consistency to determine whether or not matching will result in rule instance firings.

Where as RETE has just a singe unit of memory, the node memory, PHREAK has 3 levels of memory. This allows for much more contextual understanding during evaluation of a Rule.

PHREAK 3 Layered memory system

Example 1 shows a single rule, with three patterns; A, B and C. It forms a single segment, with bits 1, 2 and 4 for the nodes.

Example 1: Single rule, no sharing



Example 2 demonstrates what happens when another rule is added that shares the pattern A. A is placed in it's own segment, resulting in two segments per rule. Those two segments form a path, for their respective rules. The first segment is shared by both paths. When A is linked the segment becomes linked, it then iterates each path the segment is shared by, setting the bit 1 to on. If B and C are later turned on, the second segment for path R1 is linked in; this causes bit 2 to be turned on for R1. With bit 1 and bit 2 set to on for R1, the rule is now linked and a goal created to schedule the rule for later evaluation and firing.

When a rule is evaluated it is the segments that allow the results of matching to be shared. Each segment has a staging memory to queue all insert, update and deletes for that segment. If R1 was to evaluated it would process A and result in a set of tuples. The algorithm detects that there is a segmentation split and will create peered tuples for each insert, update and delete in the set and add them to R2's staging memory. Those tuples will be merged with any existing staged tuples and wait for R2 to eventually be evaluated.

Example 2: Two rules, with sharing



Example 3 adds a third rule and demonstrates what happens when A and B are shared. Only the bits for the segments are shown this time. Demonstrating that R4 has 3 segments, R3 has 3 segments and R1 has 2 segments. A and B are shared by R1, R3 and R4. While D is shared by R3 and R4.

Example 3: Three rules, with sharing


Sub-networks are formed when a Not, Exists or Accumulate node contain more than one element. In Example 4 "B not( C )" forms the sub-network, note that "not(C)" is a single element and does not require a sub network and is merged inside of the Not node.

The sub-network gets its own segment. R1 still has a path of two segments. The sub-network forms another "inner" path. When the sub-network is linked in, it will link in the outer segment.

Example 4 : Single rule, with sub-network and no sharing



Example 5 shows that the sub-network nodes can be shard by a rule that does not have a sub-network. This results in the subnetwork segment being split into two.

Example 5: Two rules, one with a sub-network and sharing
Not nodes with constraints and accumulate nodes have special behaviour and can never unlink a segment, and are always considered to have their bits on.

All rule evaluations are incremental, and will not waste work recomputing matches that it has already produced.

The evaluation algorithm is stack based, instead of method recursion. Evaluation can be paused and resumed at any time, via the use of a StackEntry to represent current node being evaluated.

When a rule evaluation reaches a sub-network a StackEntry is created for the outer path segment and the sub-network segment. The sub-network segment is evaluated first, when the set reaches the end of the sub-network path it is merged into a staging list for the outer node it feeds into. The previous StackEntry is then resumed where it can process the results of the sub network. This has the added benefit that all work is processed in a batch, before propagating to the child node; which is much more efficient for accumulate nodes.

The same stack system can be used for efficient backward chaining. When a rule evaluation reaches a query node it again pauses the current evaluation, by placing it on the stack. The query is then evaluated which produces a result set, which is saved in a memory location for the resumed StackEntry to pick up and propagate to the child node. If the query itself called other queries the process would repeat, with the current query being paused and a new evaluation setup for the current query node.

One final point on performance. One single rule in general will not evaluate any faster with PHREAK than it does with RETE. For a given rule and same data set, which using a root context object to enable and disable matching, both attempt the same amount of matches and produce the same number of rule instances, and take roughly the same time. Except for the use case with subnetworks and accumulates.

PHREAK can however be considered more forgiving that RETE for poorly written rule bases and with a more graceful degradation of performance as the number of rules and complexity increases.

RETE will also churn away producing partial machines for rules that do not have data in all the joins; where as PHREAK will avoid this.

So it's not that PHREAK is faster than RETE, it just won't slow down as much as your system grows :)

AgendaGroups did not help in RETE performance, as all rules where evaluated at all times, regardless of the group. The same is true for salience. Which is why root context objects are often used, to limit matching attempts. PHREAK only evaluates rules for the active AgendaGroup,  and within that group will attempt to avoid evaluation of rules (via salience) that do not result in rule instance firings.

With PHREAK AgendaGroups and salience now become useful performance tools. The root context objects are no longer needed and potentially counter productive to performance, as they force the flushing and recreation of matches for rules.

Share/Bookmark

Thursday, October 31, 2013

Configuration and Convention based Building and Utilization

Sneak peak at some of the 6.0 documentation we are writing. This introduces, via examples, the new ways to work with DRL and BPM2 files - without needing to programmatically create a builder, load resources.

Enjoy
--
6.0 introduces a new configuration and convention approach to building knowledge bases, instead of the using the programmatic builder approach in 5.x. Atlhough a builder is still available to fall back on, as it's used for the tooling integration.
Building now uses Maven, and aligns with Maven practices. A KIE projcet or module is simply a Maven java project or module; with an additional meta data file META-INF/kmodule.xml. The kmodule.xml file is the descriptor that selects resources to knowledge bases and configures those knowledge bases and sessions. There is also alternative xml support via Spring and OSGi BluePrints.
While standard Maven can build and package KIE resources, it will not provide validation at build time. There is a Maven plugin which is recommend to use to get build time validation. The plugin also pre-genenerates many classes, making the runtime loading faster too.

Maven can either 'mvn install' to deploy a KieModule to the local machine, where all other applications on the local machine use it. Or it can 'mvn 'deploy' to push the KieModule to a remote Maven repository. Building the Application wil pull in the KieModule, popualting it's local Maven repository, as it does so.

Jars can be deployed in one of two ways. Either added to the classpath, like any other jar in a Maven dependency listing, or they can be dynamically loaded at runtime. KIE will scan the classpath to find all the jars with a kmodule.xml in it. Each found jar is represented by the KieModule interface. The term Classpath KieModules and dynamic KieModule is used to refer to the two loading approaches. While dynamic modules supports side by side versioning, classpath modules do not. Further once module is on the classpath, no other version may be loaded dynamically.
Detailed referencs for the api are included in the next sections, the impatiant can jump straight to the examples section, which is fairly intuitive for the different use cases.

The best way to learn the new build system is by example. The source project "drools-examples-api" contains a number of examples, and can be found at github:
https://github.com/droolsjbpm/drools/tree/6.0.x/drools-examples-api
Each example is described below, the order starts with the simplest and most default working it's way up to more complex use cases.
The Deploy use cases here all involve mvn install. Remote deployment of jars in Maven is well covered is Maven literature. Utilize refers to the initial act loading the resources and providing access to the KIE runtimes. Where as Run refers to the act of interacting with those runtimes.

kmodule.xml will produce a single named KieBase, 'kbase2' that includes all files found under resources path, be it DRL, BPMN2, XLS etc. Further it will include all the resources found from the KieBase 'kbase1', due to the use of the 'includes' attribute. KieSession 'ksession2' is associated with that KieBase and can be created by name.

This example requires that the previous example, 'named-kiesession', is built and installed to the local Maven repository first. Once installed it can be included as a dependency, using the standard Maven element.

Once 'named-kiesession' is built and installed this example can be built and installed as normal. Again the act of installing, will force the unit tests to run, demonstrating the use case.

ks.getKieClasspathContainer() returns the KieContainer that contains the KieBases deployed onto the environment classpath. This time the KieSession uses the name 'ksession2'. You do not need to lookup the KieBase first, as it knows which KieBase 'ksession1' is assocaited with. Notice two rules fire this time, showing that KieBase 'kbase2' has included the resources from the dependency KieBase 'kbase1'.
kmodule.xml produces 6 different named KieBases. 'kbase1' includes all resources from the KieModule. The other KieBases include resources from other selected folders, via the 'packages' attribute. Note the use wildcard '*' use, to select this package and all packages below it.


Only part of the example is included below, as there is a test method per KieSession, but each one is a repetitino of the other, with just different list expectations.
The pom.xml must include kie-ci as a depdency, to ensure Maven is available at runtime. As this uses Maven under the hood you can also use the standard Maven settings.xml file.


In the previous examples the classpath KieContainer used. This example creates a dynamic KieContainer as specified by the ReleaseId. The ReleaseId uses Maven conventions for group id, artifact id and version. It also obey's LATEST and SNAPSHOT for versions.
No kmodue.xml file exists. The projects 'named-kiesession' and 'kiebase-include' must be built first, so that the resulting jars, in the target folders, be be reference as Files.

Creates two resources. One is for the main KieModule 'exRes1' the other is for the dependency 'exRes2'. Even though kie-ci is not present and thus Maven is not there to resolve the dependencies, this shows how you can manually specify the dependency KieModuels, for the vararg.

This programmatically builds a KieModule. It populates the model that represents the ReleaseId and kmodule.xml, as well as added the resources tht. A pom.xml is generated from the ReleaseId.
Example 2.59. Utilize and Run - Java
KieServices ks = KieServices.Factory.get();

KieFileSystem kfs = ks.newKieFileSystem();


Resource ex1Res = ks.getResources().newFileSystemResource(getFile("named-kiesession"));

Resource ex2Res = ks.getResources().newFileSystemResource(getFile("kiebase-inclusion"));


ReleaseId rid = ks.newReleaseId("org.drools", "kiemodulemodel-example", "6.0.0-SNAPSHOT");

kfs.generateAndWritePomXML(rid);


KieModuleModel kModuleModel = ks.newKieModuleModel();

kModuleModel.newKieBaseModel("kiemodulemodel")

            .addInclude("kiebase1")

            .addInclude("kiebase2")

            .newKieSessionModel("ksession6");


kfs.writeKModuleXML(kModuleModel.toXML());

kfs.write("src/main/resources/kiemodulemodel/HAL6.drl", getRule());


KieBuilder kb = ks.newKieBuilder(kfs);

kb.setDependencies(ex1Res, ex2Res);

kb.buildAll(); // kieModule is automatically deployed to KieRepository if successfully built.

if (kb.getResults().hasMessages(Level.ERROR)) {

    throw new RuntimeException("Build Errors:\n" + kb.getResults().toString());

}


KieContainer kContainer = ks.newKieContainer(rid);


KieSession kSession = kContainer.newKieSession("ksession6");

kSession.setGlobal("out", out);


Object msg1 = createMessage(kContainer, "Dave", "Hello, HAL. Do you read me, HAL?");

kSession.insert(msg1);

kSession.fireAllRules();


Object msg2 = createMessage(kContainer, "Dave", "Open the pod bay doors, HAL.");

kSession.insert(msg2);

kSession.fireAllRules();


Object msg3 = createMessage(kContainer, "Dave", "What's the problem?");

kSession.insert(msg3);

kSession.fireAllRules();


Share/Bookmark