Sunday, December 10, 2006

Dynamically Generated Classes as JBoss Rules Facts (Edson Tirelli)

As most users know, JBoss Rules is designed to use standard POJO JavaBeans as facts. This design directive intends, among other things, to:

  • make integration faster: as there is no need to replicate your Business Model in an engine proprietary business model
  • improve performance: as the engine is design to work as optimized as possible when running over the POJO beans. Also, no need to copy attribute values between mapped structures.

In real world, though, supporting POJO beans may not be enough to achieve above goals.

As we all know, Java, as a platform, provides several ways of developing applications that put dynamic resources at good use. And this is using both platform standard features and a whole lot of tools out there in the internet, both open and proprietary.

For instance, we often are asked about "How to use dynamically generated class' beans as facts?". This usually happens in companies that create applications that allow users to define part or the whole of the business model without touching java code. These applications usually have an embedded rules engine, and you may want the engine to reason over these dynamically generated business models.

And the good news is: you can do that with JBoss Rules, and 3.0.5 made it even easier!

The only issue is to make sure you are using a tool to generate your beans that will truly generate standard JavaBeans. I mean, a tool that:

  1. allows you to state specifically what is the package and class name for the generated class: this is mandatory in order to write efficient rules, as the first constraint the engine will apply over your facts are the class type of the facts your rules will reason over
  2. generates a no-argument default constructor (as per the JavaBeans spec)
  3. correctly generates the getXXX() methods for your properties (as per the JavaBeans spec)
  4. ideally will allow you to define what are your bean's key properties and will automatically generate equals()/hashCode() methods for it using these properties (allowing for consistent reasoning based on equality instead of only identity)
  5. ideally will generate fast accessors for your properties, allowing your beans to be high performing

Most bean generation tools will allow that. Just to name a few open source tools (it is not intended to be a full list):

  • ASM: this is by far my preferred framework. I bit lower level than other tools, but allows you to do anything you want (you are writing bytecode after all), and it is as fast as one can be (from my experience).
  • BCEL: another popular framework from Apache.
  • CGLIB: a higher level framework that is used in a lot of projects.

Above frameworks allows you to create a dynamic POJO JavaBean Business Model and use it as your rules business model. Above is just a small list and I'm sure there are a lot of frameworks out there that allow you to do it.

Unfortunately, one also popular framework that fails to work well with JBoss Rules is Apache DynaBeans.
I'm not saying that DynaBeans are bad to all, as there are some use cases for it described in its documentation. What I'm saying is simply that DynaBeans are bad for integration with JBoss Rules. You can read more about it in Mark's post.

So, you have your Business Model (that happens to be dynamically generated), that complies at least with requirements 1-3 listed above, and you want to use it when writing business rules. What you need to do?

You simply must make your business model available for the engine. JBoss Rules will need your classes in 2 distinct situations, that may occur in sequence or not:

  1. Rules compilation
  2. Rules execution

Let’s talk about how to use them in each of the above steps.

It is important to note that all this is closely related to how Class Loaders work in the Java platform, but since you are using dynamically generated classes, I assume you have knowledge of java Class Loader architecture.

1. Rules Compilation

When you write a rules file, whatever the format you use (DRL, DSL, Decision Tables), JBoss Rules needs to compile it before using. When compiling a rule base, the classes you use as facts must be "available".

For instance, if you write a rule base like this:

package org.drools.sample;

import org.drools.sample.facts.Person;
import org.drools.sample.facts.Cheese;

rule "Likes Cheese"
when
Person( $likes : likes )
Cheese( type == $likes )
then
// do some stuff
end


You must have classes Person and Cheese "available" for compilation. The concept of "available" varies though according to the compiler you are using. Also, Person or Cheese or even both classes may be dynamically generated classes. The engine does not care about it, but the compiler certainly does.

JBoss Rules uses JCI (Java Compiler Interface) as an abstraction layer for the compiler layer. JBoss Rules is integrated and tested with two underling compilers: JDT (default) and JANINO.

They will generate the same results, but from a compilation requirements perspective, they are a bit different, so let’s talk about how to make compilation works with each of them.

1.1. Janino

Janino
is not the default compiler for JBoss Rules but you can activate it either by using a PackageBuilderConfiguration.setCompiler() method or by setting the "drools.compiler=JANINO" system property.

For JCI+Janino to compile your rule base, it is enough to have dynamically generated classes available into your context class loader. So, for instance, if your dynamically generated classes were loaded into the same ClassLoader that loaded your Main application class, only thing you need to do is to call, before creating your PackageBuilder:

Thread.currentThread().setContextClassLoader( Main.class.getClassLoader() );


This will ensure that your PackageBuilder will use the provided class loader to find the classes for compilation. That will obviously succeed as the given class loader is the one that loaded your dynamic classes.

1.2. JDT

JDT is the default compiler for JBoss Rules. If you don't set the "drools.compiler" property, nor change it using the PackageBuilderConfiguration.setCompiler() method, it will be used to compile your rules.

Although, JCI+JDT have an additional requirement: to compile your rule base, the context class loader must be able to provide the actual classes’ bytecode, not only the loaded classes. In other words, you must provide a custom class loader that can provide an input stream for the byte code of each class used in your rule base.

In the above rule example, your custom class loader must return the input stream to the byte code of Person and Cheese classes, does not matter if the classes were dynamically generated or not.

For dynamically generated classes, it is enough for the custom class loader to implement the method:

public InputStream getResourceAsStream(final String name);


I have written a simple example that has a ClassLoader that uses a simple HashMap to store dynamically generated classes bytecodes and return them when the appropriate getResourceAsStream method is called.

So, what you need to do is similar to what you do in Janino:

Thread.currentThread().setContextClassLoader( myCustomClassLoader );


But, your custom class loader must comply with the above requirement.

2. Rules Execution

Rules execution may happen immediately after rules compilation or not. For example, you may compile a rule base and serialize it at build time. At runtime you simply deserialize the rule base and execute it.

The only thing you need to do at execution time is to make sure the same ClassLoader hierarchy used to load your rule base is used to load your fact classes. Again, this is not related to dynamically generated classes, but the problem of multiple ClassLoader shows up more frequently when using dynamically generated classes, because this one of the situations when people usually use multiple ClassLoaders inside the same application.

Just to understand the problem, remember that in java, a class C loaded by loader L1 is different from the same class C loaded by loader L2. So, if you load above rule base with class loader L1 and assert into working memory a Cheese class instance loaded by class loader L2, your rule will never match and obviously never fire.

So make sure your rule base does not load your classes in a different class loader than your application is using to load it.

Example

I created a very simple example of JBoss Rules using dynamically generated classes as facts and committed it here. It is not intended to be a general purpose solution, but rather a didactic example of one possible solution.

I am using ASM as the bytecode generation framework and kept the API functional but simple in order to not overload the example.

The example uses a Customer bean that is a regular POJO class and two dynamically generated classes: Order and LineItem.

Feel free to drop questions you may have to the JBoss Rules users list.

[]s
Edson

13 comments:

  1. There is one downside of going with a pojo approach instead of interpreted. Michael, mark and I have had discussions about this in the past. With SUN's JVM, the permGen can run out of memory when the number of classes grow too big. The generated classes approach runs into problems when it's a critical service and there's a ton of classes. Combine that with with large rulesets, SUN's JVM will quickly run out of PermGen memory even if it is set to something like 256Mb.

    If you use BEA's JRockit, the issue mostly goes away. A better approach is to use interpreted like CLIPS and JESS. This way, the model can be declared as deftemplates and cleared easily without restarting the JVM. For a long running application that is very dynamic, like machine learning, the POJO approach will run into issues :)

    Plus, interpreted approach can be just as fast as POJO, without the PermGen and classloader issues.

    ReplyDelete
  2. Woolfel,

    There may be a confusion in the understanding of the article. You are right about permGen, but supporting POJO as facts has nothing to do with it. Supporting POJO as facts means only we use the same classes/objects the user is already using in his application to reason over it. No need to copy values from one data structure (class/template/etc) to another, no need for a mapping layer etc.

    I'm not suggesting people stop using what they are already using to use something else (generated classes in this case). What I'm pointing out is that "if you are already using dynamic classes, do this to continue using your classes as facts to reason over". This is a frequent question we get from users and it is now answered.

    From the post:
    " So, you have your Business Model (that happens to be dynamically generated), that complies at least with requirements 1-3 listed above, and you want to use it when writing business rules. What you need to do? "

    Hope it clarifies things a little bit.

    []s
    Edson

    ReplyDelete
  3. PermGen issue makes no difference to do with facts - as you only have a small number of fact types in a rule base. Even if you had lots (say hundres of fact types) its still nothing.

    ReplyDelete
  4. What I meant is this. Say I'm in a service environment where I provide a base model. My customers can extend my base model and those models get compiled. Now say I am providing a mobile service to millions of users and 10% of those users extend my base model.

    If I dynamically generate a class, how soon will the JVM tank and blow up? With interpreted approach, you can declare those objects as deftemplates and easily clean them up without recycleing the classloader or JVM.

    sorry for the bad explanation the first time.

    ReplyDelete
  5. Yes if *users* are extending a model somehow, then it has to be data driven, like deftemplates are, they would be good for that, that makes sense. Its an edge case however, but in that unlikely case templates would work better then classes (but from the engines point of view it can still work the same).

    ReplyDelete
  6. that was an actual requirement when I worked on mobile platforms back in 2000. It was an edge case, but think of it another way. In a 48 hour period, there may be that many users. Using a POJO approach, the webapp or ejb has to be recycled to prevent the container from crashing.

    Using a deftemplate approach, you can have service rules to remove the deftemplates when a session ends, so it's minimal impact. Realistically, there may only be 5-10K concurrent user sessions and the request/second may only be 50-60/second. Even though a mid level server can handle the load, the cost of generating classes becomes an issue due to the lack of support for dynamic classes in Java.

    Maybe one of these days Sun will fix that problem and make the JVM better for dynamic languages at the lowest levels :)

    Most use cases don't have to worry about these kinds of issues, unless you happen to work in a large institution like financial, telecom or the military.

    ReplyDelete
  7. Peter,

    You are confusing "POJO" and "dynamic bytecode generation". One thing has nothing to do with the other.

    Supporting POJO means that we support regular classes as facts. So, the user can have his "static" business model (I mean, regular written/compiled code) used directly inside the rules engine.

    It also means that IF for ANY REASON the user has a dynamically generated business model, the engine will work with it too, without any problem!

    But again, using dynamic or static business model has nothing to do with the engine itself. It's a solution design issue.

    We know that for corner cases like the ones you are talking about there are two approaches to avoid problems: use static models or templates. That's why we developed support for templates in JBoss Rules 3.1, so the user can choose between POJO (for the engine does not matter if it is static or dynamic) or templates or even mix both in the same rulebase.

    For instance, look at JESS. It also allows you to use both templates AND POJO beans inside the same rulebase. And that is great, right?

    It defers to the solution architect the decision of what best matches his needs in each case, small or big, for telecom, finance or the bakery in the end of the street.

    There are technical differences in the JESS approach and JBoss Rules approaches, but the concept is exactly the same: provide solution developers with the tools they need to do their work, whatever it is.

    I hope it helps to clarify the issue.

    []s
    Edson

    ReplyDelete
  8. You're right edson I wasn't clear. In my mind, I was thinking about jbossrules object oriented approach of compiling rules and producing code for each rule. In a purely interpreted approach, adding a new rule doesn't produce any new java classes.

    I haven't looked at the template support in jbossrules in a long time. Mark showed it to me when he first wrote it. It is good to support both. What I was trying to say, but in a totally retarded way is this. Compiling a rule and generating classes produces the same limitations as dynamically generated classes. I forget if you were on IRC when I tried to load 8K rules in the drl ide. Has that changed? I had to bump the PermGen size up to 256Mb to load 8K rules. I think i got up to 15K rules with 512Mb PermGen.

    Again, these are edge cases, but they do happen fairly regularly in large deployments, or in machine learning scenarios where a rule engine will have a 500K rules.

    Having said that, any business that has 20K rules needs to rethink how they do rules, since managing 20K+ rules is going to be hard and become a huge issue. I do know of a few firms in DC that have over 40K rules. It's not uncommon for a customer to have 30-50K rules because they imported a spreadsheet or decision table with that many rows.

    I'll try to write something up and open a jira, so users have some tips on how to deal with edge cases. My apologies to you guys, after 160 mile drive, it's impossible for my brain to think straight. I should go to bed, it's cold in Massachussetts and I have to drive in early tomorrow.

    ReplyDelete
  9. Guys i m looking at the sample code there but i got one question why it is creating 2 classes for each class definition?

    form what i understood the logic is like this Create Field Definition which will then be added to the Class Definition this class definition is used to create the actual class

    but what is the use of the defined class in the class defintion. And why does it require FieldAccessorBuilder which is also creating the same class?

    ReplyDelete
  10. I am working for a project where we need to run rules by dynamically loading the classes. It works well for the first run.
    When I try to load the rules again, I am getting a compliation error : Cannot be resolved to a type. Any guess

    ReplyDelete
  11. Hi, I am not sure this is right forum, however I have an interesting problem. I am trying to use Drools5 in Websphere app server in secured environment. Its giving me security exception when I am trying to add my DRLs to the knowledgebase. This is because in drools-core.jar, its trying to create a new ClassLoader. Which is causing the exception. Here in this environment we cant have a new classloader permission. Would like to get some help from this forum, how to get this thing resolved. Thanks in advance.

    ReplyDelete
  12. Hi, sorry but the links to the example code are not working, would it be possible to fix them?

    ReplyDelete
  13. This is years old now, we aren't likely to correct it as it's now deprecated. Drools now supports runtime generated pojos via "declared" types.

    ReplyDelete