Thursday, June 25, 2009

Drools Flow performance

People sometimes ask for tests, benchmarks or numbers that they can use for evaluating whether Drools actually is fast enough. Fast enough always depends on your specific case. We've had various blogs before on performance for the rules engine itself, but so far we have never published anything for Drools Flow.

However, not publishing can sometimes lead to confusion as well (as for example here, where Drools Flow was used as one candidate in a performance evaluation and we at first sight only seemed a fraction faster, but it's difficult to actually figure out what the exact results were). That's why I will post some figures here anyway, simply as some kind of reference, to determine the kind of overhead the engine creates during the execution of your processes.

The test we're using here is actually a very simple one: we simply start an empty process (a start and end node connected to each other) and execute that 10.000x in sequence and measure the avg time it takes to execute that process. These results of course heavily depend on how you configure your engine and we will show these results in three different settings:

A. Simple POJO execution: The Drools engine is used as a simple local Java component (so without any persistence or transactions)

B. Persistence / transactions: The same process is executed but in a transactional context (a new transaction for each process instance), and the state of the engine is always persisted in the database.

C. Optimized Java mode: This is actually one of my pet side-projects, where we translate the Drools Flow process straight into Java code and execute that Java code for you (the client simply needs to change on simple configuration for the process). While this severely limits the types of nodes you're allowed to use in your process (no wait states for example), and reduces the flexibility of your process, it shows how we can make Drools Flow lightning fast (in specific circumstances) if necessary. And it is of course a good reference for showing what the limit is ;) This is again without persistence and transactions.

Results [using IBM ThinkPad T61 laptop running RHEL, Java 1.6]

A. Simple:
388ms -> 0.04ms / process instance
B. Persistence / transactions: 21.9s -> 2ms / process instance
C. Optimized Java: 126ms -> 0.01ms / process instance

If you're using the engine itself without any persistence or transactions (those are added as orthogonal layers, not part of the core itself), we think it's pretty fast :)

As you can see, there's a certain price you have to pay for adding persistence and transactions. But since simply opening a JPA session and persisting one object in a transaction takes about 1.5ms here as well (75% of the total time), we believe we probably do limit the additional overhead.

The optimized Java mode shows that, if you really need to, you can still get about 4x performance increase by generating Java code from the process description. We hope to get this included into the code base at some time, and maybe even provide this functionality to certain parts of your process.

If these numbers are insufficient, you'll still be able to start looking at executing commands in parallel (they were all executed in sequence now), using multiple session to split up the work, etc.

For those who want to verify themselves, the actual test code can be found here.

1 comment:

  1. Good numbers, probably we can do a bigger example to put into the official documentation.

    ReplyDelete