Wednesday, March 25, 2020

Learn DMN in 15 minutes

Today we have a new announcement for new DMN users: the learn-dmn-in-15-minutes.com course!

DMN is already simple and easy to understand at first glance. However, new adopters generally want to check a quick overview and learn about the most important parts, before jumping on a more in-depth journey. That's the goal of this course!

Now newcomers can:

  • Learn DMN in 15 minutes
  • Quickly create a DMN model on dmn.new
  • Execute their first decision model on kogito.kie.org
  • Stay tuned for new content! 🤓


    Share/Bookmark

    Monday, March 16, 2020

    Business Modeler Preview Now Available

    (originally posted here)


    Today we have a new exciting announcement for business automation developers and users. The KIE group team is releasing a preview version of the DMN and BPMN editors online! Once again, kudos for everyone involved.

    This online experience is perfect for getting access to the editors without any local setup quickly. Users and developers can take advantage of it to get familiar with BPMN and DMN standards, to sketch ideas, or even to create fully-functional models.

    preview-online

    Quick tour

    We’ve been experimenting with the idea of an online presence for a while. After a few interactions, we consider that we’re ready to break the news, and here is a quick tour of the available features.

    Samples

    If you’re new to our editors or not much familiar with BPMN and DMN, the "Try Sample" link will provide you a real-world, fully functional example of both standards. You can change the sample and download your latest updates.






    The current online version of the editors doesn’t store the opened models anywhere, so all the changes are only available to your local browser session. In case you want to "save" your work, you’ll need to download it.

    Uploading Models

    In case you have downloaded your work-in-progress model, you can upload it back to the online editor and get back to editing.





    Open from source code

    This mechanism allows users to open a model from an external source; an example of this would be raw git access to a model. Note that you can use the URL generated in your browser to share the model.




    Sharing links from GitHub

    If you have the latest GitHub extension installed, while browsing a GitHub repository and you find a BPMN or a DMN model, you’ll see an icon that will open the model in the online editor. You can also share the created link.





    What about new models? Glad you asked…​


    So far, I’ve covered multiple ways to use the editors with existing models, but what about creating new models? Of course, we have a pair of "Create new" buttons that you’re redirected to the editors.

    However, this is not exactly the best or most natural way to start a new model…​





    Introducing DMN.new and BPMN.new

    Today we’re also making publicly available the preview of DMN.new and BPMN.new!

    The .new domain is a new initiative from Google to create new digital assets online. Other .new domains are docs.new, sheets.new, slides.new, playlist.new, and many more. To learn more about the .new domains, check this page whats.new.

    There’s not much more to say about it, other than now you can type DMN.new and BPMN.new on any browser URL bar, and you’ll be able to create new models without any additional steps! Here’s a quick video to show how simple it is.





    There’s more, much more…​


    This new generation of KIE group tooling is keeping setting the bar higher, and we won’t stop here…​ we have more to come.


    Stay tuned!
    Share/Bookmark

    Tuesday, March 10, 2020

    Kogito, ergo Rules: From Knowledge To Service, Effortless

    Welcome to another episode of this blog series on the Kogito initiative and our efforts to bring Drools to the cloud. The goal of these posts is to gather early user feedback on the features we are delivering to Kogito.
    In this post we present two new ways to realize a complete intelligent service:
    1. self-contained rule services
    2. integrated intelligent workflows with rule tasks

    Units of Execution in Kogito

    As you may already know, in Kogito we are making front-and-center the new Unit concept.
    “Unit of execution” is the term that we use to indicate an executable piece of knowledge. A unit may be a process, a set of rules, a decision, etc… In the case of a set of rules, we call it a rule unit. If you opt-in to use units, in Kogito we will take care of all the boilerplate that is required to generate a REST endpoint automatically.
    A rule unit is constituted primarily by
    1) a data definition;
    2) the set of rules and queries that implement the behavior of the unit (the rules of the rule engine);
    3) optionally, event listeners may be attached for a number of purposes.
    In this post we’ll focus on data definitions, rules and queries.
    Data definitions are given by declaring a Java class that may contain data sources. Each data source represents a partition of the working memory that your rules will pattern match against or insert to.
    For instance, suppose you want to declare an alerting service that receives events and produces alerts depending on some conditions. We declare Event and Alert objects as follows:
    package com.acme;
    public class Event {
       String type;
       int value;
       // getters and setters
    }
    
    public class Alert {
      String severity;
      String message;
      // getters and setters
    }
    
    The AlertingService unit type declaration is a class that implements the interface RuleUnitData.
    package com.acme;
    public class AlertingService implements RuleUnitData {
       private final DataStream<Event> eventData = DataSource.createStream();
       private final DataStream<Alert> alertData = DataSource.createStream();
       // getters and setters
    }
    
    Rules are defined in DRL files as usual, except that you have now to indicate their unit at the top of the file. For instance you may declare the data definition for AlertingService as follows:
    package com.acme;
    unit AlertingService;
    rule IncomingEvent when
       // matches when a temperature higher than 30 °C is registered (OOPath syntax)
       $e : /eventData [ type == "temperature", value >= 30 ] 
    then
       System.out.println("incoming event: "+ $e.getMessage());
       alertData.append( new Alert( "warning", "Temperature is too high" ) );
    end
    
    As you can see, rules may match against or insert to the given data sources.
    Queries are defined in DRL files like rules, and belong to a unit, too. If you declare at least one query, you will get a REST endpoint automatically generated for free. For instance:
    query Warnings
       alerts: /alertData [ severity == "warning" ]
    end
    
    will generate the REST endpoint /warnings that you will be able to invoke by POST-ing to it as follows:
        $ curl -X POST \
               -H 'Accept: application/json' \
               -H 'Content-Type: application/json' \
               -d '{ "eventData": [ { "type": "temperature", "value" : 40 } ] }' \
               http://localhost:8080/warnings
    
    This will generate the response:
    [ { "severity": "warning", "message" : "Temperature is too high" } ]
    
    The Java-based data definition is very familiar to programmers, but, from early user feedback, we decided to provide two alternative methods to declare a rule unit. We are publishing this blog post to gather more user feedback!

    Type Declaration

    The type declaration is the DRL feature to declare Java-compatible types, in a Java-agnostic way. In the 7 series, users may declare types with the syntax:
    package com.acme;
    
    declare Event
       type:  String
       value: int
    end
    
    declare Alert
      severity: String
      message:  String
    end
    
    This makes the DRL completely self-contained: entities and rules may be all defined using DRL. However, they have few limitations; for instance, they do not support implementing interfaces and they do not support generic type fields. In other words, the following declaration, in the 7 series, is syntactically invalid:
    package com.acme;
    declare AlertingService extends RuleUnitData
       eventData: DataStream<Event>
       alertData: DataStream<Alert>
    end
    
    In version 0.8.0, we are lifting these limitations: we allow limited inheritance for interfaces (only one is allowed for now) and generic type declaration for fields. With these new features, the following piece of code becomes valid DRL.
    Long story short: you are now able to declare a full microservice
    from a single DRL
    .
    Bootstrap your Kogito service with the archetype:
          mvn archetype:generate \
             -DarchetypeGroupId=org.kie.kogito \
             -DarchetypeArtifactId=kogito-quarkus-archetype \
             -DarchetypeVersion=0.8.0 \
             -DgroupId=com.acme \
             -DartifactId=sample-kogito
    
    At the moment, no Quarkus version bundles Kogito 0.8.0; otherwise, you would be able to use mvn io.quarkus:quarkus-maven-plugin:create instead.
    Now, clear the contents of src/main and then, drop this DRL to src/main/resources/com/acme folder instead:
    package com.acme;
    unit AlertingService;
    
    import org.kie.kogito.rules.DataStream;
    import org.kie.kogito.rules.RuleUnitData;
    
    declare Event
       type:  String
       value: int
    end
    
    declare Alert
      severity: String
      message:  String
    end
    
    declare AlertingService extends RuleUnitData
       eventData: DataStream<Event>
       alertData: DataStream<Alert>
    end
    
    rule IncomingEvent when
       // matches when a temperature higher than 30 °C is registered (OOPath syntax)
       $e : /eventData [ type == "temperature", value >= 30 ] 
    then
       System.out.println("incoming event: "+ $e.getMessage());
       alertData.append( new Alert( "warning", "Temperature is too high: " + $e ) );
    end
    
    query Warnings
       alerts: /alertData [ severity == "warning" ]
    end
    
    Now fire up the Quarkus service in developer mode with:
        $ mvn compile quarkus:dev
    
    There you go, you are now ready to curl your service:
        $ curl -X POST \
               -H 'Accept: application/json' \
               -H 'Content-Type: application/json' \
               -d '{ "eventData": [ { "type": "temperature", "value" : 40 } ] }' \
               http://localhost:8080/warnings
    

    Workflow Integration

    Another way to expose a rule-based service is through a workflow.
    A workflow (sometimes called a “business process”) describes a sequence of steps in a diagram and it usually declares variables: data holders for values that are manipulated in the execution. The data type of one such variable may be anything: you may use Java classes, but, in this example, we will use again our declared data types.
    package com.acme;
    
    declare Event
       type:  String
       value: int
    end
    
    declare Alert
      severity: String
      message:  String
    end
    
    Let us call this workflow com.acme.AlertingWorkflow, and declare the variables eventData and alertData:
    workflow
    A workflow that includes a rule task may skip the rule unit data declaration altogether: in this case the rule unit is inferred directly from the structure of the process: each variable will be inserted into data source of the same name.
    workflow
    The name of the unit is declared by the process, using the syntax unit:com.acme.AlertingService. You are still free to explicitly declare the unit com.acme.AlertingService; in that case, the process will pick up the declaration that you have hand-coded.
    Note: You may have noticed that we are using the “Rule Flow Group” field. We will implement more explicit support in the UI in the future.
    Bootstrap your Kogito service with the archetype:
          mvn archetype:generate \
             -DarchetypeGroupId=org.kie.kogito \
             -DarchetypeArtifactId=kogito-quarkus-archetype \
             -DarchetypeVersion=0.8.0 \
             -DgroupId=com.acme \
             -DartifactId=sample-kogito
    
    Caveat. Support for this feature is experimental, so it may not work seamlessly with Quarkus hot code reload; we also need the following extra step to enable it, but this will change in the future.
    Update your pom.xml with the following plugin declaration:
      <build>
        <plugins>
          <plugin>
            <groupId>org.kie.kogito</groupId>
            <artifactId>kogito-maven-plugin</artifactId>
            <version>0.8.0</version>
            <executions>
              <execution>
                <goals>
                  <goal>generateDeclaredTypes</goal>
                </goals>
              </execution>
            </executions>
          </plugin>
          ...
        </plugins>
       </build>
    
    You can now clear the contents of src/main, and then drop the process and the following DRL to src/main/resources/com/acme folder.
    package com.acme;
    unit AlertingService;
    
    import org.kie.kogito.rules.DataStream;
    import org.kie.kogito.rules.RuleUnitData; 
    
    declare Event
       type:  String
       value: int
    end
    
    declare Alert
      severity: String
      message:  String
    end
    
    rule IncomingEvent when
       // matches when a temperature higher than 30 °C is registered (OOPath syntax)
       $e : /eventData [ type == "temperature", value >= 30 ]
    then
       System.out.println("incoming event: "+ $e.getMessage());
       alertData.set( new Alert( "warning",  "Temperature is too high: " + $e ) );
    end
    
    As you may have noticed, you are not required to declare a query explicitly: the process will display the contents of the variables as a response; it will generate the endpoint /AlertingWorkflow, and it accept a POST request of the following form:
        $ curl -X POST \
               -H 'Accept: application/json' \
               -H 'Content-Type: application/json' \
               -d '{ "eventData": { "type": "temperature", "value" : 40 } }' \
               http://localhost:8080/AlertingWorkflow
    
    The reply will be:
    {
      "id": ...,
      "eventData": {
        "type": "temperature",
        "value": 100
      },
      "alertData": {
        "severity": "warning",
        "message": "Temperature is too high: Event( type=temperature, value=100 )"
      }
    }
    
    However, if you do declare a query, a separate endpoint will be available as well. For instance if you declare the query Warnings you will still be able to POST to http://localhost:8080/warnings and invoke the rule service separately as follows:
    $ curl -X POST \
           -H 'Accept: application/json' \
           -H 'Content-Type: application/json' \
           -d '{ "eventData": { "type": "temperature", "value" : 40 } }' \
           http://localhost:8080/warnings
    
    Notice that the request no longer contains a list of Events. This is because process variables are mapped to single values instead of DataStreams.

    Conclusion

    We have given a sneak peek on the work that we are doing to improve the getting started experience with rules and processes in Kogito. With these changes, we hope to have provided a more streamlined way to define knowledge-based services. Developers will always able to be more explicit about the data they want to process, by opting-in to writing Java; but if they want, they can embrace a fully DSL-centric development workflow.
    For the lazies, examples are available at https://github.com/evacchi/kogito-rules-example/tree/master/code Have fun!

    Share/Bookmark

    Thursday, February 20, 2020

    PMML revisited

    Hi folks! The beginning of this year brings with it the initiative to re-design the Drools PMML module.
    In this post I will describe how we are going to approach it, what's the current status, ideas for future development, etc. etc so... stay tuned!

    Background

    PMML is a standard whose aim is to "provide a way for analytic applications to describe and exchange predictive models produced by data mining and machine learning algorithms." PMML standard defines a series of models that are managed, and we will refer to them as "Model".
    The maybe-not so obvious consequence of this is that, said differently, PMML may be thought as an orchestrator of different predictive models, each of which with different requirements.
    Drools has its own PMML implementation. The original design of it was 100% drools-engine based, but in the long term this proved to be not so satisfactory for all the models, so a decision has taken to implement a new version with a different approach. And here the current story begin...

    Requirements

    To the bare-bone essence, what a PMML implementation should allow is to:
    1. load a PMML file (xml format)
    2. submit input data to it
    3. returns predicted values
    Sounds simple, doesn't it? 

    Approach

    The proposed architecture aims at fulfilling the requirements in a modular way, following “Clean Architecture” principles.
    To achieve that, components are defined with clear boundaries and visibility.
    General idea is that there are specific tasks strictly related to the core functionality that should be kept agnostic by other “outer” features.
    Whoever wanting to deep delve in the matter may read the book "Clean Architecture" by R. C. Martin, but in the essence it is just a matter to apply good-ol' design principles to the overall architecture.
    With this target clearly defined, the steps required to achieve it are:
    1. identify the core-logic and the implementation details (model-specific)
    2. implement the core-logic inside "independent" modules
    3. write code for the model-specific modules
    We choose to implement a plugin pattern  to bind the core-logic to the model-specific implementations mostly for two reasons:
    1. incremental development and overall code-management: the core module itself does not depend on any of the model-specific implementations, so the latter may be provided/updated/replaced incrementally without any impact on the core
    2. possibility to replace the provided implementation with a custom one
    3. we also foresee the possibility to choose an implementation at runtime, depending on the original PMML structure (e.g. it may make sense to use a different implementation depending on the size of the given PMML)
    (I cheated: those are three) 


    Models

    KiePMMLModel

    1. This is the definition of Kie-representation of the original PMML model.
    2. For every actual model there is a specific implementation, and it may be any kind of object (java map, drools rule, etc).
    Could we avoid it? Maybe. We could use the model directly generated by the specification' xsd. But this has been designed to describe all the predictive models, while any of them may use it in different way and with different convention; so this internal view will represent exactly what is needed for each specific model.

    Components

    We identified the following main functional components:
    1. Compiler
    2. Assembler
    3. Executor

    Compiler

    This component read the original PMML file and traslate it to our internal format.
    The core-side of it simply unmarshall the xml data into Java object. Then, it uses java SPI to retrieve the model-compiler specific for the given PMML model (if it does not find one, the PMML is simply ignored).
    Last, the retrieved model-compiler will “translate” the original PMML model to our model-specific representation (KiePMMLModels).
    The core-side part of this component has no direct dependence on any specific Model Compiler implementation and not even with anything drools/kie related - so basically it is a lightweight/standalone library.
    This component may be invoked at runtime (i.e. during the execution of the customer project), if its execution is not time-consuming, or during the compilation of the kjar (e.g. for drools-implemented models).

    Assembler

    This component stores KiePMMLModels created by the Compiler inside KIE knowledge base. None of the other components should have any dependency/knowledge of this one.
    In turns, it must not have any dependency/knowledge/reference on actual Model Compiler implementations.

    Executor

    This component is responsible for actual execution of PMML models. It receives the PMML input data, retrieves the KiePMMLModel specific for the input data and calculates the output.
    For each model there will be a specific “executor”, to allow different kinds of execution implementation (drools, external library, etc) depending on the model type.
    The core-side of it simply receives the input data and retrieve the model-executor specific for the given PMML model (if it does not find one, the PMML is simply ignored).
    Last, the retrieved model-executor will evaluate the prediction based on the input data.
    The core-side part of this component has no direct dependence on any specific Model Executor implementation, but of course is strictly dependent on the drool runtime.

    CleanPMMLArchitecture
    Overall Architecture


    Model implementations

    Drools-based models

    Some models will delegate to the drools-engine to allow best performance under heavy load. Here are some details about general scheme for such implementations.
    1. the compiler is invoked at kjar generation (or during runtime for hot-loading of PMML file)
    2. the compiler reads the PMML file and transform it to "descr" object (see BaseDescrDescrFactoryDescrBuilderTest)
    3. regardless of how the model-compiler is invoked, the drools compiler must be invoked soon after it to have java-class generated based on the descr object
    4. the assembler put the generated classes in the kie base
    5. the executor loads the "drools-model" generated and invoke it with the input parameters

    DRL details

    • for each field in the DataDictionary, a specific DataType has to be defined
    • for each branch/leaf of the tree, a full-path rule has to be generated (i.e. a rule with the path to get to it - e.g. "sunny", "sunny_temperature", "sunny_temperature_humidity")
    • a "status-holder" object is created and contains the value of the rule fired - changing that value will fire the children branch/leaf rules matching it (e.g. the rule "sunny" will fire "sunny_temperature" that - in turns - will fire "sunny_temperature_humidity")
    • such "status-holder" may contain informations/partial result of evaluation, to be eventually used where combination of results is needed
    • missing value strategy may be implemented inside the status holder or as exploded rules

    Testing

    For each model there will be a set of standard unit tests to mostly verify  individual units of code. Beside that, inside the model-specific module (yes, it is a tongue twister) there will be an integration-test submodule. This latter will verify the overall correct execution of different, more or less complex, PMML files, to simulate as much as possible what may happen in real-world scenarios.

    Regression

    Regression model is the first one to have been implemented. Due to its inherent simplicity, we choose to provide a pure java-based implementation for it. For the moment being it is still under PR, and new full tests are being added.

    Tree

    After evaluating all the pros/cons, we decided that this model could be a good candidate to be implemented with a drools-based approach. Being also a simple model to follow, we choose to use it as first test for drools approach.

    TO-DOs

    This is a list of missing features that are not implemented, yet, and not strictly-related to a specific model. It will be (well, it should be) updated during the development:

    Needless to say that any comment (especially nice ones) and suggestion will be greatly appreciated.

    Come back in the following days and see what's next! 
    Bye!

    Share/Bookmark

    Monday, February 03, 2020

    KIE Decision Tooling blog

    KIE Decision Tooling is the team responsible for building web editors to support business decisions, and now it has a blog.

    We're still cross-posting feature releases here. But, you can also find specific content regarding the technologies that orbit the web tooling there.

    In our first post, we're presenting the new code completion feature in the DMN editor, check it out: https://medium.com/kie-decision-tooling/feel-functions-and-the-dmn-editor-7f4462f9f012

    Follow the RSS here: https://medium.com/feed/kie-decision-tooling.

    Stay tuned for the next posts! :-)


    Share/Bookmark

    Monday, August 12, 2019

    Recent Drools DMN open source engine performance improvements

    We are always looking to improve the performance of the Drools DMN open source engine. We have recently reviewed a DMN use-case where the actual input population of Input Data nodes varied to some degree; this highlighted a suboptimal behavior of the engine, which we improved in recent releases. I would like to share our findings!

    Benchmark development


    As we started running a supporting benchmark for this use-case, especially when investigating the scenario of large DMN models with sparse-populated input data nodes, we noticed some strange results: the flamegraph data highlighted a substantial performance hit when logging messages, consuming very significant time in comparison to the application logic itself.


    This flamegraph highlight specifically that a large portion of time is consumed by stacktrace synthesis, artificially induced by the logging framework. The correction, in this case, was to tune the logging configuration to avoid this problem; specifically, we disabled a feature of the logging framework which is very convenient during debugging activities, enabling to quickly locate the original calling class and methods: unfortunately this feature come at the expense of synthesizing stacktraces, which originally contaminated the benchmark results. Lesson learned here: always check first if non-functional requirements are actually masking the real issue!

    This was a necessary and propaedeutic step, before proceeding to investigate the use-case in more details.


    Improving performance


    Moving on and focusing now on DMN optimizations, we specifically developed a benchmark to be general enough, but also highlighting the use-case which was presented to us. This benchmark consists of a DMN model with many (500) decision nodes to be evaluated. Another parameter controls sparseness of input data nodes valorization for evaluation; ranging from a value of 1 where all inputs are populated, to 2 where only one out of two inputs is actually populated, etc.

    This specific benchmark proved to be a very instrumental tool to highlight some potential improvements. 

    Setting the comparison baseline to Drools release 7.23.0.Final, the first optimization implemented with DROOLS-4204 focused on improving context handling while evaluating FEEL expressions and demonstrated to offer a ~3x improvement, while further optimization implemented with DROOLS-4266 focusing on specific case for decision table input clauses demonstrated an additional ~2x improvement on top of DROOLS-4204.

    We also collected these measurements in the following graphs.


    This graph highlights the compounding improvements in the case of sparseness factor equal to 1, where all inputs are populated; this was a very important result, as in fact it did represent the main, “happy path” scenario in the original use-case.

    In other words, we achieved a ~6x improvement in comparison to running the same use-case on 
    7.23.0.Final. The lesson I learned here is to always strive for these kind of compounding improvements when possible, as they really build on top of each other, for greater results!

    For completeness, we repeated the analysis with sparseness factor equals to 2 (1 every 2 inputs is actually populate) and 50 (1 every 50 inputs is actually populated) with the following measurements:



    Results show that the optimizations were also significant for sparseness factor equal to 2, but not as relevant improvements as this factor grows -- which is expected, as the impact of the decision nodes evaluations on the overall logic of execution become now less relevant. 

    For completeness, analysis was also performed with another, already existing benchmark for single decision table consisting of many rules rows:


    Results show that these code changes considered as a whole, still offered a relevant improvement; although clearly not of the same magnitude as for the original use-case. This was another important check to ensure that these improvements were not overfitting on the specific use-case.

    Conclusions


    Considering Drools release 7.23.0.Final as the baseline, and a reference benchmark consisting of a DMN model with many decision nodes to be evaluated, we implemented several optimizations that once combined demonstrated to offer a total of ~6x speed-up on that specific use case!

    I hope this was an interesting post to highlight some of the dimensions were to look into to achieve better performances; let us know you thoughts and feedback.

    You can already benefit today from these Kie DMN open source engine improvements in the most recent releases of Drools! 



    Share/Bookmark