Sunday, June 06, 2010

Planner benchmarker: summary bar chart

Drools Planner's benchmarker allows you to run different Drools Planner configurations for different data sets and compare the results.
But comparing the results in an XML file is a pain, so for the next release it now also outputs the results as a bar chart in a summary.png file (with JFreeChart).

I ran a benchmark on the new, unfinished nurse rostering example, for 4 different configurations on 10 datasets in the medium track. Here's the result. Disclaimer: these results are not my final submission yet for the competition which will end in 2 weeks.



On the top of the chart the output notes that "higher is better". But because most planning problems have negative scores, this means that the best results have the smallest bars (and the highest score). Do you think that the "higher is better" note is helpful or do you think it is confusing?

5 comments:

  1. My first comment when I saw the graph was what does "higher is better" mean, I think it is to easy for people to think of height as an absolute value (eg as a length) and so can get confused.

    You could say "closest to top is better" but I'm not sure if that is much better.

    ReplyDelete
  2. Maybe I should leave out the note entirely? Or name it "higher score is better"?

    ReplyDelete
  3. Hi Geoffrey

    I like the idea of the graph.

    "Higher score is better" is a good idea.

    Not sure if I would give this to enduser, anyway.

    Torsten

    ReplyDelete
  4. Higher is better confused me until I realized those were negative scores. Either note negative scores or continue the chart beyond the 0 line? Not sure, but it definitely confused me.

    ReplyDelete
  5. It's not for the end user, but for domain specific planning software developer (so the user of Drools Planner).

    @OsoRojo Most planning problems have negative scores, but Drools Planner supports positive scores too. I want to support both. I agree it's confusing. For now I added a clear "(winner)" tag to the configuration with the best average.

    ReplyDelete