Drools Planner's benchmarker allows you to run different Drools Planner configurations for different data sets and compare the results.
But comparing the results in an XML file is a pain, so for the next release it now also outputs the results as a bar chart in a summary.png file (with JFreeChart).
I ran a benchmark on the new, unfinished nurse rostering example, for 4 different configurations on 10 datasets in the medium track. Here's the result. Disclaimer: these results are not my final submission yet for the competition which will end in 2 weeks.
On the top of the chart the output notes that "higher is better". But because most planning problems have negative scores, this means that the best results have the smallest bars (and the highest score). Do you think that the "higher is better" note is helpful or do you think it is confusing?