This wiki page is a work in progress. If you know a good benchmark, please post it to the GP mailing list, or directly to us gpbenchmarks@googlegroups.com, and we'll add a reference here. We are keen to link to implementations and data sets as well as papers. In the near future we will open this wiki page to public editing.

GECCO Dublin 2011 Meeting

A group of interested parties met in M. O'Brien's pub for lunch during GECCO in Dublin 2011.

Participants: James McDermott, Una-May O'Reilly, Sean Luke, David White, Robin Harper, Leonardo Vanneschi, Sara Silva, Wojciech Jaskowski, Krzysztof Krawiec, Mauro Castelli, Luca Manzoni, Rui Lopes, Terry Soule.

We discussed some of the drawbacks of existing benchmarks, and listed some of the properties desirable in benchmarks (see above).

SL reiterated important issues which led to the initial GP mailing list discussion (again, see above).

SS pointed out that different benchmarks have different purposes and that such information should be included in the wiki. For example, some benchmarks are useful for comparing performance; some induce bloating behaviour and so are useful for comparing this.

KK said that for many problems it would be useful to have on the page not only a description of the problem, but a complete set of links to work using the problem, and a statement of the purpose or the key features of the problem which caused the authors to create or use it. Such information would include the presence of noise, known solutions, and similar.

DW looked outside the field, to areas such as machine learning and systems, where it is common for a group of researchers to develop a suite of agreed benchmarks and publish them with code and documentation online and as an eventual paper. One common feature is for each suite to be labelled by the year it is published (eg "GP-2011"). This creates a built-in social pressure against using old or obsolete benchmarks. See also Da Capo.

All agreed that our effort could lead towards a paper of this type.

DW & SL volunteered to take on items in the TODO list.

