From GPBenchmarks

- Add hard symbolic regression problems to ECJ

ECJ is the closest we have to a de-facto standard set of benchmarks. Since it includes a great deal of pre-written infrastructure, adding new symbolic regression problems is not too hard. **Status: almost done [Sean Luke and David White]**

- Add some constructed problems to ECJ

**Status: almost done [James McDermott]**

- Choose a suite (10?) of symbolic regression problems

It has been pointed out that some of these hard symbolic regression problems are rendered trivial by choice of language. We should avoid those. This allows us to avoid mandating a particular language in the statement of the problem. We should aim to include extrapolative, multi-dimensional problems.

- Choose a symbolic classification problem

- Add symbolic classification problems to ECJ

Adding facilities for general classification problems might also be useful. Sean suggests using ECJ's boolean problems as the basis of this. Two parts: 1. adding facilities for importing data, as opposed to synthesizing training and test data from multiplexer and similar functions. 2. Calculating fitness based on true positive, true negative, false positive, false negative, perhaps with weighting towards difficult class, etc, rather than just summing true positive and true negative.

- Choose a non-numeric problem

Perhaps a game?

- Finalise "recommended benchmarks" and call for comments

Initial aim is to encourage discussion, criticism, refinement and then hopefully acceptance of suggested benchmarks. Longer-term, the aim is to converge on a small set of existing problems, rather than to propose new problems.

- Write a paper

**Done**

- Make a website for the project, and open a wiki page for editing

**Done**

- Establish a competition?

Competitions exist for several games, and there is the GECCO industrial challenge. In such competitions, people compete to do the best they can and are free to use any language, any representation, etc. Problems are hard enough that no-one solves them exactly. Maybe our symbolic regression and symbolic classification benchmarks could form the basis of a competition.