Making Hard(Er) Bechmark Test Functions

Dante Niewenhuis

1 a

and Daan van Den Berg

2 b

Master Artiﬁcial intelligence, Informatics Institute, University of Amsterdam, The Netherlands

Department of Computer Science, Vrije Universiteit Amsterdam, The Netherlands

Keywords:

Evolutionary Algorithms, Continuous Problems, Benchmarking, Instance Hardness.

Abstract:

This paper is an exploration into the hardness and evolvability of default benchmark test functions. Some very

well-known traditional two-dimensional continuous benchmark test functions are evolutionarily modiﬁed to

challenge the performance of the plant propagation algorithm (PPA), a crossoverless evolutionary method. For

each traditional benchmark function, only its scalar constant parameters are mutated, but the effect on PPA’s

performance is nonetheless enormous, both measured in objective deﬁciency and in the success rate. Thereby,

a traditional benchmark functions’ hardness can thereby indeed be evolutionarily increased, and an especially

interesting observation is that the evolutionary processes seem to follow one of three speciﬁc patterns: global

minimum narrowing, increase in ruggedness, or concave-to-convex inversion.

1 BENCHMARK TEST

FUNCTIONS

Sure there’s free lunch. Depth-ﬁrst branch and bound

outperforms an exhaustive search on the entirety of

traveling salesman problem instances, when paid in

recursions. But when it comes to iterative (stochas-

tic) algorithms, which may or may not “have drawn

inspiration from optimization that occur in nature”,

Wolpert & Macready’s No Free Lunch Theorem

states that “the average performance of any pair of

algorithms across all possible problems is identical”

(Wolpert and Macready, 1997). This a very strong

result, and may even be somewhat disheartening to

those hoping to ﬁnd a general-purpose optimization

algorithm (Brownlee et al., 2007).

So to what extent are Wolpert & Macready

(in)directly responsible for the popularity of bench-

marking practices in iterative optimization? Recently,

no less then 17 leading scientists coauthored a paper

on benchmarking evolutionary algorithms a few years

ago. “Benchmarking in Optimization: Best Practice

and Open Issues” is still actively under development

(currently in its third version) and shows that the sub-

ject’s debate is active (Bartz-Beielstein et al., 2020). It

afﬁrms that the community’s focus from ‘ﬁnding the

best optimization algorithm’ in general has shifted to

https://orcid.org/0000-0002-9114-1364

https://orcid.org/0000-0001-5060-3342

‘ﬁnding the best optimization algorithm for a certain

problem (instance)’ (Rice, 1976)(Kerschke and Traut-

mann, 2019). This could be seen as a direct or indirect

partial consequence of the free lunch theorem.

Benchmark sets come in many various forms for

various problems, and appear to be selected on ba-

sis reputation mostly. One such example is Reinelt’s

TSPLIB, a benchmark set for the traveling salesman

problem (Reinelt, 1991). Not particularly suitable for

complexity tests, but valuable in their practicality, the

set contains 144 problem instances of Euclidean TSP,

mostly real-world maps, but also ‘real’ drilling plans.

Stemming from 1991, such an early benchmark set

has allowed many scientists to test their algorithms

on the same problem instances for quite some time.

Other examples include cutting and packing problem

instances (Iori et al., 2021a; Iori et al., 2021b; Iori

et al., 2021c)(Braam and van den Berg, 2022), the job

shop scheduling problem (Weise et al., 2021) pseudo-

Boolean optimization (Doerr et al., 2020) and the W-

model problem (Weise and Wu, 2018a)(Weise and

Wu, 2018b).

These examples are all benchmark sets for ‘dis-

crete’ optimization problems, usually meaning the

number of solutions is ﬁnite, even though the objec-

tive function can still be continuous. Contrarily, in

the class of continuous optimization problems, the

variables are instantiated from a numeric range (such

as −4 ≤ x

, x

≤ 5), hence there is an inﬁnite num-

ber of solutions. Although not strictly required, con-

Niewenhuis, D. and van Den Berg, D.

Making Hard(Er) Bechmark Test Functions.

DOI: 10.5220/0011405300003332

In Proceedings of the 14th International Joint Conference on Computational Intelligence (IJCCI 2022), pages 29-38

ISBN: 978-989-758-611-8; ISSN: 2184-3236

Figure 1: The six continuous benchmark test functions used in this paper. From these ‘initial conditions’, scalar constants are

evolutionary modiﬁed to increase hardness.

tinuous benchmark functions are usually deﬁned us-

ing continuous functions such as x

, sin(x) or

√

making the resulting objective landscapes continu-

ous as well – and often derivatively continuous too.

Many continuous functions were created long ago,

and therefore some of these too gained a considerable

reputation. Household names like Schwefel, Rosen-

brock and Rastrigin functions have been adopted

for benchmarking evolutionary algorithms quite often

(Pohlheim, 2007)(Suganthan et al., 2005)(Socha and

Dorigo, 2008)(Laguna and Marti, 2005).

In this paper, we will evolve from six default

benchmark functions: Branin, Easom, Goldstein-

Price, Martin-Gaddy, Mishra04 and Six-Hump

Camel, ﬁve of which are present in all earlier bench-

mark studies on PPA. The sixth, Mishra04, was added

for spatial reasons – there was simply room for one

more. Evolving in this sense means: tweak their

numerical parameters, in a hillClimberly fashion, to

make them as hard as possible for the Plant Propaga-

tion Algorithm, a crossoverless evolutionary method.

The title of this paper is therefore actually somewhat

questionable; a better description would be that we

are “evolving the objective function landscapes from

well-known benchmark functions by modyﬁng their

scalar constants, resulting in harder landscapes with

new functional descriptions”. Since

“benchmark test functions behave notoriously

ﬁckle across the body of literature. Func-

tional descriptions, domain ranges, vertical

scaling, initialization values and even for the

exact spelling of a function’s name, a multi-

tude of alternatives can be found. It is there-

fore wise to use explicit deﬁnitions.”(Vrielink

and van den Berg, 2019).

we will stick to Vrielink’s recommendation to explic-

itly list deﬁnitions for functions, which for this study

can be found in Table 1, also see Figure 1). For all

other resources, including source code, parameterized

functions, full results and extra’s, the reader is sug-

gested to visit our public online repo (Anonymous,

2022).

2 RELATED WORK

Abiding by the line of thought following the No Free

Lunch Theorem, one could say we are simply evolv-

ing hard problem instances for a solving algorithm, a

concept that is not entirely new. A similar endeav-

our in the discrete domain has been conducted by

Joeri Sleegers, who evolved very hard instances of

the Hamiltonian cycle problem (Sleegers and van den

Berg, 2020a; Sleegers and van den Berg, 2020b), ﬁnd-

ECTA 2022 - 14th International Conference on Evolutionary Computation Theory and Applications

ing them in very unusual places (van Horn et al.,

2018; Sleegers and Berg, 2021). For the Hamiltonian

cycle problem, the instance space is discrete: there

is a ﬁnite (but very large) number of (V, E) problem

instances for graphs of V vertices and E edges. The

objective space is also discrete, as any one graph can

require between 0 and at most V ! recursions checking

for Hamiltonicity.

Jano van Hemert has evolved hard instances

for the Euclidean traveling salesman problem (van

Hemert, 2006; Smith-Miles et al., 2010). In Van

Hemert’s setting, cities lie on a grid, and therefore

the instance space is similar to the Hamiltonian cy-

cle problem: discrete but ﬁnite but very large. In

principal however, it could have been continuous too,

would Van Hemert have opted to use a continuous

Euclidean plane instead of a grid. The difference to

our work is not the solving algorithm, which is also a

heuristic, but that Van Hemert knows the global mini-

mum of evolved instances from running an exact algo-

rithm: Applegate’s Concord (Applegate et al., 1999).

Unlike us, Van Hemert uses the success rate as his

primary assessment measure. But particularly inter-

esting is Van Hemert’s viewpoint on his own experi-

ment, which he describes as ‘stress testing the solv-

ing algorithms’. Van Hemert and Sleegers both un-

derline that the number of evolutionary generations is

severely limited, because the problem instances are

maximized for their hardness, and therefore, evalua-

tions take more and more time while pushing for its

NP-complete bound.

Through a different entry point, Gallagher & Yuan

did some facilitating work on tuning fully continu-

ous ﬁtness landscapes. Their tuneable landscape gen-

erator takes few parameter settings, and can gener-

ate multimodal landscapes (especially see Figures 1

and 2 in their paper) (Gallagher and Yuan, 2006).

Interestingly, these authors also suggest its applica-

tion for gaining new insights into the algorithms that

navigate these landscapes. These (or similar) efforts

might have been taken to experiment by Lou et al.,

who propose a framework that evolves uniquely hard

or uniquely easy problem instances using a tuneable

benchmark generator(Lou et al., 2018). Although

they supply results, they do not supply the exact deﬁ-

nition of the objective functions.

We haven’t found any efforts that attempt to di-

rectly manipulate the formulae for objective land-

scapes

, but those initiatives might very well exist.

Often called ‘ﬁtness landscapes’, the term is slightly

misleading because not all algorithms take the function’s

objective value directly as ﬁtness. PPA is in fact one of

such examples. For this reason, we will therefore stick to

‘objective landscapes’.

For PPA evolving benchmark functions might make

an interesting case study, because so many of its

previous studies were thoughtfully ran on the same

benchmark suite, enabling us to draw longer lines on

the canvas of scientiﬁc progress. Salhi and Fraga

ﬁrst showed the effectiveness of PPA on the suite,

while a followup by Wouter Vrielink demonstrated

that the algorithm is insensitive to the position of

the global minimum on the (hyper)plane (Vrielink

and van den Berg, 2019). Later, Vrielink

demon-

strated that on the exact same suite, parameter con-

trol of PPA’s ﬁtness function can sharply improve

its performance (Vrielink and van den Berg, 2021b;

Vrielink and van den Berg, 2021a). Marleen de Jonge

demonstrated PPA’s parameter insensitivity through

offspring numbers and population size on the same

suite, but also found some remarkable dimensional

effects as well. For some, the parameter sensitivity

increased sublinearly, for others superlinearly. But

for a third category, the sensitiviy initially increased,

but plummeted for dimensionalities over 11 (de Jonge

and van den Berg, 2020; De Jonge and van den Berg,

2020). For the Euclidean traveling salesman prob-

lem however, performance is not parameter indepen-

dent in PPA (Koppenhol et al., 2022). A comparison

of survivor selection methods on the seminal bench-

mark suite was done by Nielis Brouwer, who showed

that elitsit tournament selection is best for these func-

tions(Brouwer et al., 2022).

An open question however, is whether this default

benchmark suite is actually hard to begin with. Are

harder benchmark functions possible for PPA? How

much harder can they be? What do they look like?

In this study, we’ll make a ﬁrst attempt in evolving

such functions, and extend the already long lines of

scientiﬁc progress for this algorithm. Whether these

lines actually sketch the reﬁned image of an algo-

rithm or rather the detailed portraits of the benchmark

functions suite used in the experiments is to be seen.

Maybe both.

3 PLANT PROPAGATION

ALGORITHM

Just over a decade old, the plant propagation algo-

rithm (PPA) is still a relative newcomer in the realm of

metaheuristic optimization methods. Its central prin-

ciple is that ﬁtter individuals in its population produce

more offspring with small mutations, whereas unﬁt-

ter individuals produce fewer offspring with larger

mutations. Since its introduction by Abdellah Salhi

Notoriously ﬁckle.

Making Hard(Er) Bechmark Test Functions

Table 1: The ‘Default’ column shows the benchmark formulas are as they appear in literature. In the ‘Evolved’ column are

the benchmark functions with mutated scalars, which are signiﬁcantly harder for PPA, the evolutionary algorithm of choice.

Benchmark Default Evolved

Branin

−

5.1

4π

−6)

10(1 −

8π

)cos(x

) + 10

−0.084(x

−0.114(x

+ 1.049)

+ 1.261x

−6)

10cos(8.916x

+ 0.123) + 70.495

Easom −cos(x

)cos(x

−(x

−π)

−(x

−π)

−cos(x

)cos(x

)exp[−(|(10(x

+ 0.282)

10(x

+ 10)

|)] + 10

GoldsteinPrice

[(1 + (x

+ x

+ 1)

(19 −14x

−14x

+ 3x

+ 6x

)]×

[30 + (2x

−3x

)

(18 −32x

+ 48x

+ 27x

+ 12x

−36x

)]

[5.545(−6.191 + (9.47x

+ x

−1.106)

(19 −14x

−14x

−6.48x

+ 3x

+ 6x

)]×

[59.504 + (2x

−3x

)

(18 −15.341x

+ 48x

−14.88x

+ 12(x

+ 0.748)

−100x

)]

MartinGaddy (x

−x

)

+ (

+ x

−10

)

−1.132(x

−0.476x

−4.036)

−4.506(x

−0.913x

−0.8)

+ 9.644

Mishra4

|sin(

|(x

)

+ x

|)|+

0.01(x

+ x

)

|sin(

|4.789(x

+ 5.968)

+ 10x

−0.598|)|+

0.005(x

+ x

) −10

SixHump (4 −2.1x

+ x

+ (4x

−4)x

0.33x

−2.45x

+ 4.133(x

−0.114)

−8.8x

−7.686(x

+ 0.019)

+ 1.505x

−10

and Eric Fraga (Salhi and Fraga, 2011), the paradigm

has seen a number of applications (Sulaiman et al.,

2018)(Sleegers and van den Berg, 2020a)(Sleegers

and van den Berg, 2020b)(Vrielink and van den

Berg, 2019)(Fraga, 2019)(Rodman et al., 2018), as

well as some spinoffs (Sulaiman et al., 2016)(Paauw

and Van den Berg, 2019; Dijkzeul et al., 2022)(Se-

lamo

glu and Salhi, 2016)(Haddadi, 2020)(Geleijn

et al., 2019). In this paper, we’ll use its seminal form,

which iterates through the following routine:

1. Initialize popSize individuals on the problem’s

domain with a uniform distribution between the

bounds.

2. Normalize each individual’s objective value f (x

)

to the interval [0,1] as z(x

) =

f (x

max

)−f (x

)

f (x

max

)−f (x

min

)

Here, f (x

max

) and f (x

min

) are the largest and

smallers objective value in the population.

3. Assign ﬁtnesses to individuals x

as F(x

) =

(tanh(4 ·z(x

) −2) + 1).

4. Assign the number of offspring for each individ-

ual x

as n(x

) = dn

max

F(x

)re, where r is a random

value in [0,1) and n

max

is a parameter determining

the number of offspring a population produces.

5. The mutation on dimension j is (b

−a

with d

) = 2(r −0.5)(1 −F(x

)), in which r is

a random number in [0,1) and b

and a

are the

upper and lower bounds of the j

dimension. Any

individual exceeding a dimension’s maximum or

minimum bound is corrected back to that bound.

6. The popSize best individuals are selected for the

next generation. If the predetermined number of

evaluations is not met, go back to 2.

There is no particular reason to use the plant prop-

agation algorithm for this study other than its sim-

plicity and elegance. We, the authors, happen to

be very familiar with its behaviour, but in later

stages, expedience dictates that evolved bench-

mark functions should surely be tested with the

genetic algorithms, simulated annealing and other

evolutionary methods.

4 ASSESSING PERFORMANCE

Before getting into the evolutionary process, it is nec-

essary to tackle the problem of assessing the perfor-

mance of PPA (or any heuristic algorithm for that mat-

ter) on an unknown continuous objective landscape.

For default benchmark test functions, the global min-

imum is typically known from literature, and its max-

imum can be easily found. For the evolved functions

however, this is no longer the case. Given the ﬁckle-

ness of the ﬁtness landscapes through the evolution-

ECTA 2022 - 14th International Conference on Evolutionary Computation Theory and Applications

ary process, we deploy random sampling as a practi-

cal way forward. After each generation of the func-

tion evolver (Section 5), one million objective values

are sampled from randomly chosen (x

, x

) coordi-

nates. The minimal and maximal values of the sam-

ples are retained, serving as a ‘surrogate global mini-

mum’ and ‘surrogate global maximum’ in min(Func)

and max(Func).

The ﬁrst and primary performance measure is the

objective deﬁciency which is given as

min(PPA) −min(Func)

max(Func) −min(Func)

×100% (1)

in which min(PPA) is the best found objective value

after a run of PPA, Func is the evolved function at

hand, max(Func) is its surrogate global maximum

and min(Func) its surrogate global minimum in the

objective landscape. In other words: the objective de-

ﬁciency is the percentual difference between PPA’s

performance and the benchmark function’s global

minimum relative to its entire range; an objective de-

ﬁciency of 0% means PPA has found the (surrogate)

global minimum.

Note that if this percentage is really low, that it

does not necessarily mean the run actually got close to

the global minimum’s x

and x

coordinates; it could

have discovered a very deep local minimum, which

is nonetheless spatially remote from the global mini-

mum’s location. For this reason, we also keep track

of a secondary measurement: the success rate, de-

ﬁned as the percentage of successful runs. Closely

resembling Salhi and Fraga’s seminal method (Salhi

and Fraga, 2011), we qualify a run as ‘successful’ iff

its best performing individual’s x

and x

coordinates

are close to the (surrogate) global minimum’s coordi-

nates. ‘Close’ in this sense means that the Euclidean

distance is smaller than 1% of the maximal possible

distance, which is the length of the domain’s diago-

nal. It should be noted that this method is slightly

different from Salhi & Fraga’s seminal method, that

uses a 1%-box around the global optimum, whereas

our Euclidean distance deﬁnes a circle.

5 (EVOLVING) BENCHMARKS

FUNCTIONS

For evolving the objective landscapes, we need the

ﬂexibility to depart from the benchmark functions’

default formulation, and allow its numerical param-

eters to be changed. Note that in everyday mathemat-

ics, many nunmerical parameters are ‘hidden’: a sim-

ple function such as f (x) = cos(x) can equivalently

be seen as f (x) = 1 · cos(1 ·x + 0) + 0, leading to

several extra parameters, all deﬁning its shape. But

even though this study only mutates these constants,

it still leads to some arbitrary formulaic decisions to

be made. As one example, the default Easom function

−cos(x

)cos(x

−(x

−π)

−(x

−π)

(2)

transforms to a parameterizable form as

(3)

−p

cos(p

+ p

)cos(p

+ p

)

(−((p

−p

)

−p

)

))

+ p

in which p

through p

are its mutable parameters.

The benchmark evolver itself is conceptually

fairly simple: starting from a parameterizable bench-

mark function with all parameters assigned their de-

fault values, and pick a randomly chosen parameter to

mutate on each evolutionary step. To avoid extreme

values, a ‘magnitude parameter’ mag is assigned to

each parameter, equal to its closest enveloping power

of 10. Branin’s third parameter’s default value is 5.1,

giving mag = 10 for this parameter, and a mutation

is a change to between −10 and 10. For Goldstein-

Price’s 22

parameter, whose default value is 48,

the mutation is assigning a random number between

[−100, 100].

After a mutation is made to the function, 10

ran-

dom samples are taken to ﬁnd the new function’s sur-

rogate global minimum and maximum, after which it

is subjected to 100 PPA-runs of 2500 function eval-

uations with popsize = 30 and maxO f f spring = 5.

From each of the 100 runs, the individual’s objec-

tive deﬁciency is taken, and the mean is calculated,

resulting in the mean objective deﬁciency (MOD

100

)

for that function and its objective landscape. If the

mutated function’s objective landscape has a higher

MOD

100

(‘is harder’), it is kept, otherwise the mu-

tation is reverted. The evolutionary process contin-

ues for 2000 generations of accepting and rejecting

mutations, increasing the mean objective deﬁciency

(‘hardness’), as can be seen in Figure 3.

6 RESULTS

After completing 2000 cycles in the function evolver,

all six evolved objective landscapes were signiﬁcantly

harder to solve for PPA than their default benchmark

functions’ counterparts.

The largest hardness increase is found in the

evolved Goldstein-Price, whose mean objective deﬁ-

ciency surged with a factor of 979,356. Whether this

enormous increase is due to the high number of 27

Making Hard(Er) Bechmark Test Functions

Table 2: Hardness of the default and evolved benchmark functions, measured in mean objective deﬁciency and success rate

when tested with the plant propagation algorithm.

Benchmark

Mean Objective-Deﬁciency Success Rate

Default Evolved Default Evolved

Branin 2.297e-04 4.125 100% 71%

Easom 1.426e+01 99.738 100% 63%

GoldsteinPrice 1.831e-05 17.932 93% 82%

MartinGaddy 3.077e-04 0.876 98% 96%

Mishra4 9.249e-01 7.745 7% 0%

SixHump 7.149e-04 1.120 100% 95%

parameters and the low initial mean objective deﬁ-

ciency remains an open question – possibly the de-

fault Goldstein-Price is simply a very easy bench-

mark function for PPA. The smallest hardness in-

crease is found in evolved Easom, with a MOD

100

increase factor of ‘only’ 7, and funnily enough the

opposite might be true here – the Easom is already

a hard benchmark function to begin with, the hard-

est of this suite, and after evolving, it still is. The

default Mishra04, which is the second hardest func-

tion, also increases just slightly, with a factor 8, be-

coming the third hardest evolved benchmark function

after evolved Easom and Goldstein-Price.

An interesting result is that the success rate seems

not very closely related to the MOD

100

in the evolved

functions: the evolved Mishra04 has the lowest suc-

cess rate, but ranks only 3

out of 6 in mean objective

deﬁciency, which might be due to the many deep local

minima, preventing PPA to some degree from explor-

ing. Evolved Goldstein-Price ranks 2

in MOD

100

hardness, but 4

out of 6 in success rate (Table 2).

As a generic qualiﬁcation, Goldstein-Price,

Martin-Gaddy and the Six-Hump Camel functions

show similar evolutionary patterns, their objective

landscape going through a concave-to-convex inver-

sion during evolution (see Figure 2). We suspect that

generally speaking, convex shapes make for harder

for continuous benchmark functions, because they

permit the existence of widely separated minima in

the domain’s corners and on edges. Given the right

parameters, one of these separated minima can be

deeper than others, but nontheless be very narrow. On

such surfaces, PPA is prone to getting stuck in one

of the many traps, with a very small (but nonzero!)

chance of accidentally jumping over to the global

minimum. In terms of Malan & Engelbrecht, one

could say such objective landscapes are deceptive to

the plant propagation algorithm (Malan and Engel-

brecht, 2013).

A second evolutionary pattern was witnessed by

the Easom function, going through a global mini-

mum narrowing as its already narrow minimum gets

even narrower through the generations (Fig. 2). The

third and ﬁnal evolutionary pattern, that of Mishra04,

is a ruggedness increase, adding more local minima

to its objective landscape. Likely, the factors 4.789

and 10 are to blame, increasing the frequency of the

sin()-function in both directions, and adding 13 lo-

cal minima to the landscape in its evolved form, to-

talling 21 (Table 1). Although evolved Branin in-

verses its landscape, the increase in ruggedness might

be slightly more pronounced, by which we classify its

evolutionary patterns as a ruggedness increase, but an

argument for convex-to-concave inversion could also

be made (again in Fig. 2, top line).

The number of accepted mutations during the evo-

lutionary process was relatively low for all six func-

tions. Over the entire 2000-generation run, Martin-

Gaddy accepted 28 mutations, Six-Hump Camel ac-

cepted 23, Easom 18, Branin 17, Mishra04 16 and

Goldstein-price accepted only 12 mutations. Typi-

cal for evolutionary processes in other domains, most

hardnesses increase sharply at ﬁrst and then ﬂatten out

a bit (Figure 3). Although it is hard to exactly attribute

hardness increase to speciﬁc mutation( type)s, a gen-

eral pattern seems to be that ‘critical moments’ in

the evolutionary process are those mutations that are

related to the typicality of the evolutionary patterns,

either being a concave-to-convex inversion, a global

minimum narrowing or a ruggedness increase. Unam-

biguously pinpointing them is difﬁcult though; there

are several parameter( setting)s that can contribute to

a landscape feature like convexity or ruggedness.

7 DISCUSSION

The title of this paper is slightly misleading; we are

not making harder benchmark functions yet. To ac-

tually use the end products as new benchmark func-

tions, the evolved functions should be analyzed for

their minima and maxima instead of taking their

randomly sampled surrogate substitutes. Indeed,

it would be ideal to also analytically ﬁnd zero-

derivatives during the evaluation of each evolution

step, but to what extent this is possible remains to be

ECTA 2022 - 14th International Conference on Evolutionary Computation Theory and Applications

Figure 2: The three evolutionary patterns for the objective landscapes in this study are global minimum narrowing which

is seen in Easom only, ruggedness increase (or increase in number of local minima) which is seen in Mishra04 only, and a

concave-to-convex inversion which is seen in Goldstein-Price and Six-Hump Camel (and in Branin and Martin-Gaddy which

are not depicted).

Making Hard(Er) Bechmark Test Functions

Figure 3: Mean objective deﬁciency (MOD

100

), or ‘hardness’ of the six evolved objective landscapes.

seen. The derivative function then needs to be con-

tinuous under all possible parameterizations, and cal-

culated in every function evaluation which might be

computationally expensive. The alternative, analyz-

ing the hardness after evolution only, comes with the

risk of not being feasible or representative. In any

case, for evolved objective landscapes to be actually

usable as benchmarks, we need exact values for min-

ima, maxima, and their coordinates.

A second noteworthy point is the manipulation of

scalar constants as we chose them. We tried to in-

corporate every logically mutable factor and addition

in the process, but also made some arbitrary deci-

sions in the process, like not including powers. The

example of Section 5, f (x) = cos(x) and its equiva-

lent f (x) = 1 ·cos(1 ·x + 0) + 0, can also be seen as

f (x) = 1 ·cos(1 ·x

+ 0)

+ 0 (notice the change from

x to x

). Leaving these out as we did surely excludes

a wide range of possible formulae and their accompa-

nying objective landscapes. It might take some time,

thinking and discussion to determine the most generic

yet feasible way of navigating the state space of possi-

ble functions. Apart from this, raising the caps might

also help, but also allows for wilder functions and ex-

treme ranges, unsuitable for testing evolutionary al-

gorithms.

Third, it is important to discuss the possibilities of

local maxima in the evolutionary process itself. The

used benchmark evolver is basically a stochastic hill-

Climber. While enjoying the advantage of being easy

to use and requiring no hyperparameters, hillClimbers

are known to get stuck in local maxima. So in a

slightly paradoxical way, the algorithm that makes

landscapes harder by creating local minima is prone

to getting trapped in local minima itself. Possible so-

lutions might incorporate the use of variable mutation

strength, simulated annealing (Dahmani et al., 2020)

or maybe even a population based algorithm as a func-

tion evolver.

Fourth and ﬁnally, the most obvious improvement

might be widening the scope of functionality and di-

mensionality. Performing more or longer runs is a

good idea because as we stand, we have no idea

whether these hits are incidental or more structural. In

a more ambitious fashion, we could try to evolve the

entire function structure instead of the scalars only.

This will result in new developmental problems, such

as division by zero, or negative numbers in logarithms

and square roots. But new problems will have new

solutions, and we must ﬁnd them to ensure continuity

in navigation the evolutionary landscape. An inter-

mediate solution would be to explore the possibility

of tuning the benchmark generators mentioned in the

introduction with an evolutionary algorithm. In any

case, upscaling this experiment in any of these direc-

tions like leads to new results.

Contemplating these results, the echoes of the no

free lunch theorem still resound through the valleys of

objective landscapes, reminding us that at this point,

the evolved hard objective landscapes are necessarily

hard for the plant propagation only. No free lunch

means we are ﬁtting problem (instance)s to optimiza-

tion algorithms and as such, we could ultimately just

be uncovering the intricate interplay between both.

REFERENCES

Anonymous (2022). Repostory containing source material

for this study.

Applegate, D., Bixby, R., Chvatal, V., and Cook, W. (1999).

Finding tours in the tsp. Technical report.

Bartz-Beielstein, T., Doerr, C., Bossek, J., Chandrasekaran,

S., Eftimov, T., Fischbach, A., Kerschke, P., L

opez-

ECTA 2022 - 14th International Conference on Evolutionary Computation Theory and Applications

nez, M., Malan, K. M., Moore, J. H., Naujoks, B.,

Orzechowski, P., Volz, V., Wagner, M., and Weise, T.

(2020). Benchmarking in Optimization: Best Practice

and Open Issues. arXiv, pages 1–50.

Braam, F. and van den Berg, D. (2022). Which rectangle

sets have perfect packings? Operations Research Per-

spectives, page 100211.

Brouwer, N., Dijkzeul, D., Koppenhol, L., Pijning, I., and

Van den Berg, D. (2022). Survivor selection in a

crossoverless evolutionary algorithm. In Proceedings

of the 2022 Genetic and Evolutionary Computation

Conference Companion. (in press).

Brownlee, J. et al. (2007). A note on research methodol-

ogy and benchmarking optimization algorithms. Com-

plex Intelligent Systems Laboratory (CIS), Centre for

Information Technology Research (CITR), Faculty of

Information and Communication Technologies (ICT),

Swinburne University of Technology, Victoria, Aus-

tralia, Technical Report ID, 70125.

Dahmani, R., Boogmans, S., Meijs, A., and Van den Berg,

D. (2020). Paintings-from-polygons: Simulated an-

nealing. In ICCC.

De Jonge, M. and van den Berg, D. (2020). Parameter sen-

sitivity patterns in the plant propagation algorithm. In

IJCCI, page 92–99.

de Jonge, M. and van den Berg, D. (2020). Plant propa-

gation parameterization: Offspring & population size.

Evo* 2020, page 19.

Dijkzeul, D., Brouwer, N., Pijning, I., Koppenhol, L., and

Van den Berg, D. (2022). Painting with evolutionary

algorithms. In International Conference on Compu-

tational Intelligence in Music, Sound, Art and Design

(Part of EvoStar), pages 52–67. Springer.

Doerr, C., Ye, F., Horesh, N., Wang, H., Shir, O. M., and

ack, T. (2020). Benchmarking discrete optimization

heuristics with iohproﬁler. Applied Soft Computing,

88:106027.

Fraga, E. S. (2019). An example of multi-objective opti-

mization for dynamic processes. Chemical Engineer-

ing Transactions, 74:601–606.

Gallagher, M. and Yuan, B. (2006). A general-purpose tun-

able landscape generator. IEEE transactions on evo-

lutionary computation, 10(5):590–603.

Geleijn, R., van der Meer, M., van der Post, Q., and van den

Berg, D. (2019). The plant propagation algorithm on

timetables: First results. EVO* 2019, page 2.

Haddadi, S. (2020). Plant propagation algorithm for nurse

rostering. International Journal of Innovative Com-

puting and Applications, 11(4):204–215.

Iori, M., De Lima, V., Martello, S., and M., M. (2021a).

2dpacklib: a two-dimensional cutting and packing li-

brary (in press). Optimization Letters.

Iori, M., de Lima, V. L., Martello, S., Miyazawa, F. K., and

Monaci, M. (2021b). Exact solution techniques for

two-dimensional cutting and packing. European Jour-

nal of Operational Research, 289(2):399–415.

Iori, M., de Lima, V. L., Martello, S., and Monaci, M.

(2021c). 2dpacklib.

Kerschke, P. and Trautmann, H. (2019). Comprehen-

sive feature-based landscape analysis of continuous

and constrained optimization problems using the r-

package ﬂacco. In Applications in Statistical Com-

puting, pages 93–123. Springer.

Koppenhol, L., Brouwer, N., Dijkzeul, D., Pijning, I.,

Sleegers, J., and Van den Berg, D. (2022). Survivor

selection in a crossoverless evolutionary algorithm.

In Proceedings of the 2022 Genetic and Evolutionary

Computation Conference Companion. (in press).

Laguna, M. and Marti, R. (2005). Experimental testing of

advanced scatter search designs for global optimiza-

tion of multimodal functions. Journal of Global Opti-

mization, 33(2):235–255.

Lou, Y., Yuen, S. Y., and Chen, G. (2018). Evolving bench-

mark functions using kruskal-wallis test. In Proceed-

ings of the Genetic and Evolutionary Computation

Conference Companion, pages 1337–1341.

Malan, K. M. and Engelbrecht, A. P. (2013). A survey of

techniques for characterising ﬁtness landscapes and

some possible ways forward. Information Sciences,

241:148–163.

Paauw, M. and Van den Berg, D. (2019). Paintings,

polygons and plant propagation. In International

Conference on Computational Intelligence in Music,

Sound, Art and Design (Part of EvoStar), pages 84–

97. Springer.

Pohlheim, H. (2007). Examples of objective functions. Re-

trieved, 4(10):2012.

Reinelt, G. (1991). Tsplib—a traveling salesman problem

library. ORSA journal on computing, 3(4):376–384.

Rice, J. R. (1976). The algorithm selection problem. In Ad-

vances in computers, volume 15, pages 65–118. Else-

vier.

Rodman, A. D., Fraga, E. S., and Gerogiorgis, D. (2018).

On the application of a nature-inspired stochastic

evolutionary algorithm to constrained multi-objective

beer fermentation optimisation. Computers & Chemi-

cal Engineering, 108:448–459.

Salhi, A. and Fraga, E. S. (2011). Nature-inspired optimi-

sation approaches and the new plant propagation algo-

rithm.

Selamo

glu, B.

I. and Salhi, A. (2016). The plant propaga-

tion algorithm for discrete optimisation: The case of

the travelling salesman problem. In Nature-inspired

computation in engineering, pages 43–61. Springer.

Sleegers, J. and Berg, D. v. d. (2021). Backtracking (the)

algorithms on the hamiltonian cycle problem. arXiv

preprint arXiv:2107.00314.

Sleegers, J. and van den Berg, D. (2020a). Looking for

the hardest hamiltonian cycle problem instances. In

IJCCI, pages 40–48.

Sleegers, J. and van den Berg, D. (2020b). Plant propaga-

tion & hard hamiltonian graphs. Evo* 2020, page 10.

Smith-Miles, K., Hemert, J. v., and Lim, X. Y. (2010). Un-

derstanding tsp difﬁculty by learning from evolved in-

stances. In International conference on learning and

intelligent optimization, pages 266–280. Springer.

Socha, K. and Dorigo, M. (2008). Ant colony optimization

for continuous domains. European journal of opera-

tional research, 185(3):1155–1173.

Making Hard(Er) Bechmark Test Functions

Suganthan, P. N., Hansen, N., Liang, J. J., Deb, K., Chen,

Y.-P., Auger, A., and Tiwari, S. (2005). Problem deﬁ-

nitions and evaluation criteria for the cec 2005 special

session on real-parameter optimization. KanGAL re-

port, 2005005(2005):2005.

Sulaiman, M., Salhi, A., Fraga, E. S., Mashwani, W. K., and

Rashidi, M. M. (2016). A novel plant propagation al-

gorithm: modiﬁcations and implementation. Science

International, 28(1):201–209.

Sulaiman, M., Salhi, A., Khan, A., Muhammad, S., and

Khan, W. (2018). On the theoretical analysis of the

plant propagation algorithms. Mathematical Problems

in Engineering, 2018.

van Hemert, J. I. (2006). Evolving combinatorial problem

instances that are difﬁcult to solve. Evolutionary Com-

putation, 14(4):433–462.

van Horn, G., Olij, R., Sleegers, J., and van den Berg, D.

(2018). A predictive data analytic for the hardness of

hamiltonian cycle problem instances. Data Analytics,

2018:101.

Vrielink, W. and van den Berg, D. (2019). Fireworks algo-

rithm versus plant propagation algorithm. In IJCCI,

pages 101–112.

Vrielink, W. and van den Berg, D. (2021a). A dynamic

parameter for the plant propagation algorithm. Evo*

2021, pages 5–9.

Vrielink, W. and van den Berg, D. (2021b). Parameter con-

trol for the plant propagation algorithm. Evo* 2021,

pages 1–4.

Weise, T., Li, X., Chen, Y., and Wu, Z. (2021). Solving

job shop scheduling problems without using a bias

for good solutions. In Proceedings of the Genetic

and Evolutionary Computation Conference Compan-

ion, pages 1459–1466.

Weise, T. and Wu, Z. (2018a). Difﬁcult features of com-

binatorial optimization problems and the tunable w-

model benchmark problem for simulating them. In

Proceedings of the Genetic and Evolutionary Compu-

tation Conference Companion, pages 1769–1776.

Weise, T. and Wu, Z. (2018b). The tunable w-model bench-

mark problem.

Wolpert, D. H. and Macready, W. G. (1997). No free lunch

theorems for optimization. IEEE transactions on evo-

lutionary computation, 1(1):67–82.

ECTA 2022 - 14th International Conference on Evolutionary Computation Theory and Applications