The usefulness of constructional selection for genetic algorithms should be measured by the number of points in the search space that need to be evaluated to reach a certain fitness. There are several strategies for spending computational resources: One can take an a priori representation, run numerous adaptive walks from different initial genotypes, and take the fittest peak attained after expenditure of the set amount of computations. Alternately, one can partition one's computations sampling among different genotype-phenotype maps to find that with the fittest peak. Lastly, the approach examined here is to progressively modify simple genotype-phenotype maps to more complex ones with the genome growth algorithm described. The payoff in the level of optimization obtained through constructional selection shows it to be a much more efficient.
As a simple example, consider the class of NK landscapes with the highest expected fitness peaks, the k=1 landscapes (which is K=0 in Kauffman's original definition), in which there is a one-to-one map from each gene to each fitness component, and f=n. Each gene can be optimized individually, so it takes the evaluation of 2n genotypes to find the global peak for the landscape. Each fitness component is i.i.d., where the optimal
is distributed as the maximum of two independent uniform random variables on [0,1]. The probability density of each maximum
is
.
So E
and Var
.
With n=f=31 genes and fitness components, one obtains:
The ability to select on the genotype-phenotype map as it is constructed is the key to finding higher fitness values. To carry out constructional selection in the k=1 case, one could add one gene at a time and for each choice of the fitness component it maps to, examine the fitnesses for both alleles, and keep the allele and map that give the fittest value. The first gene would be evaluated for the f possible genotype-phenotype maps, for a total of 2f evaluations. The second gene would be sampled with the remaining f-1 unmapped fitness components, and so forth, giving a total of f(f+1) evaluations. Each resulting fitness component would be the maximum of 2(f-i+1) uniform i.i.d. values, for the ith gene/gene map pair. So the ith fitness component would be distributed as
.
The fitness components would have expectation