[draft]

Genetic Algorithms (GA) and Genetic Programming (GP) are stochastic problem solving techniques used
for operations research, planning, and variety of other optimization problems. GA/GP are
loosely inspired by the notions of natural selection and evolution. The basic primer on the topic is
Koza's *On the Programming of Computers by the Means of Natural Selection*.

Essentially, a GA or GP system takes a difficult problem, generates random solutions, evaluates the "fitness" of each solution and then creates new, hopefully better solutions by "breeding" the best of old ones. Precisely what these terms mean will become more clear as we proceed. First we shall conisider a GA system.

Suppose you have a problem with many "possible solutions", such as determining the best way to connect a collection of sites with telephone wires--we wish to minimize the overall length of wire and hence the cost of building the network. The following diagram shows two possible configurations of a ten site network.

One way of solving this problem is to simply try all possible combinations and then calculate the
total distance required for each (for now we are ignoring other considerations such as reliablity
and routing bottle-necks). This method is guarenteed to find a minimal cost solution.
Unfortuntely, the number of such combinations is unreasonably large, even for small networks. If we
just look at "paths"--connection schemes that linearly connect all sites so that each site is connected
to at most two others--there are `n!` (`n` factorial, i.e. `n*(n-1)*(n-2)*...*(3)*(2)*(1)`)
unique networks with `n` sites. Note that

and

If we consider "trees"--networks such that there is exactly one way to get from one site to the others, like those shown above--there are

Although it is difficult to determine the "best solution" to this problem, it is easy to evaluate
the cost of a given configuration and hence to compare the fesablity of two or more networks. Moreover,
it is easy to generate an arbitary network if we use the proper encoding of the problem. For this
example, we will use an **adjacency matrix** representation of the network.

An adjacency matrix is just a table, with each row and column corresponding to sites in the network.
Every cell (or entry) in the table contains either a `0` or a `1`. A `1`
in row X, column Y indicates that there is a connection between the sites X and Y (a `0` indicates
that there is no direct connection between them).

The following table is the adjacency matrix corresponding to the right-most network shown above. (To make the table easier to read we have ommitted the zeros.)

1 2 3 4 5 6 7 8 9 10 +---+---+---+---+---+---+---+---+---+---+ 1 | | | 1 | | | | | | | | +---+---+---+---+---+---+---+---+---+---+ 2 | | | | | 1 | | | | | | +---+---+---+---+---+---+---+---+---+---+ 3 | 1 | | | | 1 | | | | | | +---+---+---+---+---+---+---+---+---+---+ 4 | | | | | 1 | | | | | | +---+---+---+---+---+---+---+---+---+---+ 5 | | 1 | 1 | 1 | | 1 | | | | | +---+---+---+---+---+---+---+---+---+---+ 6 | | | | | 1 | | | 1 | | | +---+---+---+---+---+---+---+---+---+---+ 7 | | | | | | | | 1 | | | +---+---+---+---+---+---+---+---+---+---+ 8 | | | | | | 1 | 1 | | 1 | 1 | +---+---+---+---+---+---+---+---+---+---+ 9 | | | | | | | | 1 | | | +---+---+---+---+---+---+---+---+---+---+ 10 | | | | | | | | 1 | | | +---+---+---+---+---+---+---+---+---+---+Note that the adjacency matrix will always be symmetric across the main diagonal since a connection between sites

To create a "random" network in this representation we simply fill in some of the cells with `1`s,
taking care to maintain the diagnal symmetry. There are also fast and easy algorithms for evaluating the
cost of the network and determing if the network is "connected", i.e. that every site can reach all
the others.

For clarity, let us now define precisely what we mean by the "feasiblity" or **fitness** of a given
solution. In general fitness is a measure, typically numerical, that allows us to compare two potential
solutions. By combining aspects of "well fit" solutions, the system will "evolve" better fit solutions.
Hence fitness measures should be chosen carefully, for whatever problem you are trying to solve, maximizing
fitness is what a GA system actually tries to achieve. It is often useful to have a fitness measure whose minimum value is `0`
(or sometimes `1` and for which large values represent high fitness, as we shall see.

For our telephone network problem, we will define `f(N)`, the fitness of the network `N`
to be `M`, the maximum possible length of a network (obtained by summing the length of all
possible connections), minus `d(N)`, the length of all connections in the network `N`. Hence

Since we want to ensure that the network is connected, let us modify the function slightly so that

if

We have now fulfilled the first three conditions for implementing a GA system. Namely,

- 1. A Difficult Problem
- We have a problem for which many possible solutions exist and it is difficult to determine directly which solution is best.
- 2. A Meaningful Fitness Measure
- We have a numerical function that allows us to compare the feasablity of a given solution--the higher the fitness value, the better the solution.
- 3. A Suitable Representation
- We have a representation that makes it easy to generate potential solutions and to evaluate the fitness function.

For algorithmic simplicty, let us now make a few modifications to our problem representation.
Consider the small adjacency matrix defined below, we will call this network `N1`:

A B C D +---+---+---+---+ A | | | 1 | | +---+---+---+---+ B | | | | 1 | +---+---+---+---+ C | 1 | | | 1 | +---+---+---+---+ D | | 1 | 1 | | +---+---+---+---+As stated before, the matrix is always symmetric across the main diagonal, so we need only consider one half of the table to encapuslate all the information:

A B C D +---+---+---+---+ A | | | 1 | | +---+---+---+---+ B | | | 1 | +---+---+---+ C | | 1 | +---+---+ D | | +---+Also, there is no need to consider a connection between a site and itself, so we can look at the table like this:

A B C D +---+---+---+ A | | 1 | | +---+---+---+ B | | 1 | +---+---+ C | 1 | +---+ D(In general there will be

where the commas represent the "breaks" between rows of the table, or simply

without them. We will call these representations

To make our example concrete, consider this collection of sites (all possible connections are indicated, along with their lengths).

For this collection of sites `M`, the maximum length of a network, is `23.61`. Then for the
network `N1`, we have

Now suppose we have randomly generated a **population** of **indivuals** (a collection of networks)
whose fitness values are summarized in the following table:

Name Chromosome Fitness N1 010011 12.61 N2 100111 8.00 N3 001011 8.61 N4 110101 6.00 N5 000111 0.00(Note that

Now, with a population in hand, we get to the fun part, evolution. We wish to take this population of "solutions" and generate better ones by taking parts of the well fit ones and combining them into new "solutions". Evolution requires two primary steps, selection and breeding.

let FIT be an array of fitness values. let TOT be the total (sum) fitness of the population. let TARGET be a random value between 1 and TOT. let INDEX be the minimum FIT array index repeat while TARGET > 0 TARGET = TARGET - FIT[INDEX] INDEX = end repeat(We are assuming, of course, that the total fitness is never zero, otherwise the routine would go into an endless loop.

Footnote 1:

A binary representation of the network makes it easy to count the number of unique networks
for a given number of sites. Using one binary bit (a `0` or a `1`) to represnent each
potential connection (where `1` indicates that the connection is made), we need
`(n-1) + (n-2) +...+ 2 + 1 = n(n-1)/2` bits to represent all possible connections
between `n` nodes. (This is a well known "counting" sum. We'll need `n-1`
bits to represent all possible links from the "first" site to the remaining `n-1`, and
`n-2` bits to represent all possible links from the "second" site to the remaining `n-2`--
we've already represented a link between sites one and two, and so on.) Each bit can independently be
`1` or `0` and each unique `n(n-1)/2` bit binary sequence represents a different
network. Hence we have `2^^( n(n-1)/2 )` possible networks with `n` nodes. (Where "`^^`" means
exponentiation--`2` raised to the `n(n-1)/2` power.) The following table illustrates how quickly this
value grows:

# of Sites (n) # of Networks ( 2^^n(n-1)/2 ) -------------- ----------------------------- 1 1 = 2^^0 2 2 = 2^^2 3 8 = 2^^3 4 64 = 2^^6 5 1,024 = 2^^10 10 ~ 3.52 * 10^^13 = 2^^45 20 ~ 1.57 * 10^^57 = 2^^190 30 ~ 8.87 * 10^^130 = 2^^435 40 ~ 6.36 * 10^^234 = 2^^780 50 ~ 5.78 * 10^^368 = 2^^1,225 100 ~ 1.25 * 10^^1490 = 2^^4,950 500 ~ 2^^124,750 1000 ~ 2^^499,500

It is easy to see why this problem is in NP. To return to the citation for this footnote, follow this link.