back to previous section up to contents on to next section

A Brief Introduction to Genetic Algorithms/Genetic Programming

Genetic Algorithms (GA) and Genetic Programming (GP) are stochastic problem solving techniques used for operations research, planning, and variety of other optimization problems. GA/GP are loosely inspired by the notions of natural selection and evolution. The basic primer on the topic is Koza's On the Programming of Computers by the Means of Natural Selection.

Essentially, a GA or GP system takes a difficult problem, generates random solutions, evaluates the "fitness" of each solution and then creates new, hopefully better solutions by "breeding" the best of old ones. Precisely what these terms mean will become more clear as we proceed. First we shall conisider a GA system.

A Problem Representation

Suppose you have a problem with many "possible solutions", such as determining the best way to connect a collection of sites with telephone wires--we wish to minimize the overall length of wire and hence the cost of building the network. The following diagram shows two possible configurations of a ten site network.

One way of solving this problem is to simply try all possible combinations and then calculate the total distance required for each (for now we are ignoring other considerations such as reliablity and routing bottle-necks). This method is guarenteed to find a minimal cost solution. Unfortuntely, the number of such combinations is unreasonably large, even for small networks. If we just look at "paths"--connection schemes that linearly connect all sites so that each site is connected to at most two others--there are n! (n factorial, i.e. n*(n-1)*(n-2)*...*(3)*(2)*(1)) unique networks with n sites. Note that

10! = 10*9*8*7*6*5*4*3*2 = 3,628,800

and
20! is roughly 2,430,000,000,000,000,000.

If we consider "trees"--networks such that there is exactly one way to get from one site to the others, like those shown above--there are many more cases to consider. Even with clever heuristics that might significantly decrease the number of cases to be considered, clearly this "brute force" approach is impractical for real world applications where 100 sites might be considered a small network.

Although it is difficult to determine the "best solution" to this problem, it is easy to evaluate the cost of a given configuration and hence to compare the fesablity of two or more networks. Moreover, it is easy to generate an arbitary network if we use the proper encoding of the problem. For this example, we will use an adjacency matrix representation of the network.

An adjacency matrix is just a table, with each row and column corresponding to sites in the network. Every cell (or entry) in the table contains either a 0 or a 1. A 1 in row X, column Y indicates that there is a connection between the sites X and Y (a 0 indicates that there is no direct connection between them).

The following table is the adjacency matrix corresponding to the right-most network shown above. (To make the table easier to read we have ommitted the zeros.)

     1   2   3   4   5   6   7   8   9  10  
   +---+---+---+---+---+---+---+---+---+---+
 1 |   |   | 1 |   |   |   |   |   |   |   |
   +---+---+---+---+---+---+---+---+---+---+
 2 |   |   |   |   | 1 |   |   |   |   |   |
   +---+---+---+---+---+---+---+---+---+---+
 3 | 1 |   |   |   | 1 |   |   |   |   |   |
   +---+---+---+---+---+---+---+---+---+---+
 4 |   |   |   |   | 1 |   |   |   |   |   |
   +---+---+---+---+---+---+---+---+---+---+
 5 |   | 1 | 1 | 1 |   | 1 |   |   |   |   |
   +---+---+---+---+---+---+---+---+---+---+
 6 |   |   |   |   | 1 |   |   | 1 |   |   |
   +---+---+---+---+---+---+---+---+---+---+
 7 |   |   |   |   |   |   |   | 1 |   |   |
   +---+---+---+---+---+---+---+---+---+---+
 8 |   |   |   |   |   | 1 | 1 |   | 1 | 1 |
   +---+---+---+---+---+---+---+---+---+---+
 9 |   |   |   |   |   |   |   | 1 |   |   |
   +---+---+---+---+---+---+---+---+---+---+
10 |   |   |   |   |   |   |   | 1 |   |   |
   +---+---+---+---+---+---+---+---+---+---+

Note that the adjacency matrix will always be symmetric across the main diagonal since a connection between sites 5 and 2 is also a connection between sites 2 and 5. (This is not the most effcient representation of the network and is probably unsuitable for large networks with few interconnections, in practice a adjacency list structure would probably be used, but it will serve our purposes here.)

To create a "random" network in this representation we simply fill in some of the cells with 1s, taking care to maintain the diagnal symmetry. There are also fast and easy algorithms for evaluating the cost of the network and determing if the network is "connected", i.e. that every site can reach all the others.

For clarity, let us now define precisely what we mean by the "feasiblity" or fitness of a given solution. In general fitness is a measure, typically numerical, that allows us to compare two potential solutions. By combining aspects of "well fit" solutions, the system will "evolve" better fit solutions. Hence fitness measures should be chosen carefully, for whatever problem you are trying to solve, maximizing fitness is what a GA system actually tries to achieve. It is often useful to have a fitness measure whose minimum value is 0 (or sometimes 1 and for which large values represent high fitness, as we shall see.

For our telephone network problem, we will define f(N), the fitness of the network N to be M, the maximum possible length of a network (obtained by summing the length of all possible connections), minus d(N), the length of all connections in the network N. Hence

f(N) := M - d(N).

Since we want to ensure that the network is connected, let us modify the function slightly so that
f(N) := 0

if N is not connected.

We have now fulfilled the first three conditions for implementing a GA system. Namely,

1. A Difficult Problem
We have a problem for which many possible solutions exist and it is difficult to determine directly which solution is best.
2. A Meaningful Fitness Measure
We have a numerical function that allows us to compare the feasablity of a given solution--the higher the fitness value, the better the solution.
3. A Suitable Representation
We have a representation that makes it easy to generate potential solutions and to evaluate the fitness function.

The Evolutionary Engine

For algorithmic simplicty, let us now make a few modifications to our problem representation. Consider the small adjacency matrix defined below, we will call this network N1:

    A   B   C   D  
  +---+---+---+---+
A |   |   | 1 |   |
  +---+---+---+---+
B |   |   |   | 1 |
  +---+---+---+---+
C | 1 |   |   | 1 |
  +---+---+---+---+
D |   | 1 | 1 |   |
  +---+---+---+---+

As stated before, the matrix is always symmetric across the main diagonal, so we need only consider one half of the table to encapuslate all the information:
    A   B   C   D  
  +---+---+---+---+
A |   |   | 1 |   |
  +---+---+---+---+
B     |   |   | 1 |
      +---+---+---+
C         |   | 1 |
          +---+---+
D             |   |
              +---+

Also, there is no need to consider a connection between a site and itself, so we can look at the table like this:
    A   B   C   D  
      +---+---+---+
A     |   | 1 |   |
      +---+---+---+
B         |   | 1 |
          +---+---+
C             | 1 |
              +---+
D                  
   
(In general there will be (n-1) + (n-2) +...+ 2 + 1 = n(n-1)/2 cells in a table representing n sites.) For simplicity, we will represent the tables in one line by simply taking each row in turn. The above table would then look like
N1 = 010,01,1

where the commas represent the "breaks" between rows of the table, or simply
N1 = 010011

without them. We will call these representations chromosomes. (By the way, this representation makes it easy to count how many unique networks there are for a given number of sites, a question we considered earlier, see footnote one.)

To make our example concrete, consider this collection of sites (all possible connections are indicated, along with their lengths).

For this collection of sites M, the maximum length of a network, is 23.61. Then for the network N1, we have

f(N1) = M - d(N1) = 23.61 - (3 + 6 + 2) = 12.61

Now suppose we have randomly generated a population of indivuals (a collection of networks) whose fitness values are summarized in the following table:

Name     Chromosome    Fitness
 N1        010011       12.61
 N2        100111        8.00
 N3        001011        8.61
 N4        110101        6.00
 N5        000111        0.00
(Note that N5 has fitness 0 since the site A is not connected to the network.)

Now, with a population in hand, we get to the fun part, evolution. We wish to take this population of "solutions" and generate better ones by taking parts of the well fit ones and combining them into new "solutions". Evolution requires two primary steps, selection and breeding.

Selection

To sucessfully implement a selection routine we must keep two goals in mind: 1) we want to choose the better fit individuals for breeding 2) we want to maintain genetic diversity so that our population does not "converge" too soon. It is easy to imagine some simple ways of doing this, but I will describe the most commonly used, roulette wheel selection. In roulette wheel selection (RWS), we imagine the population being placed on a roulette wheel, with one indivdual per "wedge" on the wheel. (If your not familiar with roulette, think of the "Wheel of Fortune" wheel.) The area of each wedge is proportional to the fitness of that individual. We then spin the wheel and select the individual whose wedge is "pointed at" by the selector. The following pseduo-code describes one implementation of the process:
let FIT be an array of fitness values.
let TOT be the total (sum) fitness of the population.
let TARGET be a random value between 1 and TOT.
let INDEX be the minimum FIT array index
repeat while TARGET > 0
  TARGET = TARGET - FIT[INDEX]
  INDEX = 
end repeat
(We are assuming, of course, that the total fitness is never zero, otherwise the routine would go into an endless loop.

Footnote 1:
A binary representation of the network makes it easy to count the number of unique networks for a given number of sites. Using one binary bit (a 0 or a 1) to represnent each potential connection (where 1 indicates that the connection is made), we need (n-1) + (n-2) +...+ 2 + 1 = n(n-1)/2 bits to represent all possible connections between n nodes. (This is a well known "counting" sum. We'll need n-1 bits to represent all possible links from the "first" site to the remaining n-1, and n-2 bits to represent all possible links from the "second" site to the remaining n-2-- we've already represented a link between sites one and two, and so on.) Each bit can independently be 1 or 0 and each unique n(n-1)/2 bit binary sequence represents a different network. Hence we have 2^^( n(n-1)/2 ) possible networks with n nodes. (Where "^^" means exponentiation--2 raised to the n(n-1)/2 power.) The following table illustrates how quickly this value grows:

# of Sites (n)   # of Networks ( 2^^n(n-1)/2 )
--------------   -----------------------------
       1                        1 = 2^^0      
       2                        2 = 2^^2      
       3                        8 = 2^^3      
       4                       64 = 2^^6      
       5                    1,024 = 2^^10     
       
      10          ~ 3.52 * 10^^13 = 2^^45     
      20          ~ 1.57 * 10^^57 = 2^^190    
      30         ~ 8.87 * 10^^130 = 2^^435    
      40         ~ 6.36 * 10^^234 = 2^^780    
      50         ~ 5.78 * 10^^368 = 2^^1,225  
      
     100        ~ 1.25 * 10^^1490 = 2^^4,950  
     500                          ~ 2^^124,750
    1000                          ~ 2^^499,500

It is easy to see why this problem is in NP. To return to the citation for this footnote, follow this link.

Back to the top of this page.


back to previous section up to contents on to next section