b!Gollem: Evolutionary Framework
Genetic Programming
Simply stated, genetic programming (GP) is the process of "evolving" an algorithm to perform
a particular task. GP is inspired by the notion of natural selection. It is similar to,
but not quite the same as genetic algorithms (GA), in which the parameters of an algorithm
are evolved in order to create a "fit" solution to given problem.
The applicability of GP for a particular domain depends on number of factors:
 First, we need a problem for which many potential solutions exist, but for which
it is difficult to determine the ideal solution directly.
 It must be relatively simple to generate potential solutions to the problem.
 It must be relatively simple to evaluate the "fitness" of a given solution.
 The functionality necessary to solve the problem must identifiable and partitionable.
This functionality must be broken down into a "closed" set of atomic functionsfunctions
whose inputs and outputs are of the same data type, so that the atoms can be arbitrarily
chained together.
Assuming these conditions are met, solutions are "evolved" as follows.
 A population of "random" programs (sequences of atomic functions) are created.
Let us call this the current population.
 The fitness of each program in the current population is evaluated, typically yielding
a numerical fitness value or a linear ordering of the individuals within the population.
 A new population is created from the current by three general techniques:
 A wellfit individual is simply copied into the new population.
 A wellfit individual is selected and "mutated" by randomly tweaking some of its
functions, and then placed in the new population.
 Two (or more) wellfit individuals are selected and "bred" by combining parts of each.
The newly generated individual is then placed in the new population.
 The new populated then becomes the current population, and we repeat steps 2 through 4
until the desired fitness level is reached.
Statistics tells us that if our functionality is sufficient to solve the problem and our
fitness measure is meaningful, the average fitness of the population will increase over
time.
Note that in general, GP/GA will not yield the ideal solution to a problem, but often
a "good" solution will suffice.
There are many online resources for information regarding GA/GP, use your favorite
search engine to find them.
Genetic Programming in b!Gollem
Gollem's genetic architecture extends traditional genetic programming in two primary ways.
 Clearly go is too complex for a single variable type to encapsulate all the necessary
information. Hence in Gollem a collection of data types is passed between the
atomic functions. This collection is described below.
 In order for GP to work we must restrict each function to a small task.
Hence for go a large number of functions are required (numbering in the thousands).
In Gollem a Central Processing Unit (CPU) model is used, which is also described
below.
DataTypes
Gollem's data types can be divided into two categories:
"globals" for which exactly one instance exists, and
"locals" which collectively form the "closed" function argument.

Board

Board is a representation of go board. It contains all the "game state" information, as well
as the functions necessary for refereeing a game, counting territory and determining the winner.
Board also acts as a "publisher" of game data to the "subscribers" ChainMap and Field. It is the
primary structure behind the Board Window.

ChainMap
 The ChainMap is representation of the board as a collection of connected, same color groups or
chains. Each location on the board is mapped to a "group structure" which contains information about
the size, shape, and neighbors of the connected stones (or empty intersections).

Field
 Field is an electrical field model representation of the board. Each stone is considered a
"point charge" which propagates charge across empty intersections. The field model can be used
to measure "influence".

Log
 The Log is simply a history of the moves made in the game. It is the source for the
History Window and is also used to implement the undo/redo stack.

Bool
 A simple Boolean (true/false) value.

Coor
 A board location, represented as x,y coordinates.

Color
 A trinary value that represents what stone (if any) might occupy a board location (either Black,
White or Empty).

Flavor
 An 8value variable that represents all possible combinations of Black, White and Empty.

Float
 A floating point (real number) value.

Int
 An integer (whole number) value.

List
 A "linked list" of coordinates (Coors). This can be used as a set of distinct locations, a bag
(multiset) of not necessarily distinct values or as an ordered list.

Rect
 A rectangular region of the board. This can be used to specify a "universe of interest" or as
a coordinate offset (vector).
These datatypes, together with the functions that manipulate them, form the foundation of Gollem's
builtin go knowledge.
Back to the top of this page.
Genetic Functions
For each of the above datatypes, a large number of "manipulating functions" have been defined.
(Currently there are over 1,000 implemented functions and nearly 2,000 planned in total.)
Each function takes two collections of local data as an argument and each has access to the global
data as well. Some example functions include:
 Set Bool to TRUE if the stone at Coor has color Color.
 Set Flavor to Flavor but not Color.
 Add all White stones in Rect to List
 Set Float to Field value of Coor
 Set Bool to TRUE if the Chain of the stone at Coor has at
least Int stones.
 Set Coor to the matrix product of Rect and Coor.
 Transpose Coor.
Back to the top of this page.
Genetic Framework
The b!Gollem genetic framework is based on a CPU model. Programs (individuals) take the
form of a sequence of atomic functions. These programs are
"run" on an engine that might be thought of as a CPU.
The Central Processing Unit
CPU Schematic
The CPU can be divided into two parts that loosely correspond to ReadOnly Memory (ROM) and
Random Access Memory (RAM). We'll describe each portion in turn.
ReadOnly Memory
The ROM portion of the CPU is composed of a function table, an array of
the atomic functions, and the global data types like the Board, et al.,
described above. The "readonly" description is somewhat inaccurate here, since the "internal
state" of global data types can be changed, but only indirectly. That is, the programs cannot
directly alter the global data types; they can only query them for information. The global data
types change, however, for each move made in the game.
Random Access Memory
The RAM portion of the CPU is entirely composed of local data types stored in registers.
Three registers, designated alpha, beta and gamma are individual
instances of the collection of local data types. A special stack register is an arbitarliy
large list of local data instances, stored in a standard "pushdown" stack. Only the topmost
collection on the stack is available for reading or writing.
There are several "meta" functions within the function table that manipulate the registers,
copying or exchanging values and pushing/popping values from the stack.
The Programs
An individual in the population is simply a sequence of integers that correspond to entries in the
function table. A program is run by calling each function in turn. In particular, each "atom" in a
program is composed of two parts. The first part specifies which registers are to be acted upon. For
instance, the same function can have dramatically different results when run upon registers alpha and
beta than it would when run with two "copies" of the gamma register. In this way each
entry in the function table corresponds to 16 different functionsone for each ordered pair of registers.
The remaining part of the atom specifies which entry in the function table to use.
The Genetic Operators
Breeding within the Gollem system is a fairly straightforward implementation of GP. Since
the order of the functions within the table is arbitary, functions are bred as piece. The
"operator" (function) and "operand" (register) parts of each atom may be changed indepenedently.
A "chromosome" within the Gollem system is the sequence of atoms and it is these sequences that
are bred.
The following list describes the genetic operators currently being used. The frequency of use
is determined by the breeding parameters dialog box described on
a previous page.
 The Copy operator simply copies programs "as is" from the old population.
Parent: 1 2 3 4 5 6 7 8
The Child: 1 2 3 4 5 6 7 8
 The One Point Crossover operator picks out one point in each of two programs in the
old population and creates a new program by taking the sequence of functions up to the
point in the first parent and after the point in the second, as illustrated below:
First Parent: 1 2 3 4 5 6 7 8

+the point picked out
Second Parent: A B C D E F G H I J

+the point picked out
The Child: 1 2 3 G H I J
 The Two Point Crossover operator is similar to the one point, except two points are
chosen and the genetic material between the points is exchanged.
First Parent: 1 2 3 4 5 6 7 8
 
 +second point

+first point
Second Parent: A B C D E F G H I J
 
 +second point

+first point
The Child: 1 2 3 E F G 7 8
 The Uniform Crossover operator randomly takes a function from one of two parents
for each function in the longer parent. An example is shown below.
First Parent: 1 2 3 4 5 6 7 8
Second Parent: A B C D E F G H I J
The Child: A 2 3 D 5 F G 8 J
 The Permute Mutation picks out one parent from the old population and creates
a new program by randomly permuting (changing the order of) the sequence of functions.
Parent: 1 2 3 4 5 6 7 8
The Child: 7 4 1 6 5 8 2 3
 The Permute Substring mutation picks out two points in a program from the old population and creates
a new program by randomly permuting the sequence of functions between the two points.
Parent: 1 2 3 4 5 6 7 8
 
 +second point

+first point
The Child: 1 2 5 6 4 3 7 8
 The Point Mutation randomly changes some of the functions within one program.
Parent: 1 2 3 4 5 6 7 8
The Child: 1 2 A 4 5 B C 8
Back to the top of this page.