Information
This document contains general information about GATSS.

Contents

o Background
o Genetic Algorithms
o The Traveling Salesman problem
o GATSS Algorithms
o System Overview
o References

A not complete test analysis of the behavior of GATTS is also available.
If you want to see how things are done in more detail, feel free to take a look on the source files.


Background

GATSS is a Genetic Algorithm based solver of the Traveling Salesman problem written in GNU C++ linked to a user interface in HTML via CGI-script. It does nothing revolutional to the science of GA, it is just a standard algorithm enclosed in a user friendly interface. Maybe it can be used as a tool in teaching the concepts of Genetic Algorithms.

This program and its documentation is the result of an exercise in writing some kind of application related to the science of Artificial Life. The exercise is an examination part of the course "Artificial Intelligence II" given by the Department of Computing Science at Umeå University, Sweden.

The Story
When I first heard about Genetic Algorithms, I was fascinated by the fact that for once, the nature was the role model. That something as formal and abstract as Computer Science could learn from nature.
I searched the web for information on GA:s and found several interesting applications, general GA problem solvers, as well as GA software engineering environments.

One of the applications i found was 'DOUGAL' written by Brett S. Parker in 1993. It is a GA demonstration program written in Pascal for MS-DOS solving the Traveling Salesman problem (hereafter called TSP). I thought that the TSP was a good example in showing how GA works. It was a well known graph-theoretical problem and the results could, obviously, be nice visualised.
GATSS is very similar to DOUGAL. The differences are mainly that GATSS is implemented in C++ and has a HTML user interface.
So, thank you Brett, I hope you don't mind!.

When I had decided what kind of problem to solve (TSP) and in what language (C++), I searched for software engineering tools to speed up development. My first stop was not beautiful. I won't mention the name of the system here but it was ugly. Bad written code and bad written documentation.

Later on I ran into a C++/perl GA software engineering environment called 'GAGS' written by professor J.J. Merelo, GeNeura Team at the Department of Electronics and Computer Technology, University of Granada, Spain. I had difficulties in compiling GAGS for two reasons, one: I got the latest beta-version (with link bugs) at the time, two: The system had never been compiled at the UNIX-platform I used. Mr. Merelo was very helpful, and after some email-correspondance, the system was up and running.

Since I'm no C++-guru, and Merelo really uses the language at it's full extent, together with the fact that I'm new to the field of GA, I had difficulties in understanding the code. I finally decided to create the code I needed from scratch instead. No shadow on the GAGS-system though. It is a well written general C++-library combined with perl scripts, with program examples and a nice user interface. The documentation is at the time a little poor (at least the programmers manual) but Merelo says he's working on it.
Thank you J.J. for the help in compiling GAGS! Even if I couldn't use the system itself, it has inspired my own code writing.

When the actual coding was finished, phase two of this project was to be done: Analysis of the different parameters in the GA, i.e. what effect they have on the evolution process. This is greatly simplified with a good user interface. To execute the original program, the command line would look like this:

resulting in

The difficulties in keeping track of which of the parameters that was corresponding to which part of the GA made me dizzy. I decided to create a better user interface using HTML.
Here, WWW-guru Daniel Silén, helped me out. He wrote the CGI-script that links the user interface to the program. Thank you Daniel!


Genetic Algorithms

History
Genetic Algorithms were invented to mimic some of the processes observed in natural evolution. Many people, biologists included, are astonished that life at the level of complexity that we observe could have evolved in the relatively short time suggested by the fossil record. The idea with GA is to use this power of evolution to solve optimization problems. The father of the original Genetic Algorithm was John Holland who invented it in the early 1970's.

The Basic Algorithm
  1. Initialize a population of chromosomes.
  2. Evaluate each chromosome in the population.
  3. Create new chromosomes by mating current chromosomes; apply mutation and crossover as the parent chromosomes mate.
  4. Delete members of the population to make room for the new chromosomes.
  5. Evaluate the new chromosomes and insert them into the population.
  6. If time is up, stop and return the best chromosome; if not, go to 3.
If all goes well throughout this process of simulated evolution, an initial population of unexceptional chromosomes will improve as parents are replaced by better and better children. The best individual in the final population produced can be a highly evolved solution to the problem.

The most important parts of a GA is crossover, mutation, population deletion, and chromosome fitness evaluation.

Data Structures
The data structures used in GA are usually rough simplifications of their natural prototype. Genes, chromosomes and populations are designed hierarchically somewhat like this:

The genes are usually numerals like integers, floating point numerals or simple bits.

More information about Genetic Algorithms is available at The Genetic Algorithms Archive or in the comp.ai.genetic FAQ's.


The Traveling Salesman problem

The Traveling Salesman problem (TS hereafter) is a classical graph-theoretical problem. It involves a traveling salesman who has to visit each of the cities in a given network before returning to his starting point, at which time his trip is complete. The algorithmic problem asks for the cheapest route, namely for a closed tour that passes through each of the nodes in the graph and whose total cost (i.e. the sum of the distances labeling the edges) is minimal.

Its variants arise in the design of telephone networks and integrating circuits, in planning construction lines, and in the programming of industrial robots, to mention but a few applications. In all of these, the ability to find inexpensive tours of the graphs in question can be quite crucial.

TS is a NPC (Nondeterministic Polynomial-time Complete) problem which in short means that it is unsolvable with a standard linear algorithm if the problems complexity is large enough. TS can be solved with a very fast computer when the number of nodes are up to, say 50. If the graph is larger, a non-linear algorithm is needed. GA is one alternative.


GATSS Algorithms

In this section I will discuss the GATSS specific algorithms. Since GATSS is a true GA, the execution algorithm outlined in the general "Genetic Algorithms" section above applies to the execution cycle of GATSS.
The algorithms that is specific for GATSS are found within the larger steps of the overview algorithm and are implied by the problem at hand, the Traveling Salesman problem.


Crossover
The crossover types used in GATSS are called Fixed Position and Fixed Position Sequential:



Mutation
One at random selected part (consisting of consecutive genes) of the chromosome is permuted. In practice, this is done by switching location of the genes within the selected part.



Fitness Evaluation
The fitness is calculated in two steps. First, all chromosomes in the population is evaluated. Then, one of the three fitness algorithms are applied.

  1. Evaluation
    The genes in this GA application are nodes in a graph, and a chromosome can be regarded as a walk through this graph. To evaluate how good the solution is, the total length of the walk is calculated by summing the distances between each node (gene).


  2. Fitness
    The fitness calculation is based on the evaluation described above. Since the system uses a parent selection technique called 'Roulette Wheel Selection', all fitness calculation techniques starts with assigning the initial fitness value by inverting the evaluation value. By doing this, the chromosome with the shortest solution gets the highest initial fitness value, and a higher probability in becoming parent to new chromosomes.



Population Deletion
In this system, two different kinds are implemented:



System Overview

GATSS consists of a C++ program, a HTML user interface and a CGI-script that links the two former parts together. I this section I will discuss the design of the C++ program in short.

The program consists of a main part and three classes: Gene, Chromo and Pop.


Gene
This class defines the 'atomic' parts of the GA.

A Gene consists of a name, x- and y-coordinates, a distance vector describing the distances to other genes (only calculated once at every program execution) and a gene index.

Possible operations on a Gene can be seen in the class declaration above.


Chromo
This class defines what a chromosome looks like.

A Chromo consists mainly of a Gene vector and two evaluation values (one original and one modified by one of the fitness algorithms).

Possible operations, including crossover and mutation, can be seen in the class declaration above.


Pop
This class defines the population.

A Pop mainly consists of a Chromo vector.

Possible operations can be seen in the class declaration above.


Main
The major object for this part is to assign the GA the user parameters and to execute the GA. The GA execution loop looks like this:


References:

Parker, B. (1993). DOUGAL. Found somewhere in the web.
Davis, L. (19xx). What is a Genetic Algorithm?
Harel, D. (1987). Algorithmics: the spirit of computing. Addison-Wesley Publishing Company.
1 May 1995, Thomas Pederson.