Test Analysis
This document contains an Analysis of the behavior of GATSS, i.e. how the GA responds to different parameter settings.

Notes

This is by no means a complete test, I've just tried things that came to my mind wile fooling around. For a more reliable study, the different parameter settings have to be repeated many times and put into some kind of statistical model.

I have used a fixed seed (799500870) in all tests. This is to give all executions exactly the same random series, which improves the reliability in the comparison between the tests. All tests discussed can of course be repeated exactly (using the same seed) by the user.

The nodes used are the same 20 cities that exists initially in the HTML-page.


Contents

o Test 1, Initial
o Test 2, Elitism
o Test 3, Population Size
o Test 4, Crossover
o Test 5, Mutation
o Test 6, Windowing
o Test 7, Windowing with extra mutation
o Test 8, Linear Normalization
o Test 9, Linear Normalization with low decrement
o Test 10, Steady Delete


Test 1

Starting with a basic parameter setup: no mutation, no crossover, delete all deletion techique with elitsm value = 0.

Result



Distance: 205.251

Analysis

It is interesting to note that the evaluation of the best chromosome actually is better in the initial population than in the converged population. This chromosome has obviously been replaced with a weaker chromosome (note the small negative peak at the red curve in the graph). Maybe this can be avoided by setting the elitism value > 0 ?

Test 2

Using the same parameters as in test 1, except for the elitism value which is changed to 1.

Result



Distance: 197.244

Analysis

Using elitism, the initially best chromosome is saved from beeing replaced. This results in a better converged solution (distance = 197.244 instead of 205.251). This solution is however far from optimal. By using a larger population, maybe a better best chromosome will exist initially?

Test 3

Using the same parameters as in test 2, except for the population size which is increased to 300.

Result



Distance: 187.015

Analysis

I was lucky! A larger population resulted in a better best chromosome initially. This improved the converged solution to 187.015. This is far from optimal. It seems like the convergence is to stable. Let's bring some irregularity into the population by the introduction of crossover!

Test 4

Using the same parameters as in test 3, except: The population size is set back to 100, elitism is set to 10, crossover probabilaty is assigned the value 0.4.

Result



Distance: 131.562

Analysis

The usage of crossover together with a higher elitism value really made a lot. The best solution is down to 131.562. Maybe this can be even more improved? Let's introduce some chaos - in GA language: mutation.

Test 5

Using the same parameters as in test 4, but assigning mutation probabilaty the value 0.005.

Result



Distance: 113.336

Analysis

The introduction of mutation made a small, but significant improvement. You can clearly see the effect of mutation on the curve displaying the average evaluation. It's getting unstable. This implies that there's always a chance that a better solution will be produced, but the search for this better solution is done 'in blindness'. Thus, the convergence is very very slow.

I've tried the above parameter setting, letting the evolution proceed in 750 generations. It resulted in a very small improvement (111.523).


Test 6

Let's try the fitness evaluation technique called windowing. The theory is that a to low guard makes the few best chromosomes dominate the population, and a to high guard makes all the chromosomes look the same regarding their fitness. In the first case, the GA finds a 'best' solution quickly, but in many cases this solution is a local maxima. In the second case, the population is homogene and does therefore converge very slowly.

Sad to say, my GA doesn't act the way it should in the second case. It does converge, and pretty much in the same rate as in the first case! I will take a closer look into my code someday to check the windowing algorithm. There's probably a bug somewhere...

Anyway, the windowing technique works well when the guarding value is set to around 45. Every other parameter has the same value as in test 5:

Result



Distance: 132.162

Analysis

The population converges pretty fast, but not to a very good evaluation level. Maybe some more mutation would be appropriate?

Test 7

Using the same parameters as in test 6, except for the mutation probability which is set to 0.4. This is a very high mutation probabilaty, but it will result in a good solution too:

Result



Distance: 98.7325

Analysis

You can clearly see how the high mutation rate makes the average evaluation line go up and down. The resulting best solution (98.7325) is the best this GA has ever reached.


The node plot above shows that this has to be one of the best possible solutions. The success of the windowing algorithm is probably a lucky strike but I thought that it deserved to be mentioned anyway.


Test 8

Now, let's try the Linear Normalization technique.
Mutation probabilaty is set back to 0.005, elitism value is set to 1, fitness evaluation technique is set to Windowing and the linear decrement value is assigned the value 1.5.

Result



Distance: 130.482

Analysis

A very fast convergence but to a pretty bad evaluation value. I have tried other values regarding the elitism but i seems like L.N. is very sensitive to this value. A zero almost always implies that the best chromosome gets replaced and a larger value results in a very bad performance.

Back to the fast convergence. Perhaps the linear decrease factor is to large? A smaller value would result in a slower convergence but, hopefully, to a better solution.


Test 9

All parameters are set to the same values as in test 8, except the linear decrement factor which is decreased to 0.15. Also, because of the probably slower convergence, the number of generations is doubled to 150.

Result



Distance: 98.7325

Analysis

Improvement. With this parameter setting, the L.N. technique manages to find the best possible solution just like the windowing technique did. But it needed more time (or rather: more generation replacements).

The decrement factor plays a major role in the results given in test 8 and 9. In test 8, there is a large spread of evaluation values between the chromosomes which gives the best chromosome a bigger advantage. In test 9, the differences between the chromosomes are much smaller and thus, gives the other chromosomes a better chance to live and mate.


Test 10

All the above tests were done with the Delete All deletion technique. It's time to find out how good the Steady Delete technique is in comparison. This technique is considerably slower at each generational step because of the sorting needed at each insertion of a new population member. The advantage should be a faster convergence regarding how many generational steps that is needed.

I won't do the above test all over again, instead I'll display the best result, and discuss how the Steady Delete algorithm performs in general, compared to Delete All.

The parameter values are set as below:

Result



Distance: 98.7325

Analysis

This deletion algorithm is capable of finding the best possible solution, and with the above parameter setting, it does it very fast (speaking in generational terms that is). It reaches the best solution after only 27 generational steps.

How is this possible? The most important feature is, that if a good chromosome is generated, even in the beginning of a generational replacement it can be used as parent within the same generational replacement. This makes the algorithm much more 'intense' than the Delete All where such a good chromosome has to wait on the next generation before it can be used as material for new chromosomes.

I've done some testing with this algorithm and it almost always converges fast, and it almost always converges to a fairly good solution when used together with a high mutation probability. This is very natural since it really needs an unsteady environment to avoid local maxima.

In general, it seems like this algoritm has a little better statistics in finding a fair solution, when used properly (that is, with a high mutation probability), than the Delete All algorithm. This goes for all combinations of fitness evaluation techniques etc. More testing is required to establish this statement though.


1 May 1995, Thomas Pederson.