Training a Neural Network vs. Developing through Genetic Programming

Genetic programming, still very much in use, was quite popular in the nineties and early two-thousands, right before neural networks became hip.

Recently at Decisive AI we have been exploring the possible advantages of developing an agent (as an Intelligent Artificial Player to engage in complex video games) through genetic programming techniques rather than what we have been doing quite successfully in order to produce agents, which is training artificial neural networks through reinforcement learning.

In order to better understand why are we trying out this particular technique here at Decisive AI, I had a very brief but thoroughly informative Q&A session with one of our main AI Analysts, Sacha G., and I’m happy to share some of the answers with you:

JH: For us at Decisive AI, what is the main difference between genetic programming and training an artificial neural network?

SG: In neural networks using reinforcement learning there is a training phase in which the network is modified to play better through positive and negative rewards, and a validation phase where the modified network is tested to determine how well it has learned. This process is repeated until some point of convergence. Genetic programming consists only of a validation phase in which all the randomized syntax trees are tested against each other to determine which is the best, and then the best trees reproduce with one another and evolve. This process is repeated until some point of convergence.

You could say that in the end they both try to approximate a function, but they just go about doing it in two different ways.

We hope that the advantage of genetic programming is that it won’t require long periods of training, which is certainly the case with reinforcement learning neural networks, for example the 200-like-years it took to train AlphaStar on Starcraft 2.

In reinforcement learning it is hard to input domain knowledge, since you can’t specifically separate inputs that are coming into the neural network and tell them to interact in a different way. With genetic programming you still can’t tell the machine to use a specific input but you can have functions that have specific number of inputs, and then hope that in your set of syntax trees, the combination of inputs you want are present in the function. As an example: when considering the position of the enemy in terms of row and column, and the position of Self in terms of row and column, when you want to tell me the distance between yourself and the enemy you don't really need the information of how many bullets you have so the function called distance would pick a row on a column and then calculate some distance, but if accidentally in one of the syntax trees it takes the number of bullets for the row and the column, then the calculation of distance would be faulty. There is no guarantee that the correct inputs will go into the correct function. You just have to hope that with enough randomness and evolution it just works out correctly.

JH: Why are we trying genetic programming?

SG: The main reason to try is to find an alternative to the the costliness of reinforcement learning in both time and resources. We simply want to find a faster, smarter way of actually solving the problem at hand without having to resort to brute force.

To summarize the advantages identified so far: Genetic programming allows us to (i) inject domain knowledge, (ii) skip the lengthy training phase, (iii) have visibility of the functions being used. It is very promising since in recent observations and studies, it generally seems to work well, experimentally at least.

JH: For us, what would be the key advantages in using genetic programming vs reinforcement learning in neural networks, or in conjunction with it?

SG: We could use one or the other. However the best approach so far seems to be to use them in conjunction. Not at the same time, but rather we would use genetic algorithms to evolve the parameters used to tune the neural network, and/or we could also potentially use neural networks inside the genetic program as functions that we have trained and hence fine tuned.

JH: What other techniques do you foresee Decisive AI trying out in the near future?

SG: Some of the techniques we could potentially look at focus on supervised learning using maybe human play as examples.

JH: Anything else you’d like to add about this subject?

SG: Genetic programming is cool because you give it a set of functions that you think are useful and then over time you can see the structure of functions interact with each other changing and improving, and the whole time it is essentially a piece of code that you can totally understand and follow; unlike in reinforcement learning where you just have this kind of black box in which you don't really understand what's going on inside.