If I had to pick the hardest video game that I ever played in my childhood, I would probably say Flappy Bird. Now, I honestly didn’t play a lot of video games as a kid, but Flappy Bird was something I just never clicked with. Now, I wasn’t one to throw my phone across the room because I couldn’t get above 30, but it was still a hard game that I spent hours playing yet still couldn’t get past 15.
So, after learning about AI and reinforcement learning, I decided to make a program that could play Flappy Bird for me. In this article, I will break down how the code works and how I used the NEAT algorithm to teach my computer how to play one of the hardest games on the planet.
For starters, the NEAT algorithm stands for NeuroEvolution of Augmenting Topologies, and it is an evolutionary algorithm that creates artificial neural networks.
To break this word into smaller bite-sized bits, evolution indicates that this is a genetic algorithm. This means that it is inspired by Charles Darwin’s theory of evolution and the algorithm reflects the process of natural selection. In other words, those who succeed will live and the weaker ones will die.
The word neuroevolution indicates that it is the neural network that is evolving such as it does in the human brain. Actions in life that are rewarded will stay and continue while those that don’t result in a reward will be discarded.
Due to the fact that the larger your neural network is, the harder it is to train, the neural network actually begins with no hidden layers and only has an input and output layer in the first iteration. After that, the network will evolve and create either new nodes or connections. Yet, it cannot affect the amount of input and output nodes, those are fixed. This leads us to the next work, augmenting. As more iterations and improvements occur to the neural network, it augments and improves.
Finally, the last word is topologies. Topologies is just a fancy term to describe the connections of the neurons in a neural network. Different forms and shapes of the neural network affect the output. Down below is a list of potential forms that a neural network can take.
Now, how does the Neat algorithm actually work? Well, because this algorithm is based on natural selection and evolution, we needed a set of flappy birds with their own neural network and therefore their own unique set of weights and biases. Luckily, we don’t have to figure out any of those numbers and the algorithm does that for us.
However, we do need to figure out what we want the inputs and outputs of the algorithm to be. When playing Flappy Bird, we can use our eyes to determine how far the next pipe is and decide whether or not we should make the bird jump or not. However, the computer unfortunately cannot see, so we need to give it this information digitally.
As a result, there are 3 input nodes to the algorithm. The first is the position of the bird, and the second and third are the position of the bird relative to the top pipe and relative to the bottom pipe. From this information, the computer will determine whether it should jump or if it shouldn’t to improve it’s fitness score.
The fitness score indicates how good the algorithm is. So, in this scenario with Flappy bird, we want the bird to travel as far as possible without dying. As a result, our fitness score will increase by 1 for every frame we travel.
Next is where the NEAT algorithm does most of the heavy lifting. After the first iteration of the algorithm is created with random weights and biases, the game is played multiple times. Then, from there, the birds with the best fitness score will survive and will genetically modify to become even better.
Similar to humans, the neural network has genetic code and properties that make it unique. Such as how you may have black hair and brown eyes and your friend has blonde hair and blue eyes, the code has properties that its friends do not share. Additionally, neural networks evolve and have children as humans do. So, after each run-through of the program, the neural network finds another neural network to combine and make a new neural network ‘child’.
This child will have elements that the parent has. Such as how you may have adopted your parents hair colour, the neural network adopts certain connections and nodes from its parents. If we take a look at the example down below, we can see how the child network has elements from both his parents.
Even though this picture is fairly accurate, this doesn’t accurately how attributes are always inherited. From the image, we can see at the end how there are some elements that aren't 1 has and parent 2 doesn’t and vice versa.
Because of this, most of the time attributes that one parent has that the other doesn’t will be passed on depending on their fitness score. Say for example after the first iteration, parent 1 had a higher fitness score. That would mean that when they try to have a ‘child’, all of the attributes that are the same will be passed on but the disjoint block that parent 1 has and parent 2 doesn’t will be passed on as well. This also means that all of the disjoint blocks and excess blocks from parent 2 will be discarded.
Essentially, the NEAT algorithm will take the similarities that the neural networks have and pass them on to their child. Then, for the properties that don’t appear in both networks, the neural network will take the properties from the network with the highest fitness score.
When creating a new iteration of an existing neural network there are 2 possible changes that can be made to the neural network.
Both of these changes allow for the neural network to improve its fitness score and achieve its goal. Luckily, these decisions are made by the NEAT algorithm without the user having to program anything.
One of the major aspects of NEAT that make it stand out is its property of protection through speciation. After each iteration, the neural networks are split up into families. So, those that have very similar structure are grouped together. For example, apples are paired with apples and oranges are paired with oranges.
From there, the best neural network from each group are chosen and iterated upon. This helps to worse neural networks to innovate and improve. Instead of comparing all of the neural networks to each other and possibly erasing a neural network with a lot of potential, you only compare neural networks to others that have the same structure to it.
This leaves worse neural networks to improve and potentially beat out the weights and biases of a neural network that got lucky in the first iteration. From here, each neural network is modified and a new batch of Flappy are created from the best Flappy birds of each category.
Another reason that the NEAT algorithm is better than other evolutionary algorithms is because of its ability to explore many different topologies without compromising speed and effort.
Most neuroevolutionary algorithms unlike NEAT, begin with random topologies. Starting off with random topologies is necessary as it is hard for new topologies to be randomly created after many iterations of a successful topology. However, an array of random topologies may not be the most useful approach.
Due to the fact that these topologies at the start have never gone through any iteration of the game and or activity, there is no way of knowing if that topology is actually necessary or useful. As a result, the algorithm may end up wasting a lot of time and effort trying to fit a square in a circular hole.
In contrast, NEAT begins with a uniform population of networks with no hidden layers. This allows the structure of the neural networks to incrementally grow and improve iteration after iteration. Additionally, because of the fact that topologies are protected by speciation, it is guaranteed that structures will only grow as necessary. This way, NEAT searches through a minimal number of weight dimensions, significantly reducing the number of generations necessary to find a solution.
All in all, the NEAT algorithm is an absolutely efficient and capable algorithm for reinforcement learning. There are many perks to the algorithm so that it can iterate quickly and also make it effective for finding the right neural network for the Flappy Bird game.
P.S. This more of an introduction to the NEAT algorithm and to learn more about NEAT, you can read the research paper here!