Logo

The Data Daily

Introduction to AI and Reinforcement Learning

Introduction to AI and Reinforcement Learning

Introduction to AI and Reinforcement Learning
Photo by Possessed Photography on Unsplash
Introduction:
Recently, I’ve been going down an AI Rabbit Hole. It started out by just watching Youtube videos and reading articles, but eventually, I came accross a facet of AI that really, really intrigued, not just because of how interesting the concept is, but also because of the potential uses it may have. It’s known as Reinforcement Learning.
Overview of Artificial Intelligence:
To start, we need to understand what exactly Artificial Intelligence (AI) is. AI is a pretty blanket term — it’s the science of making intelligent machines, especially computer programs. We’re going to go over a subset of AI, machine learning.
Machine learning is basically using a computer system to make accurate and reliable predictions/readings/interpretations of data that the system has been given. The first two types of machine learning were supervised learning and unsupervised learning. (*This is simply an overview of each system — I don’t go too in depth when it comes to the tech used for such methods)
Supervised Learning
Supervised learning is a task-driven form of machine learning. Labelled data is used to train the system, and when new data is presented, the system will be able to make a prediction.
For example, you can feed a system images of cows and chickens, and over time, it will be able to make distinctions. The system creates a function on a graph based on the data in order to predict what output a given input will lead to — you can learn more about how it actually works here . Supervised learning is used for pattern recognition, spam detection, object recognition, etc.
Unsupervised Learning
Unsupervised learning, unlike supervised learning, is data-driven. The system isn’t set out to recognise anything, but to observe. In unsupervised learning, the system is given raw data, and needs to figure out patterns in the data. Essentially, it finds patterns in data and groups said data accordingly.
For example, say there are five red data points and five green data points. The system will recognize that the data points differ in color, and will group them accordingly. Say we add five blue points, and give each of the now 15 points a random number from 1–3. The system will recognize all of these patterns.
Unsupervised learning is mainly for clustering problems, or grouping data. It is also used for anomaly detection — that is, noticing inconsistencies in datasets. This is especially useful for things such as bank transactions.
These were the first two types of machine learning, and they’re both well and good, and pretty useful — but, as Grand Master Yoda said, “There is another.” The third type of machine learning, and perhaps the most interesting (to me, at least), Reinforcement Learning.
Introduction to Reinforcement Learning:
Reinforcement learning is a pretty interesting concept, and has some pretty powerful applications. Reinforcement learning is based around a decision loop. The system, or the agent, will evaluate a given environment, and thusly take an action. It will receive feedback, or rewards, based on the affect the action had on the environment, and repeat.
Over time, the system will learn what actions yield positive rewards, and which actions do not yield a positive reward. This means that, as the agent keeps trying and failing, it will get better at navigating the environment.
There are some pretty fascinating examples. Two pretty cool agents designed around reinforcement learning networks include AlphaGo and OpenAI Five, developed by DeepMind and OpenAI, respectively. AlphaGo was the first Artificial Intelligence to beat a human player (a master, no less) at the ancient Chinese game of Go. OpenAI Five is incredible because it was able to beat professional players at Dota 2 — a video game that requires good strategy skills and teamwork in order to achieve success.
Reinforcement learning could also be the basis for AGI, or Artificial General Intelligence. AGI is the idea of an AI that can, hypothetically, learn or do anything a human could. This is because that reinforcement learning is most similar to human learning, relative to supervised and unsupervised learning. However, this is still a very big “maybe.”
RL can be used in Vehicle Automation, or self driving cars. The U.K. based company Wayve has trained a self-driving car with RL. RL can also be used in finance. IBM has created an RL model that can trade stocks. RL can also be used in industrial automation, robotics, and media personalization.
Pros of Reinforcement Learning:
1. Reinforcement learning is innovative — with regular supervised learning, the system takes one path to the end goal. With RL, the system can take steps that haven’t previously been discovered and create new strategies to solve problems.
2. Reinforcement learning is very similar to human learning. Like supervised learning, they both learn by failing and remedying those mistakes. But, reinforcement learning takes it one step further by adding a reward system.
3. It can outperform humans in certain tasks. See AlphaGo by Deepmind and OpenAI Five by OpenAI.
4. It balances exploration and exploitation. Exploration is when you try a different method to see if it is better than what has been done in the past. Exploitation is trying things that have worked well in the past. RL creates a good balance.
Cons of Reinforcement Learning:
1. Reinforcement learning requires a lot of data. In video games, there is a lot of data available, but in real life, not so much is available.
2. RL also requires a lot of computing power
3. RL is based off a Markov Decision Process, and thusly assumes the world to be Markovian. That is, a series of events where each event is based solely on the probability of the past event. The real world is not like this.
4. Another big issue that RL presents when looking at real world applications is how expensive it can be. RL is based on trial and error. When the system first starts training, it will fail — likely miserably. So, if you wanted to test RL with a robot, be prepared for said robot to break. Continuously repairing and maintaining vessels for RL systems is very expensive.
Conclusion:
RL is a pretty cool form of machine learning, with some astounding potential. It can be used for automobiles, as well as financial analysis. RL, through trial and error, can better evaluate real-world situations, and thus complete tasks and can react to unexpected consequences appropriately.
Despite this, there are still many challenges that we face when it comes to RL. It’s expensive and isn’t really optimized for the real world yet. It also takes tons of computing power, and data can be hard to get, mainly because creating an entire world for an agent to train in is pretty tedious.
Nevertheless, I look forward to researching more about RL!

Images Powered by Shutterstock