Machine learning: What developers and business analysts need to know

Machine learning is undergoing a revolution because of new technologies and methods. Machine learning is a process of using a program to develop capabilities—like the ability to tell spam from desirable email—by analyzing data instead of programming the exact steps, freeing the user from needing to make every decision about how the algorithm functions. Machine learning is a powerful tool, not only because over a million people focus on tedious programming steps every day, but also because it sometimes finds better solutions than humans engaged in manual effort.

Machine learning has applications in most industries, where it presents a great opportunity to improve upon existing processes. However, many businesses are struggling to keep up with the innovations. Finding skilled data scientists is difficult, yes, but the skills shortage does not tell the whole story, particularly for organizations that have made investments but not realized their potential. The most significant obstacles are related to a gap between data scientists with the skills to implement the methods and business leaders who can drive necessary organizational changes.

Making machine learning successful in an organization requires a holistic strategy that involves specialists and non-specialists alike. It requires focusing the organization, analyzing business cases to determine where machine learning can add value, and managing the risks of a new methodology. For example, a data science team may be interested in using machine learning but choose not to do so because of time constraints, risk aversion, or lack of familiarity. In these situations, a better approach may be to create a separate project, with a focus on creating a foundation for future projects. Once the organization has working examples of machine learning, the bar for future implementations is significantly lower.

The implication is that non-specialists in the organization need to participate in the machine learning vision to make it a success, and this starts with a common understanding. Learning the analysis and math behind data science takes years, but it is important for business leaders, analysts, and developers to at least understand where to apply the technology, how it is applied, and its basic concepts.

Using machine learning requires a different way of approaching a problem: You let the machine learning algorithm solve the problem. This is a shift in mindset for people familiar with thinking through functional steps. It takes some trust that the machine learning program will produce results and an understanding that patience may be required.

Why is machine learning so powerful? There are many different processes (facilitated by algorithms) for making machine learning work, which I will discuss in detail below, but the ones at the leading edge use neural networks, which share a structure similar to that of a biological brain. Neural networks have multiple layers of connectivity, and when there are many complex layers it is called a deep neural network.

Deep neural networks have had limited success until recently, when scientists took advantage of the GPU commonly used for displaying 3D graphics. They realized that GPUs have a massive amount of parallel computing power and used them to train neural networks. The results were so effective that incumbents were caught off guard. The process of training a deep neural network is known as deep learning.

Deep learning came of age in 2012 when a Canadian team entered the first GPU-trained neural network algorithm into a leading image recognition contest and beat the competition by a large margin. The next year, 60 percent of the entries used deep learning, and the following year (2014), almost every entry used it.

Since then, we have seen some remarkable success stories come out of Silicon Valley, giving companies like Google, Amazon, PayPal, and Microsoft new capabilities to serve their customers and understand their markets. For example, Google used its DeepMind system to reduce the energy needed for cooling its data centers by 40 percent. At PayPal, deep learning is used to detect fraud and money laundering.

Outside this center of gravity there have been some other success stories. For example, the Icahn School of Medicine at Mount Sinai leveraged Nvidia GPUs to build a tool called Deep Patient that can analyze a patient’s medical history to predict nearly 80 diseases up to one year prior to onset. The Japanese insurance company, AXA, was able to increase its prediction rate of auto accidents from 40 percent to 78 percent by applying a deep learning model.

At a basic level there are two types of machine learning: supervised and unsupervised learning. Sometimes these types are broken down further (e.g. semi-supervised and reinforcement learning) but this article will focus on the basics.

In the case of supervised learning, you train a model to make predictions by passing it examples with known inputs and outputs. Once the model has seen enough examples, it can predict a probable output from similar inputs.

For example, if you want a model that can predict the probability that someone will suffer a medical condition, then you would need historical records of a random population of people where the records indicate risk factors and whether they suffered from the condition. The results of the prediction can’t be better than the quality of the data used for training. A data scientist will often withhold some of the data from the training and use it to test the accuracy of the predictions.

With unsupervised learning, you want an algorithm to find patterns in the data and you don’t have examples to give it. In the case of clustering, the algorithm would categorize the data into groups. For example, if you are running a marketing campaign, a clustering algorithm could find groups of customers that need different marketing messages and discover specialized groups you may not have known about.

In the case of association, you want the algorithm to find rules that describe the data. For example, the algorithm may have found that people who purchase beer on Mondays also buy diapers. With this knowledge you could remind beer customers on Mondays to buy diapers and try to upsell specific brands.

As I noted above, machine learning applications take some vision beyond an understanding of math and algorithms. They require a joint effort between people who understand the business, people who understand the algorithms, and leaders who can focus the organization.

The implementation of a machine learning model involves a number of steps beyond simply executing the algorithm. For the process to work at the scale of an organization, business analysts and developers should be involved in some of the steps. The workflow is often referred to as a lifecycle and can be summarized with the following five steps. Note that some steps don’t apply to unsupervised learning.

The below example shows parts of a workflow for a supervised learning model. A big data store on Kinetica, a GPU-accelerated database, contains the training data that is accessed by a model leveraging ML features of the database as part of the learning step. The model is then deployed to a production system where an application requests low latency responses. The data from the application is added to the set of training data to improve the model.

Using the right platform for analytics is also important, because some machine learning workflows can create bottlenecks between business users and data science teams. For example, platforms like Spark and Hadoop might need to move large amounts of data into GPU processing nodes before they can begin work, and this can take minutes or hours, while restricting accessibility for business users. A high-performance GPU-powered database like Kinetica can accelerate machine learning workloads by eliminating the data movement and bringing the processing directly to the data. In this scenario, results can be returned in seconds, which enables an interactive process.

Before GPUs supercharged the training of deep neural networks, the implementations were dominated by a variety of algorithms, some of which have been around longer than computers. They still have their place in many use cases because of their simplicity and speed. Many introductory data science courses start by teaching linear regression for the prediction of continuous variables and logistic regression for the prediction of categories. K-means clustering is also a commonly used algorithm for unsupervised learning.

Deep neural networks, the algorithms behind deep learning, have many of the same applications as most of the traditional machine learning algorithms, but can scale to much more sophisticated and complex use cases. Inference is relatively fast, but training is compute-intensive, often requiring many hours of GPU time.

The following diagram shows a graphical representation of a deep learning model for image recognition. In this example, the input is an image and nodes are neurons that progressively pick out more complex features until they output a code indicating the result.

The image recognition example is called a convolutional neural network (CNN) because each neuron contains image masks and uses a technique called convolution to apply the mask to the image data. There are other types of deep neural networks like recurrent neural networks (RNN) that can work with time series data to make financial forecasts and generic multi-layer networks that work with simple variables.

An important thing to consider is that, unlike many traditional machine learning algorithms, deep neural networks are difficult or impossible to reverse engineer. More to the point, you can’t always determine how an inference is made. This is because the algorithm might populate weights in many thousands of neurons, and find solutions that can’t always be understood by humans. Credit scoring is an example where deep neural networks should not be applied if you want to understand how the score is determined.

Writing machine learning models from scratch can be tedious. To make implementations easier, frameworks are available that hide complexities and lower the hurdles for data scientists and developers. The following logos belong to some of the more popular machine learning frameworks.

Google, for example, offers a popular framework called TensorFlow that is famous for its ability to support image and speech recognition, and it provides a suite of tools for model visualization in TensorBoard (see below).

TensorFlow was designed to make it easy to train deep neural networks in parallel and on multiple GPUs, but it also supports traditional algorithms. It can work in combination with big data platforms like Hadoop and Spark for massively parallel workloads. In situations where data movement can be a bottleneck, the Kinetica platform uses native TensorFlow integration to bring GPU-accelerated workloads directly to large data sets.

TensorFlow makes an abstraction between the model (called an estimator) and the algorithm (called an optimizer), allowing a user to select from multiple algorithms when training a model. For example, a specialist could write a supervised learning model using simple linear regression as the algorithm, and then compare its accuracy against a deep neural network algorithm.

Images Powered by Shutterstock

The Data Daily

Machine learning: What developers and business analysts need to know