Should you really use machine learning for that?

Read original article here

Disclaimer: The following is based on my observations of machine learning teams — not an academic survey of the industry. For context, I’m a contributor toCortex, an open source platform for deploying models in production.

Machine learning is in an awkward phase.

Its viability has been thoroughly proven—all of the most popular mobile apps use it in some way—but the ecosystem hasn’t quite matured to the point to where the uninitiated can quickly ramp up.

It’s difficult for teams to decide when to introduce machine learning, particularly if no one on the team is a data scientist. Software engineers—while typically having some high level understanding of machine learning—often lack the domain expertise to know whether their problem is suited to ML.

The goal of this article is to present a series of informative questions for teams curious about using production machine learning. It isn’t to discuss what problems are theoretically solvable via machine learning, but rather to help teams that don’t have in-house data scientists understand whether or not applied machine learning could be effective.

Without any experienced machine learning engineers or data scientists, the hardest question for you to answer will be “Is it even possible to solve this with machine learning?”

You have three choices for answering this question:

The first two will cost you significant time or money. The last one might take a day of Googling.

Taking a look around the field will also have the added benefit of showing you where to begin. It is unlikely—given that you don’t have a data scientist on your team—that you will be designing your own model architecture for this problem. If you want to build a support agent, look at how other companies build theirs (you’ll inevitably come across Rasa and Google’s Meena).

Getting a feel for what models and approaches have been used to solve problems like yours will give you a sense of where you should start. For example, Robert Lucian is an engineer who built a popular DIY license plate reader. His solution relied on a few pre-existing models for object detection and text extraction:

As you can see from his write up, he began the machine learning portion of his project simply by looking around at what other people were using in similar domains. He eventually found a model that had been fine tuned specifically for license plates, as well as an effective model for text extraction. He was able to put both into production fairly quickly.

Unless your problem is solvable using vanilla pre-trained models, you will need some relevant data to train your model.

If you are building a recommendation engine, you will need data on user profile attributes as well as viewing habits. If you are building a customer support agent, you will need docs for it to train on. In order for a model to be tailored to your domain, you need data from said domain to train it on.

However, this data doesn’t necessarily need to be proprietary. Even if you haven’t collected sophisticated data from your users, there still may be publicly available data sources you can leverage.

For example, AI Dungeon is an ML-powered choose-your-own-adventure text game that went viral a few months ago:

The game generates results on par with state-of-the-art models, despite the engineer behind it (Nick Walton) fine tuning his model with only 50 MB of text scraped from chooseyourstory.com. This approach worked thanks to transfer learning, a technique in which the “knowledge” of a model—in this case, OpenAI’s GPT-2—is transferred to a new model, fine tuned to a more specific domain (like dungeon crawler fiction) using a smaller data set.

In many situations, machine learning is a tool for the job, but not necessarily the best one. If machine learning doesn’t provide tangible performance benefits over other solutions, it is not worth the extra overhead.

You can analyze this pretty simply by asking a few questions.

First, are there any solutions other than machine learning?

For many problems, like speech recognition or some applications of computer vision, machine learning is currently the only viable solution.

Second, can you replicate the quality of ML prediction with other solutions?

Let’s say you’re building a recommendation system, for example. If you don’t collect much user information, and you only have 100 blog posts to recommend, you can probably use a basic tagging system (i.e. if a user likes Javascript, show them articles tagged “Javascript”):

However, if you are curating a massive library of content and you have robust user data, machine learning is uniquely powerful in its ability to provide personalized recommendations.

Finally, can your other solution scale as well as machine learning?

One of the central promises of ML is that it is dynamic enough to remove the “human in the loop” from processes that traditionally require some manual intervention. For example, inventory management is a messy process. Products often have incomplete information, listed in inconsistent ways. As a result, manual processing is often required.

For small quantities of products, manual processing is fine as an alternative to machine learning—but at scale, it can’t compare. To process one million products would require many humans working for many hours each, whereas a product like Glisten, which uses machine learning to parse product data, can do it rapidly:

The problem is that oftentimes, machine learning’s cool factor leads to it being applied in situations that don’t entirely make sense, contributing to the general skepticism many have of machine learning as “just another hype cycle.”

The reality is that just like any other popular technology, there are use cases where machine learning is the ideal solution, and others where it is not. Knowing whether or not your project calls for machine learning can be the hardest part of getting started—particularly if you aren’t experienced in the field—but hopefully, this is a helpful start.

If you want to see some examples of production machine learning projects, check out theCortex examples repo.

Images Powered by Shutterstock

The Data Daily

Should you really use machine learning for that?