With our 2018 machine learning predictions, we’re taking another shot at machine learning clairvoyance with some brand new calls while also upping the ante to serious “double dog dare you” territory by reiterating some of our previous calls.
We’d like to stress that the predictions made here are shared through a lense of “machine learning in the enterprise.” As such, we’re less concerned with predicting the twists and turns in the heady world of machine learning research and more concerned with the experience of the typical enterprise when looking to leverage the technology to reach its quarterly, annual or longer-term strategic business goals.
With that out of the way, let’s start by setting the tone based on some recent market research findings on the state of the industry as it stands now. It’s a well-known fact that the tech giants have put their dollars where their mouths are when it comes to acquiring machine learning/AI talent. In fact, McKinsey Global Institute (MGI) estimates up to $27 billion of the $39 billion poured into the category in 2016 came in the form of R&D and M&A investments by top 35 high tech and advanced manufacturing companies — dwarfing investments from VCs or private equity firms. And their collective impact on ML research is undeniable. Browsing the accepted NIPS 2017 conference papers by author organization affiliation shows the likes of Google and Microsoft placed among leading universities: Google/DeepMind/Brain (210), Carnegie Mellon (108), MIT (93), Stanford (81), Berkeley (81), and Microsoft (70).
MGI’s report also reveals that the lion’s share of VC, private equity, and M&A activity has gone towards core machine learning technologies ($7b) with computer vision ($3.5b) a distant second and other niche AI areas like natural language ($0.9b), autonomous vehicles ($0.5b), smart robotics ($0.5b), and virtual agents ($0.2b) taking in much more modest sums.
With that level of investment, it’s fair to say the hopes are high for the future integration of machine learning into our economy. But here’s the stinker: the adoption rates are still far below potential. MGI reports:
To make matters more complicated, the adoption rates between industries are found to be very unbalanced, skewed toward large enterprises in software/internet, telecom, and fintech sectors. The study also finds significant profit margin discrepancies between proactive AI-adopter firms and others. This may very well point out to a correlation without causation, but another survey by the Economist Intelligence Unit had already found most strategic-minded executives aren’t willing to wait and take the risk of falling prey to more agile upstarts.
Against this complex backdrop, here are our Top 10 2018 predictions. Some are for all practical purposes continuations of associated 2017 predictions, others yet are brand new calls:
In 2018, ML (“machine learning”) maturity will be the main theme for thousands of companies that have been dabbling in ML with limited pilot projects. According to McKinsey’s survey more than half of the enterprises that invested in AI/ML haven’t seen their investments pay off yet. As the wonderment about the technology gives way to a more objective outlook having digested the pros and cons, business leaders will collaborate more intently with their technical counterparts in order to double down on the existing efforts.
That ML isn’t necessarily plug-and-play is a reality most practitioners have come to learn through first-hand experience but is still unbeknown to many executives in the real economy. Especially, moonshot-type initiatives almost always take quite longer than hoped for. Whether business executives will be patient enough to see real returns from such projects remains anybody’s guess. Meanwhile, more down to earth efforts targeting low hanging fruit areas will thrive.
Despite what you may have heard about new techniques rendering feature engineering obsolete, experienced ML practitioners know better. Having realized a big part of failed ML/AI initiatives has to do with expensive data lakes of dubious value-add, CIOs and Chief Data Officers will pull the plug and instead accelerate data engineering efforts to create feature engineering repositories to support high-value predictive use cases. Some companies will go even further to augment internal efforts by licensing and/or outsourcing to domain-specific third-party feature repository companies. Easier-to-use shared feature repositories will be great success stories as they stop the need to re-invent the wheel with each bespoke application instead fostering cross-departmental collaboration. Add on top ML platforms offering more mature self-serve data wrangling capabilities, and a much wider analytical audience in organizations will feel empowered to explore new use cases. This scenario still implies additional investment, but it will be economical enough to give rise to well thought out machine learning green shoots that ultimately justify the spend.
Recent years saw a Black Friday type rush for academic talent and research scientists with highly cited publications. Surely businesses with access to serious R&D dollars are still looking to fill their ranks with qualified candidates for what they consider industry-disrupting projects. In many cases, this means deep learning specialists, i.e. reinforcement learning for robotics applications. As captivating as deep learning is, it’s still difficult to use and experts are few.
Therefore, in 2018, most CIOs and Chief Data Officers will look to re-skill their existing workforce to achieve broader ML-literacy. The average data scientist will be hard pressed to justify complex models as easily consumable core-ML platforms deliver higher quality baselines with much less effort. Data scientists will not disappear, but may not be the sexiest job come year-end 2018.
To repeat our 2017 prediction, skilled humans will still be central to decision making despite further Machine Learning adoption. Fortune Global 2000 technologists will realize neither expensive consultants nor bringing in top academic talent will be a replacement for subject matter expertise in the form of a detailed understanding of both the business context and the value chain dynamics in their industries. The simple economics of machine learning point out that as the value of predictions near zero, the value of human judgment will be in more demand.
The tools that promise fully automated end-to-end ML overreach by limiting ways experts can intervene in the ML process by trying to fit every problem into the straightjacket of basic classification or regression modeling with hyperparameter tuning. Impactful Machine Learning is not comprised of comparing a bunch of similar algorithms based on garden-variety performance metrics, yet some companies will be finding that out the hard way.
MLaaS platform adoption will accelerate starting in “true private clouds” inside larger companies and in multi-tenant public cloud environments for medium-sized businesses and startups. The advantageous cost structure of such platforms as compared to expensive consultancy and custom applications combined with their right level of abstraction (i.e. ML building blocks and primitives at the right level atomization to achieve the Lego effect) will lend these platforms well for developers and ML engineers to design and deploy point applications at scale and much faster. Cloud machine learning platforms, in particular, will democratize machine learning by:
Developers will have a wealth of tools to leverage and yet little in the way of meaningful benchmarks, which will create some confusion and interoperability issues causing tensions with ML specialists — if and when they are available in the organization. There is no winner in this argument as the much-required learning process continues. As the dust settles, machine learning and software engineering best practices will start fusing together to avoid technical debt and result in more precisely engineered and predictable end-user experiences.
The trend will be further accelerated by the availability of a growing number of specialized toolkits and SDKs optimized for vertical solutions (i.e. IoT meets ML with lightweight local predictions favoring simpler models, like anomaly detection and reinforcement learning) that will appear to get developers closer to an end-to-end smart app deployment experience with less and less handholding.
The number of possible ML techniques plus the variation and length of commercial ML pipelines together bring about an exponential number of possible combinations of algorithms. Even with massive computational power (i.e. thousands of servers), one can only ever be able to try and make work a tiny fraction of these. The truth is computational power will never truly replace cleverness (by either the algorithm or the expert) when searching through this space of possibilities. The current fashion of packaging together many disparate open source libraries and coding paradigms into a loosely integrated “machine learning suite” as promoted by major cloud service providers will certainly please some data scientists. They will feel at home in some of those “checkboxes” as they are simply cloud versions of the exact desktop or on-prem artifacts they are accustomed to. Unfortunately, this myopic stance will fail to usher in the era of truly collaborative and inclusive enterprise Machine Learning due to its inherent complexity.
Some in the machine learning community treat interpretability as a nice to have in an effort to maximize other metrics related to model accuracy. However, this viewpoint represents serious risks in the business context as interpretability is the best debugging tool there is. Algorithmic bias can easily creep in if we blindly trust the black boxes that we build. For example, a recent news story about Amazon’s same-day service in the U.S. has revealed that even seeming anonymized data can generate predictions that contain (in this case racial) bias in subtle ways through proxy variables.
In 2018, we expect more of these issues will make the headlines as European Union’s GDPR (European General Data Protection Regulation) goes into effect on May 25, 2018. GDPR is expected to have a major effect on current “data science” practices, with strict requirements that include the right to explanation (i.e. Can your deep learning model explain why this customer was denied credit?) as well as the prevention of bias and discrimination. This only means model transparency will become more and more important, both for users’ peace-of-mind and for legal/ethical reasons.
Depending on the type of data you work with and the specific predictive use case, Deepnets may be the only game in town or an unnecessary and costly roundabout. At BigML, we are of the opinion that Deepnet models should be part of the machine learning arsenal, thus the support for it in the platform. Nevertheless, in 2018, the undeniable hype about DL research will likely do enterprise early adopters a disservice by pulling their attention away from more efficient and cost-effective baseline models and causing them to pour resources into specialized hardware and complex and/or unproven neural network architectures that are hard to operationalize and difficult to maintain even if access to rare DL expert is secured.
It’s no secret that countries like U.S., Canada, Australia, China have significant machine learning chops. But we predict that a more diverse group of machine learning technology and service providers that have been long overshadowed by the big tech nerve centers such as Silicon Valley, NYC, Boston and the Chinese megapolises will heat up the global competition against the likes of IBM and Accenture with more straightforward approaches able to deliver ROI quicker in their respective geographies. This, in turn, will raise the global Machine Learning awakening for all types of organizations in Asia, Europe, and Latin America. A resulting effect will be the ability to slow down and partially turn the brain drain tide to tech nerve centers that are suffering from affordability crises of their own.
We hope you enjoyed our mere mortal attempt to describe what might unfold in our industry later this year. Do you agree? What other trends are out there that we may have missed? Let us know your thoughts and experiences either in support of or countering the undercurrents we have summarized above and we’ll gladly learn from it.