Machine learning has uncertainty. Design for it.
We can productize and ship more data science insights — even imperfect, probabilistic ones — with the right designs.
Mar 16 · 7 min read
Credit: whiteMocca . Used with permission.
We live in the age of machine learning. That means fewer and fewer of the products we build deal in facts as we know them: instead, they rely more and more on probabilistic things like inferences, predictions, and recommendations. By definition, these things have uncertainty. Inevitably, they will be wrong.
But that doesn’t mean they have no product value. After all, you’d probably rather know there is a 50% chance of rain than have no forecast at all. How can we unlock user value from algorithms that are bound to be wrong? We can do what forecasts do: design our products to be upfront about uncertainty.
In the age of machine learning, designing products that communicate their degree of certainty can be a huge competitive advantage:
It can unlock new value. We can productize and ship more data science insights—even imperfect, probabilistic ones — by empowering users to make their own judgments about how to use them, rather than deciding for them and shipping nothing.
It can reduce risk. Communicating uncertainty is a disclaimer: users can weigh evidence and draw conclusions at their own risk, instead of having to take a product’s claims at face value and holding it responsible for opaque, incorrect conclusions.
It can improve usability. Good design lets users see what a product is doing: visibility of system status is the first Nielsen Norman heuristic for user interface design . Visibility of uncertainty saves users the pain of figuring out for themselves how reliable something is.
The problem with all of this? Uncertainty is hard to design for. Machine learning expresses uncertainty in probabilities, but probabilities aren’t products: normal people don’t want to pore over p-values and confidence intervals, and designers don’t want to create complicated monstrosities riddled with asterisks and technicalities. Besides, non-experts are not very good at interpreting raw probabilities , sometimes turning them into terrible strategic decisions .
We should communicate uncertainty in our products, but we need effective, user-centric design solutions to do it. In this article, I’ll describe three design patterns that do it well:
Show your work
Reveal individual data points
Let users finish the puzzle
To make things concrete, I’ll mainly draw examples from Context , our legal data analytics product that extracts useful insights from the language of millions of judicial opinions. It’s a great proving ground for productizing machine learning, since our attorney user base can be highly skeptical of probability and totally unforgiving of error. We’ve learned a lot in designing a product that works for them!
Design pattern #1: Show your work
The formula here is simple: provide your probabilistic insight (The airport is a little busy right now…), and let users know how your algorithm arrived at it (…based on visits to this place). Google Maps does this:
This can make your insight both more defensible and more useful: users can adjust for any particular biases or limitations they see in the methodology, and decide for themselves how much confidence to have in the conclusion.
In Context, our insights are based on the written language of judicial documents. Below, our language algorithm has discovered that an expert witness, Dr. Giles, and a law firm, Thompson & Knight, have a previous connection: they seem to have opposed each other in the past (a good thing to know when deciding whether to hire Dr. Giles). Our design shows the exact language that led the algorithm to that conclusion. That helps users verify the conclusion, and it adds depth to exactly how this expert and law firm know each other:
Showing your work is also handy in the world of recommendation engines. The design below does it in the form of a complete sentence:
This design tells the user very literally how the algorithm works: in essence, the algorithm just enlists past customers to be the recommenders, a clever technique known as collaborative filtering . By being totally transparent, the design absolves the algorithm of meeting potential expectations it is not certain to meet, like recommending deeply similar, relevant, or enticing products.
Lastly, sometimes showing your work can wind up being the main attraction. At Ravel Law , we built an experimental motion outcome forecaster, which computes the percentage chance that a motion will be granted by the court, and shows how different factors (such as type of asserted defense) empirically make a grant more or less likely. We exposed the factor weights with a simple visualization:
The attorneys we showed the forecaster to were much more interested in understanding the workings of these individual factors — the actionable things that they could control when drafting a motion — than the precise topline prediction of the outcome. Forecasting is hard and uncertain, but showing our (model’s) work gave our attorneys a useful reference for their work.
Design pattern #2: Reveal individual data points
The lowly individual data point may seem like a relic these days, with lots of design focus on how to abstract massive datasets down to human-digestable generalizations, often in the form of data dashboards. But generalizations can get us into trouble, especially when the data are sparse. Paradoxically, big data means we have more possible small datasets now than ever, the results of slicing and dicing big ones down to what we’re interested in. We should design for the uncertainty inherent in small data.
Furthermore, in datasets big and small, the rush to conclude can cause trouble when the data actually follow a different distribution than we assume. The underlying distribution could be skewed , bimodal , or even reversed, as in this example of Simpson’s paradox :
It can be safer — and often much more useful — for a product to reveal individual, raw data points, thus communicating its uncertainty about how reliable any conclusions would be. Consider this before-and-after of one of our Context designs, showing an expert witness’s track record of successfully getting their testimony admitted in court:
The original design called for lots of “roll-up stats,” as we called them — multiple bar graphs asserting the expert’s testimony admittance rate under various circumstances. The problem was that many experts only had one or two outcomes, which wasn’t enough to credibly power those graphs. We would have ended up clumsily proclaiming lots of 0% and 100% admittance rates backed up by one data point each — not technically wrong, but implying far too much certainty about how experts testify. Instead, our final design visualizes every single challenge to an expert’s testimony, letting users spot patterns and judge for themselves how certain they should be about them.
Design pattern #3: Let users finish the puzzle
Machine learning can do amazing things — but it still can’t solve everything, and some parts of the puzzle are best handed off to humans . An algorithm might be good at finding pieces but remain uncertain of the overall solution. Our designs should arrange those pieces for users — and set them up to complete the puzzle on their own.
This idea is most interesting when it comes to higher-order problems that artificial intelligence has not really solved, like explaining cause and effect . Our Context users see spikes of lawsuits affecting particular companies over time and wonder what caused those spikes. Our design arranged legal cases and clusters of news stories on parallel timelines, inviting users to infer cause-and-effect on their own. Here, a user might hypothesize that the spike of legal cases involving Chipotle could be explained by the slightly-earlier cluster of news headlines about Chipotle and E. coli:
Problems like this — why something happened, what will happen next, what you should do about it — are hard to solve and risky to answer definitively. But we don’t have to do either, at least not completely. By suggestively laying out the puzzle pieces that algorithms do find, product designs can spark ideas for users to ponder, flesh out, and assess — putting humans in the loop to extract value from imperfect machine learning.
Designing for uncertainty lets us unlock entirely new products where machine learning might not be accurate enough for traditional designs. I’ve offered three ways product designs can convey uncertainty: showing their work, revealing individual data points, and letting users complete the puzzle. But as machine learning expands, and users think more sophisticatedly about uncertainty and ethics and regulations call for transparency , our range of design solutions for revealing uncertainty will surely grow. I’m looking forward to seeing what we come up with.