12 reflections on data (and its representation) that we don’t want to forget in the “next-normal” Data is everywhere: in every headline, and at the center of every conversation. In only a few months since the pandemic began, data of all types has become an essential language to understand and make sense of a rapidly changing world. But how fluent in this language is the general public? And how can we as practitioners and researchers design data visualizations that fully represent the nuances and the implications of a very complex situation? Due to the media coverage of the COVID-19 data, we have been exposed to (very) bad and (very) good examples of visualizations. We’ve seen visualizations used to both clarify and hide arguments, to both support and deny research evidence — even as tools for propaganda. We want to make sure we’ll always contribute to clarity and consistency, being fully aware of the potential risks and pitfalls and — in the end — the responsibility of shaping data and making it available for the public discourse. This is a summary of what we’ve observed and “pinned,” a checklist to stay on track for a more transparent and open future of data visualization. *** These reflections originated in response to the way data has been displayed through the course of the Covid19 pandemic. While most examples are Covid19 specific, we believe they are valid in a broad sense.
Data isn’t neutral. How it’s created and calculated determines its meaning. The methods by which data has been derived must be accounted for in its visualization. Have these two numbers of cases been calculated using the same methodologies? A death can have different “meanings” as countries measured it differently (e.g., people that died and were also positive versus people that directly died because of Covid-19 who did not have underlying conditions, etc.). The ratio between data points is often more relevant than absolute numbers. Is it right to compare positive cases of cities with a different number of residents? Do two datasets share the same denominator or baseline? Context is key. To fully understand a phenomenon, the surrounding circumstances of a number must be treated with the same importance as the number itself. What policy changes could have caused that peak in the curve? What regulations and norms were in place? When were strict isolation measures introduced in this timeline? Data representing a specific moment in time may not tell the full story. Extending the frame of reference to account for events before and/or after a phenomenon can give a fuller picture of the forces at play. Could information from the past help bring clarity to this issue? What was the situation before the data collection started? Is our current perspective too narrow? Data visualization design prioritizes certainty, and calcifies it as an absolute fact. Yet uncertainty is a fact of life, and inherent in any phenomenon. Uncertainty is often just as important to visualize as certainty. How (un)reliable is the process? How big is that margin of error? How can we render fuzziness, ambiguity and reliability? > margins of error The data we don’t have (i.e. gaps and holes in datasets) could be as important as the data points we do have. In which days cases have not been recorded? For how long tests have not been available? Is this a zero or there is no data? Show the origin of data and how it is processed to help readers fully understand it. Is it clear what the source of these data is? How have they been combined? Which process of analysis and aggregation they went through? data sources: One size does not fit all. The design of a visualization should respond to the specificities of the dataset in question, what it stands for, and its communicative purpose. Should we represent numbers of hospitalizations with the same chart we use for numbers of deaths in the same presentation? Is the chart appropriate for the specific purpose of that visualization? Most of the observed phenomena are complex in nature. We aim at visualizations that preserve as many dimensions as possible of this complexity and represent its richness, within the limits of readability and understanding. Can I integrate another variable in the visualization without making it too complicated? What would be useful to add to understand the big picture? Attention and interest are not granted, especially with a general audience. Aesthetics and rhetoric are powerful triggers to spark people’s curiosity and drive it through data. Is this chart able to attract readers and guide them through the data? Is it rich enough and yet simple to be read? Every data visualization lives and performs on a “stage” that shapes the reader’s relationship to the data. Does the visualization exploit the specific features of the medium? Is it appropriate for the digital device the reader will be using? Would interaction and animation add value, and are they possible in this format? To truly connect numbers to what they stand for, data visualization should acknowledge, and even accentuate, the inherent human component. Can people relate what they read in the form of charts to what they are experiencing in daily life? How can we preserve individuality when aggregating data? Can we make readers “feel” the phenomenon behind the data?
Architect and Communication Designer, Paolo Ciuccarelli is a Professor of Design at Northeastern University, College of Art Media and Design, after twenty years at Politecnico di Milano in Italy. At Politecnico he coordinated the Communication Design program (BSc and MSc), has been a member of the board at the PhD in Design and he founded the DensityDesign Research Lab, an award-winning laboratory for data visualization and information design. At Northeastern University he’s the founding director of the Center for Design, an interdisciplinary hub to foster design research. Paolo’s research focuses on the design transformations that help make sense of data and information to improve decision-making processes, especially with non-experts stakeholders and for controversial complex social issues where he’s also experimenting on the role of rhetorics and aesthetics for a deeper engagement. Paolo Ciuccarelli is the author of best-paper awarded publications, lectured at several international institutions including Royal College of Arts, ENSCI Les Ateliers, Glasgow School of Arts, King’s College, MIT Media Lab and Stanford Humanities Centre and has been invited to talk at conferences such as Eyeo, TEDx, Visualized, NetSci, Congreso Futuro. Giorgia Lupi is an information designer. She is a Partner at Pentagram in New York. After receiving her master’s degree in Architecture, she earned her PhD in Design at Politecnico di Milano. In 2011, she co-founded Accurat, an internationally acclaimed data-driven design firm with offices in Milan and New York. She is co-author of Dear Data and of the new interactive book Observe, Collect, Draw — A Visual Journal. Giorgia is also a public speaker, her TED TALK on her humanistic approach to data has over one million views. She has been named One of “Fast Company’s” 100 Most Creative People in Business in 2018, and she recently joined MIT Media Lab as a Director’s Fellow. She is also a member of the World Economic Forum’s Global Future Council on New Metrics. Her work is part of the permanent collection of the Museum of Modern Art, where in 2017 she also was commissioned to create an original site-specific piece, and of the permanent collection of the Cooper Hewitt, Smithsonian Design Museum