Based on a forthcoming paper in Proceedings of Privacy Enhancing Technologies by Priyanka Nanayakkara, Johes Bater, Xi He, Jessica Hullman, and Jennie Rogers. Pre-print available here.
TL;DR: Differential privacy (DP) is a definition of privacy that allows for precise accounting of privacy loss resulting from analyses over a dataset. Differentially private analyses, however, require setting a “privacy budget,” which controls a trade-off between privacy and accuracy of results. There is little guidance around how to set the privacy budget, in part because it requires considering the impacts of multiple probabilistic outcomes given a specific application context.VisualizingPrivacy (ViP) is an interactive visualization tool that makes these trade-offs more concrete to help practitioners without DP expertise make privacy budget decisions.
Imagine you’re a medical researcher publicly releasing the rate of a certain disease in a database of patient records. Although you only want to release an aggregate statistic, you face a confidentiality challenge: publishing any statistic computed over a dataset “leaks” information about the data and could increase an attacker’s ability to discover which individuals were diagnosed with the disease.
To release the statistic while maintaining confidentiality, you might turn to differential privacy(DP), a mathematical definition of privacy which says that an analysis conducted with or without any individual’s data should produce similar outputs. DP protects individual-level privacy, and has gained traction over the last few years and has been adopted by the U.S. Census Bureau, Google, Apple, and Facebook, among others. But for DP to be successful in a given setting, a data curator must reason carefully about how to balance privacy and accuracy.
How can we design interfaces for using DP that lead to good budget decisions? First, let’s review the formal definition of DP.
Formally, the definition of DP is as follows:
Say that D and D’ are databases that differ by one record. Then a randomized mechanism M satisfies DP if the following holds, whereo is an output of M:
To satisfy DP, mechanisms often add some random noise (drawn from a specified probability distribution) to a result. More added noise implies stronger privacy protection, as each individual’s contribution to the data is further obscured. But this also means lowered accuracy — in other words, the release will be farther off from the result with no noise added. ε, or the “privacy budget,” from the definition above controls the shape of the noise distribution. Higher ε means less noise.
In many cases, the goal of statistics is to generalize from a sample to a larger population. Calculating a confidence interval (CI) under DP can support extrapolation in the face of confidentiality concerns. Similar relationships between ε, accuracy, and privacy exist with CIs under DP. However, there are now multiple sources of uncertainty that must be accounted for — for example, the uncertainty in the parameter estimate that the data support and the probability distribution that dictates how much DP noise will be added. These sources of error are distinct but both combine and propagate in the released CI.
In general, there is no formal guidance for setting ε. The task requires reasoning about the accuracy-privacy trade-off and multiple sources of uncertainty, as described above. This is where an interface for budget setting can matter. Ideally, an interface should support a practitioner in 1) understanding the ε-accuracy trade-off, 2) understanding the ε-privacy trade-off, 3) understanding statistical inference under DP, and 4) splitting ε across queries.
With this in mind, we designed and built Visualizing Privacy (ViP) (demo), an interactive visualization interface aimed at helping practitioners — such as medical researchers who must be aware of confidentiality concerns — weigh trade-offs with the goal of setting a contextually-appropriate value for ε.
The primary point of interaction between a user and ViP is through a set of privacy budget sliders. Updating ε allocated to each query in turn updates the query’s visualizations showing accuracy and disclosure risk. The user can experiment with different values of ε before settling on an appropriate value. Consider a user who wants to make public the rate of hypertension in a patient cohort.
The user will need to consider the expected accuracy of a release from the differentially private mechanism. ViP uses frequency-framed approaches to visualizing uncertainty of potential releases.
Quantile dotplots display DP output distributions, allowing the user to directly assess the expected accuracy of a privacy-preserving release under a given ε. Quantile dotplots allow for quick calculations of the cumulative distribution function since each dot represents some percent chance that the release will fall into a given range.
Second, ViP uses hypothetical outcome plots (HOPs) to display DP output distributions via animation. HOPs animate sample draws from a distribution and in the DP case, emphasize that only one release will be made per output distribution.
HOPs are overlaid on quantile dotplots. By doing so, each quantile dotplot serves as a persistent summary of its HOP.
As the user adjusts the query’s allocated ε, the quantile dotplots/HOPs update accordingly.
Alongside accuracy, the user will also need to consider privacy guarantees when setting ε. One way of thinking about privacy outcomes is through disclosure risk under a particular attack model. ViP shows disclosure risk under an attack model that assumes an adversary has access to all records in a dataset and is trying to guess whether a record was included in a computation based on a sensitive attribute. The interface includes a plot showing an upper bound on the probability that the adversary guesses inclusion in a computation correctly, for each query. As the user adjusts ε for a query, the corresponding dot on the risk curve updates accordingly, thus showing how ε and disclosure risk trade off.
While releasing a point estimate, the user might also want to release a CI. They’ll now need to consider not only measurement (e.g., sampling) error, but also DP noise when choosing ε. ViP visualizes 50, 80, and 95% CIs constructed under DP using HOPs (potential sets of CIs animate over time) with static binomial 50, 80, and 95% CIs as reference. Seeing both sets of CIs can help give the user a sense of how both forms of error combine, impacting the released CI.
While we’ve considered a scenario in which the user only wants to release results from one query, it’s likely that the user might also want to release results from several other queries. For instance, they might have a total privacy budget to spend across queries and must decide how to allocate budget to each query without overspending. ViP allows the user to set a total privacy budget and displays the remaining budget based on what’s set on each query’s ε sliders.
The interface supports reasoning around splitting a total privacy budget across queries largely through the visualizations described above communicating accuracy and risk, which allow for weighing impacts of allocating budget toward one query versus others. Each query has its own panel showing accuracy of releases and a point on the risk curve that corresponds to disclosure risk if only that query’s results were released. ViP also includes a “responsive” mode, where privacy budget sliders automatically update to help keep the privacy budget spent across queries under the set total budget.
A demo of ViP is available here, where queries refer to the proportion of patients in a database who were diagnosed with hypertension, by ethnicity, age group, race, and zip code.
DP is complex to reason about, especially in the presence of sampling error, so our primary question after designing ViP was, Will data practitioners who are new to DP benefit from using it? To this end, we compared how well practitioners could make judgments and decisions related to privacy/accuracy using ViP versus a spreadsheet-style interface that gave error estimates but didn’t enable the same type of interactive visual reasoning. For example, the spreadsheet allowed a practitioner to set ε and then provided the maximum distance from the result the release would be 95% of the time. Similar error values were given for lower and upper bounds of the 95% CI calculated under DP. We designed eight tasks that reflect questions a practitioner might need to answer while setting or splitting a privacy budget. Each participant completed each task using both ViP and the spreadsheet. We counterbalanced order of tasks and ViP/spreadsheet.
We recruited 16 U.S.-based participants with experience analyzing private or sensitive data (e.g., health data) and little to no DP background.
At ε = x for the X query, which subgroup in the Xquery do we expect to have the most accurate privacy-preserving release?
At ε = x for the X query, what is the probability that the privacy-preserving release for the X1subgroup will be greater than y?
Set ε for the Xquery such that its corresponding disclosure risk is x%.
Set ε for the X query to x. For the X1 subgroup, estimate how many times wider we expect the privacy preserving 95% CI to be compared to the traditional 95% CI.
Find the smallest ε values for each query (W, X, Y, Z) where the privacy-preserving releases for the subgroups W1, X1, Y1, Z1 are within x of their query results (i.e., query result — x ≤ release ≤ query result + x).
Suppose that you have a total budget of x that you want to allocate across queries. The risk corresponding to each query should be no more than y% and the release should be guaranteed to be within zof the query result for W1, X1, Y1, and Z1 subgroups with roughly 90% probability.
Estimate the probability that the release for the X1 subgroup will be greater than the release for the X2 subgroup when the Xquery’s ε = x.
We measured accuracy in answers based on absolute error (|ground truth — participant’s response|). We found that participants performed better using ViP for the CDF Judgment questions and CI Comparison questions. When answering CDF Judgment questions, participants could quickly count up dots and multiply by 4% (since there were 25 total dots, each represented 100/25 = 4%). When comparing CIs, participants were able to visually assess the width of the CI constructed under DP and compare it with the width of the CI calculated in the traditional way.
All participants answered the Compare Accuracy questions correctly using both ViP and the spreadsheet (hence we don’t include these results in the figure above). Participants performed roughly the same on Risk Requirement questions, and we have some evidence that ViP may help in completing Equalize Accuracy tasks.
Participants reported feeling an average of 2.3 points (on a scale from 0 to 10) more confident using ViP and when describing why ViP was helpful, seven participants described how ViP helped them keep track of DP trade-offs/relationships.
Our work is part of a growing effort to provide interfaces for DP that allow for wider adoption. We highlight three points of discussion that can inform future work on DP interfaces.
We found evidence that ViP helps users keep track of key DP relationships. To further help users bring domain knowledge into the privacy budget setting task, it may be useful to provide mappings between previously-used methods of confidentiality protection and DP. One of these methods/properties is k-anonymity, in which each individual’s data in a release must be indistinguishable from at least at k-1other individuals’ data in the release. k-anonymity is widely used in the health domain, and incorporating ways of mapping between k and ε into an interface could help practitioners, particularly those in health, transition from previous methods to DP.
We might additionally integrate ε anchor points into interfaces, specifically on privacy budget sliders. For example, we might point out ε values that correspond to maximum organizational requirements around disclosure risk or previously chosen ε values for similar queries on similar datasets.
We visualize disclosure risk under only one attack model, which may not be appropriate for all scenarios. We envision future work to include additional attack models into the interface to better support reasoning about privacy considerations. One option might be to include hypothesis testing for DP mechanisms which quantifies risk in terms of an adversary rejecting/failing to reject the null hypothesis dependent on the presence of an individual’s record in the database.
The effectiveness of DP is only as good as the appropriateness of the privacy budget for a given context. Through the development of ViP, we aim to show the importance of DP interfaces that can help in facilitating reasoning around privacy budgets and the trade-offs they control. Interactive visualizations can serve as an effective bridge between potentially abstract mathematical concepts and their real-world implications.