Logo

The Data Daily

Researchers Use Big Data And AI To Remove Legal Confidentiality

Researchers Use Big Data And AI To Remove Legal Confidentiality

"Legal confidentiality is a shield for citizens." These are the words of Shami Chakrabarti, the one-time director of the U.K.-based human rights group Liberty, speaking in 2018.

Well, it seems that this shield has just been broken, because researchers at the University of Zurich in Switzerland have published a study in which they were able to identify the participants in confidential legal cases, even though such participants had been anonymized.

How did they do it? By using a combination of artificial intelligence and big data. By harnessing these technologies in tandem, the study's authors could mine over 120,000 public legal records and then use an algorithm to identify connections between them. Described as "linkage," this process enabled the researchers to identify anonymous parties mentioned in public records of Swiss Supreme Court decisions, simply by linking anonymous records to those where various pieces of information was given.

More ominously, the researchers succeeded in de-anonymizing participants in 84% of the judgments they mined, doing so in less than one hour. The ease and speed with which they achieved this sets disconcerting precedents for the future, and more specifically for privacy and the rule of law. "With today's technological possibilities, anonymisation is no longer guaranteed in certain areas," explains co-author Kerstin Noëlle Vokinger, speaking to swissinfo.ch.

The legal profession is no stranger to data sweeping. To take one example from patent law, the software company Wirebox reported in 2018 that it had been hired by an "international legal firm" to build a system that would analyze the data of 90 million publicly registered patents, so as to determine the value of such patents held by the firm's clients. However, while this kind of data mining isn't unheard of in the legal sector, the research published by the Zurich team represents a new step into the unknown, because it appears to undermine the functioning of the law rather than support it.

Fundamentally, the research shows that, thanks to the combination of big data and AI, NSA-style signals intelligence can be applied quite effortlessly to public legal cases, and in a way that makes a mockery of the law's claim to protect the privacy of plaintiffs and defendants.

And just as the privacy of individuals is jeopardized by AI and big data, so too is that of businesses. The study specifically focused on identifying pharmaceutical companies involved in legal disputes in Switzerland, as well as the medicines sold by such companies. By de-anonymizing these companies and medicines, the researchers showed that rival firms could potentially use similar methods to find out what their rivals have been doing. And this applies not just to the pharmaceutical industry, but to any other sector.

"This procedure can in principle be applied to any publicly available database," co-author Urs Jakob Mühlematter told Swiss broadcaster SRF, implying that no public data was safe from his team's methods.

The ramifications are huge, because it would seem that the potential use of data mining in this way might require a significant overhaul not only in how legal systems produce and organize records, but also in the definition of what constitutes "personally identifiable information."

"I see myself confirmed in my admonitions that factual data based on personal behavior may potentially have to be treated as personal data," said Adrian Lobsiger, the Swiss Federal Data Protection Commissioner, also speaking to SRF.

Still, a balance may need to be struck when approaching any potential reform, since while the powerful combination of big data and AI may threaten individual privacy, the Swiss researchers have shown that it could be used to unmask overly secretive corporations. In other words, it may end up becoming a powerful boost to transparency, one that helps the public hold potentially nefarious companies to account.

"Legal confidentiality is a shield for citizens." These are the words of Shami Chakrabarti, the one-time director of the U.K.-based human rights group Liberty, speaking in 2018.

Well, it seems that this shield has just been broken, because researchers at the University of Zurich in Switzerland have published a study in which they were able to identify the participants in confidential legal cases, even though such participants had been anonymized.

How did they do it? By using a combination of artificial intelligence and big data. By harnessing these technologies in tandem, the study's authors could mine over 120,000 public legal records and then use an algorithm to identify connections between them. Described as "linkage," this process enabled the researchers to identify anonymous parties mentioned in public records of Swiss Supreme Court decisions, simply by linking anonymous records to those where various pieces of information was given.

More ominously, the researchers succeeded in de-anonymizing participants in 84% of the judgments they mined, doing so in less than one hour. The ease and speed with which they achieved this sets disconcerting precedents for the future, and more specifically for privacy and the rule of law. "With today's technological possibilities, anonymisation is no longer guaranteed in certain areas," explains co-author Kerstin Noëlle Vokinger, speaking to swissinfo.ch.

The legal profession is no stranger to data sweeping. To take one example from patent law, the software company Wirebox reported in 2018 that it had been hired by an "international legal firm" to build a system that would analyze the data of 90 million publicly registered patents, so as to determine the value of such patents held by the firm's clients. However, while this kind of data mining isn't unheard of in the legal sector, the research published by the Zurich team represents a new step into the unknown, because it appears to undermine the functioning of the law rather than support it.

Fundamentally, the research shows that, thanks to the combination of big data and AI, NSA-style signals intelligence can be applied quite effortlessly to public legal cases, and in a way that makes a mockery of the law's claim to protect the privacy of plaintiffs and defendants.

And just as the privacy of individuals is jeopardized by AI and big data, so too is that of businesses. The study specifically focused on identifying pharmaceutical companies involved in legal disputes in Switzerland, as well as the medicines sold by such companies. By de-anonymizing these companies and medicines, the researchers showed that rival firms could potentially use similar methods to find out what their rivals have been doing. And this applies not just to the pharmaceutical industry, but to any other sector.

"This procedure can in principle be applied to any publicly available database," co-author Urs Jakob Mühlematter told Swiss broadcaster SRF, implying that no public data was safe from his team's methods.

The ramifications are huge, because it would seem that the potential use of data mining in this way might require a significant overhaul not only in how legal systems produce and organize records, but also in the definition of what constitutes "personally identifiable information."

"I see myself confirmed in my admonitions that factual data based on personal behavior may potentially have to be treated as personal data," said Adrian Lobsiger, the Swiss Federal Data Protection Commissioner, also speaking to SRF.

Still, a balance may need to be struck when approaching any potential reform, since while the powerful combination of big data and AI may threaten individual privacy, the Swiss researchers have shown that it could be used to unmask overly secretive corporations. In other words, it may end up becoming a powerful boost to transparency, one that helps the public hold potentially nefarious companies to account.

Images Powered by Shutterstock