Logo

The Data Daily

Using AI and Big Data to Identify a Potential COVID-19 Peptide Treatment

Using AI and Big Data to Identify a Potential COVID-19 Peptide Treatment

Using AI and Big Data to Identify a Potential COVID-19 Peptide Treatment
By Tyrone Burke
On a molecular level, humans are susceptible to COVID-19 because a virus protein is able to bind to a human protein. If that interaction can be stopped, so could this disease. Researchers at Carleton University have synthesized a new protein that is able to do this.
Prof. Ashkan Golshani
Using artificial intelligence and Canada’s most powerful supercomputer, Ashkan Golshani and Frank Dehne analyzed millions of possible protein interactions. They have been developing algorithms that predict protein communications and potential drug treatments since 2003 and, using the IBM Blue Gene/Q supercomputer, they were able to predict that there would be a new type of protein that could stop the SARS-CoV-2 virus from infecting human cells. The researchers designed this new protein, and synthesized it. In a lab setting, it has been successful at preventing coronavirus infection with an efficacy of 75 per cent.
“We are trying to design a treatment for the disease, and these results show that this peptide should be an option,” says Golshani, a professor in the Department of Biology.
Peptide drugs are small proteins that are capable of interrupting interactions between other key proteins. Golshani and Dehne’s approach zeroed in on the interaction of two key proteins: the SARS-CoV-2 spike protein, and the human receptor called angiotensin-converting enzyme 2—more commonly called ACE2.
“The spike protein interacts with the ACE2 receptor in human cells, and that’s how COVID-19 infection starts,” says Golshani.
“Using artificial intelligence, we have been working to design new peptides that can interfere with the communication between these two proteins.”
The novel peptide could help treat people with severe COVID-19 symptoms, or prevent the progression of mild symptoms to more severe ones.
The Need for Artificial Intelligence
Without artificial intelligence, this type of task would be virtually impossible. Humans have more than 20,000 proteins, and there are more than 200 million potential interactions between them. Working on a trial and error basis over a period of decades, human biochemists have identified about 100,000 protein interactions. Each one of them can take several days, weeks or months to study. The overwhelming majority of them are not well understood, but artificial intelligence enables the analysis of millions more potential protein interactions—and even the prediction of how theoretical proteins would interact with those that already exist.
That’s where Dehne’s expertise came into the project. The professor in Carleton’s Institute of Data Science specializes in large-scale data analytics, and has been programming supercomputers to identify proteins that could be used as medical treatments for decades.
Prof. Frank Dehne
“When you look at the human genome, the genes in our bodies are like text. They are generated as a sequence of amino acids and, from a computer science perspective, they are a sequence of characters that is like a text string,” says Dehne.
“They fold into a certain shape, and that shape really determines what they do. Proteins interact with each other, and depending on their two shapes, they can match like a lock and key. These protein dockings are what run most processes in our bodies, and in other organisms too. The spike protein and the ACE2 receptor also do this.”
The two scientists wanted to identify a peptide that would bind to the human ACE2 receptor and prevent the spike protein from docking—a peptide that would make the lock no longer suitable for the key.
The problem was that no such protein was known to exist. Dehne programmed the IBM Blue Gene/Q supercomputer to identify proteins that might be able to interfere with the interaction. Located at the University of Toronto, it is Canada’s fastest supercomputer, with 10 petaflops of processing power and 40,000 processors.
It took three days of computation for the Blue Gene/Q to predict possible candidates, and one of these candidates worked in a lab setting.
Adaptable Research for an Evolving Virus
The peptide is a type of short linear motif—a string of amino acids that mediates protein interaction. It is entirely novel, but creating it is relatively straightforward. Like other proteins, the amino acids can be placed next to each other to cause a reaction that produces it. It is possible to produce this peptide on campus using genetic engineering techniques, and to purchase the custom synthesized protein from commercial labs.
In addition to possible effectiveness against COVID-19, peptide treatments have several advantages over traditional pharmaceuticals.
“Drug companies usually take a trial and error approach. For example, they might find a protein that exists in the Amazon and see if it will work as a treatment for cancer,” says Dehne.
“But this often comes with unwanted side effects. Our computational method aims to make proteins more specific, so they only attach to the target protein. Protein-based cancer drugs are called biologics, and side effects can be a problem with that type of treatment. Not very many biologics can actually make it to market, because they have horrible side effects. And that was the origin of our research into protein-protein interactions at the Institute of Data Science. We wanted to make cancer drugs with fewer side effects.”
But that research is adaptable, which is an additional advantage for this approach. As the virus evolves, the peptide that disrupts it could evolve too.
“Our system can design a new treatment quickly,” says Golshani.
“We use artificial intelligence to do a lot of our design, so we will be able to modify our peptides very quickly, and have treatment options for mutated or emerging viruses.”

Images Powered by Shutterstock