Tag Archives: print
#437673 Can AI and Automation Deliver a COVID-19 ...
Illustration: Marysia Machulska
Within moments of meeting each other at a conference last year, Nathan Collins and Yann Gaston-Mathé began devising a plan to work together. Gaston-Mathé runs a startup that applies automated software to the design of new drug candidates. Collins leads a team that uses an automated chemistry platform to synthesize new drug candidates.
“There was an obvious synergy between their technology and ours,” recalls Gaston-Mathé, CEO and cofounder of Paris-based Iktos.
In late 2019, the pair launched a project to create a brand-new antiviral drug that would block a specific protein exploited by influenza viruses. Then the COVID-19 pandemic erupted across the world stage, and Gaston-Mathé and Collins learned that the viral culprit, SARS-CoV-2, relied on a protein that was 97 percent similar to their influenza protein. The partners pivoted.
Their companies are just two of hundreds of biotech firms eager to overhaul the drug-discovery process, often with the aid of artificial intelligence (AI) tools. The first set of antiviral drugs to treat COVID-19 will likely come from sifting through existing drugs. Remdesivir, for example, was originally developed to treat Ebola, and it has been shown to speed the recovery of hospitalized COVID-19 patients. But a drug made for one condition often has side effects and limited potency when applied to another. If researchers can produce an antiviral that specifically targets SARS-CoV-2, the drug would likely be safer and more effective than a repurposed drug.
There’s one big problem: Traditional drug discovery is far too slow to react to a pandemic. Designing a drug from scratch typically takes three to five years—and that’s before human clinical trials. “Our goal, with the combination of AI and automation, is to reduce that down to six months or less,” says Collins, who is chief strategy officer at SRI Biosciences, a division of the Silicon Valley research nonprofit SRI International. “We want to get this to be very, very fast.”
That sentiment is shared by small biotech firms and big pharmaceutical companies alike, many of which are now ramping up automated technologies backed by supercomputing power to predict, design, and test new antivirals—for this pandemic as well as the next—with unprecedented speed and scope.
“The entire industry is embracing these tools,” says Kara Carter, president of the International Society for Antiviral Research and executive vice president of infectious disease at Evotec, a drug-discovery company in Hamburg. “Not only do we need [new antivirals] to treat the SARS-CoV-2 infection in the population, which is probably here to stay, but we’ll also need them to treat future agents that arrive.”
There are currentlyabout 200 known viruses that infect humans. Although viruses represent less than 14 percent of all known human pathogens, they make up two-thirds of all new human pathogens discovered since 1980.
Antiviral drugs are fundamentally different from vaccines, which teach a person’s immune system to mount a defense against a viral invader, and antibody treatments, which enhance the body’s immune response. By contrast, antivirals are chemical compounds that directly block a virus after a person has become infected. They do this by binding to specific proteins and preventing them from functioning, so that the virus cannot copy itself or enter or exit a cell.
The SARS-CoV-2 virus has an estimated 25 to 29 proteins, but not all of them are suitable drug targets. Researchers are investigating, among other targets, the virus’s exterior spike protein, which binds to a receptor on a human cell; two scissorlike enzymes, called proteases, that cut up long strings of viral proteins into functional pieces inside the cell; and a polymerase complex that makes the cell churn out copies of the virus’s genetic material, in the form of single-stranded RNA.
But it’s not enough for a drug candidate to simply attach to a target protein. Chemists also consider how tightly the compound binds to its target, whether it binds to other things as well, how quickly it metabolizes in the body, and so on. A drug candidate may have 10 to 20 such objectives. “Very often those objectives can appear to be anticorrelated or contradictory with each other,” says Gaston-Mathé.
Compared with antibiotics, antiviral drug discovery has proceeded at a snail’s pace. Scientists advanced from isolating the first antibacterial molecules in 1910 to developing an arsenal of powerful antibiotics by 1944. By contrast, it took until 1951 for researchers to be able to routinely grow large amounts of virus particles in cells in a dish, a breakthrough that earned the inventors a Nobel Prize in Medicine in 1954.
And the lag between the discovery of a virus and the creation of a treatment can be heartbreaking. According to the World Health Organization, 71 million people worldwide have chronic hepatitis C, a major cause of liver cancer. The virus that causes the infection was discovered in 1989, but effective antiviral drugs didn’t hit the market until 2014.
While many antibiotics work on a range of microbes, most antivirals are highly specific to a single virus—what those in the business call “one bug, one drug.” It takes a detailed understanding of a virus to develop an antiviral against it, says Che Colpitts, a virologist at Queen’s University, in Canada, who works on antivirals against RNA viruses. “When a new virus emerges, like SARS-CoV-2, we’re at a big disadvantage.”
Making drugs to stop viruses is hard for three main reasons. First, viruses are the Spartans of the pathogen world: They’re frugal, brutal, and expert at evading the human immune system. About 20 to 250 nanometers in diameter, viruses rely on just a few parts to operate, hijacking host cells to reproduce and often destroying those cells upon departure. They employ tricks to camouflage their presence from the host’s immune system, including preventing infected cells from sending out molecular distress beacons. “Viruses are really small, so they only have a few components, so there’s not that many drug targets available to start with,” says Colpitts.
Second, viruses replicate quickly, typically doubling in number in hours or days. This constant copying of their genetic material enables viruses to evolve quickly, producing mutations able to sidestep drug effects. The virus that causes AIDS soon develops resistance when exposed to a single drug. That’s why a cocktail of antiviral drugs is used to treat HIV infection.
Finally, unlike bacteria, which can exist independently outside human cells, viruses invade human cells to propagate, so any drug designed to eliminate a virus needs to spare the host cell. A drug that fails to distinguish between a virus and a cell can cause serious side effects. “Discriminating between the two is really quite difficult,” says Evotec’s Carter, who has worked in antiviral drug discovery for over three decades.
And then there’s the money barrier. Developing antivirals is rarely profitable. Health-policy researchers at the London School of Economics recently estimated that the average cost of developing a new drug is US $1 billion, and up to $2.8 billion for cancer and other specialty drugs. Because antivirals are usually taken for only short periods of time or during short outbreaks of disease, companies rarely recoup what they spent developing the drug, much less turn a profit, says Carter.
To change the status quo, drug discovery needs fresh approaches that leverage new technologies, rather than incremental improvements, says Christian Tidona, managing director of BioMed X, an independent research institute in Heidelberg, Germany. “We need breakthroughs.”
Putting Drug Development on Autopilot
Earlier this year, SRI Biosciences and Iktos began collaborating on a way to use artificial intelligence and automated chemistry to rapidly identify new drugs to target the COVID-19 virus. Within four months, they had designed and synthesized a first round of antiviral candidates. Here’s how they’re doing it.
1/5
STEP 1: Iktos’s AI platform uses deep-learning algorithms in an iterative process to come up with new molecular structures likely to bind to and disable a specific coronavirus protein. Illustrations: Chris Philpot
2/5
STEP 2: SRI Biosciences’s SynFini system is a three-part automated chemistry suite for producing new compounds. Starting with a target compound from Iktos, SynRoute uses machine learning to analyze and optimize routes for creating that compound, with results in about 10 seconds. It prioritizes routes based on cost, likelihood of success, and ease of implementation.
3/5
STEP 3: SynJet, an automated inkjet printer platform, tests the routes by printing out tiny quantities of chemical ingredients to see how they react. If the right compound is produced, the platform tests it.
4/5
STEP 4: AutoSyn, an automated tabletop chemical plant, synthesizes milligrams to grams of the desired compound for further testing. Computer-selected “maps” dictate paths through the plant’s modular components.
5/5
STEP 5: The most promising compounds are tested against live virus samples.
Previous
Next
Iktos’s AI platform was created by a medicinal chemist and an AI expert. To tackle SARS-CoV-2, the company used generative models—deep-learning algorithms that generate new data—to “imagine” molecular structures with a good chance of disabling a key coronavirus protein.
For a new drug target, the software proposes and evaluates roughly 1 million compounds, says Gaston-Mathé. It’s an iterative process: At each step, the system generates 100 virtual compounds, which are tested in silico with predictive models to see how closely they meet the objectives. The test results are then used to design the next batch of compounds. “It’s like we have a very, very fast chemist who is designing compounds, testing compounds, getting back the data, then designing another batch of compounds,” he says.
The computer isn’t as smart as a human chemist, Gaston-Mathé notes, but it’s much faster, so it can explore far more of what people in the field call “chemical space”—the set of all possible organic compounds. Unexplored chemical space is huge: Biochemists estimate that there are at least 1063 possible druglike molecules, and that 99.9 percent of all possible small molecules or compounds have never been synthesized.
Still, designing a chemical compound isn’t the hardest part of creating a new drug. After a drug candidate is designed, it must be synthesized, and the highly manual process for synthesizing a new chemical hasn’t changed much in 200 years. It can take days to plan a synthesis process and then months to years to optimize it for manufacture.
That’s why Gaston-Mathé was eager to send Iktos’s AI-generated designs to Collins’s team at SRI Biosciences. With $13.8 million from the Defense Advanced Research Projects Agency, SRI Biosciences spent the last four years automating the synthesis process. The company’s automated suite of three technologies, called SynFini, can produce new chemical compounds in just hours or days, says Collins.
First, machine-learning software devises possible routes for making a desired molecule. Next, an inkjet printer platform tests the routes by printing out and mixing tiny quantities of chemical ingredients to see how they react with one another; if the right compound is produced, the platform runs tests on it. Finally, a tabletop chemical plant synthesizes milligrams to grams of the desired compound.
Less than four months after Iktos and SRI Biosciences announced their collaboration, they had designed and synthesized a first round of antiviral candidates for SARS-CoV-2. Now they’re testing how well the compounds work on actual samples of the virus.
Out of 10
63 possible druglike molecules, 99.9 percent have never been synthesized.
Theirs isn’t the only collaborationapplying new tools to drug discovery. In late March, Alex Zhavoronkov, CEO of Hong Kong–based Insilico Medicine, came across a YouTube video showing three virtual-reality avatars positioning colorful, sticklike fragments in the side of a bulbous blue protein. The three researchers were using VR to explore how compounds might bind to a SARS-CoV-2 enzyme. Zhavoronkov contacted the startup that created the simulation—Nanome, in San Diego—and invited it to examine Insilico’s AI-generated molecules in virtual reality.
Insilico runs an AI platform that uses biological data to train deep-learning algorithms, then uses those algorithms to identify molecules with druglike features that will likely bind to a protein target. A four-day training sprint in late January yielded 100 molecules that appear to bind to an important SARS-CoV-2 protease. The company recently began synthesizing some of those molecules for laboratory testing.
Nanome’s VR software, meanwhile, allows researchers to import a molecular structure, then view and manipulate it on the scale of individual atoms. Like human chess players who use computer programs to explore potential moves, chemists can use VR to predict how to make molecules more druglike, says Nanome CEO Steve McCloskey. “The tighter the interface between the human and the computer, the more information goes both ways,” he says.
Zhavoronkov sent data about several of Insilico’s compounds to Nanome, which re-created them in VR. Nanome’s chemist demonstrated chemical tweaks to potentially improve each compound. “It was a very good experience,” says Zhavoronkov.
Meanwhile, in March, Takeda Pharmaceutical Co., of Japan, invited Schrödinger, a New York–based company that develops chemical-simulation software, to join an alliance working on antivirals. Schrödinger’s AI focuses on the physics of how proteins interact with small molecules and one another.
The software sifts through billions of molecules per week to predict a compound’s properties, and it optimizes for multiple desired properties simultaneously, says Karen Akinsanya, chief biomedical scientist and head of discovery R&D at Schrödinger. “There’s a huge sense of urgency here to come up with a potent molecule, but also to come up with molecules that are going to be well tolerated” by the body, she says. Drug developers are seeking compounds that can be broadly used and easily administered, such as an oral drug rather than an intravenous drug, she adds.
Schrödinger evaluated four protein targets and performed virtual screens for two of them, a computing-intensive process. In June, Google Cloud donated the equivalent of 16 million hours of Nvidia GPU time for the company’s calculations. Next, the alliance’s drug companies will synthesize and test the most promising compounds identified by the virtual screens.
Other companies, including Amazon Web Services, IBM, and Intel, as well as several U.S. national labs are also donating time and resources to the Covid-19 High Performance Computing Consortium. The consortium is supporting 87 projects, which now have access to 6.8 million CPU cores, 50,000 GPUs, and 600 petaflops of computational resources.
While advanced technologies could transform early drug discovery, any new drug candidate still has a long road after that. It must be tested in animals, manufactured in large batches for clinical trials, then tested in a series of trials that, for antivirals, lasts an average of seven years.
In May, the BioMed X Institute in Germany launched a five-year project to build a Rapid Antiviral Response Platform, which would speed drug discovery all the way through manufacturing for clinical trials. The €40 million ($47 million) project, backed by drug companies, will identify outside-the-box proposals from young scientists, then provide space and funding to develop their ideas.
“We’ll focus on technologies that allow us to go from identification of a new virus to 10,000 doses of a novel potential therapeutic ready for trials in less than six months,” says BioMed X’s Tidona, who leads the project.
While a vaccine will likely arrive long before a bespoke antiviral does, experts expect COVID-19 to be with us for a long time, so the effort to develop a direct-acting, potent antiviral continues. Plus, having new antivirals—and tools to rapidly create more—can only help us prepare for the next pandemic, whether it comes next month or in another 102 years.
“We’ve got to start thinking differently about how to be more responsive to these kinds of threats,” says Collins. “It’s pushing us out of our comfort zones.”
This article appears in the October 2020 print issue as “Automating Antivirals.” Continue reading
#437466 How Future AI Could Recognize a Kangaroo ...
AI is continuously taking on new challenges, from detecting deepfakes (which, incidentally, are also made using AI) to winning at poker to giving synthetic biology experiments a boost. These impressive feats result partly from the huge datasets the systems are trained on. That training is costly and time-consuming, and it yields AIs that can really only do one thing well.
For example, to train an AI to differentiate between a picture of a dog and one of a cat, it’s fed thousands—if not millions—of labeled images of dogs and cats. A child, on the other hand, can see a dog or cat just once or twice and remember which is which. How can we make AIs learn more like children do?
A team at the University of Waterloo in Ontario has an answer: change the way AIs are trained.
Here’s the thing about the datasets normally used to train AI—besides being huge, they’re highly specific. A picture of a dog can only be a picture of a dog, right? But what about a really small dog with a long-ish tail? That sort of dog, while still being a dog, looks more like a cat than, say, a fully-grown Golden Retriever.
It’s this concept that the Waterloo team’s methodology is based on. They described their work in a paper published on the pre-print (or non-peer-reviewed) server arXiv last month. Teaching an AI system to identify a new class of objects using just one example is what they call “one-shot learning.” But they take it a step further, focusing on “less than one shot learning,” or LO-shot learning for short.
LO-shot learning consists of a system learning to classify various categories based on a number of examples that’s smaller than the number of categories. That’s not the most straightforward concept to wrap your head around, so let’s go back to the dogs and cats example. Say you want to teach an AI to identify dogs, cats, and kangaroos. How could that possibly be done without several clear examples of each animal?
The key, the Waterloo team says, is in what they call soft labels. Unlike hard labels, which label a data point as belonging to one specific class, soft labels tease out the relationship or degree of similarity between that data point and multiple classes. In the case of an AI trained on only dogs and cats, a third class of objects, say, kangaroos, might be described as 60 percent like a dog and 40 percent like a cat (I know—kangaroos probably aren’t the best animal to have thrown in as a third category).
“Soft labels can be used to represent training sets using fewer prototypes than there are classes, achieving large increases in sample efficiency over regular (hard-label) prototypes,” the paper says. Translation? Tell an AI a kangaroo is some fraction cat and some fraction dog—both of which it’s seen and knows well—and it’ll be able to identify a kangaroo without ever having seen one.
If the soft labels are nuanced enough, you could theoretically teach an AI to identify a large number of categories based on a much smaller number of training examples.
The paper’s authors use a simple machine learning algorithm called k-nearest neighbors (kNN) to explore this idea more in depth. The algorithm operates under the assumption that similar things are most likely to exist near each other; if you go to a dog park, there will be lots of dogs but no cats or kangaroos. Go to the Australian grasslands and there’ll be kangaroos but no cats or dogs. And so on.
To train a kNN algorithm to differentiate between categories, you choose specific features to represent each category (i.e. for animals you could use weight or size as a feature). With one feature on the x-axis and the other on the y-axis, the algorithm creates a graph where data points that are similar to each other are clustered near each other. A line down the center divides the categories, and it’s pretty straightforward for the algorithm to discern which side of the line new data points should fall on.
The Waterloo team kept it simple and used plots of color on a 2D graph. Using the colors and their locations on the graphs, the team created synthetic data sets and accompanying soft labels. One of the more simplistic graphs is pictured below, along with soft labels in the form of pie charts.
Image Credit: Ilia Sucholutsky & Matthias Schonlau
When the team had the algorithm plot the boundary lines of the different colors based on these soft labels, it was able to split the plot up into more colors than the number of data points it was given in the soft labels.
While the results are encouraging, the team acknowledges that they’re just the first step, and there’s much more exploration of this concept yet to be done. The kNN algorithm is one of the least complex models out there; what might happen when LO-shot learning is applied to a far more complex algorithm? Also, to apply it, you still need to distill a larger dataset down into soft labels.
One idea the team is already working on is having other algorithms generate the soft labels for the algorithm that’s going to be trained using LO-shot; manually designing soft labels won’t always be as easy as splitting up some pie charts into different colors.
LO-shot’s potential for reducing the amount of training data needed to yield working AI systems is promising. Besides reducing the cost and the time required to train new models, the method could also make AI more accessible to industries, companies, or individuals who don’t have access to large datasets—an important step for democratization of AI.
Image Credit: pen_ash from Pixabay Continue reading