Tag Archives: programs
#437269 DeepMind’s Newest AI Programs Itself ...
When Deep Blue defeated world chess champion Garry Kasparov in 1997, it may have seemed artificial intelligence had finally arrived. A computer had just taken down one of the top chess players of all time. But it wasn’t to be.
Though Deep Blue was meticulously programmed top-to-bottom to play chess, the approach was too labor-intensive, too dependent on clear rules and bounded possibilities to succeed at more complex games, let alone in the real world. The next revolution would take a decade and a half, when vastly more computing power and data revived machine learning, an old idea in artificial intelligence just waiting for the world to catch up.
Today, machine learning dominates, mostly by way of a family of algorithms called deep learning, while symbolic AI, the dominant approach in Deep Blue’s day, has faded into the background.
Key to deep learning’s success is the fact the algorithms basically write themselves. Given some high-level programming and a dataset, they learn from experience. No engineer anticipates every possibility in code. The algorithms just figure it.
Now, Alphabet’s DeepMind is taking this automation further by developing deep learning algorithms that can handle programming tasks which have been, to date, the sole domain of the world’s top computer scientists (and take them years to write).
In a paper recently published on the pre-print server arXiv, a database for research papers that haven’t been peer reviewed yet, the DeepMind team described a new deep reinforcement learning algorithm that was able to discover its own value function—a critical programming rule in deep reinforcement learning—from scratch.
Surprisingly, the algorithm was also effective beyond the simple environments it trained in, going on to play Atari games—a different, more complicated task—at a level that was, at times, competitive with human-designed algorithms and achieving superhuman levels of play in 14 games.
DeepMind says the approach could accelerate the development of reinforcement learning algorithms and even lead to a shift in focus, where instead of spending years writing the algorithms themselves, researchers work to perfect the environments in which they train.
Pavlov’s Digital Dog
First, a little background.
Three main deep learning approaches are supervised, unsupervised, and reinforcement learning.
The first two consume huge amounts of data (like images or articles), look for patterns in the data, and use those patterns to inform actions (like identifying an image of a cat). To us, this is a pretty alien way to learn about the world. Not only would it be mind-numbingly dull to review millions of cat images, it’d take us years or more to do what these programs do in hours or days. And of course, we can learn what a cat looks like from just a few examples. So why bother?
While supervised and unsupervised deep learning emphasize the machine in machine learning, reinforcement learning is a bit more biological. It actually is the way we learn. Confronted with several possible actions, we predict which will be most rewarding based on experience—weighing the pleasure of eating a chocolate chip cookie against avoiding a cavity and trip to the dentist.
In deep reinforcement learning, algorithms go through a similar process as they take action. In the Atari game Breakout, for instance, a player guides a paddle to bounce a ball at a ceiling of bricks, trying to break as many as possible. When playing Breakout, should an algorithm move the paddle left or right? To decide, it runs a projection—this is the value function—of which direction will maximize the total points, or rewards, it can earn.
Move by move, game by game, an algorithm combines experience and value function to learn which actions bring greater rewards and improves its play, until eventually, it becomes an uncanny Breakout player.
Learning to Learn (Very Meta)
So, a key to deep reinforcement learning is developing a good value function. And that’s difficult. According to the DeepMind team, it takes years of manual research to write the rules guiding algorithmic actions—which is why automating the process is so alluring. Their new Learned Policy Gradient (LPG) algorithm makes solid progress in that direction.
LPG trained in a number of toy environments. Most of these were “gridworlds”—literally two-dimensional grids with objects in some squares. The AI moves square to square and earns points or punishments as it encounters objects. The grids vary in size, and the distribution of objects is either set or random. The training environments offer opportunities to learn fundamental lessons for reinforcement learning algorithms.
Only in LPG’s case, it had no value function to guide that learning.
Instead, LPG has what DeepMind calls a “meta-learner.” You might think of this as an algorithm within an algorithm that, by interacting with its environment, discovers both “what to predict,” thereby forming its version of a value function, and “how to learn from it,” applying its newly discovered value function to each decision it makes in the future.
Prior work in the area has had some success, but according to DeepMind, LPG is the first algorithm to discover reinforcement learning rules from scratch and to generalize beyond training. The latter was particularly surprising because Atari games are so different from the simple worlds LPG trained in—that is, it had never seen anything like an Atari game.
Time to Hand Over the Reins? Not Just Yet
LPG is still behind advanced human-designed algorithms, the researchers said. But it outperformed a human-designed benchmark in training and even some Atari games, which suggests it isn’t strictly worse, just that it specializes in some environments.
This is where there’s room for improvement and more research.
The more environments LPG saw, the more it could successfully generalize. Intriguingly, the researchers speculate that with enough well-designed training environments, the approach might yield a general-purpose reinforcement learning algorithm.
At the least, though, they say further automation of algorithm discovery—that is, algorithms learning to learn—will accelerate the field. In the near term, it can help researchers more quickly develop hand-designed algorithms. Further out, as self-discovered algorithms like LPG improve, engineers may shift from manually developing the algorithms themselves to building the environments where they learn.
Deep learning long ago left Deep Blue in the dust at games. Perhaps algorithms learning to learn will be a winning strategy in the real world too.
Image credit: Mike Szczepanski / Unsplash Continue reading
#437171 Scientists Tap the World’s Most ...
In The Hitchhiker’s Guide to the Galaxy by Douglas Adams, the haughty supercomputer Deep Thought is asked whether it can find the answer to the ultimate question concerning life, the universe, and everything. It replies that, yes, it can do it, but it’s tricky and it’ll have to think about it. When asked how long it will take it replies, “Seven-and-a-half million years. I told you I’d have to think about it.”
Real-life supercomputers are being asked somewhat less expansive questions but tricky ones nonetheless: how to tackle the Covid-19 pandemic. They’re being used in many facets of responding to the disease, including to predict the spread of the virus, to optimize contact tracing, to allocate resources and provide decisions for physicians, to design vaccines and rapid testing tools, and to understand sneezes. And the answers are needed in a rather shorter time frame than Deep Thought was proposing.
The largest number of Covid-19 supercomputing projects involves designing drugs. It’s likely to take several effective drugs to treat the disease. Supercomputers allow researchers to take a rational approach and aim to selectively muzzle proteins that SARS-CoV-2, the virus that causes Covid-19, needs for its life cycle.
The viral genome encodes proteins needed by the virus to infect humans and to replicate. Among these are the infamous spike protein that sniffs out and penetrates its human cellular target, but there are also enzymes and molecular machines that the virus forces its human subjects to produce for it. Finding drugs that can bind to these proteins and stop them from working is a logical way to go.
The Summit supercomputer at Oak Ridge National Laboratory has a peak performance of 200,000 trillion calculations per second—equivalent to about a million laptops. Image credit: Oak Ridge National Laboratory, U.S. Dept. of Energy, CC BY
I am a molecular biophysicist. My lab, at the Center for Molecular Biophysics at the University of Tennessee and Oak Ridge National Laboratory, uses a supercomputer to discover drugs. We build three-dimensional virtual models of biological molecules like the proteins used by cells and viruses, and simulate how various chemical compounds interact with those proteins. We test thousands of compounds to find the ones that “dock” with a target protein. Those compounds that fit, lock-and-key style, with the protein are potential therapies.
The top-ranked candidates are then tested experimentally to see if they indeed do bind to their targets and, in the case of Covid-19, stop the virus from infecting human cells. The compounds are first tested in cells, then animals, and finally humans. Computational drug discovery with high-performance computing has been important in finding antiviral drugs in the past, such as the anti-HIV drugs that revolutionized AIDS treatment in the 1990s.
World’s Most Powerful Computer
Since the 1990s the power of supercomputers has increased by a factor of a million or so. Summit at Oak Ridge National Laboratory is presently the world’s most powerful supercomputer, and has the combined power of roughly a million laptops. A laptop today has roughly the same power as a supercomputer had 20-30 years ago.
However, in order to gin up speed, supercomputer architectures have become more complicated. They used to consist of single, very powerful chips on which programs would simply run faster. Now they consist of thousands of processors performing massively parallel processing in which many calculations, such as testing the potential of drugs to dock with a pathogen or cell’s proteins, are performed at the same time. Persuading those processors to work together harmoniously is a pain in the neck but means we can quickly try out a lot of chemicals virtually.
Further, researchers use supercomputers to figure out by simulation the different shapes formed by the target binding sites and then virtually dock compounds to each shape. In my lab, that procedure has produced experimentally validated hits—chemicals that work—for each of 16 protein targets that physician-scientists and biochemists have discovered over the past few years. These targets were selected because finding compounds that dock with them could result in drugs for treating different diseases, including chronic kidney disease, prostate cancer, osteoporosis, diabetes, thrombosis and bacterial infections.
Scientists are using supercomputers to find ways to disable the various proteins—including the infamous spike protein (green protrusions)—produced by SARS-CoV-2, the virus responsible for Covid-19. Image credit: Thomas Splettstoesser scistyle.com, CC BY-ND
Billions of Possibilities
So which chemicals are being tested for Covid-19? A first approach is trying out drugs that already exist for other indications and that we have a pretty good idea are reasonably safe. That’s called “repurposing,” and if it works, regulatory approval will be quick.
But repurposing isn’t necessarily being done in the most rational way. One idea researchers are considering is that drugs that work against protein targets of some other virus, such as the flu, hepatitis or Ebola, will automatically work against Covid-19, even when the SARS-CoV-2 protein targets don’t have the same shape.
Our own work has now expanded to about 10 targets on SARS-CoV-2, and we’re also looking at human protein targets for disrupting the virus’s attack on human cells. Top-ranked compounds from our calculations are being tested experimentally for activity against the live virus. Several of these have already been found to be active.The best approach is to check if repurposed compounds will actually bind to their intended target. To that end, my lab published a preliminary report of a supercomputer-driven docking study of a repurposing compound database in mid-February. The study ranked 8,000 compounds in order of how well they bind to the viral spike protein. This paper triggered the establishment of a high-performance computing consortium against our viral enemy, announced by President Trump in March. Several of our top-ranked compounds are now in clinical trials.
Also, we and others are venturing out into the wild world of new drug discovery for Covid-19—looking for compounds that have never been tried as drugs before. Databases of billions of these compounds exist, all of which could probably be synthesized in principle but most of which have never been made. Billion-compound docking is a tailor-made task for massively parallel supercomputing.
Dawn of the Exascale Era
Work will be helped by the arrival of the next big machine at Oak Ridge, called Frontier, planned for next year. Frontier should be about 10 times more powerful than Summit. Frontier will herald the “exascale” supercomputing era, meaning machines capable of 1,000,000,000,000,000,000 calculations per second.
Although some fear supercomputers will take over the world, for the time being, at least, they are humanity’s servants, which means that they do what we tell them to. Different scientists have different ideas about how to calculate which drugs work best—some prefer artificial intelligence, for example—so there’s quite a lot of arguing going on.
Hopefully, scientists armed with the most powerful computers in the world will, sooner rather than later, find the drugs needed to tackle Covid-19. If they do, then their answers will be of more immediate benefit, if less philosophically tantalizing, than the answer to the ultimate question provided by Deep Thought, which was, maddeningly, simply 42.
This article is republished from The Conversation under a Creative Commons license. Read the original article.
Image credit: NIH/NIAID Continue reading