Tag Archives: engineers
When Deep Blue defeated world chess champion Garry Kasparov in 1997, it may have seemed artificial intelligence had finally arrived. A computer had just taken down one of the top chess players of all time. But it wasn’t to be.
Though Deep Blue was meticulously programmed top-to-bottom to play chess, the approach was too labor-intensive, too dependent on clear rules and bounded possibilities to succeed at more complex games, let alone in the real world. The next revolution would take a decade and a half, when vastly more computing power and data revived machine learning, an old idea in artificial intelligence just waiting for the world to catch up.
Today, machine learning dominates, mostly by way of a family of algorithms called deep learning, while symbolic AI, the dominant approach in Deep Blue’s day, has faded into the background.
Key to deep learning’s success is the fact the algorithms basically write themselves. Given some high-level programming and a dataset, they learn from experience. No engineer anticipates every possibility in code. The algorithms just figure it.
Now, Alphabet’s DeepMind is taking this automation further by developing deep learning algorithms that can handle programming tasks which have been, to date, the sole domain of the world’s top computer scientists (and take them years to write).
In a paper recently published on the pre-print server arXiv, a database for research papers that haven’t been peer reviewed yet, the DeepMind team described a new deep reinforcement learning algorithm that was able to discover its own value function—a critical programming rule in deep reinforcement learning—from scratch.
Surprisingly, the algorithm was also effective beyond the simple environments it trained in, going on to play Atari games—a different, more complicated task—at a level that was, at times, competitive with human-designed algorithms and achieving superhuman levels of play in 14 games.
DeepMind says the approach could accelerate the development of reinforcement learning algorithms and even lead to a shift in focus, where instead of spending years writing the algorithms themselves, researchers work to perfect the environments in which they train.
Pavlov’s Digital Dog
First, a little background.
Three main deep learning approaches are supervised, unsupervised, and reinforcement learning.
The first two consume huge amounts of data (like images or articles), look for patterns in the data, and use those patterns to inform actions (like identifying an image of a cat). To us, this is a pretty alien way to learn about the world. Not only would it be mind-numbingly dull to review millions of cat images, it’d take us years or more to do what these programs do in hours or days. And of course, we can learn what a cat looks like from just a few examples. So why bother?
While supervised and unsupervised deep learning emphasize the machine in machine learning, reinforcement learning is a bit more biological. It actually is the way we learn. Confronted with several possible actions, we predict which will be most rewarding based on experience—weighing the pleasure of eating a chocolate chip cookie against avoiding a cavity and trip to the dentist.
In deep reinforcement learning, algorithms go through a similar process as they take action. In the Atari game Breakout, for instance, a player guides a paddle to bounce a ball at a ceiling of bricks, trying to break as many as possible. When playing Breakout, should an algorithm move the paddle left or right? To decide, it runs a projection—this is the value function—of which direction will maximize the total points, or rewards, it can earn.
Move by move, game by game, an algorithm combines experience and value function to learn which actions bring greater rewards and improves its play, until eventually, it becomes an uncanny Breakout player.
Learning to Learn (Very Meta)
So, a key to deep reinforcement learning is developing a good value function. And that’s difficult. According to the DeepMind team, it takes years of manual research to write the rules guiding algorithmic actions—which is why automating the process is so alluring. Their new Learned Policy Gradient (LPG) algorithm makes solid progress in that direction.
LPG trained in a number of toy environments. Most of these were “gridworlds”—literally two-dimensional grids with objects in some squares. The AI moves square to square and earns points or punishments as it encounters objects. The grids vary in size, and the distribution of objects is either set or random. The training environments offer opportunities to learn fundamental lessons for reinforcement learning algorithms.
Only in LPG’s case, it had no value function to guide that learning.
Instead, LPG has what DeepMind calls a “meta-learner.” You might think of this as an algorithm within an algorithm that, by interacting with its environment, discovers both “what to predict,” thereby forming its version of a value function, and “how to learn from it,” applying its newly discovered value function to each decision it makes in the future.
Prior work in the area has had some success, but according to DeepMind, LPG is the first algorithm to discover reinforcement learning rules from scratch and to generalize beyond training. The latter was particularly surprising because Atari games are so different from the simple worlds LPG trained in—that is, it had never seen anything like an Atari game.
Time to Hand Over the Reins? Not Just Yet
LPG is still behind advanced human-designed algorithms, the researchers said. But it outperformed a human-designed benchmark in training and even some Atari games, which suggests it isn’t strictly worse, just that it specializes in some environments.
This is where there’s room for improvement and more research.
The more environments LPG saw, the more it could successfully generalize. Intriguingly, the researchers speculate that with enough well-designed training environments, the approach might yield a general-purpose reinforcement learning algorithm.
At the least, though, they say further automation of algorithm discovery—that is, algorithms learning to learn—will accelerate the field. In the near term, it can help researchers more quickly develop hand-designed algorithms. Further out, as self-discovered algorithms like LPG improve, engineers may shift from manually developing the algorithms themselves to building the environments where they learn.
Deep learning long ago left Deep Blue in the dust at games. Perhaps algorithms learning to learn will be a winning strategy in the real world too.
Image credit: Mike Szczepanski / Unsplash Continue reading
Need a robot with a soft touch? A team of Michigan State University engineers has designed and developed a novel humanoid hand that may be able to help. Continue reading
The human brain operates on roughly 20 watts of power (a third of a 60-watt light bulb) in a space the size of, well, a human head. The biggest machine learning algorithms use closer to a nuclear power plant’s worth of electricity and racks of chips to learn.
That’s not to slander machine learning, but nature may have a tip or two to improve the situation. Luckily, there’s a branch of computer chip design heeding that call. By mimicking the brain, super-efficient neuromorphic chips aim to take AI off the cloud and put it in your pocket.
The latest such chip is smaller than a piece of confetti and has tens of thousands of artificial synapses made out of memristors—chip components that can mimic their natural counterparts in the brain.
In a recent paper in Nature Nanotechnology, a team of MIT scientists say their tiny new neuromorphic chip was used to store, retrieve, and manipulate images of Captain America’s Shield and MIT’s Killian Court. Whereas images stored with existing methods tended to lose fidelity over time, the new chip’s images remained crystal clear.
“So far, artificial synapse networks exist as software. We’re trying to build real neural network hardware for portable artificial intelligence systems,” Jeehwan Kim, associate professor of mechanical engineering at MIT said in a press release. “Imagine connecting a neuromorphic device to a camera on your car, and having it recognize lights and objects and make a decision immediately, without having to connect to the internet. We hope to use energy-efficient memristors to do those tasks on-site, in real-time.”
A Brain in Your Pocket
Whereas the computers in our phones and laptops use separate digital components for processing and memory—and therefore need to shuttle information between the two—the MIT chip uses analog components called memristors that process and store information in the same place. This is similar to the way the brain works and makes memristors far more efficient. To date, however, they’ve struggled with reliability and scalability.
To overcome these challenges, the MIT team designed a new kind of silicon-based, alloyed memristor. Ions flowing in memristors made from unalloyed materials tend to scatter as the components get smaller, meaning the signal loses fidelity and the resulting computations are less reliable. The team found an alloy of silver and copper helped stabilize the flow of silver ions between electrodes, allowing them to scale the number of memristors on the chip without sacrificing functionality.
While MIT’s new chip is promising, there’s likely a ways to go before memristor-based neuromorphic chips go mainstream. Between now and then, engineers like Kim have their work cut out for them to further scale and demonstrate their designs. But if successful, they could make for smarter smartphones and other even smaller devices.
“We would like to develop this technology further to have larger-scale arrays to do image recognition tasks,” Kim said. “And some day, you might be able to carry around artificial brains to do these kinds of tasks, without connecting to supercomputers, the internet, or the cloud.”
Special Chips for AI
The MIT work is part of a larger trend in computing and machine learning. As progress in classical chips has flagged in recent years, there’s been an increasing focus on more efficient software and specialized chips to continue pushing the pace.
Neuromorphic chips, for example, aren’t new. IBM and Intel are developing their own designs. So far, their chips have been based on groups of standard computing components, such as transistors (as opposed to memristors), arranged to imitate neurons in the brain. These chips are, however, still in the research phase.
Graphics processing units (GPUs)—chips originally developed for graphics-heavy work like video games—are the best practical example of specialized hardware for AI and were heavily used in this generation of machine learning early on. In the years since, Google, NVIDIA, and others have developed even more specialized chips that cater more specifically to machine learning.
The gains from such specialized chips are already being felt.
In a recent cost analysis of machine learning, research and investment firm ARK Invest said cost declines have far outpaced Moore’s Law. In a particular example, they found the cost to train an image recognition algorithm (ResNet-50) went from around $1,000 in 2017 to roughly $10 in 2019. The fall in cost to actually run such an algorithm was even more dramatic. It took $10,000 to classify a billion images in 2017 and just $0.03 in 2019.
Some of these declines can be traced to better software, but according to ARK, specialized chips have improved performance by nearly 16 times in the last three years.
As neuromorphic chips—and other tailored designs—advance further in the years to come, these trends in cost and performance may continue. Eventually, if all goes to plan, we might all carry a pocket brain that can do the work of today’s best AI.
Image credit: Peng Lin Continue reading