Tag Archives: toy
#437269 DeepMind’s Newest AI Programs Itself ...
When Deep Blue defeated world chess champion Garry Kasparov in 1997, it may have seemed artificial intelligence had finally arrived. A computer had just taken down one of the top chess players of all time. But it wasn’t to be.
Though Deep Blue was meticulously programmed top-to-bottom to play chess, the approach was too labor-intensive, too dependent on clear rules and bounded possibilities to succeed at more complex games, let alone in the real world. The next revolution would take a decade and a half, when vastly more computing power and data revived machine learning, an old idea in artificial intelligence just waiting for the world to catch up.
Today, machine learning dominates, mostly by way of a family of algorithms called deep learning, while symbolic AI, the dominant approach in Deep Blue’s day, has faded into the background.
Key to deep learning’s success is the fact the algorithms basically write themselves. Given some high-level programming and a dataset, they learn from experience. No engineer anticipates every possibility in code. The algorithms just figure it.
Now, Alphabet’s DeepMind is taking this automation further by developing deep learning algorithms that can handle programming tasks which have been, to date, the sole domain of the world’s top computer scientists (and take them years to write).
In a paper recently published on the pre-print server arXiv, a database for research papers that haven’t been peer reviewed yet, the DeepMind team described a new deep reinforcement learning algorithm that was able to discover its own value function—a critical programming rule in deep reinforcement learning—from scratch.
Surprisingly, the algorithm was also effective beyond the simple environments it trained in, going on to play Atari games—a different, more complicated task—at a level that was, at times, competitive with human-designed algorithms and achieving superhuman levels of play in 14 games.
DeepMind says the approach could accelerate the development of reinforcement learning algorithms and even lead to a shift in focus, where instead of spending years writing the algorithms themselves, researchers work to perfect the environments in which they train.
Pavlov’s Digital Dog
First, a little background.
Three main deep learning approaches are supervised, unsupervised, and reinforcement learning.
The first two consume huge amounts of data (like images or articles), look for patterns in the data, and use those patterns to inform actions (like identifying an image of a cat). To us, this is a pretty alien way to learn about the world. Not only would it be mind-numbingly dull to review millions of cat images, it’d take us years or more to do what these programs do in hours or days. And of course, we can learn what a cat looks like from just a few examples. So why bother?
While supervised and unsupervised deep learning emphasize the machine in machine learning, reinforcement learning is a bit more biological. It actually is the way we learn. Confronted with several possible actions, we predict which will be most rewarding based on experience—weighing the pleasure of eating a chocolate chip cookie against avoiding a cavity and trip to the dentist.
In deep reinforcement learning, algorithms go through a similar process as they take action. In the Atari game Breakout, for instance, a player guides a paddle to bounce a ball at a ceiling of bricks, trying to break as many as possible. When playing Breakout, should an algorithm move the paddle left or right? To decide, it runs a projection—this is the value function—of which direction will maximize the total points, or rewards, it can earn.
Move by move, game by game, an algorithm combines experience and value function to learn which actions bring greater rewards and improves its play, until eventually, it becomes an uncanny Breakout player.
Learning to Learn (Very Meta)
So, a key to deep reinforcement learning is developing a good value function. And that’s difficult. According to the DeepMind team, it takes years of manual research to write the rules guiding algorithmic actions—which is why automating the process is so alluring. Their new Learned Policy Gradient (LPG) algorithm makes solid progress in that direction.
LPG trained in a number of toy environments. Most of these were “gridworlds”—literally two-dimensional grids with objects in some squares. The AI moves square to square and earns points or punishments as it encounters objects. The grids vary in size, and the distribution of objects is either set or random. The training environments offer opportunities to learn fundamental lessons for reinforcement learning algorithms.
Only in LPG’s case, it had no value function to guide that learning.
Instead, LPG has what DeepMind calls a “meta-learner.” You might think of this as an algorithm within an algorithm that, by interacting with its environment, discovers both “what to predict,” thereby forming its version of a value function, and “how to learn from it,” applying its newly discovered value function to each decision it makes in the future.
Prior work in the area has had some success, but according to DeepMind, LPG is the first algorithm to discover reinforcement learning rules from scratch and to generalize beyond training. The latter was particularly surprising because Atari games are so different from the simple worlds LPG trained in—that is, it had never seen anything like an Atari game.
Time to Hand Over the Reins? Not Just Yet
LPG is still behind advanced human-designed algorithms, the researchers said. But it outperformed a human-designed benchmark in training and even some Atari games, which suggests it isn’t strictly worse, just that it specializes in some environments.
This is where there’s room for improvement and more research.
The more environments LPG saw, the more it could successfully generalize. Intriguingly, the researchers speculate that with enough well-designed training environments, the approach might yield a general-purpose reinforcement learning algorithm.
At the least, though, they say further automation of algorithm discovery—that is, algorithms learning to learn—will accelerate the field. In the near term, it can help researchers more quickly develop hand-designed algorithms. Further out, as self-discovered algorithms like LPG improve, engineers may shift from manually developing the algorithms themselves to building the environments where they learn.
Deep learning long ago left Deep Blue in the dust at games. Perhaps algorithms learning to learn will be a winning strategy in the real world too.
Image credit: Mike Szczepanski / Unsplash Continue reading
#436911 Scientists Linked Artificial and ...
Scientists have linked up two silicon-based artificial neurons with a biological one across multiple countries into a fully-functional network. Using standard internet protocols, they established a chain of communication whereby an artificial neuron controls a living, biological one, and passes on the info to another artificial one.
Whoa.
We’ve talked plenty about brain-computer interfaces and novel computer chips that resemble the brain. We’ve covered how those “neuromorphic” chips could link up into tremendously powerful computing entities, using engineered communication nodes called artificial synapses.
As Moore’s law is dying, we even said that neuromorphic computing is one path towards the future of extremely powerful, low energy consumption artificial neural network-based computing—in hardware—that could in theory better link up with the brain. Because the chips “speak” the brain’s language, in theory they could become neuroprosthesis hubs far more advanced and “natural” than anything currently possible.
This month, an international team put all of those ingredients together, turning theory into reality.
The three labs, scattered across Padova, Italy, Zurich, Switzerland, and Southampton, England, collaborated to create a fully self-controlled, hybrid artificial-biological neural network that communicated using biological principles, but over the internet.
The three-neuron network, linked through artificial synapses that emulate the real thing, was able to reproduce a classic neuroscience experiment that’s considered the basis of learning and memory in the brain. In other words, artificial neuron and synapse “chips” have progressed to the point where they can actually use a biological neuron intermediary to form a circuit that, at least partially, behaves like the real thing.
That’s not to say cyborg brains are coming soon. The simulation only recreated a small network that supports excitatory transmission in the hippocampus—a critical region that supports memory—and most brain functions require enormous cross-talk between numerous neurons and circuits. Nevertheless, the study is a jaw-dropping demonstration of how far we’ve come in recreating biological neurons and synapses in artificial hardware.
And perhaps one day, the currently “experimental” neuromorphic hardware will be integrated into broken biological neural circuits as bridges to restore movement, memory, personality, and even a sense of self.
The Artificial Brain Boom
One important thing: this study relies heavily on a decade of research into neuromorphic computing, or the implementation of brain functions inside computer chips.
The best-known example is perhaps IBM’s TrueNorth, which leveraged the brain’s computational principles to build a completely different computer than what we have today. Today’s computers run on a von Neumann architecture, in which memory and processing modules are physically separate. In contrast, the brain’s computing and memory are simultaneously achieved at synapses, small “hubs” on individual neurons that talk to adjacent ones.
Because memory and processing occur on the same site, biological neurons don’t have to shuttle data back and forth between processing and storage compartments, massively reducing processing time and energy use. What’s more, a neuron’s history will also influence how it behaves in the future, increasing flexibility and adaptability compared to computers. With the rise of deep learning, which loosely mimics neural processing as the prima donna of AI, the need to reduce power while boosting speed and flexible learning is becoming ever more tantamount in the AI community.
Neuromorphic computing was partially born out of this need. Most chips utilize special ingredients that change their resistance (or other physical characteristics) to mimic how a neuron might adapt to stimulation. Some chips emulate a whole neuron, that is, how it responds to a history of stimulation—does it get easier or harder to fire? Others imitate synapses themselves, that is, how easily they will pass on the information to another neuron.
Although single neuromorphic chips have proven to be far more efficient and powerful than current computer chips running machine learning algorithms in toy problems, so far few people have tried putting the artificial components together with biological ones in the ultimate test.
That’s what this study did.
A Hybrid Network
Still with me? Let’s talk network.
It’s gonna sound complicated, but remember: learning is the formation of neural networks, and neurons that fire together wire together. To rephrase: when learning, neurons will spontaneously organize into networks so that future instances will re-trigger the entire network. To “wire” together, downstream neurons will become more responsive to their upstream neural partners, so that even a whisper will cause them to activate. In contrast, some types of stimulation will cause the downstream neuron to “chill out” so that only an upstream “shout” will trigger downstream activation.
Both these properties—easier or harder to activate downstream neurons—are essentially how the brain forms connections. The “amping up,” in neuroscience jargon, is long-term potentiation (LTP), whereas the down-tuning is LTD (long-term depression). These two phenomena were first discovered in the rodent hippocampus more than half a century ago, and ever since have been considered as the biological basis of how the brain learns and remembers, and implicated in neurological problems such as addition (seriously, you can’t pass Neuro 101 without learning about LTP and LTD!).
So it’s perhaps especially salient that one of the first artificial-brain hybrid networks recapitulated this classic result.
To visualize: the three-neuron network began in Switzerland, with an artificial neuron with the badass name of “silicon spiking neuron.” That neuron is linked to an artificial synapse, a “memristor” located in the UK, which is then linked to a biological rat neuron cultured in Italy. The rat neuron has a “smart” microelectrode, controlled by the artificial synapse, to stimulate it. This is the artificial-to-biological pathway.
Meanwhile, the rat neuron in Italy also has electrodes that listen in on its electrical signaling. This signaling is passed back to another artificial synapse in the UK, which is then used to control a second artificial neuron back in Switzerland. This is the biological-to-artificial pathway back. As a testimony in how far we’ve come in digitizing neural signaling, all of the biological neural responses are digitized and sent over the internet to control its far-out artificial partner.
Here’s the crux: to demonstrate a functional neural network, just having the biological neuron passively “pass on” electrical stimulation isn’t enough. It has to show the capacity to learn, that is, to be able to mimic the amping up and down-tuning that are LTP and LTD, respectively.
You’ve probably guessed the results: certain stimulation patterns to the first artificial neuron in Switzerland changed how the artificial synapse in the UK operated. This, in turn, changed the stimulation to the biological neuron, so that it either amped up or toned down depending on the input.
Similarly, the response of the biological neuron altered the second artificial synapse, which then controlled the output of the second artificial neuron. Altogether, the biological and artificial components seamlessly linked up, over thousands of miles, into a functional neural circuit.
Cyborg Mind-Meld
So…I’m still picking my jaw up off the floor.
It’s utterly insane seeing a classic neuroscience learning experiment repeated with an integrated network with artificial components. That said, a three-neuron network is far from the thousands of synapses (if not more) needed to truly re-establish a broken neural circuit in the hippocampus, which DARPA has been aiming to do. And LTP/LTD has come under fire recently as the de facto brain mechanism for learning, though so far they remain cemented as neuroscience dogma.
However, this is one of the few studies where you see fields coming together. As Richard Feynman famously said, “What I cannot recreate, I cannot understand.” Even though neuromorphic chips were built on a high-level rather than molecular-level understanding of how neurons work, the study shows that artificial versions can still synapse with their biological counterparts. We’re not just on the right path towards understanding the brain, we’re recreating it, in hardware—if just a little.
While the study doesn’t have immediate use cases, practically it does boost both the neuromorphic computing and neuroprosthetic fields.
“We are very excited with this new development,” said study author Dr. Themis Prodromakis at the University of Southampton. “On one side it sets the basis for a novel scenario that was never encountered during natural evolution, where biological and artificial neurons are linked together and communicate across global networks; laying the foundations for the Internet of Neuro-electronics. On the other hand, it brings new prospects to neuroprosthetic technologies, paving the way towards research into replacing dysfunctional parts of the brain with AI chips.”
Image Credit: Gerd Altmann from Pixabay Continue reading