Tag Archives: simple
#437269 DeepMind’s Newest AI Programs Itself ...
When Deep Blue defeated world chess champion Garry Kasparov in 1997, it may have seemed artificial intelligence had finally arrived. A computer had just taken down one of the top chess players of all time. But it wasn’t to be.
Though Deep Blue was meticulously programmed top-to-bottom to play chess, the approach was too labor-intensive, too dependent on clear rules and bounded possibilities to succeed at more complex games, let alone in the real world. The next revolution would take a decade and a half, when vastly more computing power and data revived machine learning, an old idea in artificial intelligence just waiting for the world to catch up.
Today, machine learning dominates, mostly by way of a family of algorithms called deep learning, while symbolic AI, the dominant approach in Deep Blue’s day, has faded into the background.
Key to deep learning’s success is the fact the algorithms basically write themselves. Given some high-level programming and a dataset, they learn from experience. No engineer anticipates every possibility in code. The algorithms just figure it.
Now, Alphabet’s DeepMind is taking this automation further by developing deep learning algorithms that can handle programming tasks which have been, to date, the sole domain of the world’s top computer scientists (and take them years to write).
In a paper recently published on the pre-print server arXiv, a database for research papers that haven’t been peer reviewed yet, the DeepMind team described a new deep reinforcement learning algorithm that was able to discover its own value function—a critical programming rule in deep reinforcement learning—from scratch.
Surprisingly, the algorithm was also effective beyond the simple environments it trained in, going on to play Atari games—a different, more complicated task—at a level that was, at times, competitive with human-designed algorithms and achieving superhuman levels of play in 14 games.
DeepMind says the approach could accelerate the development of reinforcement learning algorithms and even lead to a shift in focus, where instead of spending years writing the algorithms themselves, researchers work to perfect the environments in which they train.
Pavlov’s Digital Dog
First, a little background.
Three main deep learning approaches are supervised, unsupervised, and reinforcement learning.
The first two consume huge amounts of data (like images or articles), look for patterns in the data, and use those patterns to inform actions (like identifying an image of a cat). To us, this is a pretty alien way to learn about the world. Not only would it be mind-numbingly dull to review millions of cat images, it’d take us years or more to do what these programs do in hours or days. And of course, we can learn what a cat looks like from just a few examples. So why bother?
While supervised and unsupervised deep learning emphasize the machine in machine learning, reinforcement learning is a bit more biological. It actually is the way we learn. Confronted with several possible actions, we predict which will be most rewarding based on experience—weighing the pleasure of eating a chocolate chip cookie against avoiding a cavity and trip to the dentist.
In deep reinforcement learning, algorithms go through a similar process as they take action. In the Atari game Breakout, for instance, a player guides a paddle to bounce a ball at a ceiling of bricks, trying to break as many as possible. When playing Breakout, should an algorithm move the paddle left or right? To decide, it runs a projection—this is the value function—of which direction will maximize the total points, or rewards, it can earn.
Move by move, game by game, an algorithm combines experience and value function to learn which actions bring greater rewards and improves its play, until eventually, it becomes an uncanny Breakout player.
Learning to Learn (Very Meta)
So, a key to deep reinforcement learning is developing a good value function. And that’s difficult. According to the DeepMind team, it takes years of manual research to write the rules guiding algorithmic actions—which is why automating the process is so alluring. Their new Learned Policy Gradient (LPG) algorithm makes solid progress in that direction.
LPG trained in a number of toy environments. Most of these were “gridworlds”—literally two-dimensional grids with objects in some squares. The AI moves square to square and earns points or punishments as it encounters objects. The grids vary in size, and the distribution of objects is either set or random. The training environments offer opportunities to learn fundamental lessons for reinforcement learning algorithms.
Only in LPG’s case, it had no value function to guide that learning.
Instead, LPG has what DeepMind calls a “meta-learner.” You might think of this as an algorithm within an algorithm that, by interacting with its environment, discovers both “what to predict,” thereby forming its version of a value function, and “how to learn from it,” applying its newly discovered value function to each decision it makes in the future.
Prior work in the area has had some success, but according to DeepMind, LPG is the first algorithm to discover reinforcement learning rules from scratch and to generalize beyond training. The latter was particularly surprising because Atari games are so different from the simple worlds LPG trained in—that is, it had never seen anything like an Atari game.
Time to Hand Over the Reins? Not Just Yet
LPG is still behind advanced human-designed algorithms, the researchers said. But it outperformed a human-designed benchmark in training and even some Atari games, which suggests it isn’t strictly worse, just that it specializes in some environments.
This is where there’s room for improvement and more research.
The more environments LPG saw, the more it could successfully generalize. Intriguingly, the researchers speculate that with enough well-designed training environments, the approach might yield a general-purpose reinforcement learning algorithm.
At the least, though, they say further automation of algorithm discovery—that is, algorithms learning to learn—will accelerate the field. In the near term, it can help researchers more quickly develop hand-designed algorithms. Further out, as self-discovered algorithms like LPG improve, engineers may shift from manually developing the algorithms themselves to building the environments where they learn.
Deep learning long ago left Deep Blue in the dust at games. Perhaps algorithms learning to learn will be a winning strategy in the real world too.
Image credit: Mike Szczepanski / Unsplash Continue reading
#436944 Is Digital Learning Still Second Best?
As Covid-19 continues to spread, the world has gone digital on an unprecedented scale. Tens of thousands of employees are working from home, and huge conferences, like the Google I/O and Apple WWDC software extravaganzas, plan to experiment with digital events.
Universities too are sending students home. This might have meant an extended break from school not too long ago. But no more. As lecture halls go empty, an experiment into digital learning at scale is ramping up. In the US alone, over 100 universities, from Harvard to Duke, are offering online classes to students to keep the semester going.
While digital learning has been improving for some time, Covid-19 may not only tip us further into a more digitally connected reality, but also help us better appreciate its benefits. This is important because historically, digital learning has been viewed as inferior to traditional learning. But that may be changing.
The Inversion
We often think about digital technologies as ways to reach people without access to traditional services—online learning for children who don’t have schools nearby or telemedicine for patients with no access to doctors. And while these solutions have helped millions of people, they’re often viewed as “second best” and “better than nothing.” Even in more resource-rich environments, there’s an assumption one should pay more to attend an event in person—a concert, a football game, an exercise class—while digital equivalents are extremely cheap or free. Why is this? And is the situation about to change?
Take the case of Dr. Sanjeev Arora, a professor of medicine at the University of New Mexico. Arora started Project Echo because he was frustrated by how many late-stage cases of hepatitis C he encountered in rural New Mexico. He realized that if he had reached patients sooner, he could have prevented needless deaths. The solution? Digital learning for local health workers.
Project Echo connects rural healthcare practitioners to specialists at top health centers by video. The approach is collaborative: Specialists share best practices and work through cases with participants to apply them in the real world and learn from edge cases. Added to expert presentations, there are lots of opportunities to ask questions and interact with specialists.
The method forms a digital loop of learning, practice, assessment, and adjustment.
Since 2003, Project Echo has scaled to 800 locations in 39 countries and trained over 90,000 healthcare providers. Most notably, a study in The New England Journal of Medicine found that the outcomes of hepatitis C treatment given by Project Echo trained healthcare workers in rural and underserved areas were similar to outcomes at university medical centers. That is, digital learning in this context was equivalent to high quality in-person learning.
If that is possible today, with simple tools, will they surpass traditional medical centers and schools in the future? Can digital learning more generally follow suit and have the same success? Perhaps. Going digital brings its own special toolset to the table too.
The Benefits of Digital
If you’re training people online, you can record the session to better understand their engagement levels—or even add artificial intelligence to analyze it in real time. Ahura AI, for example, founded by Bryan Talebi, aims to upskill workers through online training. Early study of their method suggests they can significantly speed up learning by analyzing users’ real-time emotions—like frustration or distraction—and adjusting the lesson plan or difficulty on the fly.
Other benefits of digital learning include the near-instantaneous download of course materials—rather than printing and shipping books—and being able to more easily report grades and other results, a requirement for many schools and social services organizations. And of course, as other digitized industries show, digital learning can grow and scale further at much lower costs.
To that last point, 360ed, a digital learning startup founded in 2016 by Hla Hla Win, now serves millions of children in Myanmar with augmented reality lesson plans. And Global Startup Ecosystem, founded by Christine Souffrant Ntim and Einstein Kofi Ntim in 2015, is the world’s first and largest digital accelerator program. Their entirely online programs support over 1,000 companies in 90 countries. It’s astonishing how fast both of these organizations have grown.
Notably, both examples include offline experiences too. Many of the 360ed lesson plans come with paper flashcards children use with their smartphones because the online-offline interaction improves learning. The Global Startup Ecosystem also hosts about 10 additional in-person tech summits around the world on various topics through a related initiative.
Looking further ahead, probably the most important benefit of online learning will be its potential to integrate with other digital systems in the workplace.
Imagine a medical center that has perfect information about every patient and treatment in real time and that this information is (anonymously and privately) centralized, analyzed, and shared with medical centers, research labs, pharmaceutical companies, clinical trials, policy makers, and medical students around the world. Just as self-driving cars can learn to drive better by having access to the experiences of other self-driving cars, so too can any group working to solve complex, time-sensitive challenges learn from and build on each other’s experiences.
Why This Matters
While in the long term the world will likely end up combining the best aspects of traditional and digital learning, it’s important in the near term to be more aware of the assumptions we make about digital technologies. Some of the most pioneering work in education, healthcare, and other industries may not be highly visible right now because it is in a virtual setting. Most people are unaware, for example, that the busiest emergency room in rural America is already virtual.
Once they start converging with other digital technologies, these innovations will likely become the mainstream system for all of us. Which raises more questions: What is the best business model for these virtual services? If they start delivering better healthcare and educational outcomes than traditional institutions, should they charge more? Hopefully, we will see an even bigger shift occurring, in which technology allows us to provide high quality education, healthcare, and other services to everyone at more affordable prices than today.
These are some of the topics we can consider as Covid-19 forces us into uncharted territory.
Image Credit: Andras Vas / Unsplash Continue reading