Tag Archives: trip
#439105 This Robot Taught Itself to Walk in a ...
Recently, in a Berkeley lab, a robot called Cassie taught itself to walk, a little like a toddler might. Through trial and error, it learned to move in a simulated world. Then its handlers sent it strolling through a minefield of real-world tests to see how it’d fare.
And, as it turns out, it fared pretty damn well. With no further fine-tuning, the robot—which is basically just a pair of legs—was able to walk in all directions, squat down while walking, right itself when pushed off balance, and adjust to different kinds of surfaces.
It’s the first time a machine learning approach known as reinforcement learning has been so successfully applied in two-legged robots.
This likely isn’t the first robot video you’ve seen, nor the most polished.
For years, the internet has been enthralled by videos of robots doing far more than walking and regaining their balance. All that is table stakes these days. Boston Dynamics, the heavyweight champ of robot videos, regularly releases mind-blowing footage of robots doing parkour, back flips, and complex dance routines. At times, it can seem the world of iRobot is just around the corner.
This sense of awe is well-earned. Boston Dynamics is one of the world’s top makers of advanced robots.
But they still have to meticulously hand program and choreograph the movements of the robots in their videos. This is a powerful approach, and the Boston Dynamics team has done incredible things with it.
In real-world situations, however, robots need to be robust and resilient. They need to regularly deal with the unexpected, and no amount of choreography will do. Which is how, it’s hoped, machine learning can help.
Reinforcement learning has been most famously exploited by Alphabet’s DeepMind to train algorithms that thrash humans at some the most difficult games. Simplistically, it’s modeled on the way we learn. Touch the stove, get burned, don’t touch the damn thing again; say please, get a jelly bean, politely ask for another.
In Cassie’s case, the Berkeley team used reinforcement learning to train an algorithm to walk in a simulation. It’s not the first AI to learn to walk in this manner. But going from simulation to the real world doesn’t always translate.
Subtle differences between the two can (literally) trip up a fledgling robot as it tries out its sim skills for the first time.
To overcome this challenge, the researchers used two simulations instead of one. The first simulation, an open source training environment called MuJoCo, was where the algorithm drew upon a large library of possible movements and, through trial and error, learned to apply them. The second simulation, called Matlab SimMechanics, served as a low-stakes testing ground that more precisely matched real-world conditions.
Once the algorithm was good enough, it graduated to Cassie.
And amazingly, it didn’t need further polishing. Said another way, when it was born into the physical world—it knew how to walk just fine. In addition, it was also quite robust. The researchers write that two motors in Cassie’s knee malfunctioned during the experiment, but the robot was able to adjust and keep on trucking.
Other labs have been hard at work applying machine learning to robotics.
Last year Google used reinforcement learning to train a (simpler) four-legged robot. And OpenAI has used it with robotic arms. Boston Dynamics, too, will likely explore ways to augment their robots with machine learning. New approaches—like this one aimed at training multi-skilled robots or this one offering continuous learning beyond training—may also move the dial. It’s early yet, however, and there’s no telling when machine learning will exceed more traditional methods.
And in the meantime, Boston Dynamics bots are testing the commercial waters.
Still, robotics researchers, who were not part of the Berkeley team, think the approach is promising. Edward Johns, head of Imperial College London’s Robot Learning Lab, told MIT Technology Review, “This is one of the most successful examples I have seen.”
The Berkeley team hopes to build on that success by trying out “more dynamic and agile behaviors.” So, might a self-taught parkour-Cassie be headed our way? We’ll see.
Image Credit: University of California Berkeley Hybrid Robotics via YouTube Continue reading
#438779 Meet Catfish Charlie, the CIA’s ...
Photo: CIA Museum
CIA roboticists designed Catfish Charlie to take water samples undetected. Why they wanted a spy fish for such a purpose remains classified.
In 1961, Tom Rogers of the Leo Burnett Agency created Charlie the Tuna, a jive-talking cartoon mascot and spokesfish for the StarKist brand. The popular ad campaign ran for several decades, and its catchphrase “Sorry, Charlie” quickly hooked itself in the American lexicon.
When the CIA’s Office of Advanced Technologies and Programs started conducting some fish-focused research in the 1990s, Charlie must have seemed like the perfect code name. Except that the CIA’s Charlie was a catfish. And it was a robot.
More precisely, Charlie was an unmanned underwater vehicle (UUV) designed to surreptitiously collect water samples. Its handler controlled the fish via a line-of-sight radio handset. Not much has been revealed about the fish’s construction except that its body contained a pressure hull, ballast system, and communications system, while its tail housed the propulsion. At 61 centimeters long, Charlie wouldn’t set any biggest-fish records. (Some species of catfish can grow to 2 meters.) Whether Charlie reeled in any useful intel is unknown, as details of its missions are still classified.
For exploring watery environments, nothing beats a robot
The CIA was far from alone in its pursuit of UUVs nor was it the first agency to do so. In the United States, such research began in earnest in the 1950s, with the U.S. Navy’s funding of technology for deep-sea rescue and salvage operations. Other projects looked at sea drones for surveillance and scientific data collection.
Aaron Marburg, a principal electrical and computer engineer who works on UUVs at the University of Washington’s Applied Physics Laboratory, notes that the world’s oceans are largely off-limits to crewed vessels. “The nature of the oceans is that we can only go there with robots,” he told me in a recent Zoom call. To explore those uncharted regions, he said, “we are forced to solve the technical problems and make the robots work.”
Image: Thomas Wells/Applied Physics Laboratory/University of Washington
An oil painting commemorates SPURV, a series of underwater research robots built by the University of Washington’s Applied Physics Lab. In nearly 400 deployments, no SPURVs were lost.
One of the earliest UUVs happens to sit in the hall outside Marburg’s office: the Self-Propelled Underwater Research Vehicle, or SPURV, developed at the applied physics lab beginning in the late ’50s. SPURV’s original purpose was to gather data on the physical properties of the sea, in particular temperature and sound velocity. Unlike Charlie, with its fishy exterior, SPURV had a utilitarian torpedo shape that was more in line with its mission. Just over 3 meters long, it could dive to 3,600 meters, had a top speed of 2.5 m/s, and operated for 5.5 hours on a battery pack. Data was recorded to magnetic tape and later transferred to a photosensitive paper strip recorder or other computer-compatible media and then plotted using an IBM 1130.
Over time, SPURV’s instrumentation grew more capable, and the scope of the project expanded. In one study, for example, SPURV carried a fluorometer to measure the dispersion of dye in the water, to support wake studies. The project was so successful that additional SPURVs were developed, eventually completing nearly 400 missions by the time it ended in 1979.
Working on underwater robots, Marburg says, means balancing technical risks and mission objectives against constraints on funding and other resources. Support for purely speculative research in this area is rare. The goal, then, is to build UUVs that are simple, effective, and reliable. “No one wants to write a report to their funders saying, ‘Sorry, the batteries died, and we lost our million-dollar robot fish in a current,’ ” Marburg says.
A robot fish called SoFi
Since SPURV, there have been many other unmanned underwater vehicles, of various shapes and sizes and for various missions, developed in the United States and elsewhere. UUVs and their autonomous cousins, AUVs, are now routinely used for scientific research, education, and surveillance.
At least a few of these robots have been fish-inspired. In the mid-1990s, for instance, engineers at MIT worked on a RoboTuna, also nicknamed Charlie. Modeled loosely on a blue-fin tuna, it had a propulsion system that mimicked the tail fin of a real fish. This was a big departure from the screws or propellers used on UUVs like SPURV. But this Charlie never swam on its own; it was always tethered to a bank of instruments. The MIT group’s next effort, a RoboPike called Wanda, overcame this limitation and swam freely, but never learned to avoid running into the sides of its tank.
Fast-forward 25 years, and a team from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) unveiled SoFi, a decidedly more fishy robot designed to swim next to real fish without disturbing them. Controlled by a retrofitted Super Nintendo handset, SoFi could dive more than 15 meters, control its own buoyancy, and swim around for up to 40 minutes between battery charges. Noting that SoFi’s creators tested their robot fish in the gorgeous waters off Fiji, IEEE Spectrum’s Evan Ackerman noted, “Part of me is convinced that roboticists take on projects like these…because it’s a great way to justify a trip somewhere exotic.”
SoFi, Wanda, and both Charlies are all examples of biomimetics, a term coined in 1974 to describe the study of biological mechanisms, processes, structures, and substances. Biomimetics looks to nature to inspire design.
Sometimes, the resulting technology proves to be more efficient than its natural counterpart, as Richard James Clapham discovered while researching robotic fish for his Ph.D. at the University of Essex, in England. Under the supervision of robotics expert Huosheng Hu, Clapham studied the swimming motion of Cyprinus carpio, the common carp. He then developed four robots that incorporated carplike swimming, the most capable of which was iSplash-II. When tested under ideal conditions—that is, a tank 5 meters long, 2 meters wide, and 1.5 meters deep—iSpash-II obtained a maximum velocity of 11.6 body lengths per second (or about 3.7 m/s). That’s faster than a real carp, which averages a top velocity of 10 body lengths per second. But iSplash-II fell short of the peak performance of a fish darting quickly to avoid a predator.
Of course, swimming in a test pool or placid lake is one thing; surviving the rough and tumble of a breaking wave is another matter. The latter is something that roboticist Kathryn Daltorio has explored in depth.
Daltorio, an assistant professor at Case Western Reserve University and codirector of the Center for Biologically Inspired Robotics Research there, has studied the movements of cockroaches, earthworms, and crabs for clues on how to build better robots. After watching a crab navigate from the sandy beach to shallow water without being thrown off course by a wave, she was inspired to create an amphibious robot with tapered, curved feet that could dig into the sand. This design allowed her robot to withstand forces up to 138 percent of its body weight.
Photo: Nicole Graf
This robotic crab created by Case Western’s Kathryn Daltorio imitates how real crabs grab the sand to avoid being toppled by waves.
In her designs, Daltorio is following architect Louis Sullivan’s famous maxim: Form follows function. She isn’t trying to imitate the aesthetics of nature—her robot bears only a passing resemblance to a crab—but rather the best functionality. She looks at how animals interact with their environments and steals evolution’s best ideas.
And yet, Daltorio admits, there is also a place for realistic-looking robotic fish, because they can capture the imagination and spark interest in robotics as well as nature. And unlike a hyperrealistic humanoid, a robotic fish is unlikely to fall into the creepiness of the uncanny valley.
In writing this column, I was delighted to come across plenty of recent examples of such robotic fish. Ryomei Engineering, a subsidiary of Mitsubishi Heavy Industries, has developed several: a robo-coelacanth, a robotic gold koi, and a robotic carp. The coelacanth was designed as an educational tool for aquariums, to present a lifelike specimen of a rarely seen fish that is often only known by its fossil record. Meanwhile, engineers at the University of Kitakyushu in Japan created Tai-robot-kun, a credible-looking sea bream. And a team at Evologics, based in Berlin, came up with the BOSS manta ray.
Whatever their official purpose, these nature-inspired robocreatures can inspire us in return. UUVs that open up new and wondrous vistas on the world’s oceans can extend humankind’s ability to explore. We create them, and they enhance us, and that strikes me as a very fair and worthy exchange.
This article appears in the March 2021 print issue as “Catfish, Robot, Swimmer, Spy.”
About the Author
Allison Marsh is an associate professor of history at the University of South Carolina and codirector of the university’s Ann Johnson Institute for Science, Technology & Society. Continue reading
#437269 DeepMind’s Newest AI Programs Itself ...
When Deep Blue defeated world chess champion Garry Kasparov in 1997, it may have seemed artificial intelligence had finally arrived. A computer had just taken down one of the top chess players of all time. But it wasn’t to be.
Though Deep Blue was meticulously programmed top-to-bottom to play chess, the approach was too labor-intensive, too dependent on clear rules and bounded possibilities to succeed at more complex games, let alone in the real world. The next revolution would take a decade and a half, when vastly more computing power and data revived machine learning, an old idea in artificial intelligence just waiting for the world to catch up.
Today, machine learning dominates, mostly by way of a family of algorithms called deep learning, while symbolic AI, the dominant approach in Deep Blue’s day, has faded into the background.
Key to deep learning’s success is the fact the algorithms basically write themselves. Given some high-level programming and a dataset, they learn from experience. No engineer anticipates every possibility in code. The algorithms just figure it.
Now, Alphabet’s DeepMind is taking this automation further by developing deep learning algorithms that can handle programming tasks which have been, to date, the sole domain of the world’s top computer scientists (and take them years to write).
In a paper recently published on the pre-print server arXiv, a database for research papers that haven’t been peer reviewed yet, the DeepMind team described a new deep reinforcement learning algorithm that was able to discover its own value function—a critical programming rule in deep reinforcement learning—from scratch.
Surprisingly, the algorithm was also effective beyond the simple environments it trained in, going on to play Atari games—a different, more complicated task—at a level that was, at times, competitive with human-designed algorithms and achieving superhuman levels of play in 14 games.
DeepMind says the approach could accelerate the development of reinforcement learning algorithms and even lead to a shift in focus, where instead of spending years writing the algorithms themselves, researchers work to perfect the environments in which they train.
Pavlov’s Digital Dog
First, a little background.
Three main deep learning approaches are supervised, unsupervised, and reinforcement learning.
The first two consume huge amounts of data (like images or articles), look for patterns in the data, and use those patterns to inform actions (like identifying an image of a cat). To us, this is a pretty alien way to learn about the world. Not only would it be mind-numbingly dull to review millions of cat images, it’d take us years or more to do what these programs do in hours or days. And of course, we can learn what a cat looks like from just a few examples. So why bother?
While supervised and unsupervised deep learning emphasize the machine in machine learning, reinforcement learning is a bit more biological. It actually is the way we learn. Confronted with several possible actions, we predict which will be most rewarding based on experience—weighing the pleasure of eating a chocolate chip cookie against avoiding a cavity and trip to the dentist.
In deep reinforcement learning, algorithms go through a similar process as they take action. In the Atari game Breakout, for instance, a player guides a paddle to bounce a ball at a ceiling of bricks, trying to break as many as possible. When playing Breakout, should an algorithm move the paddle left or right? To decide, it runs a projection—this is the value function—of which direction will maximize the total points, or rewards, it can earn.
Move by move, game by game, an algorithm combines experience and value function to learn which actions bring greater rewards and improves its play, until eventually, it becomes an uncanny Breakout player.
Learning to Learn (Very Meta)
So, a key to deep reinforcement learning is developing a good value function. And that’s difficult. According to the DeepMind team, it takes years of manual research to write the rules guiding algorithmic actions—which is why automating the process is so alluring. Their new Learned Policy Gradient (LPG) algorithm makes solid progress in that direction.
LPG trained in a number of toy environments. Most of these were “gridworlds”—literally two-dimensional grids with objects in some squares. The AI moves square to square and earns points or punishments as it encounters objects. The grids vary in size, and the distribution of objects is either set or random. The training environments offer opportunities to learn fundamental lessons for reinforcement learning algorithms.
Only in LPG’s case, it had no value function to guide that learning.
Instead, LPG has what DeepMind calls a “meta-learner.” You might think of this as an algorithm within an algorithm that, by interacting with its environment, discovers both “what to predict,” thereby forming its version of a value function, and “how to learn from it,” applying its newly discovered value function to each decision it makes in the future.
Prior work in the area has had some success, but according to DeepMind, LPG is the first algorithm to discover reinforcement learning rules from scratch and to generalize beyond training. The latter was particularly surprising because Atari games are so different from the simple worlds LPG trained in—that is, it had never seen anything like an Atari game.
Time to Hand Over the Reins? Not Just Yet
LPG is still behind advanced human-designed algorithms, the researchers said. But it outperformed a human-designed benchmark in training and even some Atari games, which suggests it isn’t strictly worse, just that it specializes in some environments.
This is where there’s room for improvement and more research.
The more environments LPG saw, the more it could successfully generalize. Intriguingly, the researchers speculate that with enough well-designed training environments, the approach might yield a general-purpose reinforcement learning algorithm.
At the least, though, they say further automation of algorithm discovery—that is, algorithms learning to learn—will accelerate the field. In the near term, it can help researchers more quickly develop hand-designed algorithms. Further out, as self-discovered algorithms like LPG improve, engineers may shift from manually developing the algorithms themselves to building the environments where they learn.
Deep learning long ago left Deep Blue in the dust at games. Perhaps algorithms learning to learn will be a winning strategy in the real world too.
Image credit: Mike Szczepanski / Unsplash Continue reading