Tag Archives: shows
After 25 million games, the AI agents playing hide-and-seek with each other had mastered four basic game strategies. The researchers expected that part.
After a total of 380 million games, the AI players developed strategies that the researchers didn’t know were possible in the game environment—which the researchers had themselves created. That was the part that surprised the team at OpenAI, a research company based in San Francisco.
The AI players learned everything via a machine learning technique known as reinforcement learning. In this learning method, AI agents start out by taking random actions. Sometimes those random actions produce desired results, which earn them rewards. Via trial-and-error on a massive scale, they can learn sophisticated strategies.
In the context of games, this process can be abetted by having the AI play against another version of itself, ensuring that the opponents will be evenly matched. It also locks the AI into a process of one-upmanship, where any new strategy that emerges forces the opponent to search for a countermeasure. Over time, this “self-play” amounted to what the researchers call an “auto-curriculum.”
According to OpenAI researcher Igor Mordatch, this experiment shows that self-play “is enough for the agents to learn surprising behaviors on their own—it’s like children playing with each other.”
Reinforcement is a hot field of AI research right now. OpenAI’s researchers used the technique when they trained a team of bots to play the video game Dota 2, which squashed a world-champion human team last April. The Alphabet subsidiary DeepMind has used it to triumph in the ancient board game Go and the video game StarCraft.
Aniruddha Kembhavi, a researcher at the Allen Institute for Artificial Intelligence (AI2) in Seattle, says games such as hide-and-seek offer a good way for AI agents to learn “foundational skills.” He worked on a team that taught their AllenAI to play Pictionary with humans, viewing the gameplay as a way for the AI to work on common sense reasoning and communication. “We are, however, quite far away from being able to translate these preliminary findings in highly simplified environments into the real world,” says Kembhavi.
AI agents construct a fort during a hide-and-seek game developed by OpenAI.
In OpenAI’s game of hide-and-seek, both the hiders and the seekers received a reward only if they won the game, leaving the AI players to develop their own strategies. Within a simple 3D environment containing walls, blocks, and ramps, the players first learned to run around and chase each other (strategy 1). The hiders next learned to move the blocks around to build forts (2), and then the seekers learned to move the ramps (3), enabling them to jump inside the forts. Then the hiders learned to move all the ramps into their forts before the seekers could use them (4).
The two strategies that surprised the researchers came next. First the seekers learned that they could jump onto a box and “surf” it over to a fort (5), allowing them to jump in—a maneuver that the researchers hadn’t realized was physically possible in the game environment. So as a final countermeasure, the hiders learned to lock all the boxes into place (6) so they weren’t available for use as surfboards.
An AI agent uses a nearby box to surf its way into a competitor’s fort.
In this circumstance, having AI agents behave in an unexpected way wasn’t a problem: They found different paths to their rewards, but didn’t cause any trouble. However, you can imagine situations in which the outcome would be rather serious. Robots acting in the real world could do real damage. And then there’s Nick Bostrom’s famous example of a paper clip factory run by an AI, whose goal is to make as many paper clips as possible. As Bostrom told IEEE Spectrum back in 2014, the AI might realize that “human bodies consist of atoms, and those atoms could be used to make some very nice paper clips.”
Bowen Baker, another member of the OpenAI research team, notes that it’s hard to predict all the ways an AI agent will act inside an environment—even a simple one. “Building these environments is hard,” he says. “The agents will come up with these unexpected behaviors, which will be a safety problem down the road when you put them in more complex environments.”
AI researcher Katja Hofmann at Microsoft Research Cambridge, in England, has seen a lot of gameplay by AI agents: She started a competition that uses Minecraft as the playing field. She says the emergent behavior seen in this game, and in prior experiments by other researchers, shows that games can be a useful for studies of safe and responsible AI.
“I find demonstrations like this, in games and game-like settings, a great way to explore the capabilities and limitations of existing approaches in a safe environment,” says Hofmann. “Results like these will help us develop a better understanding on how to validate and debug reinforcement learning systems–a crucial step on the path towards real-world applications.”
Baker says there’s also a hopeful takeaway from the surprises in the hide-and-seek experiment. “If you put these agents into a rich enough environment they will find strategies that we never knew were possible,” he says. “Maybe they can solve problems that we can’t imagine solutions to.” Continue reading
Soft robots are getting more and more popular for some very good reasons. Their relative simplicity is one. Their relative low cost is another. And for their simplicity and low cost, they’re generally able to perform very impressively, leveraging the unique features inherent to their design and construction to move themselves and interact with their environment. The other significant reason why soft robots are so appealing is that they’re durable. Without the constraints of rigid parts, they can withstand the sort of abuse that would make any roboticist cringe.
In the current issue of Science Robotics, a group of researchers from Tsinghua University in China and University of California, Berkeley, present a new kind of soft robot that’s both higher performance and much more robust than just about anything we’ve seen before. The deceptively simple robot looks like a bent strip of paper, but it’s able to move at 20 body lengths per second and survive being stomped on by a human wearing tennis shoes. Take that, cockroaches.
This prototype robot measures just 3 centimeters by 1.5 cm. It takes a scanning electron microscope to actually see what the robot is made of—a thermoplastic layer is sandwiched by palladium-gold electrodes, bonded with adhesive silicone to a structural plastic at the bottom. When an AC voltage (as low as 8 volts but typically about 60 volts) is run through the electrodes, the thermoplastic extends and contracts, causing the robot’s back to flex and the little “foot” to shuffle. A complete step cycle takes just 50 milliseconds, yielding a 200 hertz gait. And technically, the robot “runs,” since it does have a brief aerial phase.
Image: Science Robotics
Photos from a high-speed camera show the robot’s gait (A to D) as it contracts and expands its body.
To put the robot’s top speed of 20 body lengths per second in perspective, have a look at this nifty chart, which shows where other animals relative running speeds of some animals and robots versus body mass:
Image: Science Robotics
This chart shows the relative running speeds of some mammals (purple area), arthropods (orange area), and soft robots (blue area) versus body mass. For both mammals and arthropods, relative speeds show a strong negative scaling law with respect to the body mass: speeds increase as body masses decrease. However, for soft robots, the relationship appears to be the opposite: speeds decrease as the body mass decrease. For the little soft robots created by the researchers from Tsinghua University and UC Berkeley (red stars), the scaling law is similar to that of living animals: Higher speed was attained as the body mass decreased.
If you were wondering, like we were, just what that number 39 is on that chart (top left corner), it’s a species of tiny mite that was discovered underneath a rock in California in 1916. The mite is just under 1 mm in size, but it can run at 0.8 kilometer per hour, which is 322 body lengths per second, making it by far (like, by a factor of two at least) the fastest land animal on Earth relative to size. If a human was to run that fast relative to our size, we’d be traveling at a little bit over 2,000 kilometers per hour. It’s not a coincidence that pretty much everything in the upper left of the chart is an insect—speed scales favorably with decreasing mass, since actuators have a proportionally larger effect.
Other notable robots on the chart with impressive speed to mass ratios are number 27, which is this magnetically driven quadruped robot from UMD, and number 86, UC Berkeley’s X2-VelociRoACH.
Anyway, back to this robot. Some other cool things about it:
You can step on it, squishing it flat with a load about 1 million times its own body weight, and it’ll keep on crawling, albeit only half as fast.
Even climbing a slope of 15 degrees, it can still manage to move at 1 body length per second.
It carries peanuts! With a payload of six times its own weight, it moves a sixth as fast, but still, it’s not like you need your peanuts delivered all that quickly anyway, do you?
Image: Science Robotics
The researchers also put together a prototype with two legs instead of one, which was able to demonstrate a potentially faster galloping gait by spending more time in the air. They suggest that robots like these could be used for “environmental exploration, structural inspection, information reconnaissance, and disaster relief,” which are the sorts of things that you suggest that your robot could be used for when you really have no idea what it could be used for. But this work is certainly impressive, with speed and robustness that are largely unmatched by other soft robots. An untethered version seems possible due to the relatively low voltages required to drive the robot, and if they can put some peanut-sized sensors on there as well, practical applications might actually be forthcoming sometime soon.
“Insect-scale Fast Moving and Ultrarobust Soft Robot,” by Yichuan Wu, Justin K. Yim, Jiaming Liang, Zhichun Shao, Mingjing Qi, Junwen Zhong, Zihao Luo, Xiaojun Yan, Min Zhang, Xiaohao Wang, Ronald S. Fearing, Robert J. Full, and Liwei Lin from Tsinghua University and UC Berkeley, is published in Science Robotics. Continue reading