Tag Archives: learn
#435707 AI Agents Startle Researchers With ...
After 25 million games, the AI agents playing hide-and-seek with each other had mastered four basic game strategies. The researchers expected that part.
After a total of 380 million games, the AI players developed strategies that the researchers didn’t know were possible in the game environment—which the researchers had themselves created. That was the part that surprised the team at OpenAI, a research company based in San Francisco.
The AI players learned everything via a machine learning technique known as reinforcement learning. In this learning method, AI agents start out by taking random actions. Sometimes those random actions produce desired results, which earn them rewards. Via trial-and-error on a massive scale, they can learn sophisticated strategies.
In the context of games, this process can be abetted by having the AI play against another version of itself, ensuring that the opponents will be evenly matched. It also locks the AI into a process of one-upmanship, where any new strategy that emerges forces the opponent to search for a countermeasure. Over time, this “self-play” amounted to what the researchers call an “auto-curriculum.”
According to OpenAI researcher Igor Mordatch, this experiment shows that self-play “is enough for the agents to learn surprising behaviors on their own—it’s like children playing with each other.”
Reinforcement is a hot field of AI research right now. OpenAI’s researchers used the technique when they trained a team of bots to play the video game Dota 2, which squashed a world-champion human team last April. The Alphabet subsidiary DeepMind has used it to triumph in the ancient board game Go and the video game StarCraft.
Aniruddha Kembhavi, a researcher at the Allen Institute for Artificial Intelligence (AI2) in Seattle, says games such as hide-and-seek offer a good way for AI agents to learn “foundational skills.” He worked on a team that taught their AllenAI to play Pictionary with humans, viewing the gameplay as a way for the AI to work on common sense reasoning and communication. “We are, however, quite far away from being able to translate these preliminary findings in highly simplified environments into the real world,” says Kembhavi.
Illustration: OpenAI
AI agents construct a fort during a hide-and-seek game developed by OpenAI.
In OpenAI’s game of hide-and-seek, both the hiders and the seekers received a reward only if they won the game, leaving the AI players to develop their own strategies. Within a simple 3D environment containing walls, blocks, and ramps, the players first learned to run around and chase each other (strategy 1). The hiders next learned to move the blocks around to build forts (2), and then the seekers learned to move the ramps (3), enabling them to jump inside the forts. Then the hiders learned to move all the ramps into their forts before the seekers could use them (4).
The two strategies that surprised the researchers came next. First the seekers learned that they could jump onto a box and “surf” it over to a fort (5), allowing them to jump in—a maneuver that the researchers hadn’t realized was physically possible in the game environment. So as a final countermeasure, the hiders learned to lock all the boxes into place (6) so they weren’t available for use as surfboards.
Illustration: OpenAI
An AI agent uses a nearby box to surf its way into a competitor’s fort.
In this circumstance, having AI agents behave in an unexpected way wasn’t a problem: They found different paths to their rewards, but didn’t cause any trouble. However, you can imagine situations in which the outcome would be rather serious. Robots acting in the real world could do real damage. And then there’s Nick Bostrom’s famous example of a paper clip factory run by an AI, whose goal is to make as many paper clips as possible. As Bostrom told IEEE Spectrum back in 2014, the AI might realize that “human bodies consist of atoms, and those atoms could be used to make some very nice paper clips.”
Bowen Baker, another member of the OpenAI research team, notes that it’s hard to predict all the ways an AI agent will act inside an environment—even a simple one. “Building these environments is hard,” he says. “The agents will come up with these unexpected behaviors, which will be a safety problem down the road when you put them in more complex environments.”
AI researcher Katja Hofmann at Microsoft Research Cambridge, in England, has seen a lot of gameplay by AI agents: She started a competition that uses Minecraft as the playing field. She says the emergent behavior seen in this game, and in prior experiments by other researchers, shows that games can be a useful for studies of safe and responsible AI.
“I find demonstrations like this, in games and game-like settings, a great way to explore the capabilities and limitations of existing approaches in a safe environment,” says Hofmann. “Results like these will help us develop a better understanding on how to validate and debug reinforcement learning systems–a crucial step on the path towards real-world applications.”
Baker says there’s also a hopeful takeaway from the surprises in the hide-and-seek experiment. “If you put these agents into a rich enough environment they will find strategies that we never knew were possible,” he says. “Maybe they can solve problems that we can’t imagine solutions to.” Continue reading
#435619 Video Friday: Watch This Robot Dog ...
Video Friday is your weekly selection of awesome robotics videos, collected by your Automaton bloggers. We’ll also be posting a weekly calendar of upcoming robotics events for the next few months; here’s what we have so far (send us your events!):
IEEE Africon 2019 – September 25-27, 2019 – Accra, Ghana
RoboBusiness 2019 – October 1-3, 2019 – Santa Clara, CA, USA
ISRR 2019 – October 6-10, 2019 – Hanoi, Vietnam
Ro-Man 2019 – October 14-18, 2019 – New Delhi, India
Humanoids 2019 – October 15-17, 2019 – Toronto, Canada
ARSO 2019 – October 31-1, 2019 – Beijing, China
ROSCon 2019 – October 31-1, 2019 – Macau
IROS 2019 – November 4-8, 2019 – Macau
Let us know if you have suggestions for next week, and enjoy today’s videos.
Team PLUTO (University of Pennsylvania, Ghost Robotics, and Exyn Technologies) put together this video giving us a robot’s-eye-view (or whatever they happen to be using for eyes) of the DARPA Subterranean Challenge tunnel circuits.
[ PLUTO ]
Zhifeng Huang has been improving his jet-stepping humanoid robot, which features new hardware and the ability to take larger and more complex steps.
This video reported the last progress of an ongoing project utilizing ducted-fan propulsion system to improve humanoid robot’s ability in stepping over large ditches. The landing point of the robot’s swing foot can be not only forward but also side direction. With keeping quasi-static balance, the robot was able to step over a ditch with 450mm in width (up to 97% of the robot’s leg’s length) in 3D stepping.
[ Paper ]
Thanks Zhifeng!
These underacuated hands from Matei Ciocarlie’s lab at Columbia are magically able to reconfigure themselves to grasp different object types with just one or two motors.
[ Paper ] via [ ROAM Lab ]
This is one reason we should pursue not “autonomous cars” but “fully autonomous cars” that never require humans to take over. We can’t be trusted.
During our early days as the Google self-driving car project, we invited some employees to test our vehicles on their commutes and weekend trips. What we were testing at the time was similar to the highway driver assist features that are now available on cars today, where the car takes over the boring parts of the driving, but if something outside its ability occurs, the driver has to take over immediately.
What we saw was that our testers put too much trust in that technology. They were doing things like texting, applying makeup, and even falling asleep that made it clear they would not be ready to take over driving if the vehicle asked them to. This is why we believe that nothing short of full autonomy will do.
[ Waymo ]
Buddy is a DIY and fetchingly minimalist social robot (of sorts) that will be coming to Kickstarter this month.
We have created a new arduino kit. His name is Buddy. He is a DIY social robot to serve as a replacement for Jibo, Cozmo, or any of the other bots that are no longer available. Fully 3D printed and supported he adds much more to our series of Arduino STEM robotics kits.
Buddy is able to look around and map his surroundings and react to changes within them. He can be surprised and he will always have a unique reaction to changes. The kit can be built very easily in less than an hour. It is even robust enough to take the abuse that kids can give it in a classroom.
[ Littlebots ]
The android Mindar, based on the Buddhist deity of mercy, preaches sermons at Kodaiji temple in Kyoto, and its human colleagues predict that with artificial intelligence it could one day acquire unlimited wisdom. Developed at a cost of almost $1 million (¥106 million) in a joint project between the Zen temple and robotics professor Hiroshi Ishiguro, the robot teaches about compassion and the dangers of desire, anger and ego.
[ Japan Times ]
I’m not sure whether it’s the sound or what, but this thing scares me for some reason.
[ BIRL ]
This gripper uses magnets as a sort of adjustable spring for dynamic stiffness control, which seems pretty clever.
[ Buffalo ]
What a package of medicine sees while being flown by drone from a hospital to a remote clinic in the Dominican Republic. The drone flew 11 km horizontally and 800 meters vertically, and I can’t even imagine what it would take to make that drive.
[ WeRobotics ]
My first ride in a fully autonomous car was at Stanford in 2009. I vividly remember getting in the back seat of a descendant of Junior, and watching the steering wheel turn by itself as the car executed a perfect parking maneuver. Ten years later, it’s still fun to watch other people have that experience.
[ Waymo ]
Flirtey, the pioneer of the commercial drone delivery industry, has unveiled the much-anticipated first video of its next-generation delivery drone, the Flirtey Eagle. The aircraft designer and manufacturer also unveiled the Flirtey Portal, a sophisticated take off and landing platform that enables scalable store-to-door operations; and an autonomous software platform that enables drones to deliver safely to homes.
[ Flirtey ]
EPFL scientists are developing new approaches for improved control of robotic hands – in particular for amputees – that combines individual finger control and automation for improved grasping and manipulation. This interdisciplinary proof-of-concept between neuroengineering and robotics was successfully tested on three amputees and seven healthy subjects.
[ EPFL ]
This video is a few years old, but we’ll take any excuse to watch the majestic sage-grouse be majestic in all their majesticness.
[ UC Davis ]
I like the idea of a game of soccer (or, football to you weirdos in the rest of the world) where the ball has a mind of its own.
[ Sphero ]
Looks like the whole delivery glider idea is really taking off! Or, you know, not taking off.
Weird that they didn’t show the landing, because it sure looked like it was going to plow into the side of the hill at full speed.
[ Yates ] via [ sUAS News ]
This video is from a 2018 paper, but it’s not like we ever get tired of seeing quadrupeds do stuff, right?
[ MIT ]
Founder and Head of Product, Ian Bernstein, and Head of Engineering, Morgan Bell, have been involved in the Misty project for years and they have learned a thing or two about building robots. Hear how and why Misty evolved into a robot development platform, learn what some of the earliest prototypes did (and why they didn’t work for what we envision), and take a deep dive into the technology decisions that form the Misty II platform.
[ Misty Robotics ]
Lex Fridman interviews Vijay Kumar on the Artifiical Intelligence Podcast.
[ AI Podcast ]
This week’s CMU RI Seminar is from Ross Knepper at Cornell, on Formalizing Teamwork in Human-Robot Interaction.
Robots out in the world today work for people but not with people. Before robots can work closely with ordinary people as part of a human-robot team in a home or office setting, robots need the ability to acquire a new mix of functional and social skills. Working with people requires a shared understanding of the task, capabilities, intentions, and background knowledge. For robots to act jointly as part of a team with people, they must engage in collaborative planning, which involves forming a consensus through an exchange of information about goals, capabilities, and partial plans. Often, much of this information is conveyed through implicit communication. In this talk, I formalize components of teamwork involving collaboration, communication, and representation. I illustrate how these concepts interact in the application of social navigation, which I argue is a first-class example of teamwork. In this setting, participants must avoid collision by legibly conveying intended passing sides via nonverbal cues like path shape. A topological representation using the braid groups enables the robot to reason about a small enumerable set of passing outcomes. I show how implicit communication of topological group plans achieves rapid covergence to a group consensus, and how a robot in the group can deliberately influence the ultimate outcome to maximize joint performance, yielding pedestrian comfort with the robot.
[ CMU RI ]
In this week’s episode of Robots in Depth, Per speaks with Julien Bourgeois about Claytronics, a project from Carnegie Mellon and Intel to develop “programmable matter.”
Julien started out as a computer scientist. He was always interested in robotics privately but then had the opportunity to get into micro robots when his lab was merged into the FEMTO-ST Institute. He later worked with Seth Copen Goldstein at Carnegie Mellon on the Claytronics project.
Julien shows an enlarged mock-up of the small robots that make up programmable matter, catoms, and speaks about how they are designed. Currently he is working on a unit that is one centimeter in diameter and he shows us the very small CPU that goes into that model.
[ Robots in Depth ] Continue reading