Tag Archives: ieee

#435738 Boing Goes the Trampoline Robot

There are a handful of quadrupedal robots out there that are highly dynamic, with the ability to run and jump, but those robots tend to be rather expensive and complicated, requiring powerful actuators and legs with elasticity. Boxing Wang, a Ph.D. student in the College of Control Science and Engineering at Zhejiang University in China, contacted us to share a project he’s been working to investigate quadruped jumping with simple, affordable hardware.

“The motivation for this project is quite simple,” Boxing says. “I wanted to study quadrupedal jumping control, but I didn’t have custom-made powerful actuators, and I didn’t want to have to design elastic legs. So I decided to use a trampoline to make a normal servo-driven quadruped robot to jump.”

Boxing and his colleagues had wanted to study quadrupedal running and jumping, so they built this robot with the most powerful servos they had access to: Kondo KRS6003RHV actuators, which have a maximum torque of 6 Nm. After some simple testing, it became clear that the servos were simply not fast or powerful enough to get the robot to jump, and that an elastic element was necessary to store energy to help the robot get off the ground.

“Normally, people would choose elastic legs,” says Boxing. “But nobody in my lab knew for sure how to design them. If we tried making elastic legs and we failed to make the robot jump, we couldn’t be sure whether the problem was the legs or the control algorithms. For hardware, we decided that it’s better to start with something reliable, something that definitely won’t be the source of the problem.”

As it turns out, all you need is a trampoline, an inertial measurement unit (IMU), and little tactile switches on the end of each foot to detect touch-down and lift-off events, and you can do some useful jumping research without a jumping robot. And the trampoline has other benefits as well—because it’s stiffer at the edges than at the center, for example, the robot will tend to center itself on the trampoline, and you get some warning before things go wrong.

“I can’t say that it’s a breakthrough to make a quadruped robot jump on a trampoline,” Boxing tells us. “But I believe this is useful for prototype testing, especially for people who are interested in quadrupedal jumping control but without a suitable robot at hand.”

To learn more about the project, we emailed him some additional questions.

IEEE Spectrum: Where did this idea come from?

Boxing Wang: The idea of the trampoline came while we were drinking milk tea. I don’t know why it came up, maybe someone saw a trampoline in a gym recently. And I don’t remember who proposed it exactly. It was just like someone said it unintentionally. But I realized that a trampoline would be a perfect choice. It’s reliable, easy to buy, and should have a similar dynamic model with the one of jumping with springy legs (we have briefly analyzed this in a paper). So I decided to try the trampoline.

How much do you think you can learn using a quadruped on a trampoline, instead of using a jumping quadruped?

Generally speaking, no contact surfaces are strictly rigid. They all have elasticity. So there are no essential differences between jumping on a trampoline and jumping on a rigid surface. However, using a quadruped on a trampoline can give you more information on how to make use of elasticity to make jumping easier and more efficient. You can use quadruped robots with springy legs to address the same problem, but that usually requires much more time on hardware design.

We prefer to treat the trampoline experiment as a kind of early test for further real jumping quadruped design. Unless you’re interested in designing an acrobatic robot on a trampoline, a real jumping quadruped is probably a more useful application, and that is our ultimate goal. The point of the trampoline tests is to develop the control algorithms first, and to examine the stability of the general hardware structure. Due to the similarity between jumping on a trampoline with rigid legs and jumping on hard surfaces with springy legs, the control algorithms you develop could be transferred to hard-surface jumping robots.

“Unless you’re interested in designing an acrobatic robot on a trampoline, a real jumping quadruped is probably a more useful application, and that is our ultimate goal. The point of the trampoline tests is to develop the control algorithms first, and to examine the stability of the general hardware structure”

Do you think that this idea can be beneficial for other kinds of robotics research?

Yes. For jumping quadrupeds with springy legs, the control algorithms could be first designed through trampoline tests using simple rigid legs. And the hardware design for elastic legs could be accelerated with the help of the control algorithms you design. In addition, we believe our work could be a good example of using a position-control robot to realize dynamic motions such as jumping, or even running.

Unlike other dynamic robots, every active joint in our robot is controlled through commercial position-control servos and not custom torque control motors. Most people don’t think that a position-control robot could perform highly dynamic motions such as jumping, because position-control motors usually mean high a gear ratio and slow response. However, our work indicates that, with the help of elasticity, stable jumping could be realized through position-control servos. So for those who already have a position-control robot at hand, they could explore the potential of their robot through trampoline tests.

Why is teaching a robot to jump important?

There are many scenarios where a jumping robot is needed. For example, a real jumping quadruped could be used to design a running quadruped. Both experience moments when all four legs are in the air, and it is easier to start from jumping and then move to running. Specifically, hopping or pronking can easily transform to bounding if the pitch angle is not strictly controlled. A bounding quadruped is similar to a running rabbit, so for now it can already be called a running quadruped.

To the best of our knowledge, a practical use of jumping quadrupeds could be planet exploration, just like what SpaceBok was designed for. In a low-gravity environment, jumping is more efficient than walking, and it’s easier to jump over obstacles. But if I had a jumping quadruped on Earth, I would teach it to catch a ball that I throw at it by jumping. It would be fantastic!

That would be fantastic.

Since the whole point of the trampoline was to get jumping software up and running with a minimum of hardware, the next step is to add some springy legs to the robot so that the control system the researchers developed can be tested on hard surfaces. They have a journal paper currently under revision, and Boxing Wang is joined as first author by his adviser Chunlin Zhou, undergrads Ziheng Duan and Qichao Zhu, and researchers Jun Wu and Rong Xiong. Continue reading

Posted in Human Robots

#435726 This Is the Most Powerful Robot Arm Ever ...

Last month, engineers at NASA’s Jet Propulsion Laboratory wrapped up the installation of the Mars 2020 rover’s 2.1-meter-long robot arm. This is the most powerful arm ever installed on a Mars rover. Even though the Mars 2020 rover shares much of its design with Curiosity, the new arm was redesigned to be able to do much more complex science, drilling into rocks to collect samples that can be stored for later recovery.

JPL is well known for developing robots that do amazing work in incredibly distant and hostile environments. The Opportunity Mars rover, to name just one example, had a 90-day planned mission but remained operational for 5,498 days in a robot unfriendly place full of dust and wild temperature swings where even the most basic maintenance or repair is utterly impossible. (Its twin rover, Spirit, operated for 2,269 days.)

To learn more about the process behind designing robotic systems that are capable of feats like these, we talked with Matt Robinson, one of the engineers who designed the Mars 2020 rover’s new robot arm.

The Mars 2020 rover (which will be officially named through a public contest which opens this fall) is scheduled to launch in July of 2020, landing in Jezero Crater on February 18, 2021. The overall design is similar to the Mars Science Laboratory (MSL) rover, named Curiosity, which has been exploring Gale Crater on Mars since August 2012, except Mars 2020 will be a bit bigger and capable of doing even more amazing science. It will outweigh Curiosity by about 150 kilograms, but it’s otherwise about the same size, and uses the same type of radioisotope thermoelectric generator for power. Upgraded aluminum wheels will be more durable than Curiosity’s wheels, which have suffered significant wear. Mars 2020 will land on Mars in the same way that Curiosity did, with a mildly insane descent to the surface from a rocket-powered hovering “skycrane.”

Photo: NASA/JPL-Caltech

Last month, engineers at NASA's Jet Propulsion Laboratory install the main robotic arm on the Mars 2020 rover. Measuring 2.1 meters long, the arm will allow the rover to work as a human geologist would: by holding and using science tools with its turret.

Mars 2020 really steps it up when it comes to science. The most interesting new capability (besides serving as the base station for a highly experimental autonomous helicopter) is that the rover will be able to take surface samples of rock and soil, put them into tubes, seal the tubes up, and then cache the tubes on the surface for later retrieval (and potentially return to Earth for analysis). Collecting the samples is the job of a drill on the end of the robot arm that can be equipped with a variety of interchangeable bits, but the arm holds a number of other instruments as well. A “turret” can swap between the drill, a mineral identification sensor suite called SHERLOC, and an X-ray spectrometer and camera called PIXL. Fundamentally, most of Mars 2020’s science work is going to depend on the arm and the hardware that it carries, both in terms of close-up surface investigations and collecting samples for caching.

Matt Robinson is the Deputy Delivery Manager for the Sample Caching System on the Mars 2020 rover, which covers the robotic arm itself, the drill at the end of the arm, and the sample caching system within the body of the rover that manages the samples. Robinson has been at JPL since 2001, and he’s worked on the Mars Phoenix Lander mission as the robotic arm flight software developer and robotic arm test and operations engineer, as well as on Curiosity as the robotic arm test and operations lead engineer.

We spoke with Robinson about how the Mars 2020 arm was designed, and what it’s like to be building robots for exploring other planets.

IEEE Spectrum: How’d you end up working on robots at JPL?

Matt Robinson: When I was a grad student, my focus was on vision-based robotics research, so the kinds of things they do at JPL, or that we do at JPL now, were right within my wheelhouse. One of my advisors in grad school had a former student who was out here at JPL, so that’s how I made the contact. But I was very excited to come to JPL—as a young grad student working in robotics, space robotics was where it’s at.

For a robotics engineer, working in space is kind of the gold standard. You’re working in a challenging environment and you have to be prepared for any time of eventuality that may occur. And when you send your robot out to space, there’s no getting it back.

Once the rover arrives on Mars and you receive pictures back from it operating, there’s no greater feeling. You’ve built something that is now working 200+ million miles away. It’s an awesome experience! I have to pinch myself sometimes with the job I do. Working at JPL on space robotics is the holy grail for a roboticist.

What’s different about designing an arm for a rover that will operate on Mars?

We spent over five years designing, manufacturing, assembling, and testing the arm. Scientists have defined the high-level goals for what the mission has to do—acquire core samples and process them for return, carry science instruments on the arm to help determine what rocks to sample, and so on. We, as engineers, define the next level of requirements that support those goals.

When you’re building a robotic arm for another planet, you want to design something that is robust to the environment as well as robust from fault-protection standpoint. On Mars, we’re talking about an environment where the temperature can vary 100 degrees Celsius over the course of the day, so it’s very challenging thermally. With force sensing for instance, that’s a major problem. Force sensors aren’t typically designed to operate or even survive in temperature ranges that we’re talking about. So a lot of effort has to go into force sensor design and testing.

And then there’s a do-no-harm aspect—you’re sending this piece of hardware 200 million miles away, and you can’t get it back, so you want to make sure your hardware and software are robust and cannot do any harm to the system. It’s definitely a change in mindset from a terrestrial robot, where if you make a mistake, you can repair it.

“Once the rover arrives on Mars and you receive pictures back from it, there’s no greater feeling . . . I have to pinch myself sometimes with the job I do.”
—Matt Robinson, NASA JPL

How do you decide how much redundancy is enough?

That’s always a big question. It comes down to a couple of things, typically: mass and volume. You have a certain amount of mass that’s allocated to the robotic arm and we have a volume that it has to fit within, so those are often the drivers of the amount of redundancy that you can fit. We also have a lot of experience with sending arms to other planets, and at the beginning of projects, we establish a number of requirements that the design has to meet, and that’s where the redundancy is captured.

How much is the design of the arm driven by this need for redundancy, as opposed to trying to pack in all of the instrumentation that you want to have on there to do as much science as possible?

The requirements were driven by a couple of things. We knew roughly how big the instruments on the end of the arm were going to be, so the arm design is partially driven by that, because as the instruments get bigger and heavier, the arm has to get bigger and stronger. We have our coring drill at the end of the arm, and coring requires a certain level of force, so the arm has to be strong enough to do that. Those all became requirements that drove the design of the arm. On top of that, there was also that this arm also has to operate within the Martian environment, so you have things like the temperature changes and thermal expansion—you have to design for that as well. It’s a combination of both, really.

You were a test engineer for the arm used on the MSL rover. What did you learn from Spirit and Opportunity that informed the design of the arm on Curiosity?

Spirit and Opportunity did not have any force-sensing on the robotic arm. We had contact sensors that were good enough. Spirit and Opportunity’s arms were used to place instruments, that’s all it had to do, primarily. When you’re talking about actually acquiring samples, it’s not a matter of just placing the tool—you also have to apply forces to the environment. And once you start doing that, you really need a force sensor to protect you, and also to determine how much load to apply. So that was a big theme, a big difference between MSL and Spirit and Opportunity.

The size grew a lot too. If you look at Spirit and Opportunity, they’re the size of a riding lawnmower. Curiosity and the Mars 2020 rovers are the size of a small car. The Spirit and Opportunity arm was under a meter long, and the 2020 arm is twice that, and it has to apply forces that are much higher than the Spirit and Opportunity arm. From Curiosity to 2020, the payload of the arm grew by 50 percent, but the mass of the arm did not grow a whole lot, because our mass budget was kind of tight. We had to design an arm that was stronger, that had more capability, without adding more mass. That was a big challenge. We were fairly efficient on Curiosity, but on 2020, we sharpened the pencil even more.

Photo: NASA/JPL-Caltech

Three generations of Mars rovers developed at NASA’s Jet Propulsion Laboratory. Front and center: Sojourner rover, which landed on Mars in 1997 as part of the Mars Pathfinder Project. Left: Mars Exploration Rover Project rover (Spirit and Opportunity), which landed on Mars in 2004. Right: Mars Science Laboratory rover (Curiosity), which landed on Mars in August 2012.

MSL used its arm to drill into rocks like Mars 2020 will—how has the experience of operating MSL on Mars changed your thinking on how to make that work?

On MSL, the force sensor was used primarily for fault protection, just to protect the arm from being overloaded. [When drilling] we used a stiffness model of the arm to apply the force. The force sensor was only used in case you overloaded, and that’s very different from doing active force control, where you’re actually using the force sensor in a control loop.

On Mars 2020, we’re taking it to the next step, using the force sensor to actually actively control the level of force, both for pushing on the ground and for doing bit exchange. That’s a key point because fault protection to prevent damage usually has larger error bars. When you’re trying to actually push on the environment to apply force, and you’re doing active force control, the force sensor has to be significantly more accurate.

So a big thing that we learned on MSL—it was the first time we’d actually flown a force sensor, and we learned a lot about how to design and test force sensors to be used on the surface of Mars.

How do you effectively test the Mars 2020 arm on Earth?

That’s a good question. The arm was designed to operate on either Earth or Mars. It’s strong enough to do both. We also have a stiffness model of the arm which includes allows us to compensate for differences in gravity. For testing, we make two copies of the robotic arm. We have our copy that we’re going to fly to Mars, which is what we call our flight model, and we have our engineering model. They’re effectively duplicates of each other. The engineering arm stays on earth, so even once we’ve sent the flight model to Mars, we can continue to test. And if something were to happen, if say a drill bit got stuck in the ground on Mars, we could try to replicate those conditions on Earth with our engineering model arm, and use that to test out different scenarios to overcome the problem.

How much autonomy will the arm have?

We have different models of autonomy. We have pretty high levels flight software and, for instance, we have a command that just says “dock,” that moves the arm does all the force control to the dock the arm with the carousel. For surface interaction, we have stereo cameras on the rover, and those cameras allow us to generate 3D terrain models. Using those 3D terrain models, scientists can select a target on that surface, and then we can position the arm on the target.

Scientists like to select the particular sample targets, because they have very specific types of rocks they’re looking for to sample from. On 2020, we’re providing the ability for the next level of autonomy for the rover to drive up to an area and at least do the initial surveying of that area, so the scientists can select the specific target. So the way that that would happen is, if there’s an area off in the distance that the scientists find potentially interesting, the rover will autonomously drive up to it, and deploy the arm and take all the pictures so that we can generate those 3D terrain models and then the next day the scientists can pick the specific target they want. It’s really cool.

JPL is famous for making robots that operate for far longer than NASA necessarily plans for. What’s it like designing hardware and software for a system that will (hopefully) become part of that legacy?

The way that I look at it is, when you’re building an arm that’s going to go to another planet, all the things that could go wrong… You have to build something that’s robust and that can survive all that. It’s not that we’re trying to overdesign arms so that they’ll end up lasting much, much longer, it’s that, given all the things that you can encounter within a fairly unknown environment, and the level of robustness of the design you have to apply, it just so happens we end up with designs that end up lasting a lot longer than they do. Which is great, but we’re not held to that, although we’re very excited when we see them last that long. Without any calibration, without any maintenance, exactly, it’s amazing. They show their wear over time, but they still operate, it’s super exciting, it’s very inspirational to see.

[ Mars 2020 Rover ] Continue reading

Posted in Human Robots

#435716 Watch This Drone Explode Into Maple Seed ...

As useful as conventional fixed-wing and quadrotor drones have become, they still tend to be relatively complicated, expensive machines that you really want to be able to use more than once. When a one-way trip is all that you have in mind, you want something simple, reliable, and cheap, and we’ve seen a bunch of different designs for drone gliders that more or less fulfill those criteria.

For an even simpler gliding design, you want to minimize both airframe mass and control surfaces, and the maple tree provides some inspiration in the form of samara, those distinctive seed pods that whirl to the ground in the fall. Samara are essentially just an unbalanced wing that spins, and while the natural ones don’t steer, adding an actuated flap to the robotic version and moving it at just the right time results in enough controllability to aim for a specific point on the ground.

Roboticists at the Singapore University of Technology and Design (SUTD) have been experimenting with samara-inspired drones, and in a new paper in IEEE Robotics and Automation Letters they explore what happens if you attach five of the drones together and then separate them in mid air.

Image: Singapore University of Technology and Design

The drone with all five wings attached (top left), and details of the individual wings: (a) smaller 44.9-gram wing for semi-indoor testing; (b) larger 83.4-gram wing able to carry a Pixracer, GPS, and magnetometer for directional control experiments.

Fundamentally, a samara design acts as a decelerator for an aerial payload. You can think of it like a parachute: It makes sure that whatever you toss out of an airplane gets to the ground intact rather than just smashing itself to bits on impact. Steering is possible, but you don’t get a lot of stability or precision control. The RA-L paper describes one solution to this, which is to collaboratively use five drones at once in a configuration that looks a bit like a helicopter rotor.

And once the multi-drone is right where you want it, the five individual samara drones can split off all at once, heading out on their own missions. It's quite a sight:

The concept features a collaborative autorotation in the initial stage of drop whereby several wings are attached to each other to form a rotor hub. The combined form achieves higher rotational energy and a collaborative control strategy is possible. Once closer to the ground, they can exit the collaborative form and continue to descend to unique destinations. A section of each wing forms a flap and a small actuator changes its pitch cyclically. Since all wing-flaps can actuate simultaneously in collaborative mode, better maneuverability is possible, hence higher resistance against environmental conditions. The vertical and horizontal speeds can be controlled to a certain extent, allowing it to navigate towards a target location and land softly.

The samara autorotating wing drones themselves could conceivably carry small payloads like sensors or emergency medical supplies, with these small-scale versions in the video able to handle an extra 30 grams of payload. While they might not have as much capacity as a traditional fixed-wing glider, they have the advantage of being able to descent vertically, and can perform better than a parachute due to their ability to steer. The researchers plan on improving the design of their little drones, with the goal of increasing the rotation speed and improving the control performance of both the individual drones and the multi-wing collaborative version.

“Dynamics and Control of a Collaborative and Separating Descent of Samara Autorotating Wings,” by Shane Kyi Hla Win, Luke Soe Thura Win, Danial Sufiyan, Gim Song Soh, and Shaohui Foong from Singapore University of Technology and Design, appears in the current issue of IEEE Robotics and Automation Letters.
[ SUTD ]

< Back to IEEE Journal Watch Continue reading

Posted in Human Robots

#435714 Universal Robots Introduces Its ...

Universal Robots, already the dominant force in collaborative robots, is flexing its muscles in an effort to further expand its reach in the cobots market. The Danish company is introducing today the UR16e, its strongest robotic arm yet, with a payload capability of 16 kilograms (35.3 lbs), reach of 900 millimeters, and repeatability of +/- 0.05 mm.

Universal says the new “heavy duty payload cobot” will allow customers to automate a broader range of processes, including packaging and palletizing, nut and screw driving, and high-payload and CNC machine tending.

In early 2015, Universal introduced the UR3, its smallest robot, which joined the UR5 and the flagship UR10, offering a payload capability of 3, 5, and 10 kg, respectively. Now the company is going in the other direction, announcing a bigger, stronger arm.

“With Universal joining its competitors in extending the reach and payload capacity of its cobots, a new standard of capability is forming,” Rian Whitton, a senior analyst at ABI Research, in London, tweeted.

Like its predecessors, the UR16e is part of Universal’s e-Series platform, which features 6 degrees of freedom and force/torque sensing on the tool flange. The UR family of cobots have stood out from the competition by being versatile in a variety of applications and, most important, easy to deploy and program. Universal didn’t release UR16e’s price, saying only that it is about 10 percent higher than that of the UR10e, which is about $50,000, depending on the configuration.

Jürgen von Hollen, president of Universal Robots, says the company decided to launch the UR16e after studying the market and talking to customers about their needs. “What came out of that process is we understood payload was a true barrier for a lot of customers,” he tells IEEE Spectrum. The 16 kg payload will be particularly useful for applications that require mounting specialized tools on the arm to perform tasks like screw driving and machine tending, he explains. Customers that could benefit from such applications include manufacturing, material handling, and automotive companies.

“We’ve added the payload, and that will open up that market for us,” von Hollen says.

The difference between Universal and Rethink

Universal has grown by leaps and bounds since its founding in 2008. By 2015, it had sold more than 5,000 robots; that number was close to 40,000 as of last year. During the same period, revenue more than doubled from about $100 million to $234 million. At a time when a string of robot makers have shuttered, including most notably Rethink Robotics, a cobots pioneer and Universal’s biggest rival, Universal finds itself in an enviable position, having amassed a commanding market share, estimated at between 50 to 60 percent.

About Rethink, von Hollen says the Boston-based company was a “good competitor,” helping disseminate the advantages and possibilities of cobots. “When Rethink basically ended it was more of a negative than a positive, from my perspective,” he says. In his view, a major difference between the two companies is that Rethink focused on delivering full-fledged applications to customers, whereas Universal focused on delivering a product to the market and letting the system integrators and sales partners deploy the robots to the customer base.

“We’ve always been very focused on delivering the product, whereas I think Rethink was much more focused on applications, very early on, and they added a level of complexity to their company that made it become very de-focused,” he says.

The collaborative robots market: massive growth

And yet, despite its success, Universal is still tiny when you compare it to the giants of industrial automation, which include companies like ABB, Fanuc, Yaskawa, and Kuka, with revenue in the billions of dollars. Although some of these companies have added cobots to their product portfolios—ABB’s YuMi, for example—that market represents a drop in the bucket when you consider global robot sales: The size of the cobots market was estimated at $700 million in 2018, whereas the global market for industrial robot systems (including software, peripherals, and system engineering) is close to $50 billion.

Von Hollen notes that cobots are expected to go through an impressive growth curve—nearly 50 percent year after year until 2025, when sales will reach between $9 to $12 billion. If Universal can maintain its dominance and capture a big slice of that market, it’ll add up to a nice sum. To get there, Universal is not alone: It is backed by U.S. electronics testing equipment maker Teradyne, which acquired Universal in 2015 for $285 million.

“The amount of resources we invest year over year matches the growth we had on sales,” von Hollen says. Universal currently has more than 650 employees, most based at its headquarters in Odense, Denmark, and the rest scattered in 27 offices in 18 countries. “No other company [in the cobots segment] is so focused on one product.”

[ Universal Robots ] Continue reading

Posted in Human Robots

#435707 AI Agents Startle Researchers With ...

After 25 million games, the AI agents playing hide-and-seek with each other had mastered four basic game strategies. The researchers expected that part.

After a total of 380 million games, the AI players developed strategies that the researchers didn’t know were possible in the game environment—which the researchers had themselves created. That was the part that surprised the team at OpenAI, a research company based in San Francisco.

The AI players learned everything via a machine learning technique known as reinforcement learning. In this learning method, AI agents start out by taking random actions. Sometimes those random actions produce desired results, which earn them rewards. Via trial-and-error on a massive scale, they can learn sophisticated strategies.

In the context of games, this process can be abetted by having the AI play against another version of itself, ensuring that the opponents will be evenly matched. It also locks the AI into a process of one-upmanship, where any new strategy that emerges forces the opponent to search for a countermeasure. Over time, this “self-play” amounted to what the researchers call an “auto-curriculum.”

According to OpenAI researcher Igor Mordatch, this experiment shows that self-play “is enough for the agents to learn surprising behaviors on their own—it’s like children playing with each other.”

Reinforcement is a hot field of AI research right now. OpenAI’s researchers used the technique when they trained a team of bots to play the video game Dota 2, which squashed a world-champion human team last April. The Alphabet subsidiary DeepMind has used it to triumph in the ancient board game Go and the video game StarCraft.

Aniruddha Kembhavi, a researcher at the Allen Institute for Artificial Intelligence (AI2) in Seattle, says games such as hide-and-seek offer a good way for AI agents to learn “foundational skills.” He worked on a team that taught their AllenAI to play Pictionary with humans, viewing the gameplay as a way for the AI to work on common sense reasoning and communication. “We are, however, quite far away from being able to translate these preliminary findings in highly simplified environments into the real world,” says Kembhavi.

Illustration: OpenAI

AI agents construct a fort during a hide-and-seek game developed by OpenAI.

In OpenAI’s game of hide-and-seek, both the hiders and the seekers received a reward only if they won the game, leaving the AI players to develop their own strategies. Within a simple 3D environment containing walls, blocks, and ramps, the players first learned to run around and chase each other (strategy 1). The hiders next learned to move the blocks around to build forts (2), and then the seekers learned to move the ramps (3), enabling them to jump inside the forts. Then the hiders learned to move all the ramps into their forts before the seekers could use them (4).

The two strategies that surprised the researchers came next. First the seekers learned that they could jump onto a box and “surf” it over to a fort (5), allowing them to jump in—a maneuver that the researchers hadn’t realized was physically possible in the game environment. So as a final countermeasure, the hiders learned to lock all the boxes into place (6) so they weren’t available for use as surfboards.

Illustration: OpenAI

An AI agent uses a nearby box to surf its way into a competitor’s fort.

In this circumstance, having AI agents behave in an unexpected way wasn’t a problem: They found different paths to their rewards, but didn’t cause any trouble. However, you can imagine situations in which the outcome would be rather serious. Robots acting in the real world could do real damage. And then there’s Nick Bostrom’s famous example of a paper clip factory run by an AI, whose goal is to make as many paper clips as possible. As Bostrom told IEEE Spectrum back in 2014, the AI might realize that “human bodies consist of atoms, and those atoms could be used to make some very nice paper clips.”

Bowen Baker, another member of the OpenAI research team, notes that it’s hard to predict all the ways an AI agent will act inside an environment—even a simple one. “Building these environments is hard,” he says. “The agents will come up with these unexpected behaviors, which will be a safety problem down the road when you put them in more complex environments.”

AI researcher Katja Hofmann at Microsoft Research Cambridge, in England, has seen a lot of gameplay by AI agents: She started a competition that uses Minecraft as the playing field. She says the emergent behavior seen in this game, and in prior experiments by other researchers, shows that games can be a useful for studies of safe and responsible AI.

“I find demonstrations like this, in games and game-like settings, a great way to explore the capabilities and limitations of existing approaches in a safe environment,” says Hofmann. “Results like these will help us develop a better understanding on how to validate and debug reinforcement learning systems–a crucial step on the path towards real-world applications.”

Baker says there’s also a hopeful takeaway from the surprises in the hide-and-seek experiment. “If you put these agents into a rich enough environment they will find strategies that we never knew were possible,” he says. “Maybe they can solve problems that we can’t imagine solutions to.” Continue reading

Posted in Human Robots