Tag Archives: mistakes

#439012 Video Friday: Man-Machine Synergy ...

Video Friday is your weekly selection of awesome robotics videos, collected by your Automaton bloggers. We’ll also be posting a weekly calendar of upcoming robotics events for the next few months; here's what we have so far (send us your events!):

RoboSoft 2021 – April 12-16, 2021 – [Online Conference]
ICRA 2021 – May 30-5, 2021 – Xi'an, China
DARPA SubT Finals – September 21-23, 2021 – Louisville, KY, USA
WeRobot 2021 – September 23-25, 2021 – Coral Gables, FL, USA
Let us know if you have suggestions for next week, and enjoy today's videos.

Man-Machine Synergy Effectors, Inc. is a Japanese company working on an absolutely massive “human machine synergistic effect device,” which is a huge robot controlled by a nearby human using a haptic rig.

From the look of things, the next generation will be able to move around. Whoa.

[ MMSE ]

This method of loading and unloading AMRs without having them ever stop moving is so obvious that there must be some equally obvious reason why I've never seen it done in practice.

The LoadRunner is able to transport and sort parcels weighing up to 30 kilograms. This makes it the perfect luggage carrier for airports. These AI-driven go-carts can also work in concert as larger collectives to carry large, heavy and bulky objects. Every LoadRunner can also haul up to four passive trailers. Powered by four electric motors, the LoadRunner sharply brakes at just the right moment right in front of its destination and the payload slides from the robot onto the delivery platform.

[ Fraunhofer ] via [ Gizmodo ]

Ayato Kanada at Kyushu University wrote in to share this clever “dislocatable joint,” a way of combining continuum and rigid robots.

[ Paper ]

Thanks Ayato!

The DodgeDrone challenge revisits the popular dodgeball game in the context of autonomous drones. Specifically, participants will have to code navigation policies to fly drones between waypoints while avoiding dynamic obstacles. Drones are fast but fragile systems: as soon as something hits them, they will crash! Since objects will move towards the drone with different speeds and acceleration, smart algorithms are required to avoid them!

This could totally happen in real life, and we need to be prepared for it!

[ DodgeDrone Challenge ]

In addition to winning the Best Student Design Competition CREATIVITY Award at HRI 2021, this paper would also have won the Best Paper Title award, if that award existed.

[ Paper ]

Robots are traditionally bound by a fixed morphology during their operational lifetime, which is limited to adapting only their control strategies. Here we present the first quadrupedal robot that can morphologically adapt to different environmental conditions in outdoor, unstructured environments.

We show that the robot exploits its training to effectively transition between different morphological configurations, exhibiting substantial performance improvements over a non-adaptive approach. The demonstrated benefits of real-world morphological adaptation demonstrate the potential for a new embodied way of incorporating adaptation into future robotic designs.

[ Nature ]

A drone video shot in a Minneapolis bowling alley was hailed as an instant classic. One Hollywood veteran said it “adds to the language and vocabulary of cinema.” One IEEE Spectrum editor said “hey that's pretty cool.”

[ Bryant Lake Bowl ]

It doesn't take a robot to convince me to buy candy, but I think if I buy candy from Relay it's a business expense, right?

[ RIS ]

DARPA is making progress on its AI dogfighting program, with physical flight tests expected this year.

[ DARPA ACE ]

Unitree Robotics has realized that the Empire needs to be overthrown!

[ Unitree ]

Windhover Labs, an emerging leader in open and reliable flight software and hardware, announces the upcoming availability of its first hardware product, a low cost modular flight computer for commercial drones and small satellites.

[ Windhover ]

As robots and autonomous systems are poised to become part of our everyday lives, the University of Michigan and Ford are opening a one-of-a-kind facility where they’ll develop robots and roboticists that help make lives better, keep people safer and build a more equitable society.

[ U Michigan ]

The adaptive robot Rizon combined with a new hybrid electrostatic and gecko-inspired gripping pad developed by Stanford BDML can manipulate bulky, non-smooth items in the most effort-saving way, which broadens the applications in retail and household environments.

[ Flexiv ]

Thanks Yunfan!

I don't know why anyone would want things to get MORE icy, but if you do for some reason, you can make it happen with a Husky.

Is winter over yet?

[ Clearpath ]

Skip ahead to about 1:20 to see a pair of Gita robots following a Spot following a human like a chain of lil’ robot duckings.

[ PFF ]

Here are a couple of retro robotics videos, one showing teleoperated humanoids from 2000, and the other showing a robotic guide dog from 1976 (!)

[ Tachi Lab ]

Thanks Fan!

If you missed Chad Jenkins' talk “That Ain’t Right: AI Mistakes and Black Lives” last time, here's another opportunity to watch from Robotics Today, and it includes a top notch panel discussion at the end.

[ Robotics Today ]

Since its founding in 1979, the Robotics Institute (RI) at Carnegie Mellon University has been leading the world in robotics research and education. In the mid 1990s, RI created NREC as the applied R&D center within the Institute with a specific mission to apply robotics technology in an impactful way on real-world applications. In this talk, I will go over numerous R&D programs that I have led at NREC in the past 25 years.

[ CMU ] Continue reading

Posted in Human Robots

#438738 This Week’s Awesome Tech Stories From ...

ARTIFICIAL INTELLIGENCE
A New Artificial Intelligence Makes Mistakes—on Purpose
Will Knight | Wired
“It took about 50 years for computers to eviscerate humans in the venerable game of chess. A standard smartphone can now play the kind of moves that make a grandmaster’s head spin. But one artificial intelligence program is taking a few steps backward, to appreciate how average humans play—blunders and all.”

CRYPTOCURRENCY
Bitcoin’s Price Rises to $50,000 as Mainstream Institutions Hop On
Timothy B. Lee | Ars Technica
“Bitcoin’s price is now far above the previous peak of $19,500 reached in December 2017. Bitcoin’s value has risen by almost 70 percent since the start of 2021. No single factor seems to be driving the cryptocurrency’s rise. Instead, the price is rising as more and more mainstream organizations are deciding to treat it as an ordinary investment asset.”

SCIENCE
Million-Year-Old Mammoth Teeth Contain Oldest DNA Ever Found
Jeanne Timmons | Gizmodo
“An international team of scientists has sequenced DNA from mammoth teeth that is at least a million years old, if not older. This research, published today in Nature, not only provides exciting new insight into mammoth evolutionary history, it reveals an entirely unknown lineage of ancient mammoth.”

SCIENCE
Scientists Accidentally Discover Strange Creatures Under a Half Mile of Ice
Matt Simon | Wired
“i‘It’s like, bloody hell!’ Smith says. ‘It’s just one big boulder in the middle of a relatively flat seafloor. It’s not as if the seafloor is littered with these things.’ Just his luck to drill in the only wrong place. Wrong place for collecting seafloor muck, but the absolute right place for a one-in-a-million shot at finding life in an environment that scientists didn’t reckon could support much of it.”

BIOTECH
Highest-Resolution Images of DNA Reveal It’s Surprisingly Jiggly
George Dvorsky | Gizmodo
“Scientists have captured the highest-resolution images ever taken of DNA, revealing previously unseen twisting and squirming behaviors. …These hidden movements were revealed by computer simulations fed with the highest-resolution images ever taken of a single molecule of DNA. The new study is exposing previously unseen behaviors in the self-replicating molecule, and this research could eventually lead to the development of powerful new genetic therapies.”

TRANSPORTATION
The First Battery-Powered Tanker Is Coming to Tokyo
Maria Gallucci | IEEE Spectrum
“The Japanese tanker is Corvus’s first fully-electric coastal freighter project; the company hopes the e5 will be the first of hundreds more just like it. ‘We see it [as] a beachhead for the coastal shipping market globally,’ Puchalski said. ‘There are many other coastal freighter types that are similar in size and energy demand.’ The number of battery-powered ships has ballooned from virtually zero a decade ago to hundreds worldwide.”

SPACE
Report: NASA’s Only Realistic Path for Humans on Mars Is Nuclear Propulsion
Eric Berger | Ars Technica
“Conducted at the request of NASA, a broad-based committee of experts assessed the viability of two means of propulsion—nuclear thermal and nuclear electric—for a human mission launching to Mars in 2039. ‘One of the primary takeaways of the report is that if we want to send humans to Mars, and we want to do so repeatedly and in a sustainable way, nuclear space propulsion is on the path,’ said [JPL’s] Bobby Braun.”

NASA’s Perseverance Rover Successfully Lands on Mars
Joey Roulette | The Verge
“Perseverance hit Mars’ atmosphere on time at 3:48PM ET at speeds of about 12,100 miles per hour, diving toward the surface in an infamously challenging sequence engineers call the “seven minutes of terror.” With an 11-minute comms delay between Mars and Earth, the spacecraft had to carry out its seven-minute plunge at all by itself with a wickedly complex set of pre-programmed instructions.”

ENVIRONMENT
A First-of-Its-Kind Geoengineering Experiment Is About to Take Its First Step
James Temple | MIT Technology Review
“When I visited Frank Keutsch in the fall of 2019, he walked me down to the lab, where the tube, wrapped in gray insulation, ran the length of a bench in the back corner. By filling it with the right combination of gases, at particular temperatures and pressures, Keutsch and his colleagues had simulated the conditions some 20 kilometers above Earth’s surface. In testing how various chemicals react in this rarefied air, the team hoped to conduct a crude test of a controversial scheme known as solar geoengineering.”

Image Credit: Garcia / Unsplash Continue reading

Posted in Human Robots

#437809 Q&A: The Masterminds Behind ...

Illustration: iStockphoto

Getting a car to drive itself is undoubtedly the most ambitious commercial application of artificial intelligence (AI). The research project was kicked into life by the 2004 DARPA Urban Challenge and then taken up as a business proposition, first by Alphabet, and later by the big automakers.

The industry-wide effort vacuumed up many of the world’s best roboticists and set rival companies on a multibillion-dollar acquisitions spree. It also launched a cycle of hype that paraded ever more ambitious deadlines—the most famous of which, made by Alphabet’s Sergei Brin in 2012, was that full self-driving technology would be ready by 2017. Those deadlines have all been missed.

Much of the exhilaration was inspired by the seeming miracles that a new kind of AI—deep learning—was achieving in playing games, recognizing faces, and transliterating voices. Deep learning excels at tasks involving pattern recognition—a particular challenge for older, rule-based AI techniques. However, it now seems that deep learning will not soon master the other intellectual challenges of driving, such as anticipating what human beings might do.

Among the roboticists who have been involved from the start are Gill Pratt, the chief executive officer of Toyota Research Institute (TRI) , formerly a program manager at the Defense Advanced Research Projects Agency (DARPA); and Wolfram Burgard, vice president of automated driving technology for TRI and president of the IEEE Robotics and Automation Society. The duo spoke with IEEE Spectrum’s Philip Ross at TRI’s offices in Palo Alto, Calif.

This interview has been condensed and edited for clarity.

IEEE Spectrum: How does AI handle the various parts of the self-driving problem?

Photo: Toyota

Gill Pratt

Gill Pratt: There are three different systems that you need in a self-driving car: It starts with perception, then goes to prediction, and then goes to planning.

The one that by far is the most problematic is prediction. It’s not prediction of other automated cars, because if all cars were automated, this problem would be much more simple. How do you predict what a human being is going to do? That’s difficult for deep learning to learn right now.

Spectrum: Can you offset the weakness in prediction with stupendous perception?

Photo: Toyota Research Institute for Burgard

Wolfram Burgard

Wolfram Burgard: Yes, that is what car companies basically do. A camera provides semantics, lidar provides distance, radar provides velocities. But all this comes with problems, because sometimes you look at the world from different positions—that’s called parallax. Sometimes you don’t know which range estimate that pixel belongs to. That might make the decision complicated as to whether that is a person painted onto the side of a truck or whether this is an actual person.

With deep learning there is this promise that if you throw enough data at these networks, it’s going to work—finally. But it turns out that the amount of data that you need for self-driving cars is far larger than we expected.

Spectrum: When do deep learning’s limitations become apparent?

Pratt: The way to think about deep learning is that it’s really high-performance pattern matching. You have input and output as training pairs; you say this image should lead to that result; and you just do that again and again, for hundreds of thousands, millions of times.

Here’s the logical fallacy that I think most people have fallen prey to with deep learning. A lot of what we do with our brains can be thought of as pattern matching: “Oh, I see this stop sign, so I should stop.” But it doesn’t mean all of intelligence can be done through pattern matching.

“I asked myself, if all of those cars had automated drive, how good would they have to be to tolerate the number of crashes that would still occur?”
—Gill Pratt, Toyota Research Institute

For instance, when I’m driving and I see a mother holding the hand of a child on a corner and trying to cross the street, I am pretty sure she’s not going to cross at a red light and jaywalk. I know from my experience being a human being that mothers and children don’t act that way. On the other hand, say there are two teenagers—with blue hair, skateboards, and a disaffected look. Are they going to jaywalk? I look at that, you look at that, and instantly the probability in your mind that they’ll jaywalk is much higher than for the mother holding the hand of the child. It’s not that you’ve seen 100,000 cases of young kids—it’s that you understand what it is to be either a teenager or a mother holding a child’s hand.

You can try to fake that kind of intelligence. If you specifically train a neural network on data like that, you could pattern-match that. But you’d have to know to do it.

Spectrum: So you’re saying that when you substitute pattern recognition for reasoning, the marginal return on the investment falls off pretty fast?

Pratt: That’s absolutely right. Unfortunately, we don’t have the ability to make an AI that thinks yet, so we don’t know what to do. We keep trying to use the deep-learning hammer to hammer more nails—we say, well, let’s just pour more data in, and more data.

Spectrum: Couldn’t you train the deep-learning system to recognize teenagers and to assign the category a high propensity for jaywalking?

Burgard: People have been doing that. But it turns out that these heuristics you come up with are extremely hard to tweak. Also, sometimes the heuristics are contradictory, which makes it extremely hard to design these expert systems based on rules. This is where the strength of the deep-learning methods lies, because somehow they encode a way to see a pattern where, for example, here’s a feature and over there is another feature; it’s about the sheer number of parameters you have available.

Our separation of the components of a self-driving AI eases the development and even the learning of the AI systems. Some companies even think about using deep learning to do the job fully, from end to end, not having any structure at all—basically, directly mapping perceptions to actions.

Pratt: There are companies that have tried it; Nvidia certainly tried it. In general, it’s been found not to work very well. So people divide the problem into blocks, where we understand what each block does, and we try to make each block work well. Some of the blocks end up more like the expert system we talked about, where we actually code things, and other blocks end up more like machine learning.

Spectrum: So, what’s next—what new technique is in the offing?

Pratt: If I knew the answer, we’d do it. [Laughter]

Spectrum: You said that if all cars on the road were automated, the problem would be easy. Why not “geofence” the heck out of the self-driving problem, and have areas where only self-driving cars are allowed?

Pratt: That means putting in constraints on the operational design domain. This includes the geography—where the car should be automated; it includes the weather, it includes the level of traffic, it includes speed. If the car is going slow enough to avoid colliding without risking a rear-end collision, that makes the problem much easier. Street trolleys operate with traffic still in some parts of the world, and that seems to work out just fine. People learn that this vehicle may stop at unexpected times. My suspicion is, that is where we’ll see Level 4 autonomy in cities. It’s going to be in the lower speeds.

“We are now in the age of deep learning, and we don’t know what will come after.”
—Wolfram Burgard, Toyota Research Institute

That’s a sweet spot in the operational design domain, without a doubt. There’s another one at high speed on a highway, because access to highways is so limited. But unfortunately there is still the occasional debris that suddenly crosses the road, and the weather gets bad. The classic example is when somebody irresponsibly ties a mattress to the top of a car and it falls off; what are you going to do? And the answer is that terrible things happen—even for humans.

Spectrum: Learning by doing worked for the first cars, the first planes, the first steam boilers, and even the first nuclear reactors. We ran risks then; why not now?

Pratt: It has to do with the times. During the era where cars took off, all kinds of accidents happened, women died in childbirth, all sorts of diseases ran rampant; the expected characteristic of life was that bad things happened. Expectations have changed. Now the chance of dying in some freak accident is quite low because of all the learning that’s gone on, the OSHA [Occupational Safety and Health Administration] rules, UL code for electrical appliances, all the building standards, medicine.

Furthermore—and we think this is very important—we believe that empathy for a human being at the wheel is a significant factor in public acceptance when there is a crash. We don’t know this for sure—it’s a speculation on our part. I’ve driven, I’ve had close calls; that could have been me that made that mistake and had that wreck. I think people are more tolerant when somebody else makes mistakes, and there’s an awful crash. In the case of an automated car, we worry that that empathy won’t be there.

Photo: Toyota

Toyota is using this
Platform 4 automated driving test vehicle, based on the Lexus LS, to develop Level-4 self-driving capabilities for its “Chauffeur” project.

Spectrum: Toyota is building a system called Guardian to back up the driver, and a more futuristic system called Chauffeur, to replace the driver. How can Chauffeur ever succeed? It has to be better than a human plus Guardian!

Pratt: In the discussions we’ve had with others in this field, we’ve talked about that a lot. What is the standard? Is it a person in a basic car? Or is it a person with a car that has active safety systems in it? And what will people think is good enough?

These systems will never be perfect—there will always be some accidents, and no matter how hard we try there will still be occasions where there will be some fatalities. At what threshold are people willing to say that’s okay?

Spectrum: You were among the first top researchers to warn against hyping self-driving technology. What did you see that so many other players did not?

Pratt: First, in my own case, during my time at DARPA I worked on robotics, not cars. So I was somewhat of an outsider. I was looking at it from a fresh perspective, and that helps a lot.

Second, [when I joined Toyota in 2015] I was joining a company that is very careful—even though we have made some giant leaps—with the Prius hybrid drive system as an example. Even so, in general, the philosophy at Toyota is kaizen—making the cars incrementally better every single day. That care meant that I was tasked with thinking very deeply about this thing before making prognostications.

And the final part: It was a new job for me. The first night after I signed the contract I felt this incredible responsibility. I couldn’t sleep that whole night, so I started to multiply out the numbers, all using a factor of 10. How many cars do we have on the road? Cars on average last 10 years, though ours last 20, but let’s call it 10. They travel on an order of 10,000 miles per year. Multiply all that out and you get 10 to the 10th miles per year for our fleet on Planet Earth, a really big number. I asked myself, if all of those cars had automated drive, how good would they have to be to tolerate the number of crashes that would still occur? And the answer was so incredibly good that I knew it would take a long time. That was five years ago.

Burgard: We are now in the age of deep learning, and we don’t know what will come after. We are still making progress with existing techniques, and they look very promising. But the gradient is not as steep as it was a few years ago.

Pratt: There isn’t anything that’s telling us that it can’t be done; I should be very clear on that. Just because we don’t know how to do it doesn’t mean it can’t be done. Continue reading

Posted in Human Robots

#437630 How Toyota Research Envisions the Future ...

Yesterday, the Toyota Research Institute (TRI) showed off some of the projects that it’s been working on recently, including a ceiling-mounted robot that could one day help us with household chores. That system is just one example of how TRI envisions the future of robotics and artificial intelligence. As TRI CEO Gill Pratt told us, the company is focusing on robotics and AI technology for “amplifying, rather than replacing, human beings.” In other words, Toyota wants to develop robots not for convenience or to do our jobs for us, but rather to allow people to continue to live and work independently even as we age.

To better understand Toyota’s vision of robotics 15 to 20 years from now, it’s worth watching the 20-minute video below, which depicts various scenarios “where the application of robotic capabilities is enabling members of an aging society to live full and independent lives in spite of the challenges that getting older brings.” It’s a long video, but it helps explains TRI’s perspective on how robots will collaborate with humans in our daily lives over the next couple of decades.

Those are some interesting conceptual telepresence-controlled bipeds they’ve got running around in that video, right?

For more details, we sent TRI some questions on how it plans to go from concepts like the ones shown in the video to real products that can be deployed in human environments. Below are answers from TRI CEO Gill Pratt, who is also chief scientist for Toyota Motor Corp.; Steffi Paepcke, senior UX designer at TRI; and Max Bajracharya, VP of robotics at TRI.

IEEE Spectrum: TRI seems to have a more explicit focus on eventual commercialization than most of the robotics research that we cover. At what point TRI starts to think about things like reliability and cost?

Photo: TRI

Toyota is exploring robots capable of manipulating dishes in a sink and a dishwasher, performing experiments and simulations to make sure that the robots can handle a wide range of conditions.

Gill Pratt: It’s a really interesting question, because the normal way to think about this would be to say, well, both reliability and cost are product development tasks. But actually, we need to think about it at the earliest possible stage with research as well. The hardware that we use in the laboratory for doing experiments, we don’t worry about cost there, or not nearly as much as you’d worry about for a product. However, in terms of what research we do, we very much have to think about, is it possible (if the research is successful) for it to end up in a product that has a reasonable cost. Because if a customer can’t afford what we come up with, maybe it has some academic value but it’s not actually going to make a difference in their quality of life in the real world. So we think about cost very much from the beginning.

The same is true with reliability. Right now, we’re working very hard to make our control techniques robust to wide variations in the environment. For instance, in work that Russ Tedrake is doing with manipulating dishes in a sink and a dishwasher, both in physical testing and in simulation, we’re doing thousands and now millions of different experiments to make sure that we can handle the edge cases and it works over a very wide range of conditions.

A tremendous amount of work that we do is trying to bring robotics out of the age of doing demonstrations. There’s been a history of robotics where for some time, things have not been reliable, so we’d catch the robot succeeding just once and then show that video to the world, and people would get the mis-impression that it worked all of the time. Some researchers have been very good about showing the blooper reel too, to show that some of the time, robots don’t work.

“A tremendous amount of work that we do is trying to bring robotics out of the age of doing demonstrations. There’s been a history of robotics where for some time, things have not been reliable, so we’d catch the robot succeeding just once and then show that video to the world, and people would get the mis-impression that it worked all of the time.”
—Gill Pratt, TRI

In the spirit of sharing things that didn’t work, can you tell us a bit about some of the robots that TRI has had under development that didn’t make it into the demo yesterday because they were abandoned along the way?

Steffi Paepcke: We’re really looking at how we can connect people; it can be hard to stay in touch and see our loved ones as much as we would like to. There have been a few prototypes that we’ve worked on that had to be put on the shelf, at least for the time being. We were exploring how to use light so that people could be ambiently aware of one another across distances. I was very excited about that—the internal name was “glowing orb.” For a variety of reasons, it didn’t work out, but it was really fascinating to investigate different modalities for keeping in touch.

Another prototype we worked on—we found through our research that grocery shopping is obviously an important part of life, and for a lot of older adults, it’s not necessarily the right answer to always have groceries delivered. Getting up and getting out of the house keeps you physically active, and a lot of people prefer to continue doing it themselves. But it can be challenging, especially if you’re purchasing heavy items that you need to transport. We had a prototype that assisted with grocery shopping, but when we pivoted our focus to Japan, we found that the inside of a Japanese home really needs to stay inside, and the outside needs to stay outside, so a robot that traverses both domains is probably not the right fit for a Japanese audience, and those were some really valuable lessons for us.

Photo: TRI

Toyota recently demonstrated a gantry robot that would hang from the ceiling to perform tasks like wiping surfaces and clearing clutter.

I love that TRI is exploring things like the gantry robot both in terms of near-term research and as part of its long-term vision, but is a robot like this actually worth pursuing? Or more generally, what’s the right way to compromise between making an environment robot friendly, and asking humans to make changes to their homes?

Max Bajracharya: We think a lot about the problems that we’re trying to address in a holistic way. We don’t want to just give people a robot, and assume that they’re not going to change anything about their lifestyle. We have a lot of evidence from people who use automated vacuum cleaners that people will adapt to the tools you give them, and they’ll change their lifestyle. So we want to think about what is that trade between changing the environment, and giving people robotic assistance and tools.

We certainly think that there are ways to make the gantry system plausible. The one you saw today is obviously a prototype and does require significant infrastructure. If we’re going to retrofit a home, that isn’t going to be the way to do it. But we still feel like we’re very much in the prototype phase, where we’re trying to understand whether this is worth it to be able to bypass navigation challenges, and coming up with the pros and cons of the gantry system. We’re evaluating whether we think this is the right approach to solving the problem.

To what extent do you think humans should be either directly or indirectly in the loop with home and service robots?

Bajracharya: Our goal is to amplify people, so achieving this is going to require robots to be in a loop with people in some form. One thing we have learned is that using people in a slow loop with robots, such as teaching them or helping them when they make mistakes, gives a robot an important advantage over one that has to do everything perfectly 100 percent of the time. In unstructured human environments, robots are going to encounter corner cases, and are going to need to learn to adapt. People will likely play an important role in helping the robots learn. Continue reading

Posted in Human Robots

#437624 AI-Powered Drone Learns Extreme ...

Quadrotors are among the most agile and dynamic machines ever created. In the hands of a skilled human pilot, they can do some astonishing series of maneuvers. And while autonomous flying robots have been getting better at flying dynamically in real-world environments, they still haven’t demonstrated the same level of agility of manually piloted ones.

Now researchers from the Robotics and Perception Group at the University of Zurich and ETH Zurich, in collaboration with Intel, have developed a neural network training method that “enables an autonomous quadrotor to fly extreme acrobatic maneuvers with only onboard sensing and computation.” Extreme.

There are two notable things here: First, the quadrotor can do these extreme acrobatics outdoors without any kind of external camera or motion-tracking system to help it out (all sensing and computing is onboard). Second, all of the AI training is done in simulation, without the need for an additional simulation-to-real-world (what researchers call “sim-to-real”) transfer step. Usually, a sim-to-real transfer step means putting your quadrotor into one of those aforementioned external tracking systems, so that it doesn’t completely bork itself while trying to reconcile the differences between the simulated world and the real world, where, as the researchers wrote in a paper describing their system, “even tiny mistakes can result in catastrophic outcomes.”

To enable “zero-shot” sim-to-real transfer, the neural net training in simulation uses an expert controller that knows exactly what’s going on to teach a “student controller” that has much less perfect knowledge. That is, the simulated sensory input that the student ends up using as it learns to follow the expert has been abstracted to present the kind of imperfect, imprecise data it’s going to encounter in the real world. This can involve things like abstracting away the image part of the simulation until you’d have no way of telling the difference between abstracted simulation and abstracted reality, which is what allows the system to make that sim-to-real leap.

The simulation environment that the researchers used was Gazebo, slightly modified to better simulate quadrotor physics. Meanwhile, over in reality, a custom 1.5-kilogram quadrotor with a 4:1 thrust to weight ratio performed the physical experiments, using only a Nvidia Jetson TX2 computing board and an Intel RealSense T265, a dual fisheye camera module optimized for V-SLAM. To challenge the learning system, it was trained to perform three acrobatic maneuvers plus a combo of all of them:

Image: University of Zurich/ETH Zurich/Intel

Reference trajectories for acrobatic maneuvers. Top row, from left: Power Loop, Barrel Roll, and Matty Flip. Bottom row: Combo.

All of these maneuvers require high accelerations of up to 3 g’s and careful control, and the Matty Flip is particularly challenging, at least for humans, because the whole thing is done while the drone is flying backwards. Still, after just a few hours of training in simulation, the drone was totally real-world competent at these tricks, and could even extrapolate a little bit to perform maneuvers that it was not explicitly trained on, like doing multiple loops in a row. Where humans still have the advantage over drones is (as you might expect since we’re talking about robots) is quickly reacting to novel or unexpected situations. And when you’re doing this sort of thing outdoors, novel and unexpected situations are everywhere, from a gust of wind to a jealous bird.

For more details, we spoke with Antonio Loquercio from the University of Zurich’s Robotics and Perception Group.

IEEE Spectrum: Can you explain how the abstraction layer interfaces with the simulated sensors to enable effective sim-to-real transfer?

Antonio Loquercio: The abstraction layer applies a specific function to the raw sensor information. Exactly the same function is applied to the real and simulated sensors. The result of the function, which is “abstracted sensor measurements,” makes simulated and real observation of the same scene similar. For example, suppose we have a sequence of simulated and real images. We can very easily tell apart the real from the simulated ones given the difference in rendering. But if we apply the abstraction function of “feature tracks,” which are point correspondences in time, it becomes very difficult to tell which are the simulated and real feature tracks, since point correspondences are independent of the rendering. This applies for humans as well as for neural networks: Training policies on raw images gives low sim-to-real transfer (since images are too different between domains), while training on the abstracted images has high transfer abilities.

How useful is visual input from a camera like the Intel RealSense T265 for state estimation during such aggressive maneuvers? Would using an event camera substantially improve state estimation?

Our end-to-end controller does not require a state estimation module. It shares however some components with traditional state estimation pipelines, specifically the feature extractor and the inertial measurement unit (IMU) pre-processing and integration function. The input of the neural networks are feature tracks and integrated IMU measurements. When looking at images with low features (for example when the camera points to the sky), the neural net will mainly rely on IMU. When more features are available, the network uses to correct the accumulated drift from IMU. Overall, we noticed that for very short maneuvers IMU measurements were sufficient for the task. However, for longer ones, visual information was necessary to successfully address the IMU drift and complete the maneuver. Indeed, visual information reduces the odds of a crash by up to 30 percent in the longest maneuvers. We definitely think that event camera can improve even more the current approach since they could provide valuable visual information during high speed.

“The Matty Flip is probably one of the maneuvers that our approach can do very well … It is super challenging for humans, since they don’t see where they’re going and have problems in estimating their speed. For our approach the maneuver is no problem at all, since we can estimate forward velocities as well as backward velocities.”
—Antonio Loquercio, University of Zurich

You describe being able to train on “maneuvers that stretch the abilities of even expert human pilots.” What are some examples of acrobatics that your drones might be able to do that most human pilots would not be capable of?

The Matty Flip is probably one of the maneuvers that our approach can do very well, but human pilots find very challenging. It basically entails doing a high speed power loop by always looking backward. It is super challenging for humans, since they don’t see where they’re going and have problems in estimating their speed. For our approach the maneuver is no problem at all, since we can estimate forward velocities as well as backward velocities.

What are the limits to the performance of this system?

At the moment the main limitation is the maneuver duration. We never trained a controller that could perform maneuvers longer than 20 seconds. In the future, we plan to address this limitation and train general controllers which can fly in that agile way for significantly longer with relatively small drift. In this way, we could start being competitive against human pilots in drone racing competitions.

Can you talk about how the techniques developed here could be applied beyond drone acrobatics?

The current approach allows us to do acrobatics and agile flight in free space. We are now working to perform agile flight in cluttered environments, which requires a higher degree of understanding of the surrounding with respect to this project. Drone acrobatics is of course only an example application. We selected it because it makes a stress test of the controller performance. However, several other applications which require fast and agile flight can benefit from our approach. Examples are delivery (we want our Amazon packets always faster, don’t we?), search and rescue, or inspection. Going faster allows us to cover more space in less time, saving battery costs. Indeed, agile flight has very similar battery consumption of slow hovering for an autonomous drone.

“Deep Drone Acrobatics,” by Elia Kaufmann, Antonio Loquercio, René Ranftl, Matthias Müller, Vladlen Koltun, and Davide Scaramuzza from the Robotics and Perception Group at the University of Zurich and ETH Zurich, and Intel’s Intelligent Systems Lab, was presented at RSS 2020. Continue reading

Posted in Human Robots