Tag Archives: net

#437624 AI-Powered Drone Learns Extreme ...

Quadrotors are among the most agile and dynamic machines ever created. In the hands of a skilled human pilot, they can do some astonishing series of maneuvers. And while autonomous flying robots have been getting better at flying dynamically in real-world environments, they still haven’t demonstrated the same level of agility of manually piloted ones.

Now researchers from the Robotics and Perception Group at the University of Zurich and ETH Zurich, in collaboration with Intel, have developed a neural network training method that “enables an autonomous quadrotor to fly extreme acrobatic maneuvers with only onboard sensing and computation.” Extreme.

There are two notable things here: First, the quadrotor can do these extreme acrobatics outdoors without any kind of external camera or motion-tracking system to help it out (all sensing and computing is onboard). Second, all of the AI training is done in simulation, without the need for an additional simulation-to-real-world (what researchers call “sim-to-real”) transfer step. Usually, a sim-to-real transfer step means putting your quadrotor into one of those aforementioned external tracking systems, so that it doesn’t completely bork itself while trying to reconcile the differences between the simulated world and the real world, where, as the researchers wrote in a paper describing their system, “even tiny mistakes can result in catastrophic outcomes.”

To enable “zero-shot” sim-to-real transfer, the neural net training in simulation uses an expert controller that knows exactly what’s going on to teach a “student controller” that has much less perfect knowledge. That is, the simulated sensory input that the student ends up using as it learns to follow the expert has been abstracted to present the kind of imperfect, imprecise data it’s going to encounter in the real world. This can involve things like abstracting away the image part of the simulation until you’d have no way of telling the difference between abstracted simulation and abstracted reality, which is what allows the system to make that sim-to-real leap.

The simulation environment that the researchers used was Gazebo, slightly modified to better simulate quadrotor physics. Meanwhile, over in reality, a custom 1.5-kilogram quadrotor with a 4:1 thrust to weight ratio performed the physical experiments, using only a Nvidia Jetson TX2 computing board and an Intel RealSense T265, a dual fisheye camera module optimized for V-SLAM. To challenge the learning system, it was trained to perform three acrobatic maneuvers plus a combo of all of them:

Image: University of Zurich/ETH Zurich/Intel

Reference trajectories for acrobatic maneuvers. Top row, from left: Power Loop, Barrel Roll, and Matty Flip. Bottom row: Combo.

All of these maneuvers require high accelerations of up to 3 g’s and careful control, and the Matty Flip is particularly challenging, at least for humans, because the whole thing is done while the drone is flying backwards. Still, after just a few hours of training in simulation, the drone was totally real-world competent at these tricks, and could even extrapolate a little bit to perform maneuvers that it was not explicitly trained on, like doing multiple loops in a row. Where humans still have the advantage over drones is (as you might expect since we’re talking about robots) is quickly reacting to novel or unexpected situations. And when you’re doing this sort of thing outdoors, novel and unexpected situations are everywhere, from a gust of wind to a jealous bird.

For more details, we spoke with Antonio Loquercio from the University of Zurich’s Robotics and Perception Group.

IEEE Spectrum: Can you explain how the abstraction layer interfaces with the simulated sensors to enable effective sim-to-real transfer?

Antonio Loquercio: The abstraction layer applies a specific function to the raw sensor information. Exactly the same function is applied to the real and simulated sensors. The result of the function, which is “abstracted sensor measurements,” makes simulated and real observation of the same scene similar. For example, suppose we have a sequence of simulated and real images. We can very easily tell apart the real from the simulated ones given the difference in rendering. But if we apply the abstraction function of “feature tracks,” which are point correspondences in time, it becomes very difficult to tell which are the simulated and real feature tracks, since point correspondences are independent of the rendering. This applies for humans as well as for neural networks: Training policies on raw images gives low sim-to-real transfer (since images are too different between domains), while training on the abstracted images has high transfer abilities.

How useful is visual input from a camera like the Intel RealSense T265 for state estimation during such aggressive maneuvers? Would using an event camera substantially improve state estimation?

Our end-to-end controller does not require a state estimation module. It shares however some components with traditional state estimation pipelines, specifically the feature extractor and the inertial measurement unit (IMU) pre-processing and integration function. The input of the neural networks are feature tracks and integrated IMU measurements. When looking at images with low features (for example when the camera points to the sky), the neural net will mainly rely on IMU. When more features are available, the network uses to correct the accumulated drift from IMU. Overall, we noticed that for very short maneuvers IMU measurements were sufficient for the task. However, for longer ones, visual information was necessary to successfully address the IMU drift and complete the maneuver. Indeed, visual information reduces the odds of a crash by up to 30 percent in the longest maneuvers. We definitely think that event camera can improve even more the current approach since they could provide valuable visual information during high speed.

“The Matty Flip is probably one of the maneuvers that our approach can do very well … It is super challenging for humans, since they don’t see where they’re going and have problems in estimating their speed. For our approach the maneuver is no problem at all, since we can estimate forward velocities as well as backward velocities.”
—Antonio Loquercio, University of Zurich

You describe being able to train on “maneuvers that stretch the abilities of even expert human pilots.” What are some examples of acrobatics that your drones might be able to do that most human pilots would not be capable of?

The Matty Flip is probably one of the maneuvers that our approach can do very well, but human pilots find very challenging. It basically entails doing a high speed power loop by always looking backward. It is super challenging for humans, since they don’t see where they’re going and have problems in estimating their speed. For our approach the maneuver is no problem at all, since we can estimate forward velocities as well as backward velocities.

What are the limits to the performance of this system?

At the moment the main limitation is the maneuver duration. We never trained a controller that could perform maneuvers longer than 20 seconds. In the future, we plan to address this limitation and train general controllers which can fly in that agile way for significantly longer with relatively small drift. In this way, we could start being competitive against human pilots in drone racing competitions.

Can you talk about how the techniques developed here could be applied beyond drone acrobatics?

The current approach allows us to do acrobatics and agile flight in free space. We are now working to perform agile flight in cluttered environments, which requires a higher degree of understanding of the surrounding with respect to this project. Drone acrobatics is of course only an example application. We selected it because it makes a stress test of the controller performance. However, several other applications which require fast and agile flight can benefit from our approach. Examples are delivery (we want our Amazon packets always faster, don’t we?), search and rescue, or inspection. Going faster allows us to cover more space in less time, saving battery costs. Indeed, agile flight has very similar battery consumption of slow hovering for an autonomous drone.

“Deep Drone Acrobatics,” by Elia Kaufmann, Antonio Loquercio, René Ranftl, Matthias Müller, Vladlen Koltun, and Davide Scaramuzza from the Robotics and Perception Group at the University of Zurich and ETH Zurich, and Intel’s Intelligent Systems Lab, was presented at RSS 2020. Continue reading

Posted in Human Robots

#437575 AI-Directed Robotic Hand Learns How to ...

Reaching for a nearby object seems like a mindless task, but the action requires a sophisticated neural network that took humans millions of years to evolve. Now, robots are acquiring that same ability using artificial neural networks. In a recent study, a robotic hand “learns” to pick up objects of different shapes and hardness using three different grasping motions.

The key to this development is something called a spiking neuron. Like real neurons in the brain, artificial neurons in a spiking neural network (SNN) fire together to encode and process temporal information. Researchers study SNNs because this approach may yield insights into how biological neural networks function, including our own.

“The programming of humanoid or bio-inspired robots is complex,” says Juan Camilo Vasquez Tieck, a research scientist at FZI Forschungszentrum Informatik in Karlsruhe, Germany. “And classical robotics programming methods are not always suitable to take advantage of their capabilities.”

Conventional robotic systems must perform extensive calculations, Tieck says, to track trajectories and grasp objects. But a robotic system like Tieck’s, which relies on a SNN, first trains its neural net to better model system and object motions. After which it grasps items more autonomously—by adapting to the motion in real-time.

The new robotic system by Tieck and his colleagues uses an existing robotic hand, called a Schunk SVH 5-finger hand, which has the same number of fingers and joints as a human hand.

The researchers incorporated a SNN into their system, which is divided into several sub-networks. One sub-network controls each finger individually, either flexing or extending the finger. Another concerns each type of grasping movement, for example whether the robotic hand will need to do a pinching, spherical or cylindrical movement.

For each finger, a neural circuit detects contact with an object using the currents of the motors and the velocity of the joints. When contact with an object is detected, a controller is activated to regulate how much force the finger exerts.

“This way, the movements of generic grasping motions are adapted to objects with different shapes, stiffness and sizes,” says Tieck. The system can also adapt its grasping motion quickly if the object moves or deforms.

The robotic grasping system is described in a study published October 24 in IEEE Robotics and Automation Letters. The researchers’ robotic hand used its three different grasping motions on objects without knowing their properties. Target objects included a plastic bottle, a soft ball, a tennis ball, a sponge, a rubber duck, different balloons, a pen, and a tissue pack. The researchers found, for one, that pinching motions required more precision than cylindrical or spherical grasping motions.

“For this approach, the next step is to incorporate visual information from event-based cameras and integrate arm motion with SNNs,” says Tieck. “Additionally, we would like to extend the hand with haptic sensors.”

The long-term goal, he says, is to develop “a system that can perform grasping similar to humans, without intensive planning for contact points or intense stability analysis, and [that is] able to adapt to different objects using visual and haptic feedback.” Continue reading

Posted in Human Robots

#437535 Unravelling the secrets of spider limb ...

Spider webs are engineering marvels constructed by eight-legged experts with 400 million years of accumulated know-how. Much can be learned from the building of the spider's gossamer net and the operation of its sticky trap. Amazingly, garden cross spiders can regenerate lost legs and use them immediately to build a web that is pitch-perfect, even though the new limb is much shorter than the one it replaced. This phenomenon has allowed scientists to probe the rules the animal uses to build its web and how it uses its legs as measuring sticks. Continue reading

Posted in Human Robots

#437109 This Week’s Awesome Tech Stories From ...

FUTURE
Why the Coronavirus Is So Confusing
Ed Yong | The Atlantic
“…beyond its vast scope and sui generis nature, there are other reasons the pandemic continues to be so befuddling—a slew of forces scientific and societal, epidemiological and epistemological. What follows is an analysis of those forces, and a guide to making sense of a problem that is now too big for any one person to fully comprehend.”

ARTIFICIAL INTELLIGENCE
Common Sense Comes Closer to Computers
John Pavlus | Quanta Magazine
“The problem of common-sense reasoning has plagued the field of artificial intelligence for over 50 years. Now a new approach, borrowing from two disparate lines of thinking, has made important progress.”

BIOTECH
Scientists Create Glowing Plants Using Bioluminescent Mushroom DNA
George Dvorsky | Gizmodo
“New research published today in Nature Biotechnology describes a new technique, in which the DNA from bioluminescent mushrooms was used to create plants that glow 10 times brighter than their bacteria-powered precursors. Botanists could eventually use this technique to study the inner workings of plants, but it also introduces the possibility of glowing ornamental plants for our homes.”

HEALTH
Old Drugs May Find a New Purpose: Fighting the Coronavirus
Carl Zimmer | The New York Times
“Driven by the pandemic’s spread, research teams have been screening thousands of drugs to see if they have this unexpected potential to fight the coronavirus. They’ve tested the drugs on dishes of cells, and a few dozen candidates have made the first cut.”

MACHINE LEARNING
OpenAI’s New Experiments in Music Generation Create an Uncanny Valley Elvis
Devin Coldewey | TechCrunch
“AI-generated music is a fascinating new field, and deep-pocketed research outfit OpenAI has hit new heights in it, creating recreations of songs in the style of Elvis, 2Pac and others. The results are convincing, but fall squarely in the unnerving ‘uncanny valley’ of audio, sounding rather like good, but drunk, karaoke heard through a haze of drugs.”

CULTURE
Neural Net-Generated Memes Are One of the Best Uses of AI on the Internet
Jay Peters | The Verge
“I’ve spent a good chunk of my workday so far creating memes thanks to this amazing website from Imgflip that automatically generates captions for memes using a neural network. …You can pick from 48 classic meme templates, including distracted boyfriend, Drake in ‘Hotline Bling,’ mocking Spongebob, surprised Pikachu, and Oprah giving things away.”

GENETICS
Can Genetic Engineering Bring Back the American Chestnut?
Gabriel Popkin | The New York Times Magazine
“The geneticists’ research forces conservationists to confront, in a new and sometimes discomfiting way, the prospect that repairing the natural world does not necessarily mean returning to an unblemished Eden. It may instead mean embracing a role that we’ve already assumed: engineers of everything, including nature.”

Image credit: Dan Gold / Unsplash Continue reading

Posted in Human Robots

#436559 This Is What an AI Said When Asked to ...

“What’s past is prologue.” So says the famed quote from Shakespeare’s The Tempest, alleging that we can look to what has already happened as an indication of what will happen next.

This idea could be interpreted as being rather bleak; are we doomed to repeat the errors of the past until we correct them? We certainly do need to learn and re-learn life lessons—whether in our work, relationships, finances, health, or other areas—in order to grow as people.

Zooming out, the same phenomenon exists on a much bigger scale—that of our collective human history. We like to think we’re improving as a species, but haven’t yet come close to doing away with the conflicts and injustices that plagued our ancestors.

Zooming back in (and lightening up) a little, what about the short-term future? What might happen over the course of this year, and what information would we use to make educated guesses about it?

The editorial team at The Economist took a unique approach to answering these questions. On top of their own projections for 2020, including possible scenarios in politics, economics, and the continued development of technologies like artificial intelligence, they looked to an AI to make predictions of its own. What it came up with is intriguing, and a little bit uncanny.

[For the full list of the questions and answers, read The Economist article].

An AI That Reads—Then Writes
Almost exactly a year ago, non-profit OpenAI announced it had built a neural network for natural language processing called GPT-2. The announcement was met with some controversy, as it included the caveat that the tool would not be immediately released to the public due to its potential for misuse. It was then released in phases over the course of several months.

GPT-2’s creators upped the bar on quality when training the neural net; rather than haphazardly feeding it low-quality text, they only used articles that got more than three upvotes on Reddit (admittedly, this doesn’t guarantee high quality across the board—but it’s something).

The training dataset consisted of 40GB of text. For context, 1GB of text is about 900,000 ASCII pages or 130,000 double-spaced Microsoft Word pages.

The tool has no understanding of the text it’s generating, of course. It uses language patterns and word sequences to draw statistical associations between words and phrases, building a sort of guidebook for itself (not unlike the grammar rules and vocabulary words you might study when trying to learn a foreign language). It then uses that guidebook to answer questions or predict what will come after a particular sequence of words.

GPT-2’s creators did note that when the model is trained on specifically-selected datasets for narrower applications, its replies become more convincing.

Some Light Editing
Though the Economist article detailing GPT-2’s predictions describes the text the model generated as “unedited,” this isn’t wholly accurate. In a Medium post, deputy editor Tom Standage describes the methodology he used, and it does involve a degree of tweaking.

Standage wrote an introductory paragraph, followed by his opening question (“Greetings, GPT-2. Have you done an interview before?”). He configured the tool to generate five responses to each subsequent question then selected the best answer, deleting any questions GPT-2 came up with as part of its response.

AI’s View of the World—and Itself
The questions Standage asked ranged from tech-related (the future of AI, technological unemployment, misuse of technology) to political and economic (re-election of Donald Trump, the US-China trade war). Here are a couple of his questions and GPT-2’s answers:

Q: Do fake news stories, generated using AI, pose a threat to democracy? Are we likely to see this tactic being used in the 2020 American presidential elections?
A: Yes, we’ll probably see them. It’s just a matter of when. Fake news stories are generally generated by political campaigns, and have the potential to have a huge impact on the outcome of an election. This is because they spread misinformation, often with no basis in fact.

Q: How do you see the economic outlook for 2020?
A: I think it is very unclear. The outlook for the UK economy is uncertain, but the American economy is very strong. I think we are going to have a lot of turbulence in the world economy.

While some of its answers seemed insightful—when asked about the future of AI, GPT-2 advised that we treat the technology like a tool and “put as much effort into developing [it] as necessary, rather than worrying that it’s going to harm us and destroy our lives”—many of the responses were quite vague. One answer, for example, mentions that there will “probably be major changes in China.” Are we talking political changes? Social? Economic? All of the above?

However, while nothing the AI wrote was particularly illuminating or new, it’s impressive that it was able to expound on the correct topic for each question, and in fully coherent English.

GPT-2 named itself—AI—as the most important disruptive technology to watch in 2020. This is telling; it’s coming from a synthesis of millions of pages of published articles. If GPT-2 thinks AI is more important than, say, gene editing or quantum computing, it could be right. Or it could simply be that AI is at peak hype, and as such more is being written about it than about other technologies.

Equally intriguing was GPT-2’s response when asked whether Donald Trump will win a second term: “I think he will not win a second term. I think he will be defeated in the general election.” Some deeper insight there would be great, but hey—we’ll take it.

Predicting Predictions
Since an AI can read and synthesize vast data sets much faster than we can, it’s being used to predict all kinds of things, from virus outbreaks to crime. But asking it to philosophize on the future based on the (Reddit-curated) past is new, and if you think about it, a pretty fascinating undertaking.

As GPT-2 and tools like it continually improve, we’ll likely see them making more—and better—predictions of the future. In the meantime, let’s hope that the new data these models are trained on—news of what’s happening this week, this month, this year—add to an already-present sense of optimism.

When asked if it had any advice for readers, GPT-2 replied, “The big projects that you think are impossible today are actually possible in the near future.”

Image Credit: Alexas_Fotos from Pixabay Continue reading

Posted in Human Robots