Tag Archives: turning
#439773 How the U.S. Army Is Turning Robots Into ...
This article is part of our special report on AI, “The Great AI Reckoning.”
“I should probably not be standing this close,” I think to myself, as the robot slowly approaches a large tree branch on the floor in front of me. It's not the size of the branch that makes me nervous—it's that the robot is operating autonomously, and that while I know what it's supposed to do, I'm not entirely sure what it will do. If everything works the way the roboticists at the U.S. Army Research Laboratory (ARL) in Adelphi, Md., expect, the robot will identify the branch, grasp it, and drag it out of the way. These folks know what they're doing, but I've spent enough time around robots that I take a small step backwards anyway.
The robot, named
RoMan, for Robotic Manipulator, is about the size of a large lawn mower, with a tracked base that helps it handle most kinds of terrain. At the front, it has a squat torso equipped with cameras and depth sensors, as well as a pair of arms that were harvested from a prototype disaster-response robot originally developed at NASA's Jet Propulsion Laboratory for a DARPA robotics competition. RoMan's job today is roadway clearing, a multistep task that ARL wants the robot to complete as autonomously as possible. Instead of instructing the robot to grasp specific objects in specific ways and move them to specific places, the operators tell RoMan to “go clear a path.” It's then up to the robot to make all the decisions necessary to achieve that objective.
The ability to make decisions autonomously is not just what makes robots useful, it's what makes robots
robots. We value robots for their ability to sense what's going on around them, make decisions based on that information, and then take useful actions without our input. In the past, robotic decision making followed highly structured rules—if you sense this, then do that. In structured environments like factories, this works well enough. But in chaotic, unfamiliar, or poorly defined settings, reliance on rules makes robots notoriously bad at dealing with anything that could not be precisely predicted and planned for in advance.
RoMan, along with many other robots including home vacuums, drones, and autonomous cars, handles the challenges of semistructured environments through artificial neural networks—a computing approach that loosely mimics the structure of neurons in biological brains. About a decade ago, artificial neural networks began to be applied to a wide variety of semistructured data that had previously been very difficult for computers running rules-based programming (generally referred to as symbolic reasoning) to interpret. Rather than recognizing specific data structures, an artificial neural network is able to recognize data patterns, identifying novel data that are similar (but not identical) to data that the network has encountered before. Indeed, part of the appeal of artificial neural networks is that they are trained by example, by letting the network ingest annotated data and learn its own system of pattern recognition. For neural networks with multiple layers of abstraction, this technique is called deep learning.
Even though humans are typically involved in the training process, and even though artificial neural networks were inspired by the neural networks in human brains, the kind of pattern recognition a deep learning system does is fundamentally different from the way humans see the world. It's often nearly impossible to understand the relationship between the data input into the system and the interpretation of the data that the system outputs. And that difference—the “black box” opacity of deep learning—poses a potential problem for robots like RoMan and for the Army Research Lab.
In chaotic, unfamiliar, or poorly defined settings, reliance on rules makes robots notoriously bad at dealing with anything that could not be precisely predicted and planned for in advance.
This opacity means that robots that rely on deep learning have to be used carefully. A deep-learning system is good at recognizing patterns, but lacks the world understanding that a human typically uses to make decisions, which is why such systems do best when their applications are well defined and narrow in scope. “When you have well-structured inputs and outputs, and you can encapsulate your problem in that kind of relationship, I think deep learning does very well,” says
Tom Howard, who directs the University of Rochester's Robotics and Artificial Intelligence Laboratory and has developed natural-language interaction algorithms for RoMan and other ground robots. “The question when programming an intelligent robot is, at what practical size do those deep-learning building blocks exist?” Howard explains that when you apply deep learning to higher-level problems, the number of possible inputs becomes very large, and solving problems at that scale can be challenging. And the potential consequences of unexpected or unexplainable behavior are much more significant when that behavior is manifested through a 170-kilogram two-armed military robot.
After a couple of minutes, RoMan hasn't moved—it's still sitting there, pondering the tree branch, arms poised like a praying mantis. For the last 10 years, the Army Research Lab's Robotics Collaborative Technology Alliance (RCTA) has been working with roboticists from Carnegie Mellon University, Florida State University, General Dynamics Land Systems, JPL, MIT, QinetiQ North America, University of Central Florida, the University of Pennsylvania, and other top research institutions to develop robot autonomy for use in future ground-combat vehicles. RoMan is one part of that process.
The “go clear a path” task that RoMan is slowly thinking through is difficult for a robot because the task is so abstract. RoMan needs to identify objects that might be blocking the path, reason about the physical properties of those objects, figure out how to grasp them and what kind of manipulation technique might be best to apply (like pushing, pulling, or lifting), and then make it happen. That's a lot of steps and a lot of unknowns for a robot with a limited understanding of the world.
This limited understanding is where the ARL robots begin to differ from other robots that rely on deep learning, says Ethan Stump, chief scientist of the AI for Maneuver and Mobility program at ARL. “The Army can be called upon to operate basically anywhere in the world. We do not have a mechanism for collecting data in all the different domains in which we might be operating. We may be deployed to some unknown forest on the other side of the world, but we'll be expected to perform just as well as we would in our own backyard,” he says. Most deep-learning systems function reliably only within the domains and environments in which they've been trained. Even if the domain is something like “every drivable road in San Francisco,” the robot will do fine, because that's a data set that has already been collected. But, Stump says, that's not an option for the military. If an Army deep-learning system doesn't perform well, they can't simply solve the problem by collecting more data.
ARL's robots also need to have a broad awareness of what they're doing. “In a standard operations order for a mission, you have goals, constraints, a paragraph on the commander's intent—basically a narrative of the purpose of the mission—which provides contextual info that humans can interpret and gives them the structure for when they need to make decisions and when they need to improvise,” Stump explains. In other words, RoMan may need to clear a path quickly, or it may need to clear a path quietly, depending on the mission's broader objectives. That's a big ask for even the most advanced robot. “I can't think of a deep-learning approach that can deal with this kind of information,” Stump says.
Robots at the Army Research Lab test autonomous navigation techniques in rough terrain [top, middle] with the goal of being able to keep up with their human teammates. ARL is also developing robots with manipulation capabilities [bottom] that can interact with objects so that humans don't have to.Evan Ackerman
While I watch, RoMan is reset for a second try at branch removal. ARL's approach to autonomy is modular, where deep learning is combined with other techniques, and the robot is helping ARL figure out which tasks are appropriate for which techniques. At the moment, RoMan is testing two different ways of identifying objects from 3D sensor data: UPenn's approach is deep-learning-based, while Carnegie Mellon is using a method called perception through search, which relies on a more traditional database of 3D models. Perception through search works only if you know exactly which objects you're looking for in advance, but training is much faster since you need only a single model per object. It can also be more accurate when perception of the object is difficult—if the object is partially hidden or upside-down, for example. ARL is testing these strategies to determine which is the most versatile and effective, letting them run simultaneously and compete against each other.
Perception is one of the things that deep learning tends to excel at. “The computer vision community has made crazy progress using deep learning for this stuff,” says Maggie Wigness, a computer scientist at ARL. “We've had good success with some of these models that were trained in one environment generalizing to a new environment, and we intend to keep using deep learning for these sorts of tasks, because it's the state of the art.”
ARL's modular approach might combine several techniques in ways that leverage their particular strengths. For example, a perception system that uses deep-learning-based vision to classify terrain could work alongside an autonomous driving system based on an approach called inverse reinforcement learning, where the model can rapidly be created or refined by observations from human soldiers. Traditional reinforcement learning optimizes a solution based on established reward functions, and is often applied when you're not necessarily sure what optimal behavior looks like. This is less of a concern for the Army, which can generally assume that well-trained humans will be nearby to show a robot the right way to do things. “When we deploy these robots, things can change very quickly,” Wigness says. “So we wanted a technique where we could have a soldier intervene, and with just a few examples from a user in the field, we can update the system if we need a new behavior.” A deep-learning technique would require “a lot more data and time,” she says.
It's not just data-sparse problems and fast adaptation that deep learning struggles with. There are also questions of robustness, explainability, and safety. “These questions aren't unique to the military,” says Stump, “but it's especially important when we're talking about systems that may incorporate lethality.” To be clear, ARL is not currently working on lethal autonomous weapons systems, but the lab is helping to lay the groundwork for autonomous systems in the U.S. military more broadly, which means considering ways in which such systems may be used in the future.
The requirements of a deep network are to a large extent misaligned with the requirements of an Army mission, and that's a problem.
Safety is an obvious priority, and yet there isn't a clear way of making a deep-learning system verifiably safe, according to Stump. “Doing deep learning with safety constraints is a major research effort. It's hard to add those constraints into the system, because you don't know where the constraints already in the system came from. So when the mission changes, or the context changes, it's hard to deal with that. It's not even a data question; it's an architecture question.” ARL's modular architecture, whether it's a perception module that uses deep learning or an autonomous driving module that uses inverse reinforcement learning or something else, can form parts of a broader autonomous system that incorporates the kinds of safety and adaptability that the military requires. Other modules in the system can operate at a higher level, using different techniques that are more verifiable or explainable and that can step in to protect the overall system from adverse unpredictable behaviors. “If other information comes in and changes what we need to do, there's a hierarchy there,” Stump says. “It all happens in a rational way.”
Nicholas Roy, who leads the Robust Robotics Group at MIT and describes himself as “somewhat of a rabble-rouser” due to his skepticism of some of the claims made about the power of deep learning, agrees with the ARL roboticists that deep-learning approaches often can't handle the kinds of challenges that the Army has to be prepared for. “The Army is always entering new environments, and the adversary is always going to be trying to change the environment so that the training process the robots went through simply won't match what they're seeing,” Roy says. “So the requirements of a deep network are to a large extent misaligned with the requirements of an Army mission, and that's a problem.”
Roy, who has worked on abstract reasoning for ground robots as part of the RCTA, emphasizes that deep learning is a useful technology when applied to problems with clear functional relationships, but when you start looking at abstract concepts, it's not clear whether deep learning is a viable approach. “I'm very interested in finding how neural networks and deep learning could be assembled in a way that supports higher-level reasoning,” Roy says. “I think it comes down to the notion of combining multiple low-level neural networks to express higher level concepts, and I do not believe that we understand how to do that yet.” Roy gives the example of using two separate neural networks, one to detect objects that are cars and the other to detect objects that are red. It's harder to combine those two networks into one larger network that detects red cars than it would be if you were using a symbolic reasoning system based on structured rules with logical relationships. “Lots of people are working on this, but I haven't seen a real success that drives abstract reasoning of this kind.”
For the foreseeable future, ARL is making sure that its autonomous systems are safe and robust by keeping humans around for both higher-level reasoning and occasional low-level advice. Humans might not be directly in the loop at all times, but the idea is that humans and robots are more effective when working together as a team. When the most recent phase of the Robotics Collaborative Technology Alliance program began in 2009, Stump says, “we'd already had many years of being in Iraq and Afghanistan, where robots were often used as tools. We've been trying to figure out what we can do to transition robots from tools to acting more as teammates within the squad.”
RoMan gets a little bit of help when a human supervisor points out a region of the branch where grasping might be most effective. The robot doesn't have any fundamental knowledge about what a tree branch actually is, and this lack of world knowledge (what we think of as common sense) is a fundamental problem with autonomous systems of all kinds. Having a human leverage our vast experience into a small amount of guidance can make RoMan's job much easier. And indeed, this time RoMan manages to successfully grasp the branch and noisily haul it across the room.
Turning a robot into a good teammate can be difficult, because it can be tricky to find the right amount of autonomy. Too little and it would take most or all of the focus of one human to manage one robot, which may be appropriate in special situations like explosive-ordnance disposal but is otherwise not efficient. Too much autonomy and you'd start to have issues with trust, safety, and explainability.
“I think the level that we're looking for here is for robots to operate on the level of working dogs,” explains Stump. “They understand exactly what we need them to do in limited circumstances, they have a small amount of flexibility and creativity if they are faced with novel circumstances, but we don't expect them to do creative problem-solving. And if they need help, they fall back on us.”
RoMan is not likely to find itself out in the field on a mission anytime soon, even as part of a team with humans. It's very much a research platform. But the software being developed for RoMan and other robots at ARL, called Adaptive Planner Parameter Learning (APPL), will likely be used first in autonomous driving, and later in more complex robotic systems that could include mobile manipulators like RoMan. APPL combines different machine-learning techniques (including inverse reinforcement learning and deep learning) arranged hierarchically underneath classical autonomous navigation systems. That allows high-level goals and constraints to be applied on top of lower-level programming. Humans can use teleoperated demonstrations, corrective interventions, and evaluative feedback to help robots adjust to new environments, while the robots can use unsupervised reinforcement learning to adjust their behavior parameters on the fly. The result is an autonomy system that can enjoy many of the benefits of machine learning, while also providing the kind of safety and explainability that the Army needs. With APPL, a learning-based system like RoMan can operate in predictable ways even under uncertainty, falling back on human tuning or human demonstration if it ends up in an environment that's too different from what it trained on.
It's tempting to look at the rapid progress of commercial and industrial autonomous systems (autonomous cars being just one example) and wonder why the Army seems to be somewhat behind the state of the art. But as Stump finds himself having to explain to Army generals, when it comes to autonomous systems, “there are lots of hard problems, but industry's hard problems are different from the Army's hard problems.” The Army doesn't have the luxury of operating its robots in structured environments with lots of data, which is why ARL has put so much effort into APPL, and into maintaining a place for humans. Going forward, humans are likely to remain a key part of the autonomous framework that ARL is developing. “That's what we're trying to build with our robotics systems,” Stump says. “That's our bumper sticker: 'From tools to teammates.' ”
This article appears in the October 2021 print issue as “Deep Learning Goes to Boot Camp.”
Special Report: The Great AI Reckoning
READ NEXT:
7 Revealing Ways AIs Fail
Or see the full report for more articles on the future of AI. Continue reading
#438982 Quantum Computing and Reinforcement ...
Deep reinforcement learning is having a superstar moment.
Powering smarter robots. Simulating human neural networks. Trouncing physicians at medical diagnoses and crushing humanity’s best gamers at Go and Atari. While far from achieving the flexible, quick thinking that comes naturally to humans, this powerful machine learning idea seems unstoppable as a harbinger of better thinking machines.
Except there’s a massive roadblock: they take forever to run. Because the concept behind these algorithms is based on trial and error, a reinforcement learning AI “agent” only learns after being rewarded for its correct decisions. For complex problems, the time it takes an AI agent to try and fail to learn a solution can quickly become untenable.
But what if you could try multiple solutions at once?
This week, an international collaboration led by Dr. Philip Walther at the University of Vienna took the “classic” concept of reinforcement learning and gave it a quantum spin. They designed a hybrid AI that relies on both quantum and run-of-the-mill classic computing, and showed that—thanks to quantum quirkiness—it could simultaneously screen a handful of different ways to solve a problem.
The result is a reinforcement learning AI that learned over 60 percent faster than its non-quantum-enabled peers. This is one of the first tests that shows adding quantum computing can speed up the actual learning process of an AI agent, the authors explained.
Although only challenged with a “toy problem” in the study, the hybrid AI, once scaled, could impact real-world problems such as building an efficient quantum internet. The setup “could readily be integrated within future large-scale quantum communication networks,” the authors wrote.
The Bottleneck
Learning from trial and error comes intuitively to our brains.
Say you’re trying to navigate a new convoluted campground without a map. The goal is to get from the communal bathroom back to your campsite. Dead ends and confusing loops abound. We tackle the problem by deciding to turn either left or right at every branch in the road. One will get us closer to the goal; the other leads to a half hour of walking in circles. Eventually, our brain chemistry rewards correct decisions, so we gradually learn the correct route. (If you’re wondering…yeah, true story.)
Reinforcement learning AI agents operate in a similar trial-and-error way. As a problem becomes more complex, the number—and time—of each trial also skyrockets.
“Even in a moderately realistic environment, it may simply take too long to rationally respond to a given situation,” explained study author Dr. Hans Briegel at the Universität Innsbruck in Austria, who previously led efforts to speed up AI decision-making using quantum mechanics. If there’s pressure that allows “only a certain time for a response, an agent may then be unable to cope with the situation and to learn at all,” he wrote.
Many attempts have tried speeding up reinforcement learning. Giving the AI agent a short-term “memory.” Tapping into neuromorphic computing, which better resembles the brain. In 2014, Briegel and colleagues showed that a “quantum brain” of sorts can help propel an AI agent’s decision-making process after learning. But speeding up the learning process itself has eluded our best attempts.
The Hybrid AI
The new study went straight for that previously untenable jugular.
The team’s key insight was to tap into the best of both worlds—quantum and classical computing. Rather than building an entire reinforcement learning system using quantum mechanics, they turned to a hybrid approach that could prove to be more practical. Here, the AI agent uses quantum weirdness as it’s trying out new approaches—the “trial” in trial and error. The system then passes the baton to a classical computer to give the AI its reward—or not—based on its performance.
At the heart of the quantum “trial” process is a quirk called superposition. Stay with me. Our computers are powered by electrons, which can represent only two states—0 or 1. Quantum mechanics is far weirder, in that photons (particles of light) can simultaneously be both 0 and 1, with a slightly different probability of “leaning towards” one or the other.
This noncommittal oddity is part of what makes quantum computing so powerful. Take our reinforcement learning example of navigating a new campsite. In our classic world, we—and our AI—need to decide between turning left or right at an intersection. In a quantum setup, however, the AI can (in a sense) turn left and right at the same time. So when searching for the correct path back to home base, the quantum system has a leg up in that it can simultaneously explore multiple routes, making it far faster than conventional, consecutive trail and error.
“As a consequence, an agent that can explore its environment in superposition will learn significantly faster than its classical counterpart,” said Briegel.
It’s not all theory. To test out their idea, the team turned to a programmable chip called a nanophotonic processor. Think of it as a CPU-like computer chip, but it processes particles of light—photons—rather than electricity. These light-powered chips have been a long time in the making. Back in 2017, for example, a team from MIT built a fully optical neural network into an optical chip to bolster deep learning.
The chips aren’t all that exotic. Nanophotonic processors act kind of like our eyeglasses, which can carry out complex calculations that transform light that passes through them. In the glasses case, they let people see better. For a light-based computer chip, it allows computation. Rather than using electrical cables, the chips use “wave guides” to shuttle photons and perform calculations based on their interactions.
The “error” or “reward” part of the new hardware comes from a classical computer. The nanophotonic processor is coupled to a traditional computer, where the latter provides the quantum circuit with feedback—that is, whether to reward a solution or not. This setup, the team explains, allows them to more objectively judge any speed-ups in learning in real time.
In this way, a hybrid reinforcement learning agent alternates between quantum and classical computing, trying out ideas in wibbly-wobbly “multiverse” land while obtaining feedback in grounded, classic physics “normality.”
A Quantum Boost
In simulations using 10,000 AI agents and actual experimental data from 165 trials, the hybrid approach, when challenged with a more complex problem, showed a clear leg up.
The key word is “complex.” The team found that if an AI agent has a high chance of figuring out the solution anyway—as for a simple problem—then classical computing works pretty well. The quantum advantage blossoms when the task becomes more complex or difficult, allowing quantum mechanics to fully flex its superposition muscles. For these problems, the hybrid AI was 63 percent faster at learning a solution compared to traditional reinforcement learning, decreasing its learning effort from 270 guesses to 100.
Now that scientists have shown a quantum boost for reinforcement learning speeds, the race for next-generation computing is even more lit. Photonics hardware required for long-range light-based communications is rapidly shrinking, while improving signal quality. The partial-quantum setup could “aid specifically in problems where frequent search is needed, for example, network routing problems” that’s prevalent for a smooth-running internet, the authors wrote. With a quantum boost, reinforcement learning may be able to tackle far more complex problems—those in the real world—than currently possible.
“We are just at the beginning of understanding the possibilities of quantum artificial intelligence,” said lead author Walther.
Image Credit: Oleg Gamulinskiy from Pixabay Continue reading
#437964 How Explainable Artificial Intelligence ...
The field of artificial intelligence has created computers that can drive cars, synthesize chemical compounds, fold proteins, and detect high-energy particles at a superhuman level.
However, these AI algorithms cannot explain the thought processes behind their decisions. A computer that masters protein folding and also tells researchers more about the rules of biology is much more useful than a computer that folds proteins without explanation.
Therefore, AI researchers like me are now turning our efforts toward developing AI algorithms that can explain themselves in a manner that humans can understand. If we can do this, I believe that AI will be able to uncover and teach people new facts about the world that have not yet been discovered, leading to new innovations.
Learning From Experience
One field of AI, called reinforcement learning, studies how computers can learn from their own experiences. In reinforcement learning, an AI explores the world, receiving positive or negative feedback based on its actions.
This approach has led to algorithms that have independently learned to play chess at a superhuman level and prove mathematical theorems without any human guidance. In my work as an AI researcher, I use reinforcement learning to create AI algorithms that learn how to solve puzzles such as the Rubik’s Cube.
Through reinforcement learning, AIs are independently learning to solve problems that even humans struggle to figure out. This has got me and many other researchers thinking less about what AI can learn and more about what humans can learn from AI. A computer that can solve the Rubik’s Cube should be able to teach people how to solve it, too.
Peering Into the Black Box
Unfortunately, the minds of superhuman AIs are currently out of reach to us humans. AIs make terrible teachers and are what we in the computer science world call “black boxes.”
AI simply spits out solutions without giving reasons for its solutions. Computer scientists have been trying for decades to open this black box, and recent research has shown that many AI algorithms actually do think in ways that are similar to humans. For example, a computer trained to recognize animals will learn about different types of eyes and ears and will put this information together to correctly identify the animal.
The effort to open up the black box is called explainable AI. My research group at the AI Institute at the University of South Carolina is interested in developing explainable AI. To accomplish this, we work heavily with the Rubik’s Cube.
The Rubik’s Cube is basically a pathfinding problem: Find a path from point A—a scrambled Rubik’s Cube—to point B—a solved Rubik’s Cube. Other pathfinding problems include navigation, theorem proving and chemical synthesis.
My lab has set up a website where anyone can see how our AI algorithm solves the Rubik’s Cube; however, a person would be hard-pressed to learn how to solve the cube from this website. This is because the computer cannot tell you the logic behind its solutions.
Solutions to the Rubik’s Cube can be broken down into a few generalized steps—the first step, for example, could be to form a cross while the second step could be to put the corner pieces in place. While the Rubik’s Cube itself has over 10 to the 19th power possible combinations, a generalized step-by-step guide is very easy to remember and is applicable in many different scenarios.
Approaching a problem by breaking it down into steps is often the default manner in which people explain things to one another. The Rubik’s Cube naturally fits into this step-by-step framework, which gives us the opportunity to open the black box of our algorithm more easily. Creating AI algorithms that have this ability could allow people to collaborate with AI and break down a wide variety of complex problems into easy-to-understand steps.
A step-by-step refinement approach can make it easier for humans to understand why AIs do the things they do. Forest Agostinelli, CC BY-ND
Collaboration Leads to Innovation
Our process starts with using one’s own intuition to define a step-by-step plan thought to potentially solve a complex problem. The algorithm then looks at each individual step and gives feedback about which steps are possible, which are impossible and ways the plan could be improved. The human then refines the initial plan using the advice from the AI, and the process repeats until the problem is solved. The hope is that the person and the AI will eventually converge to a kind of mutual understanding.
Currently, our algorithm is able to consider a human plan for solving the Rubik’s Cube, suggest improvements to the plan, recognize plans that do not work and find alternatives that do. In doing so, it gives feedback that leads to a step-by-step plan for solving the Rubik’s Cube that a person can understand. Our team’s next step is to build an intuitive interface that will allow our algorithm to teach people how to solve the Rubik’s Cube. Our hope is to generalize this approach to a wide range of pathfinding problems.
People are intuitive in a way unmatched by any AI, but machines are far better in their computational power and algorithmic rigor. This back and forth between man and machine utilizes the strengths from both. I believe this type of collaboration will shed light on previously unsolved problems in everything from chemistry to mathematics, leading to new solutions, intuitions and innovations that may have, otherwise, been out of reach.
This article is republished from The Conversation under a Creative Commons license. Read the original article.
Image Credit: Serg Antonov / Unsplash Continue reading