Tag Archives: Handle
#437227 DIGIT: A high-resolution tactile sensor ...
To assist humans in completing manual chores or tasks, robots must efficiently grasp and manipulate objects in their surroundings. While in recent years robotics researchers have developed a growing number of techniques that allow robots to pick up and handle objects, most of these only proved to be effective when tackling very basic tasks, such as picking up an object or moving it from one place to another. Continue reading
#436176 We’re Making Progress in Explainable ...
Machine learning algorithms are starting to exceed human performance in many narrow and specific domains, such as image recognition and certain types of medical diagnoses. They’re also rapidly improving in more complex domains such as generating eerily human-like text. We increasingly rely on machine learning algorithms to make decisions on a wide range of topics, from what we collectively spend billions of hours watching to who gets the job.
But machine learning algorithms cannot explain the decisions they make.
How can we justify putting these systems in charge of decisions that affect people’s lives if we don’t understand how they’re arriving at those decisions?
This desire to get more than raw numbers from machine learning algorithms has led to a renewed focus on explainable AI: algorithms that can make a decision or take an action, and tell you the reasons behind it.
What Makes You Say That?
In some circumstances, you can see a road to explainable AI already. Take OpenAI’s GTP-2 model, or IBM’s Project Debater. Both of these generate text based on a large corpus of training data, and try to make it as relevant as possible to the prompt that’s given. If these models were also able to provide a quick run-down of the top few sources in that corpus of training data they were drawing information from, it may be easier to understand where the “argument” (or poetic essay about unicorns) was coming from.
This is similar to the approach Google is now looking at for its image classifiers. Many algorithms are more sensitive to textures and the relationship between adjacent pixels in an image, rather than recognizing objects by their outlines as humans do. This leads to strange results: some algorithms can happily identify a totally scrambled image of a polar bear, but not a polar bear silhouette.
Previous attempts to make image classifiers explainable relied on significance mapping. In this method, the algorithm would highlight the areas of the image that contributed the most statistical weight to making the decision. This is usually determined by changing groups of pixels in the image and seeing which contribute to the biggest change in the algorithm’s impression of what the image is. For example, if the algorithm is trying to recognize a stop sign, changing the background is unlikely to be as important as changing the sign.
Google’s new approach changes the way that its algorithm recognizes objects, by examining them at several different resolutions and searching for matches to different “sub-objects” within the main object. You or I might recognize an ambulance from its flashing lights, its tires, and its logo; we might zoom in on the basketball held by an NBA player to deduce their occupation, and so on. By linking the overall categorization of an image to these “concepts,” the algorithm can explain its decision: I categorized this as a cat because of its tail and whiskers.
Even in this experiment, though, the “psychology” of the algorithm in decision-making is counter-intuitive. For example, in the basketball case, the most important factor in making the decision was actually the player’s jerseys rather than the basketball.
Can You Explain What You Don’t Understand?
While it may seem trivial, the conflict here is a fundamental one in approaches to artificial intelligence. Namely, how far can you get with mere statistical associations between huge sets of data, and how much do you need to introduce abstract concepts for real intelligence to arise?
At one end of the spectrum, Good Old-Fashioned AI or GOFAI dreamed up machines that would be entirely based on symbolic logic. The machine would be hard-coded with the concept of a dog, a flower, cars, and so forth, alongside all of the symbolic “rules” which we internalize, allowing us to distinguish between dogs, flowers, and cars. (You can imagine a similar approach to a conversational AI would teach it words and strict grammatical structures from the top down, rather than “learning” languages from statistical associations between letters and words in training data, as GPT-2 broadly does.)
Such a system would be able to explain itself, because it would deal in high-level, human-understandable concepts. The equation is closer to: “ball” + “stitches” + “white” = “baseball”, rather than a set of millions of numbers linking various pathways together. There are elements of GOFAI in Google’s new approach to explaining its image recognition: the new algorithm can recognize objects based on the sub-objects they contain. To do this, it requires at least a rudimentary understanding of what those sub-objects look like, and the rules that link objects to sub-objects, such as “cats have whiskers.”
The issue, of course, is the—maybe impossible—labor-intensive task of defining all these symbolic concepts and every conceivable rule that could possibly link them together by hand. The difficulty of creating systems like this, which could handle the “combinatorial explosion” present in reality, helped to lead to the first AI winter.
Meanwhile, neural networks rely on training themselves on vast sets of data. Without the “labeling” of supervised learning, this process might bear no relation to any concepts a human could understand (and therefore be utterly inexplicable).
Somewhere between these two, hope explainable AI enthusiasts, is a happy medium that can crunch colossal amounts of data, giving us all of the benefits that recent, neural-network AI has bestowed, while showing its working in terms that humans can understand.
Image Credit: Image by Seanbatty from Pixabay Continue reading
#436140 Let’s Build Robots That Are as Smart ...
Illustration: Nicholas Little
Let’s face it: Robots are dumb. At best they are idiot savants, capable of doing one thing really well. In general, even those robots require specialized environments in which to do their one thing really well. This is why autonomous cars or robots for home health care are so difficult to build. They’ll need to react to an uncountable number of situations, and they’ll need a generalized understanding of the world in order to navigate them all.
Babies as young as two months already understand that an unsupported object will fall, while five-month-old babies know materials like sand and water will pour from a container rather than plop out as a single chunk. Robots lack these understandings, which hinders them as they try to navigate the world without a prescribed task and movement.
But we could see robots with a generalized understanding of the world (and the processing power required to wield it) thanks to the video-game industry. Researchers are bringing physics engines—the software that provides real-time physical interactions in complex video-game worlds—to robotics. The goal is to develop robots’ understanding in order to learn about the world in the same way babies do.
Giving robots a baby’s sense of physics helps them navigate the real world and can even save on computing power, according to Lochlainn Wilson, the CEO of SE4, a Japanese company building robots that could operate on Mars. SE4 plans to avoid the problems of latency caused by distance from Earth to Mars by building robots that can operate independently for a few hours before receiving more instructions from Earth.
Wilson says that his company uses simple physics engines such as PhysX to help build more-independent robots. He adds that if you can tie a physics engine to a coprocessor on the robot, the real-time basic physics intuitions won’t take compute cycles away from the robot’s primary processor, which will often be focused on a more complicated task.
Wilson’s firm occasionally still turns to a traditional graphics engine, such as Unity or the Unreal Engine, to handle the demands of a robot’s movement. In certain cases, however, such as a robot accounting for friction or understanding force, you really need a robust physics engine, Wilson says, not a graphics engine that simply simulates a virtual environment. For his projects, he often turns to the open-source Bullet Physics engine built by Erwin Coumans, who is now an employee at Google.
Bullet is a popular physics-engine option, but it isn’t the only one out there. Nvidia Corp., for example, has realized that its gaming and physics engines are well-placed to handle the computing demands required by robots. In a lab in Seattle, Nvidia is working with teams from the University of Washington to build kitchen robots, fully articulated robot hands and more, all equipped with Nvidia’s tech.
When I visited the lab, I watched a robot arm move boxes of food from counters to cabinets. That’s fairly straightforward, but that same robot arm could avoid my body if I got in its way, and it could adapt if I moved a box of food or dropped it onto the floor.
The robot could also understand that less pressure is needed to grasp something like a cardboard box of Cheez-It crackers versus something more durable like an aluminum can of tomato soup.
Nvidia’s silicon has already helped advance the fields of artificial intelligence and computer vision by making it possible to process multiple decisions in parallel. It’s possible that the company’s new focus on virtual worlds will help advance the field of robotics and teach robots to think like babies.
This article appears in the November 2019 print issue as “Robots as Smart as Babies.” Continue reading