Tag Archives: box
Robots have been masters of manufacturing at speed and precision for decades, but give them a seemingly simple task like stacking shelves, and they quickly get stuck. That’s changing, though, as engineers build systems that can take on the deceptively tricky tasks most humans can do with their eyes closed.
Boston Dynamics is famous for dramatic reveals of robots performing mind-blowing feats that also leave you scratching your head as to what the market is—think the bipedal Atlas doing backflips or Spot the galloping robot dog.
Last week, the company released a video of a robot called Handle that looks like an ostrich on wheels carrying out the seemingly mundane task of stacking boxes in a warehouse.
It might seem like a step backward, but this is exactly the kind of practical task robots have long struggled with. While the speed and precision of industrial robots has seen them take over many functions in modern factories, they’re generally limited to highly prescribed tasks carried out in meticulously-controlled environments.
That’s because despite their mechanical sophistication, most are still surprisingly dumb. They can carry out precision welding on a car or rapidly assemble electronics, but only by rigidly following a prescribed set of motions. Moving cardboard boxes around a warehouse might seem simple to a human, but it actually involves a variety of tasks machines still find pretty difficult—perceiving your surroundings, navigating, and interacting with objects in a dynamic environment.
But the release of this video suggests Boston Dynamics thinks these kinds of applications are close to prime time. Last week the company doubled down by announcing the acquisition of start-up Kinema Systems, which builds computer vision systems for robots working in warehouses.
It’s not the only company making strides in this area. On the same day the video went live, Google unveiled a robot arm called TossingBot that can pick random objects from a box and quickly toss them into another container beyond its reach, which could prove very useful for sorting items in a warehouse. The machine can train on new objects in just an hour or two, and can pick and toss up to 500 items an hour with better accuracy than any of the humans who tried the task.
And an apple-picking robot built by Abundant Robotics is currently on New Zealand farms navigating between rows of apple trees using LIDAR and computer vision to single out ripe apples before using a vacuum tube to suck them off the tree.
In most cases, advances in machine learning and computer vision brought about by the recent AI boom are the keys to these rapidly improving capabilities. Robots have historically had to be painstakingly programmed by humans to solve each new task, but deep learning is making it possible for them to quickly train themselves on a variety of perception, navigation, and dexterity tasks.
It’s not been simple, though, and the application of deep learning in robotics has lagged behind other areas. A major limitation is that the process typically requires huge amounts of training data. That’s fine when you’re dealing with image classification, but when that data needs to be generated by real-world robots it can make the approach impractical. Simulations offer the possibility to run this training faster than real time, but it’s proved difficult to translate policies learned in virtual environments into the real world.
Recent years have seen significant progress on these fronts, though, and the increasing integration of modern machine learning with robotics. In October, OpenAI imbued a robotic hand with human-level dexterity by training an algorithm in a simulation using reinforcement learning before transferring it to the real-world device. The key to ensuring the translation went smoothly was injecting random noise into the simulation to mimic some of the unpredictability of the real world.
And just a couple of weeks ago, MIT researchers demonstrated a new technique that let a robot arm learn to manipulate new objects with far less training data than is usually required. By getting the algorithm to focus on a few key points on the object necessary for picking it up, the system could learn to pick up a previously unseen object after seeing only a few dozen examples (rather than the hundreds or thousands typically required).
How quickly these innovations will trickle down to practical applications remains to be seen, but a number of startups as well as logistics behemoth Amazon are developing robots designed to flexibly pick and place the wide variety of items found in your average warehouse.
Whether the economics of using robots to replace humans at these kinds of menial tasks makes sense yet is still unclear. The collapse of collaborative robotics pioneer Rethink Robotics last year suggests there are still plenty of challenges.
But at the same time, the number of robotic warehouses is expected to leap from 4,000 today to 50,000 by 2025. It may not be long until robots are muscling in on tasks we’ve long assumed only humans could do.
Image Credit: Visual Generation / Shutterstock.com Continue reading
Dr. Been Kim wants to rip open the black box of deep learning.
A senior researcher at Google Brain, Kim specializes in a sort of AI psychology. Like cognitive psychologists before her, she develops various ways to probe the alien minds of artificial neural networks (ANNs), digging into their gory details to better understand the models and their responses to inputs.
The more interpretable ANNs are, the reasoning goes, the easier it is to reveal potential flaws in their reasoning. And if we understand when or why our systems choke, we’ll know when not to use them—a foundation for building responsible AI.
There are already several ways to tap into ANN reasoning, but Kim’s inspiration for unraveling the AI black box came from an entirely different field: cognitive psychology. The field aims to discover fundamental rules of how the human mind—essentially also a tantalizing black box—operates, Kim wrote with her colleagues.
In a new paper uploaded to the pre-publication server arXiv, the team described a way to essentially perform a human cognitive test on ANNs. The test probes how we automatically complete gaps in what we see, so that they form entire objects—for example, perceiving a circle from a bunch of loose dots arranged along a clock face. Psychologist dub this the “law of completion,” a highly influential idea that led to explanations of how our minds generalize data into concepts.
Because deep neural networks in machine vision loosely mimic the structure and connections of the visual cortex, the authors naturally asked: do ANNs also exhibit the law of completion? And what does that tell us about how an AI thinks?
Enter the Germans
The law of completion is part of a series of ideas from Gestalt psychology. Back in the 1920s, long before the advent of modern neuroscience, a group of German experimental psychologists asked: in this chaotic, flashy, unpredictable world, how do we piece together input in a way that leads to meaningful perceptions?
The result is a group of principles known together as the Gestalt effect: that the mind self-organizes to form a global whole. In the more famous words of Gestalt psychologist Kurt Koffka, our perception forms a whole that’s “something else than the sum of its parts.” Not greater than; just different.
Although the theory has its critics, subsequent studies in humans and animals suggest that the law of completion happens on both the cognitive and neuroanatomical level.
Take a look at the drawing below. You immediately “see” a shape that’s actually the negative: a triangle or a square (A and B). Or you further perceive a 3D ball (C), or a snake-like squiggle (D). Your mind fills in blank spots, so that the final perception is more than just the black shapes you’re explicitly given.
Image Credit: Wikimedia Commons contributors, the free media repository.
Neuroscientists now think that the effect comes from how our visual system processes information. Arranged in multiple layers and columns, lower-level neurons—those first to wrangle the data—tend to extract simpler features such as lines or angles. In Gestalt speak, they “see” the parts.
Then, layer by layer, perception becomes more abstract, until higher levels of the visual system directly interpret faces or objects—or things that don’t really exist. That is, the “whole” emerges.
The Experiment Setup
Inspired by these classical experiments, Kim and team developed a protocol to test the Gestalt effect on feed-forward ANNs: one simple, the other, dubbed the “Inception V3,” far more complex and widely used in the machine vision community.
The main idea is similar to the triangle drawings above. First, the team generated three datasets: one set shows complete, ordinary triangles. The second—the “Illusory” set, shows triangles with the edges removed but the corners intact. Thanks to the Gestalt effect, to us humans these generally still look like triangles. The third set also only shows incomplete triangle corners. But here, the corners are randomly rotated so that we can no longer imagine a line connecting them—hence, no more triangle.
To generate a dataset large enough to tease out small effects, the authors changed the background color, image rotation, and other aspects of the dataset. In all, they produced nearly 1,000 images to test their ANNs on.
“At a high level, we compare an ANN’s activation similarities between the three sets of stimuli,” the authors explained. The process is two steps: first, train the AI on complete triangles. Second, test them on the datasets. If the response is more similar between the illusory set and the complete triangle—rather than the randomly rotated set—it should suggest a sort of Gestalt closure effect in the network.
Right off the bat, the team got their answer: yes, ANNs do seem to exhibit the law of closure.
When trained on natural images, the networks better classified the illusory set as triangles than those with randomized connection weights or networks trained on white noise.
When the team dug into the “why,” things got more interesting. The ability to complete an image correlated with the network’s ability to generalize.
Humans subconsciously do this constantly: anything with a handle made out of ceramic, regardless of shape, could easily be a mug. ANNs still struggle to grasp common features—clues that immediately tells us “hey, that’s a mug!” But when they do, it sometimes allows the networks to better generalize.
“What we observe here is that a network that is able to generalize exhibits…more of the closure effect [emphasis theirs], hinting that the closure effect reflects something beyond simply learning features,” the team wrote.
What’s more, remarkably similar to the visual cortex, “higher” levels of the ANNs showed more of the closure effect than lower layers, and—perhaps unsurprisingly—the more layers a network had, the more it exhibited the closure effect.
As the networks learned, their ability to map out objects from fragments also improved. When the team messed around with the brightness and contrast of the images, the AI still learned to see the forest from the trees.
“Our findings suggest that neural networks trained with natural images do exhibit closure,” the team concluded.
That’s not to say that ANNs recapitulate the human brain. As Google’s Deep Dream, an effort to coax AIs into spilling what they’re perceiving, clearly demonstrates, machine vision sees some truly weird stuff.
In contrast, because they’re modeled after the human visual cortex, perhaps it’s not all that surprising that these networks also exhibit higher-level properties inherent to how we process information.
But to Kim and her colleagues, that’s exactly the point.
“The field of psychology has developed useful tools and insights to study human brains– tools that we may be able to borrow to analyze artificial neural networks,” they wrote.
By tweaking these tools to better analyze machine minds, the authors were able to gain insight on how similarly or differently they see the world from us. And that’s the crux: the point isn’t to say that ANNs perceive the world sort of, kind of, maybe similar to humans. It’s to tap into a wealth of cognitive psychology tools, established over decades using human minds, to probe that of ANNs.
“The work here is just one step along a much longer path,” the authors conclude.
“Understanding where humans and neural networks differ will be helpful for research on interpretability by enlightening the fundamental differences between the two interesting species.”
Image Credit: Popova Alena / Shutterstock.com Continue reading