Tag Archives: dog
Robots have been masters of manufacturing at speed and precision for decades, but give them a seemingly simple task like stacking shelves, and they quickly get stuck. That’s changing, though, as engineers build systems that can take on the deceptively tricky tasks most humans can do with their eyes closed.
Boston Dynamics is famous for dramatic reveals of robots performing mind-blowing feats that also leave you scratching your head as to what the market is—think the bipedal Atlas doing backflips or Spot the galloping robot dog.
Last week, the company released a video of a robot called Handle that looks like an ostrich on wheels carrying out the seemingly mundane task of stacking boxes in a warehouse.
It might seem like a step backward, but this is exactly the kind of practical task robots have long struggled with. While the speed and precision of industrial robots has seen them take over many functions in modern factories, they’re generally limited to highly prescribed tasks carried out in meticulously-controlled environments.
That’s because despite their mechanical sophistication, most are still surprisingly dumb. They can carry out precision welding on a car or rapidly assemble electronics, but only by rigidly following a prescribed set of motions. Moving cardboard boxes around a warehouse might seem simple to a human, but it actually involves a variety of tasks machines still find pretty difficult—perceiving your surroundings, navigating, and interacting with objects in a dynamic environment.
But the release of this video suggests Boston Dynamics thinks these kinds of applications are close to prime time. Last week the company doubled down by announcing the acquisition of start-up Kinema Systems, which builds computer vision systems for robots working in warehouses.
It’s not the only company making strides in this area. On the same day the video went live, Google unveiled a robot arm called TossingBot that can pick random objects from a box and quickly toss them into another container beyond its reach, which could prove very useful for sorting items in a warehouse. The machine can train on new objects in just an hour or two, and can pick and toss up to 500 items an hour with better accuracy than any of the humans who tried the task.
And an apple-picking robot built by Abundant Robotics is currently on New Zealand farms navigating between rows of apple trees using LIDAR and computer vision to single out ripe apples before using a vacuum tube to suck them off the tree.
In most cases, advances in machine learning and computer vision brought about by the recent AI boom are the keys to these rapidly improving capabilities. Robots have historically had to be painstakingly programmed by humans to solve each new task, but deep learning is making it possible for them to quickly train themselves on a variety of perception, navigation, and dexterity tasks.
It’s not been simple, though, and the application of deep learning in robotics has lagged behind other areas. A major limitation is that the process typically requires huge amounts of training data. That’s fine when you’re dealing with image classification, but when that data needs to be generated by real-world robots it can make the approach impractical. Simulations offer the possibility to run this training faster than real time, but it’s proved difficult to translate policies learned in virtual environments into the real world.
Recent years have seen significant progress on these fronts, though, and the increasing integration of modern machine learning with robotics. In October, OpenAI imbued a robotic hand with human-level dexterity by training an algorithm in a simulation using reinforcement learning before transferring it to the real-world device. The key to ensuring the translation went smoothly was injecting random noise into the simulation to mimic some of the unpredictability of the real world.
And just a couple of weeks ago, MIT researchers demonstrated a new technique that let a robot arm learn to manipulate new objects with far less training data than is usually required. By getting the algorithm to focus on a few key points on the object necessary for picking it up, the system could learn to pick up a previously unseen object after seeing only a few dozen examples (rather than the hundreds or thousands typically required).
How quickly these innovations will trickle down to practical applications remains to be seen, but a number of startups as well as logistics behemoth Amazon are developing robots designed to flexibly pick and place the wide variety of items found in your average warehouse.
Whether the economics of using robots to replace humans at these kinds of menial tasks makes sense yet is still unclear. The collapse of collaborative robotics pioneer Rethink Robotics last year suggests there are still plenty of challenges.
But at the same time, the number of robotic warehouses is expected to leap from 4,000 today to 50,000 by 2025. It may not be long until robots are muscling in on tasks we’ve long assumed only humans could do.
Image Credit: Visual Generation / Shutterstock.com Continue reading
If a recent project using Google’s DeepMind were a recipe, you would take a pair of AI systems, images of animals, and a whole lot of computing power. Mix it all together, and you’d get a series of imagined animals dreamed up by one of the AIs. A look through the research paper about the project—or this open Google Folder of images it produced—will likely lead you to agree that the results are a mix of impressive and downright eerie.
But the eerie factor doesn’t mean the project shouldn’t be considered a success and a step forward for future uses of AI.
From GAN To BigGAN
The team behind the project consists of Andrew Brock, a PhD student at Edinburgh Center for Robotics, and DeepMind intern and researcher Jeff Donahue and Karen Simonyan.
They used a so-called Generative Adversarial Network (GAN) to generate the images. In a GAN, two AI systems collaborate in a game-like manner. One AI produces images of an object or creature. The human equivalent would be drawing pictures of, for example, a dog—without necessarily knowing what a dog exactly looks like. Those images are then shown to the second AI, which has already been fed images of dogs. The second AI then tells the first one how far off its efforts were. The first one uses this information to improve its images. The two go back and forth in an iterative process, and the goal is for the first AI to become so good at creating images of dogs that the second can’t tell the difference between its creations and actual pictures of dogs.
The team was able to draw on Google’s vast vaults of computational power to create images of a quality and life-like nature that were beyond almost anything seen before. In part, this was achieved by feeding the GAN with more images than is usually the case. According to IFLScience, the standard is to feed about 64 images per subject into the GAN. In this case, the research team fed about 2,000 images per subject into the system, leading to it being nicknamed BigGAN.
Their results showed that feeding the system with more images and using masses of raw computer power markedly increased the GAN’s precision and ability to create life-like renditions of the subjects it was trained to reproduce.
“The main thing these models need is not algorithmic improvements, but computational ones. […] When you increase model capacity and you increase the number of images you show at every step, you get this twofold combined effect,” Andrew Brock told Fast Company.
The Power Drain
The team used 512 of Google’s AI-focused Tensor Processing Units (TPU) to generate 512-pixel images. Each experiment took between 24 and 48 hours to run.
That kind of computing power needs a lot of electricity. As artist and Innovator-In-Residence at the Library of Congress Jer Thorp tongue-in-cheek put it on Twitter: “The good news is that AI can now give you a more believable image of a plate of spaghetti. The bad news is that it used roughly enough energy to power Cleveland for the afternoon.”
Thorp added that a back-of-the-envelope calculation showed that the computations to produce the images would require about 27,000 square feet of solar panels to have adequate power.
BigGAN’s images have been hailed by researchers, with Oriol Vinyals, research scientist at DeepMind, rhetorically asking if these were the ‘Best GAN samples yet?’
However, they are still not perfect. The number of legs on a given creature is one example of where the BigGAN seemed to struggle. The system was good at recognizing that something like a spider has a lot of legs, but seemed unable to settle on how many ‘a lot’ was supposed to be. The same applied to dogs, especially if the images were supposed to show said dogs in motion.
Those eerie images are contrasted by other renditions that show such lifelike qualities that a human mind has a hard time identifying them as fake. Spaniels with lolling tongues, ocean scenery, and butterflies were all rendered with what looks like perfection. The same goes for an image of a hamburger that was good enough to make me stop writing because I suddenly needed lunch.
The Future Use Cases
GAN networks were first introduced in 2014, and given their relative youth, researchers and companies are still busy trying out possible use cases.
One possible use is image correction—making pixillated images clearer. Not only does this help your future holiday snaps, but it could be applied in industries such as space exploration. A team from the University of Michigan and the Max Planck Institute have developed a method for GAN networks to create images from text descriptions. At Berkeley, a research group has used GAN to create an interface that lets users change the shape, size, and design of objects, including a handbag.
For anyone who has seen a film like Wag the Dog or read 1984, the possibilities are also starkly alarming. GANs could, in other words, make fake news look more real than ever before.
For now, it seems that while not all GANs require the computational and electrical power of the BigGAN, there is still some way to reach these potential use cases. However, if there’s one lesson from Moore’s Law and exponential technology, it is that today’s technical roadblock quickly becomes tomorrow’s minor issue as technology progresses.
Image Credit: Ondrej Prosicky/Shutterstock Continue reading