Tag Archives: best

#433799 The First Novel Written by AI Is ...

Last year, a novelist went on a road trip across the USA. The trip was an attempt to emulate Jack Kerouac—to go out on the road and find something essential to write about in the experience. There is, however, a key difference between this writer and anyone else talking your ear off in the bar. This writer is just a microphone, a GPS, and a camera hooked up to a laptop and a whole bunch of linear algebra.

People who are optimistic that artificial intelligence and machine learning won’t put us all out of a job say that human ingenuity and creativity will be difficult to imitate. The classic argument is that, just as machines freed us from repetitive manual tasks, machine learning will free us from repetitive intellectual tasks.

This leaves us free to spend more time on the rewarding aspects of our work, pursuing creative hobbies, spending time with loved ones, and generally being human.

In this worldview, creative works like a great novel or symphony, and the emotions they evoke, cannot be reduced to lines of code. Humans retain a dimension of superiority over algorithms.

But is creativity a fundamentally human phenomenon? Or can it be learned by machines?

And if they learn to understand us better than we understand ourselves, could the great AI novel—tailored, of course, to your own predispositions in fiction—be the best you’ll ever read?

Maybe Not a Beach Read
This is the futurist’s view, of course. The reality, as the jury-rigged contraption in Ross Goodwin’s Cadillac for that road trip can attest, is some way off.

“This is very much an imperfect document, a rapid prototyping project. The output isn’t perfect. I don’t think it’s a human novel, or anywhere near it,” Goodwin said of the novel that his machine created. 1 The Road is currently marketed as the first novel written by AI.

Once the neural network has been trained, it can generate any length of text that the author desires, either at random or working from a specific seed word or phrase. Goodwin used the sights and sounds of the road trip to provide these seeds: the novel is written one sentence at a time, based on images, locations, dialogue from the microphone, and even the computer’s own internal clock.

The results are… mixed.

The novel begins suitably enough, quoting the time: “It was nine seventeen in the morning, and the house was heavy.” Descriptions of locations begin according to the Foursquare dataset fed into the algorithm, but rapidly veer off into the weeds, becoming surreal. While experimentation in literature is a wonderful thing, repeatedly quoting longitude and latitude coordinates verbatim is unlikely to win anyone the Booker Prize.

Data In, Art Out?
Neural networks as creative agents have some advantages. They excel at being trained on large datasets, identifying the patterns in those datasets, and producing output that follows those same rules. Music inspired by or written by AI has become a growing subgenre—there’s even a pop album by human-machine collaborators called the Songularity.

A neural network can “listen to” all of Bach and Mozart in hours, and train itself on the works of Shakespeare to produce passable pseudo-Bard. The idea of artificial creativity has become so widespread that there’s even a meme format about forcibly training neural network ‘bots’ on human writing samples, with hilarious consequences—although the best joke was undoubtedly human in origin.

The AI that roamed from New York to New Orleans was an LSTM (long short-term memory) neural net. By default, information contained in individual neurons is preserved, and only small parts can be “forgotten” or “learned” in an individual timestep, rather than neurons being entirely overwritten.

The LSTM architecture performs better than previous recurrent neural networks at tasks such as handwriting and speech recognition. The neural net—and its programmer—looked further in search of literary influences, ingesting 60 million words (360 MB) of raw literature according to Goodwin’s recipe: one third poetry, one third science fiction, and one third “bleak” literature.

In this way, Goodwin has some creative control over the project; the source material influences the machine’s vocabulary and sentence structuring, and hence the tone of the piece.

The Thoughts Beneath the Words
The problem with artificially intelligent novelists is the same problem with conversational artificial intelligence that computer scientists have been trying to solve from Turing’s day. The machines can understand and reproduce complex patterns increasingly better than humans can, but they have no understanding of what these patterns mean.

Goodwin’s neural network spits out sentences one letter at a time, on a tiny printer hooked up to the laptop. Statistical associations such as those tracked by neural nets can form words from letters, and sentences from words, but they know nothing of character or plot.

When talking to a chatbot, the code has no real understanding of what’s been said before, and there is no dataset large enough to train it through all of the billions of possible conversations.

Unless restricted to a predetermined set of options, it loses the thread of the conversation after a reply or two. In a similar way, the creative neural nets have no real grasp of what they’re writing, and no way to produce anything with any overarching coherence or narrative.

Goodwin’s experiment is an attempt to add some coherent backbone to the AI “novel” by repeatedly grounding it with stimuli from the cameras or microphones—the thematic links and narrative provided by the American landscape the neural network drives through.

Goodwin feels that this approach (the car itself moving through the landscape, as if a character) borrows some continuity and coherence from the journey itself. “Coherent prose is the holy grail of natural-language generation—feeling that I had somehow solved a small part of the problem was exhilarating. And I do think it makes a point about language in time that’s unexpected and interesting.”

AI Is Still No Kerouac
A coherent tone and semantic “style” might be enough to produce some vaguely-convincing teenage poetry, as Google did, and experimental fiction that uses neural networks can have intriguing results. But wading through the surreal AI prose of this era, searching for some meaning or motif beyond novelty value, can be a frustrating experience.

Maybe machines can learn the complexities of the human heart and brain, or how to write evocative or entertaining prose. But they’re a long way off, and somehow “more layers!” or a bigger corpus of data doesn’t feel like enough to bridge that gulf.

Real attempts by machines to write fiction have so far been broadly incoherent, but with flashes of poetry—dreamlike, hallucinatory ramblings.

Neural networks might not be capable of writing intricately-plotted works with charm and wit, like Dickens or Dostoevsky, but there’s still an eeriness to trying to decipher the surreal, Finnegans’ Wake mish-mash.

You might see, in the odd line, the flickering ghost of something like consciousness, a deeper understanding. Or you might just see fragments of meaning thrown into a neural network blender, full of hype and fury, obeying rules in an occasionally striking way, but ultimately signifying nothing. In that sense, at least, the RNN’s grappling with metaphor feels like a metaphor for the hype surrounding the latest AI summer as a whole.

Or, as the human author of On The Road put it: “You guys are going somewhere or just going?”

Image Credit: eurobanks / Shutterstock.com Continue reading

Posted in Human Robots

#433785 DeepMind’s Eerie Reimagination of the ...

If a recent project using Google’s DeepMind were a recipe, you would take a pair of AI systems, images of animals, and a whole lot of computing power. Mix it all together, and you’d get a series of imagined animals dreamed up by one of the AIs. A look through the research paper about the project—or this open Google Folder of images it produced—will likely lead you to agree that the results are a mix of impressive and downright eerie.

But the eerie factor doesn’t mean the project shouldn’t be considered a success and a step forward for future uses of AI.

From GAN To BigGAN
The team behind the project consists of Andrew Brock, a PhD student at Edinburgh Center for Robotics, and DeepMind intern and researcher Jeff Donahue and Karen Simonyan.

They used a so-called Generative Adversarial Network (GAN) to generate the images. In a GAN, two AI systems collaborate in a game-like manner. One AI produces images of an object or creature. The human equivalent would be drawing pictures of, for example, a dog—without necessarily knowing what a dog exactly looks like. Those images are then shown to the second AI, which has already been fed images of dogs. The second AI then tells the first one how far off its efforts were. The first one uses this information to improve its images. The two go back and forth in an iterative process, and the goal is for the first AI to become so good at creating images of dogs that the second can’t tell the difference between its creations and actual pictures of dogs.

The team was able to draw on Google’s vast vaults of computational power to create images of a quality and life-like nature that were beyond almost anything seen before. In part, this was achieved by feeding the GAN with more images than is usually the case. According to IFLScience, the standard is to feed about 64 images per subject into the GAN. In this case, the research team fed about 2,000 images per subject into the system, leading to it being nicknamed BigGAN.

Their results showed that feeding the system with more images and using masses of raw computer power markedly increased the GAN’s precision and ability to create life-like renditions of the subjects it was trained to reproduce.

“The main thing these models need is not algorithmic improvements, but computational ones. […] When you increase model capacity and you increase the number of images you show at every step, you get this twofold combined effect,” Andrew Brock told Fast Company.

The Power Drain
The team used 512 of Google’s AI-focused Tensor Processing Units (TPU) to generate 512-pixel images. Each experiment took between 24 and 48 hours to run.

That kind of computing power needs a lot of electricity. As artist and Innovator-In-Residence at the Library of Congress Jer Thorp tongue-in-cheek put it on Twitter: “The good news is that AI can now give you a more believable image of a plate of spaghetti. The bad news is that it used roughly enough energy to power Cleveland for the afternoon.”

Thorp added that a back-of-the-envelope calculation showed that the computations to produce the images would require about 27,000 square feet of solar panels to have adequate power.

BigGAN’s images have been hailed by researchers, with Oriol Vinyals, research scientist at DeepMind, rhetorically asking if these were the ‘Best GAN samples yet?’

However, they are still not perfect. The number of legs on a given creature is one example of where the BigGAN seemed to struggle. The system was good at recognizing that something like a spider has a lot of legs, but seemed unable to settle on how many ‘a lot’ was supposed to be. The same applied to dogs, especially if the images were supposed to show said dogs in motion.

Those eerie images are contrasted by other renditions that show such lifelike qualities that a human mind has a hard time identifying them as fake. Spaniels with lolling tongues, ocean scenery, and butterflies were all rendered with what looks like perfection. The same goes for an image of a hamburger that was good enough to make me stop writing because I suddenly needed lunch.

The Future Use Cases
GAN networks were first introduced in 2014, and given their relative youth, researchers and companies are still busy trying out possible use cases.

One possible use is image correction—making pixillated images clearer. Not only does this help your future holiday snaps, but it could be applied in industries such as space exploration. A team from the University of Michigan and the Max Planck Institute have developed a method for GAN networks to create images from text descriptions. At Berkeley, a research group has used GAN to create an interface that lets users change the shape, size, and design of objects, including a handbag.

For anyone who has seen a film like Wag the Dog or read 1984, the possibilities are also starkly alarming. GANs could, in other words, make fake news look more real than ever before.

For now, it seems that while not all GANs require the computational and electrical power of the BigGAN, there is still some way to reach these potential use cases. However, if there’s one lesson from Moore’s Law and exponential technology, it is that today’s technical roadblock quickly becomes tomorrow’s minor issue as technology progresses.

Image Credit: Ondrej Prosicky/Shutterstock Continue reading

Posted in Human Robots

#433655 First-Ever Grad Program in Space Mining ...

Maybe they could call it the School of Space Rock: A new program being offered at the Colorado School of Mines (CSM) will educate post-graduate students on the nuts and bolts of extracting and using valuable materials such as rare metals and frozen water from space rocks like asteroids or the moon.

Officially called Space Resources, the graduate-level program is reputedly the first of its kind in the world to offer a course in the emerging field of space mining. Heading the program is Angel Abbud-Madrid, director of the Center for Space Resources at Mines, a well-known engineering school located in Golden, Colorado, where Molson Coors taps Rocky Mountain spring water for its earthly brews.

The first semester for the new discipline began last month. While Abbud-Madrid didn’t immediately respond to an interview request, Singularity Hub did talk to Chris Lewicki, president and CEO of Planetary Resources, a space mining company whose founders include Peter Diamandis, Singularity University co-founder.

A former NASA engineer who worked on multiple Mars missions, Lewicki says the Space Resources program at CSM, with its multidisciplinary focus on science, economics, and policy, will help students be light years ahead of their peers in the nascent field of space mining.

“I think it’s very significant that they’ve started this program,” he said. “Having students with that kind of background exposure just allows them to be productive on day one instead of having to kind of fill in a lot of things for them.”

Who would be attracted to apply for such a program? There are many professionals who could be served by a post-baccalaureate certificate, master’s degree, or even Ph.D. in Space Resources, according to Lewicki. Certainly aerospace engineers and planetary scientists would be among the faces in the classroom.

“I think it’s [also] people who have an interest in what I would call maybe space robotics,” he said. Lewicki is referring not only to the classic example of robotic arms like the Canadarm2, which lends a hand to astronauts aboard the International Space Station, but other types of autonomous platforms.

One example might be Planetary Resources’ own Arkyd-6, a small, autonomous satellite called a CubeSat launched earlier this year to test different technologies that might be used for deep-space exploration of resources. The proof-of-concept was as much a test for the technology—such as the first space-based use of a mid-wave infrared imager to detect water resources—as it was for being able to work in space on a shoestring budget.

“We really proved that doing one of these billion-dollar science missions to deep space can be done for a lot less if you have a very focused goal, and if you kind of cut a lot of corners and then put some commercial approaches into those things,” Lewicki said.

A Trillion-Dollar Industry
Why space mining? There are at least a trillion reasons.

Astrophysicist Neil deGrasse Tyson famously said that the first trillionaire will be the “person who exploits the natural resources on asteroids.” That’s because asteroids—rocky remnants from the formation of our solar system more than four billion years ago—harbor precious metals, ranging from platinum and gold to iron and nickel.

For instance, one future target of exploration by NASA—an asteroid dubbed 16 Psyche, orbiting the sun in the asteroid belt between Mars and Jupiter—is worth an estimated $10,000 quadrillion. It’s a number so mind-bogglingly big that it would crash the global economy, if someone ever figured out how to tow it back to Earth without literally crashing it into the planet.

Living Off the Land
Space mining isn’t just about getting rich. Many argue that humanity’s ability to extract resources in space, especially water that can be refined into rocket fuel, will be a key technology to extend our reach beyond near-Earth space.

The presence of frozen water around the frigid polar regions of the moon, for example, represents an invaluable source to power future deep-space missions. Splitting H20 into its component elements of hydrogen and oxygen would provide a nearly inexhaustible source of rocket fuel. Today, it costs $10,000 to put a pound of payload in Earth orbit, according to NASA.

Until more advanced rocket technology is developed, the moon looks to be the best bet for serving as the launching pad to Mars and beyond.

Moon Versus Asteroid
However, Lewicki notes that despite the moon’s proximity and our more intimate familiarity with its pockmarked surface, that doesn’t mean a lunar mission to extract resources is any easier than a multi-year journey to a fast-moving asteroid.

For one thing, fighting gravity to and from the moon is no easy feat, as the moon has a significantly stronger gravitational field than an asteroid. Another challenge is that the frozen water is located in permanently shadowed lunar craters, meaning space miners can’t rely on solar-powered equipment, but on some sort of external energy source.

And then there’s the fact that moon craters might just be the coldest places in the solar system. NASA’s Lunar Reconnaissance Orbiter found temperatures plummeted as low as 26 Kelvin, or more than minus 400 degrees Fahrenheit. In comparison, the coldest temperatures on Earth have been recorded near the South Pole in Antarctica—about minus 148 degrees F.

“We don’t operate machines in that kind of thermal environment,” Lewicki said of the extreme temperatures detected in the permanent dark regions of the moon. “Antarctica would be a balmy desert island compared to a lunar polar crater.”

Of course, no one knows quite what awaits us in the asteroid belt. Answers may soon be forthcoming. Last week, the Japan Aerospace Exploration Agency landed two small, hopping rovers on an asteroid called Ryugu. Meanwhile, NASA hopes to retrieve a sample from the near-Earth asteroid Bennu when its OSIRIS-REx mission makes contact at the end of this year.

No Bucks, No Buck Rogers
Visionaries like Elon Musk and Jeff Bezos talk about colonies on Mars, with millions of people living and working in space. The reality is that there’s probably a reason Buck Rogers was set in the 25th century: It’s going to take a lot of money and a lot of time to realize those sci-fi visions.

Or, as Lewicki put it: “No bucks, no Buck Rogers.”

The cost of operating in outer space can be prohibitive. Planetary Resources itself is grappling with raising additional funding, with reports this year about layoffs and even a possible auction of company assets.

Still, Lewicki is confident that despite economic and technical challenges, humanity will someday exceed even the boldest dreamers—skyscrapers on the moon, interplanetary trips to Mars—as judged against today’s engineering marvels.

“What we’re doing is going to be very hard, very painful, and almost certainly worth it,” he said. “Who would have thought that there would be a job for a space miner that you could go to school for, even just five or ten years ago. Things move quickly.”

Image Credit: M-SUR / Shutterstock.com Continue reading

Posted in Human Robots

#433620 Instilling the Best of Human Values in ...

Now that the era of artificial intelligence is unquestionably upon us, it behooves us to think and work harder to ensure that the AIs we create embody positive human values.

Science fiction is full of AIs that manifest the dark side of humanity, or are indifferent to humans altogether. Such possibilities cannot be ruled out, but nor is there any logical or empirical reason to consider them highly likely. I am among a large group of AI experts who see a strong potential for profoundly positive outcomes in the AI revolution currently underway.

We are facing a future with great uncertainty and tremendous promise, and the best we can do is to confront it with a combination of heart and mind, of common sense and rigorous science. In the realm of AI, what this means is, we need to do our best to guide the AI minds we are creating to embody the values we cherish: love, compassion, creativity, and respect.

The quest for beneficial AI has many dimensions, including its potential to reduce material scarcity and to help unlock the human capacity for love and compassion.

Reducing Scarcity
A large percentage of difficult issues in human society, many of which spill over into the AI domain, would be palliated significantly if material scarcity became less of a problem. Fortunately, AI has great potential to help here. AI is already increasing efficiency in nearly every industry.

In the next few decades, as nanotech and 3D printing continue to advance, AI-driven design will become a larger factor in the economy. Radical new tools like artificial enzymes built using Christian Schafmeister’s spiroligomer molecules, and designed using quantum physics-savvy AIs, will enable the creation of new materials and medicines.

For amazing advances like the intersection of AI and nanotech to lead toward broadly positive outcomes, however, the economic and political aspects of the AI industry may have to shift from the current status quo.

Currently, most AI development occurs under the aegis of military organizations or large corporations oriented heavily toward advertising and marketing. Put crudely, an awful lot of AI today is about “spying, brainwashing, or killing.” This is not really the ideal situation if we want our first true artificial general intelligences to be open-minded, warm-hearted, and beneficial.

Also, as the bulk of AI development now occurs in large for-profit organizations bound by law to pursue the maximization of shareholder value, we face a situation where AI tends to exacerbate global wealth inequality and class divisions. This has the potential to lead to various civilization-scale failure modes involving the intersection of geopolitics, AI, cyberterrorism, and so forth. Part of my motivation for founding the decentralized AI project SingularityNET was to create an alternative mode of dissemination and utilization of both narrow AI and AGI—one that operates in a self-organizing way, outside of the direct grip of conventional corporate and governmental structures.

In the end, though, I worry that radical material abundance and novel political and economic structures may fail to create a positive future, unless they are coupled with advances in consciousness and compassion. AGIs have the potential to be massively more ethical and compassionate than humans. But still, the odds of getting deeply beneficial AGIs seem higher if the humans creating them are fuller of compassion and positive consciousness—and can effectively pass these values on.

Transmitting Human Values
Brain-computer interfacing is another critical aspect of the quest for creating more positive AIs and more positive humans. As Elon Musk has put it, “If you can’t beat ’em, join’ em.” Joining is more fun than beating anyway. What better way to infuse AIs with human values than to connect them directly to human brains, and let them learn directly from the source (while providing humans with valuable enhancements)?

Millions of people recently heard Elon Musk discuss AI and BCI on the Joe Rogan podcast. Musk’s embrace of brain-computer interfacing is laudable, but he tends to dodge some of the tough issues—for instance, he does not emphasize the trade-off cyborgs will face between retaining human-ness and maximizing intelligence, joy, and creativity. To make this trade-off effectively, the AI portion of the cyborg will need to have a deep sense of human values.

Musk calls humanity the “biological boot loader” for AGI, but to me this colorful metaphor misses a key point—that we can seed the AGI we create with our values as an initial condition. This is one reason why it’s important that the first really powerful AGIs are created by decentralized networks, and not conventional corporate or military organizations. The decentralized software/hardware ecosystem, for all its quirks and flaws, has more potential to lead to human-computer cybernetic collective minds that are reasonable and benevolent.

Algorithmic Love
BCI is still in its infancy, but a more immediate way of connecting people with AIs to infuse both with greater love and compassion is to leverage humanoid robotics technology. Toward this end, I conceived a project called Loving AI, focused on using highly expressive humanoid robots like the Hanson robot Sophia to lead people through meditations and other exercises oriented toward unlocking the human potential for love and compassion. My goals here were to explore the potential of AI and robots to have a positive impact on human consciousness, and to use this application to study and improve the OpenCog and SingularityNET tools used to control Sophia in these interactions.

The Loving AI project has now run two small sets of human trials, both with exciting and positive results. These have been small—dozens rather than hundreds of people—but have definitively proven the point. Put a person in a quiet room with a humanoid robot that can look them in the eye, mirror their facial expressions, recognize some of their emotions, and lead them through simple meditation, listening, and consciousness-oriented exercises…and quite a lot of the time, the result is a more relaxed person who has entered into a shifted state of consciousness, at least for a period of time.

In a certain percentage of cases, the interaction with the robot consciousness guide triggered a dramatic change of consciousness in the human subject—a deep meditative trance state, for instance. In most cases, the result was not so extreme, but statistically the positive effect was quite significant across all cases. Furthermore, a similar effect was found using an avatar simulation of the robot’s face on a tablet screen (together with a webcam for facial expression mirroring and recognition), but not with a purely auditory interaction.

The Loving AI experiments are not only about AI; they are about human-robot and human-avatar interaction, with AI as one significant aspect. The facial interaction with the robot or avatar is pushing “biological buttons” that trigger emotional reactions and prime the mind for changes of consciousness. However, this sort of body-mind interaction is arguably critical to human values and what it means to be human; it’s an important thing for robots and AIs to “get.”

Halting or pausing the advance of AI is not a viable possibility at this stage. Despite the risks, the potential economic and political benefits involved are clear and massive. The convergence of narrow AI toward AGI is also a near inevitability, because there are so many important applications where greater generality of intelligence will lead to greater practical functionality. The challenge is to make the outcome of this great civilization-level adventure as positive as possible.

Image Credit: Anton Gvozdikov / Shutterstock.com Continue reading

Posted in Human Robots

#433506 MIT’s New Robot Taught Itself to Pick ...

Back in 2016, somewhere in a Google-owned warehouse, more than a dozen robotic arms sat for hours quietly grasping objects of various shapes and sizes. For hours on end, they taught themselves how to pick up and hold the items appropriately—mimicking the way a baby gradually learns to use its hands.

Now, scientists from MIT have made a new breakthrough in machine learning: their new system can not only teach itself to see and identify objects, but also understand how best to manipulate them.

This means that, armed with the new machine learning routine referred to as “dense object nets (DON),” the robot would be capable of picking up an object that it’s never seen before, or in an unfamiliar orientation, without resorting to trial and error—exactly as a human would.

The deceptively simple ability to dexterously manipulate objects with our hands is a huge part of why humans are the dominant species on the planet. We take it for granted. Hardware innovations like the Shadow Dexterous Hand have enabled robots to softly grip and manipulate delicate objects for many years, but the software required to control these precision-engineered machines in a range of circumstances has proved harder to develop.

This was not for want of trying. The Amazon Robotics Challenge offers millions of dollars in prizes (and potentially far more in contracts, as their $775m acquisition of Kiva Systems shows) for the best dexterous robot able to pick and package items in their warehouses. The lucrative dream of a fully-automated delivery system is missing this crucial ability.

Meanwhile, the Robocup@home challenge—an offshoot of the popular Robocup tournament for soccer-playing robots—aims to make everyone’s dream of having a robot butler a reality. The competition involves teams drilling their robots through simple household tasks that require social interaction or object manipulation, like helping to carry the shopping, sorting items onto a shelf, or guiding tourists around a museum.

Yet all of these endeavors have proved difficult; the tasks often have to be simplified to enable the robot to complete them at all. New or unexpected elements, such as those encountered in real life, more often than not throw the system entirely. Programming the robot’s every move in explicit detail is not a scalable solution: this can work in the highly-controlled world of the assembly line, but not in everyday life.

Computer vision is improving all the time. Neural networks, including those you train every time you prove that you’re not a robot with CAPTCHA, are getting better at sorting objects into categories, and identifying them based on sparse or incomplete data, such as when they are occluded, or in different lighting.

But many of these systems require enormous amounts of input data, which is impractical, slow to generate, and often needs to be laboriously categorized by humans. There are entirely new jobs that require people to label, categorize, and sift large bodies of data ready for supervised machine learning. This can make machine learning undemocratic. If you’re Google, you can make thousands of unwitting volunteers label your images for you with CAPTCHA. If you’re IBM, you can hire people to manually label that data. If you’re an individual or startup trying something new, however, you will struggle to access the vast troves of labeled data available to the bigger players.

This is why new systems that can potentially train themselves over time or that allow robots to deal with situations they’ve never seen before without mountains of labelled data are a holy grail in artificial intelligence. The work done by MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) is part of a new wave of “self-supervised” machine learning systems—little of the data used was labeled by humans.

The robot first inspects the new object from multiple angles, building up a 3D picture of the object with its own coordinate system. This then allows the robotic arm to identify a particular feature on the object—such as a handle, or the tongue of a shoe—from various different angles, based on its relative distance to other grid points.

This is the real innovation: the new means of representing objects to grasp as mapped-out 3D objects, with grid points and subsections of their own. Rather than using a computer vision algorithm to identify a door handle, and then activating a door handle grasping subroutine, the DON system treats all objects by making these spatial maps before classifying or manipulating them, enabling it to deal with a greater range of objects than in other approaches.

“Many approaches to manipulation can’t identify specific parts of an object across the many orientations that object may encounter,” said PhD student Lucas Manuelli, who wrote a new paper about the system with lead author and fellow student Pete Florence, alongside MIT professor Russ Tedrake. “For example, existing algorithms would be unable to grasp a mug by its handle, especially if the mug could be in multiple orientations, like upright, or on its side.”

Class-specific descriptors, which can be applied to the object features, can allow the robot arm to identify a mug, find the handle, and pick the mug up appropriately. Object-specific descriptors allow the robot arm to select a particular mug from a group of similar items. I’m already dreaming of a robot butler reliably picking my favourite mug when it serves me coffee in the morning.

Google’s robot arm-y was an attempt to develop a general grasping algorithm: one that could identify, categorize, and appropriately grip as many items as possible. This requires a great deal of training time and data, which is why Google parallelized their project by having 14 robot arms feed data into a single neural network brain: even then, the algorithm may fail with highly specific tasks. Specialist grasping algorithms might require less training if they’re limited to specific objects, but then your software is useless for general tasks.

As the roboticists noted, their system, with its ability to identify parts of an object rather than just a single object, is better suited to specific tasks, such as “grasp the racquet by the handle,” than Amazon Robotics Challenge robots, which identify whole objects by segmenting an image.

This work is small-scale at present. It has been tested with a few classes of objects, including shoes, hats, and mugs. Yet the use of these dense object nets as a way for robots to represent and manipulate new objects may well be another step towards the ultimate goal of generalized automation: a robot capable of performing every task a person can. If that point is reached, the question that will remain is how to cope with being obsolete.

Image Credit: Tom Buehler/CSAIL Continue reading

Posted in Human Robots