Tag Archives: camera
Autonomous vehicles can follow the general rules of American roads, recognizing traffic signals and lane markings, noticing crosswalks and other regular features of the streets. But they work only on well-marked roads that are carefully scanned and mapped in advance.
Many paved roads, though, have faded paint, signs obscured behind trees and unusual intersections. In addition, 1.4 million miles of U.S. roads—one-third of the country’s public roadways—are unpaved, with no on-road signals like lane markings or stop-here lines. That doesn’t include miles of private roads, unpaved driveways or off-road trails.
What’s a rule-following autonomous car to do when the rules are unclear or nonexistent? And what are its passengers to do when they discover their vehicle can’t get them where they’re going?
Accounting for the Obscure
Most challenges in developing advanced technologies involve handling infrequent or uncommon situations, or events that require performance beyond a system’s normal capabilities. That’s definitely true for autonomous vehicles. Some on-road examples might be navigating construction zones, encountering a horse and buggy, or seeing graffiti that looks like a stop sign. Off-road, the possibilities include the full variety of the natural world, such as trees down over the road, flooding and large puddles—or even animals blocking the way.
At Mississippi State University’s Center for Advanced Vehicular Systems, we have taken up the challenge of training algorithms to respond to circumstances that almost never happen, are difficult to predict and are complex to create. We seek to put autonomous cars in the hardest possible scenario: driving in an area the car has no prior knowledge of, with no reliable infrastructure like road paint and traffic signs, and in an unknown environment where it’s just as likely to see a cactus as a polar bear.
Our work combines virtual technology and the real world. We create advanced simulations of lifelike outdoor scenes, which we use to train artificial intelligence algorithms to take a camera feed and classify what it sees, labeling trees, sky, open paths and potential obstacles. Then we transfer those algorithms to a purpose-built all-wheel-drive test vehicle and send it out on our dedicated off-road test track, where we can see how our algorithms work and collect more data to feed into our simulations.
We have developed a simulator that can create a wide range of realistic outdoor scenes for vehicles to navigate through. The system generates a range of landscapes of different climates, like forests and deserts, and can show how plants, shrubs and trees grow over time. It can also simulate weather changes, sunlight and moonlight, and the accurate locations of 9,000 stars.
The system also simulates the readings of sensors commonly used in autonomous vehicles, such as lidar and cameras. Those virtual sensors collect data that feeds into neural networks as valuable training data.
Simulated desert, meadow and forest environments generated by the Mississippi State University Autonomous Vehicle Simulator. Chris Goodin, Mississippi State University, Author provided.
Building a Test Track
Simulations are only as good as their portrayals of the real world. Mississippi State University has purchased 50 acres of land on which we are developing a test track for off-road autonomous vehicles. The property is excellent for off-road testing, with unusually steep grades for our area of Mississippi—up to 60 percent inclines—and a very diverse population of plants.
We have selected certain natural features of this land that we expect will be particularly challenging for self-driving vehicles, and replicated them exactly in our simulator. That allows us to directly compare results from the simulation and real-life attempts to navigate the actual land. Eventually, we’ll create similar real and virtual pairings of other types of landscapes to improve our vehicle’s capabilities.
A road washout, as seen in real life, left, and in simulation. Chris Goodin, Mississippi State University, Author provided.
Collecting More Data
We have also built a test vehicle, called the Halo Project, which has an electric motor and sensors and computers that can navigate various off-road environments. The Halo Project car has additional sensors to collect detailed data about its actual surroundings, which can help us build virtual environments to run new tests in.
The Halo Project car can collect data about driving and navigating in rugged terrain. Beth Newman Wynn, Mississippi State University, Author provided.
Two of its lidar sensors, for example, are mounted at intersecting angles on the front of the car so their beams sweep across the approaching ground. Together, they can provide information on how rough or smooth the surface is, as well as capturing readings from grass and other plants and items on the ground.
Lidar beams intersect, scanning the ground in front of the vehicle. Chris Goodin, Mississippi State University, Author provided
We’ve seen some exciting early results from our research. For example, we have shown promising preliminary results that machine learning algorithms trained on simulated environments can be useful in the real world. As with most autonomous vehicle research, there is still a long way to go, but our hope is that the technologies we’re developing for extreme cases will also help make autonomous vehicles more functional on today’s roads.
Matthew Doude, Associate Director, Center for Advanced Vehicular Systems; Ph.D. Student in Industrial and Systems Engineering, Mississippi State University; Christopher Goodin, Assistant Research Professor, Center for Advanced Vehicular Systems, Mississippi State University, and Daniel Carruth, Assistant Research Professor and Associate Director for Human Factors and Advanced Vehicle System, Center for Advanced Vehicular Systems, Mississippi State University
This article is republished from The Conversation under a Creative Commons license. Read the original article.
Photo provided for The Conversation by Matthew Goudin / CC BY ND Continue reading
Last year, a novelist went on a road trip across the USA. The trip was an attempt to emulate Jack Kerouac—to go out on the road and find something essential to write about in the experience. There is, however, a key difference between this writer and anyone else talking your ear off in the bar. This writer is just a microphone, a GPS, and a camera hooked up to a laptop and a whole bunch of linear algebra.
People who are optimistic that artificial intelligence and machine learning won’t put us all out of a job say that human ingenuity and creativity will be difficult to imitate. The classic argument is that, just as machines freed us from repetitive manual tasks, machine learning will free us from repetitive intellectual tasks.
This leaves us free to spend more time on the rewarding aspects of our work, pursuing creative hobbies, spending time with loved ones, and generally being human.
In this worldview, creative works like a great novel or symphony, and the emotions they evoke, cannot be reduced to lines of code. Humans retain a dimension of superiority over algorithms.
But is creativity a fundamentally human phenomenon? Or can it be learned by machines?
And if they learn to understand us better than we understand ourselves, could the great AI novel—tailored, of course, to your own predispositions in fiction—be the best you’ll ever read?
Maybe Not a Beach Read
This is the futurist’s view, of course. The reality, as the jury-rigged contraption in Ross Goodwin’s Cadillac for that road trip can attest, is some way off.
“This is very much an imperfect document, a rapid prototyping project. The output isn’t perfect. I don’t think it’s a human novel, or anywhere near it,” Goodwin said of the novel that his machine created. 1 The Road is currently marketed as the first novel written by AI.
Once the neural network has been trained, it can generate any length of text that the author desires, either at random or working from a specific seed word or phrase. Goodwin used the sights and sounds of the road trip to provide these seeds: the novel is written one sentence at a time, based on images, locations, dialogue from the microphone, and even the computer’s own internal clock.
The results are… mixed.
The novel begins suitably enough, quoting the time: “It was nine seventeen in the morning, and the house was heavy.” Descriptions of locations begin according to the Foursquare dataset fed into the algorithm, but rapidly veer off into the weeds, becoming surreal. While experimentation in literature is a wonderful thing, repeatedly quoting longitude and latitude coordinates verbatim is unlikely to win anyone the Booker Prize.
Data In, Art Out?
Neural networks as creative agents have some advantages. They excel at being trained on large datasets, identifying the patterns in those datasets, and producing output that follows those same rules. Music inspired by or written by AI has become a growing subgenre—there’s even a pop album by human-machine collaborators called the Songularity.
A neural network can “listen to” all of Bach and Mozart in hours, and train itself on the works of Shakespeare to produce passable pseudo-Bard. The idea of artificial creativity has become so widespread that there’s even a meme format about forcibly training neural network ‘bots’ on human writing samples, with hilarious consequences—although the best joke was undoubtedly human in origin.
The AI that roamed from New York to New Orleans was an LSTM (long short-term memory) neural net. By default, information contained in individual neurons is preserved, and only small parts can be “forgotten” or “learned” in an individual timestep, rather than neurons being entirely overwritten.
The LSTM architecture performs better than previous recurrent neural networks at tasks such as handwriting and speech recognition. The neural net—and its programmer—looked further in search of literary influences, ingesting 60 million words (360 MB) of raw literature according to Goodwin’s recipe: one third poetry, one third science fiction, and one third “bleak” literature.
In this way, Goodwin has some creative control over the project; the source material influences the machine’s vocabulary and sentence structuring, and hence the tone of the piece.
The Thoughts Beneath the Words
The problem with artificially intelligent novelists is the same problem with conversational artificial intelligence that computer scientists have been trying to solve from Turing’s day. The machines can understand and reproduce complex patterns increasingly better than humans can, but they have no understanding of what these patterns mean.
Goodwin’s neural network spits out sentences one letter at a time, on a tiny printer hooked up to the laptop. Statistical associations such as those tracked by neural nets can form words from letters, and sentences from words, but they know nothing of character or plot.
When talking to a chatbot, the code has no real understanding of what’s been said before, and there is no dataset large enough to train it through all of the billions of possible conversations.
Unless restricted to a predetermined set of options, it loses the thread of the conversation after a reply or two. In a similar way, the creative neural nets have no real grasp of what they’re writing, and no way to produce anything with any overarching coherence or narrative.
Goodwin’s experiment is an attempt to add some coherent backbone to the AI “novel” by repeatedly grounding it with stimuli from the cameras or microphones—the thematic links and narrative provided by the American landscape the neural network drives through.
Goodwin feels that this approach (the car itself moving through the landscape, as if a character) borrows some continuity and coherence from the journey itself. “Coherent prose is the holy grail of natural-language generation—feeling that I had somehow solved a small part of the problem was exhilarating. And I do think it makes a point about language in time that’s unexpected and interesting.”
AI Is Still No Kerouac
A coherent tone and semantic “style” might be enough to produce some vaguely-convincing teenage poetry, as Google did, and experimental fiction that uses neural networks can have intriguing results. But wading through the surreal AI prose of this era, searching for some meaning or motif beyond novelty value, can be a frustrating experience.
Maybe machines can learn the complexities of the human heart and brain, or how to write evocative or entertaining prose. But they’re a long way off, and somehow “more layers!” or a bigger corpus of data doesn’t feel like enough to bridge that gulf.
Real attempts by machines to write fiction have so far been broadly incoherent, but with flashes of poetry—dreamlike, hallucinatory ramblings.
Neural networks might not be capable of writing intricately-plotted works with charm and wit, like Dickens or Dostoevsky, but there’s still an eeriness to trying to decipher the surreal, Finnegans’ Wake mish-mash.
You might see, in the odd line, the flickering ghost of something like consciousness, a deeper understanding. Or you might just see fragments of meaning thrown into a neural network blender, full of hype and fury, obeying rules in an occasionally striking way, but ultimately signifying nothing. In that sense, at least, the RNN’s grappling with metaphor feels like a metaphor for the hype surrounding the latest AI summer as a whole.
Or, as the human author of On The Road put it: “You guys are going somewhere or just going?”
Image Credit: eurobanks / Shutterstock.com Continue reading
In Goethe’s poem “The Sorcerer’s Apprentice,” made world-famous by its adaptation in Disney’s Fantasia, a lazy apprentice, left to fetch water, uses magic to bewitch a broom into performing his chores for him. Now, new research from Yale has opened up the possibility of being able to animate—and automate—household objects by fitting them with a robotic skin.
Yale’s Soft Robotics lab, the Faboratory, is led by Professor Rebecca Kramer-Bottiglio, and has long investigated the possibilities associated with new kinds of manufacturing. While the typical image of a robot is hard, cold steel and rigid movements, soft robotics aims to create something more flexible and versatile. After all, the human body is made up of soft, flexible surfaces, and the world is designed for us. Soft, deformable robots could change shape to adapt to different tasks.
When designing a robot, key components are the robot’s sensors, which allow it to perceive its environment, and its actuators, the electrical or pneumatic motors that allow the robot to move and interact with its environment.
Consider your hand, which has temperature and pressure sensors, but also muscles as actuators. The omni-skins, as the Science Robotics paper dubs them, combine sensors and actuators, embedding them into an elastic sheet. The robotic skins are moved by pneumatic actuators or memory alloy that can bounce back into shape. If this is then wrapped around a soft, deformable object, moving the skin with the actuators can allow the object to crawl along a surface.
The key to the design here is flexibility: rather than adding chips, sensors, and motors into every household object to turn them into individual automatons, the same skin can be used for many purposes. “We can take the skins and wrap them around one object to perform a task—locomotion, for example—and then take them off and put them on a different object to perform a different task, such as grasping and moving an object,” said Kramer-Bottiglio. “We can then take those same skins off that object and put them on a shirt to make an active wearable device.”
The task is then to dream up applications for the omni-skins. Initially, you might imagine demanding a stuffed toy to fetch the remote control for you, or animating a sponge to wipe down kitchen surfaces—but this is just the beginning. The scientists attached the skins to a soft tube and camera, creating a worm-like robot that could compress itself and crawl into small spaces for rescue missions. The same skins could then be worn by a person to sense their posture. One could easily imagine this being adapted into a soft exoskeleton for medical or industrial purposes: for example, helping with rehabilitation after an accident or injury.
The initial motivating factor for creating the robots was in an environment where space and weight are at a premium, and humans are forced to improvise with whatever’s at hand: outer space. Kramer-Bottoglio originally began the work after NASA called out for soft robotics systems for use by astronauts. Instead of wasting valuable rocket payload by sending up a heavy metal droid like ATLAS to fetch items or perform repairs, soft robotic skins with modular sensors could be adapted for a range of different uses spontaneously.
By reassembling components in the soft robotic skin, a crumpled ball of paper could provide the chassis for a robot that performs repairs on the spaceship, or explores the lunar surface. The dynamic compression provided by the robotic skin could be used for g-suits to protect astronauts when they rapidly accelerate or decelerate.
“One of the main things I considered was the importance of multi-functionality, especially for deep space exploration where the environment is unpredictable. The question is: How do you prepare for the unknown unknowns? … Given the design-on-the-fly nature of this approach, it’s unlikely that a robot created using robotic skins will perform any one task optimally,” Kramer-Bottiglio said. “However, the goal is not optimization, but rather diversity of applications.”
There are still problems to resolve. Many of the videos of the skins indicate that they can rely on an external power supply. Creating new, smaller batteries that can power wearable devices has been a focus of cutting-edge materials science research for some time. Much of the lab’s expertise is in creating flexible, stretchable electronics that can be deformed by the actuators without breaking the circuitry. In the future, the team hopes to work on streamlining the production process; if the components could be 3D printed, then the skins could be created when needed.
In addition, robotic hardware that’s capable of performing an impressive range of precise motions is quite an advanced technology. The software to control those robots, and enable them to perform a variety of tasks, is quite another challenge. With soft robots, it can become even more complex to design that control software, because the body itself can change shape and deform as the robot moves. The same set of programmed motions, then, can produce different results depending on the environment.
“Let’s say I have a soft robot with four legs that crawls along the ground, and I make it walk up a hard slope,” Dr. David Howard, who works on robotics at CSIRO in Australia, explained to ABC.
“If I make that slope out of gravel and I give it the same control commands, the actual body is going to deform in a different way, and I’m not necessarily going to know what that is.”
Despite these and other challenges, research like that at the Faboratory still hopes to redefine how we think of robots and robotics. Instead of a robot that imitates a human and manipulates objects, the objects themselves will become programmable matter, capable of moving autonomously and carrying out a range of tasks. Futurists speculate about a world where most objects are automated to some degree and can assemble and repair themselves, or are even built entirely of tiny robots.
The tale of the Sorcerer’s Apprentice was first written in 1797, at the dawn of the industrial revolution, over a century before the word “robot” was even coined. Yet more and more roboticists aim to prove Arthur C Clarke’s maxim: any sufficiently advanced technology is indistinguishable from magic.
Image Credit: Joran Booth, The Faboratory Continue reading
A new technique using artificial intelligence to manipulate video content gives new meaning to the expression “talking head.”
An international team of researchers showcased the latest advancement in synthesizing facial expressions—including mouth, eyes, eyebrows, and even head position—in video at this month’s 2018 SIGGRAPH, a conference on innovations in computer graphics, animation, virtual reality, and other forms of digital wizardry.
The project is called Deep Video Portraits. It relies on a type of AI called generative adversarial networks (GANs) to modify a “target” actor based on the facial and head movement of a “source” actor. As the name implies, GANs pit two opposing neural networks against one another to create a realistic talking head, right down to the sneer or raised eyebrow.
In this case, the adversaries are actually working together: One neural network generates content, while the other rejects or approves each effort. The back-and-forth interplay between the two eventually produces a realistic result that can easily fool the human eye, including reproducing a static scene behind the head as it bobs back and forth.
The researchers say the technique can be used by the film industry for a variety of purposes, from editing facial expressions of actors for matching dubbed voices to repositioning an actor’s head in post-production. AI can not only produce highly realistic results, but much quicker ones compared to the manual processes used today, according to the researchers. You can read the full paper of their work here.
“Deep Video Portraits shows how such a visual effect could be created with less effort in the future,” said Christian Richardt, from the University of Bath’s motion capture research center CAMERA, in a press release. “With our approach, even the positioning of an actor’s head and their facial expression could be easily edited to change camera angles or subtly change the framing of a scene to tell the story better.”
AI Tech Different Than So-Called “Deepfakes”
The work is far from the first to employ AI to manipulate video and audio. At last year’s SIGGRAPH conference, researchers from the University of Washington showcased their work using algorithms that inserted audio recordings from a person in one instance into a separate video of the same person in a different context.
In this case, they “faked” a video using a speech from former President Barack Obama addressing a mass shooting incident during his presidency. The AI-doctored video injects the audio into an unrelated video of the president while also blending the facial and mouth movements, creating a pretty credible job of lip synching.
A previous paper by many of the same scientists on the Deep Video Portraits project detailed how they were first able to manipulate a video in real time of a talking head (in this case, actor and former California governor Arnold Schwarzenegger). The Face2Face system pulled off this bit of digital trickery using a depth-sensing camera that tracked the facial expressions of an Asian female source actor.
A less sophisticated method of swapping faces using a machine learning software dubbed FakeApp emerged earlier this year. Predictably, the tech—requiring numerous photos of the source actor in order to train the neural network—was used for more juvenile pursuits, such as injecting a person’s face onto a porn star.
The application gave rise to the term “deepfakes,” which is now used somewhat ubiquitously to describe all such instances of AI-manipulated video—much to the chagrin of some of the researchers involved in more legitimate uses.
Fighting AI-Created Video Forgeries
However, the researchers are keenly aware that their work—intended for benign uses such as in the film industry or even to correct gaze and head positions for more natural interactions through video teleconferencing—could be used for nefarious purposes. Fake news is the most obvious concern.
“With ever-improving video editing technology, we must also start being more critical about the video content we consume every day, especially if there is no proof of origin,” said Michael Zollhöfer, a visiting assistant professor at Stanford University and member of the Deep Video Portraits team, in the press release.
Toward that end, the research team is training the same adversarial neural networks to spot video forgeries. They also strongly recommend that developers clearly watermark videos that are edited through AI or otherwise, and denote clearly what part and element of the scene was modified.
To catch less ethical users, the US Department of Defense, through the Defense Advanced Research Projects Agency (DARPA), is supporting a program called Media Forensics. This latest DARPA challenge enlists researchers to develop technologies to automatically assess the integrity of an image or video, as part of an end-to-end media forensics platform.
The DARPA official in charge of the program, Matthew Turek, did tell MIT Technology Review that so far the program has “discovered subtle cues in current GAN-manipulated images and videos that allow us to detect the presence of alterations.” In one reported example, researchers have targeted eyes, which rarely blink in the case of “deepfakes” like those created by FakeApp, because the AI is trained on still pictures. That method would seem to be less effective to spot the sort of forgeries created by Deep Video Portraits, which appears to flawlessly match the entire facial and head movements between the source and target actors.
“We believe that the field of digital forensics should and will receive a lot more attention in the future to develop approaches that can automatically prove the authenticity of a video clip,” Zollhöfer said. “This will lead to ever-better approaches that can spot such modifications even if we humans might not be able to spot them with our own eyes.
Image Credit: Tancha / Shutterstock.com Continue reading