Tag Archives: challenge

#433288 The New AI Tech Turning Heads in Video ...

A new technique using artificial intelligence to manipulate video content gives new meaning to the expression “talking head.”

An international team of researchers showcased the latest advancement in synthesizing facial expressions—including mouth, eyes, eyebrows, and even head position—in video at this month’s 2018 SIGGRAPH, a conference on innovations in computer graphics, animation, virtual reality, and other forms of digital wizardry.

The project is called Deep Video Portraits. It relies on a type of AI called generative adversarial networks (GANs) to modify a “target” actor based on the facial and head movement of a “source” actor. As the name implies, GANs pit two opposing neural networks against one another to create a realistic talking head, right down to the sneer or raised eyebrow.

In this case, the adversaries are actually working together: One neural network generates content, while the other rejects or approves each effort. The back-and-forth interplay between the two eventually produces a realistic result that can easily fool the human eye, including reproducing a static scene behind the head as it bobs back and forth.

The researchers say the technique can be used by the film industry for a variety of purposes, from editing facial expressions of actors for matching dubbed voices to repositioning an actor’s head in post-production. AI can not only produce highly realistic results, but much quicker ones compared to the manual processes used today, according to the researchers. You can read the full paper of their work here.

“Deep Video Portraits shows how such a visual effect could be created with less effort in the future,” said Christian Richardt, from the University of Bath’s motion capture research center CAMERA, in a press release. “With our approach, even the positioning of an actor’s head and their facial expression could be easily edited to change camera angles or subtly change the framing of a scene to tell the story better.”

AI Tech Different Than So-Called “Deepfakes”
The work is far from the first to employ AI to manipulate video and audio. At last year’s SIGGRAPH conference, researchers from the University of Washington showcased their work using algorithms that inserted audio recordings from a person in one instance into a separate video of the same person in a different context.

In this case, they “faked” a video using a speech from former President Barack Obama addressing a mass shooting incident during his presidency. The AI-doctored video injects the audio into an unrelated video of the president while also blending the facial and mouth movements, creating a pretty credible job of lip synching.

A previous paper by many of the same scientists on the Deep Video Portraits project detailed how they were first able to manipulate a video in real time of a talking head (in this case, actor and former California governor Arnold Schwarzenegger). The Face2Face system pulled off this bit of digital trickery using a depth-sensing camera that tracked the facial expressions of an Asian female source actor.

A less sophisticated method of swapping faces using a machine learning software dubbed FakeApp emerged earlier this year. Predictably, the tech—requiring numerous photos of the source actor in order to train the neural network—was used for more juvenile pursuits, such as injecting a person’s face onto a porn star.

The application gave rise to the term “deepfakes,” which is now used somewhat ubiquitously to describe all such instances of AI-manipulated video—much to the chagrin of some of the researchers involved in more legitimate uses.

Fighting AI-Created Video Forgeries
However, the researchers are keenly aware that their work—intended for benign uses such as in the film industry or even to correct gaze and head positions for more natural interactions through video teleconferencing—could be used for nefarious purposes. Fake news is the most obvious concern.

“With ever-improving video editing technology, we must also start being more critical about the video content we consume every day, especially if there is no proof of origin,” said Michael Zollhöfer, a visiting assistant professor at Stanford University and member of the Deep Video Portraits team, in the press release.

Toward that end, the research team is training the same adversarial neural networks to spot video forgeries. They also strongly recommend that developers clearly watermark videos that are edited through AI or otherwise, and denote clearly what part and element of the scene was modified.

To catch less ethical users, the US Department of Defense, through the Defense Advanced Research Projects Agency (DARPA), is supporting a program called Media Forensics. This latest DARPA challenge enlists researchers to develop technologies to automatically assess the integrity of an image or video, as part of an end-to-end media forensics platform.

The DARPA official in charge of the program, Matthew Turek, did tell MIT Technology Review that so far the program has “discovered subtle cues in current GAN-manipulated images and videos that allow us to detect the presence of alterations.” In one reported example, researchers have targeted eyes, which rarely blink in the case of “deepfakes” like those created by FakeApp, because the AI is trained on still pictures. That method would seem to be less effective to spot the sort of forgeries created by Deep Video Portraits, which appears to flawlessly match the entire facial and head movements between the source and target actors.

“We believe that the field of digital forensics should and will receive a lot more attention in the future to develop approaches that can automatically prove the authenticity of a video clip,” Zollhöfer said. “This will lead to ever-better approaches that can spot such modifications even if we humans might not be able to spot them with our own eyes.

Image Credit: Tancha / Shutterstock.com Continue reading

Posted in Human Robots

#432880 Google’s Duplex Raises the Question: ...

By now, you’ve probably seen Google’s new Duplex software, which promises to call people on your behalf to book appointments for haircuts and the like. As yet, it only exists in demo form, but already it seems like Google has made a big stride towards capturing a market that plenty of companies have had their eye on for quite some time. This software is impressive, but it raises questions.

Many of you will be familiar with the stilted, robotic conversations you can have with early chatbots that are, essentially, glorified menus. Instead of pressing 1 to confirm or 2 to re-enter, some of these bots would allow for simple commands like “Yes” or “No,” replacing the buttons with limited ability to recognize a few words. Using them was often a far more frustrating experience than attempting to use a menu—there are few things more irritating than a robot saying, “Sorry, your response was not recognized.”

Google Duplex scheduling a hair salon appointment:

Google Duplex calling a restaurant:

Even getting the response recognized is hard enough. After all, there are countless different nuances and accents to baffle voice recognition software, and endless turns of phrase that amount to saying the same thing that can confound natural language processing (NLP), especially if you like your phrasing quirky.

You may think that standard customer-service type conversations all travel the same route, using similar words and phrasing. But when there are over 80,000 ways to order coffee, and making a mistake is frowned upon, even simple tasks require high accuracy over a huge dataset.

Advances in audio processing, neural networks, and NLP, as well as raw computing power, have meant that basic recognition of what someone is trying to say is less of an issue. Soundhound’s virtual assistant prides itself on being able to process complicated requests (perhaps needlessly complicated).

The deeper issue, as with all attempts to develop conversational machines, is one of understanding context. There are so many ways a conversation can go that attempting to construct a conversation two or three layers deep quickly runs into problems. Multiply the thousands of things people might say by the thousands they might say next, and the combinatorics of the challenge runs away from most chatbots, leaving them as either glorified menus, gimmicks, or rather bizarre to talk to.

Yet Google, who surely remembers from Glass the risk of premature debuts for technology, especially the kind that ask you to rethink how you interact with or trust in software, must have faith in Duplex to show it on the world stage. We know that startups like Semantic Machines and x.ai have received serious funding to perform very similar functions, using natural-language conversations to perform computing tasks, schedule meetings, book hotels, or purchase items.

It’s no great leap to imagine Google will soon do the same, bringing us closer to a world of onboard computing, where Lens labels the world around us and their assistant arranges it for us (all the while gathering more and more data it can convert into personalized ads). The early demos showed some clever tricks for keeping the conversation within a fairly narrow realm where the AI should be comfortable and competent, and the blog post that accompanied the release shows just how much effort has gone into the technology.

Yet given the privacy and ethics funk the tech industry finds itself in, and people’s general unease about AI, the main reaction to Duplex’s impressive demo was concern. The voice sounded too natural, bringing to mind Lyrebird and their warnings of deepfakes. You might trust “Do the Right Thing” Google with this technology, but it could usher in an era when automated robo-callers are far more convincing.

A more human-like voice may sound like a perfectly innocuous improvement, but the fact that the assistant interjects naturalistic “umm” and “mm-hm” responses to more perfectly mimic a human rubbed a lot of people the wrong way. This wasn’t just a voice assistant trying to sound less grinding and robotic; it was actively trying to deceive people into thinking they were talking to a human.

Google is running the risk of trying to get to conversational AI by going straight through the uncanny valley.

“Google’s experiments do appear to have been designed to deceive,” said Dr. Thomas King of the Oxford Internet Institute’s Digital Ethics Lab, according to Techcrunch. “Their main hypothesis was ‘can you distinguish this from a real person?’ In this case it’s unclear why their hypothesis was about deception and not the user experience… there should be some kind of mechanism there to let people know what it is they are speaking to.”

From Google’s perspective, being able to say “90 percent of callers can’t tell the difference between this and a human personal assistant” is an excellent marketing ploy, even though statistics about how many interactions are successful might be more relevant.

In fact, Duplex runs contrary to pretty much every major recommendation about ethics for the use of robotics or artificial intelligence, not to mention certain eavesdropping laws. Transparency is key to holding machines (and the people who design them) accountable, especially when it comes to decision-making.

Then there are the more subtle social issues. One prominent effect social media has had is to allow people to silo themselves; in echo chambers of like-minded individuals, it’s hard to see how other opinions exist. Technology exacerbates this by removing the evolutionary cues that go along with face-to-face interaction. Confronted with a pair of human eyes, people are more generous. Confronted with a Twitter avatar or a Facebook interface, people hurl abuse and criticism they’d never dream of using in a public setting.

Now that we can use technology to interact with ever fewer people, will it change us? Is it fair to offload the burden of dealing with a robot onto the poor human at the other end of the line, who might have to deal with dozens of such calls a day? Google has said that if the AI is in trouble, it will put you through to a human, which might help save receptionists from the hell of trying to explain a concept to dozens of dumbfounded AI assistants all day. But there’s always the risk that failures will be blamed on the person and not the machine.

As AI advances, could we end up treating the dwindling number of people in these “customer-facing” roles as the buggiest part of a fully automatic service? Will people start accusing each other of being robots on the phone, as well as on Twitter?

Google has provided plenty of reassurances about how the system will be used. They have said they will ensure that the system is identified, and it’s hardly difficult to resolve this problem; a slight change in the script from their demo would do it. For now, consumers will likely appreciate moves that make it clear whether the “intelligent agents” that make major decisions for us, that we interact with daily, and that hide behind social media avatars or phone numbers are real or artificial.

Image Credit: Besjunior / Shutterstock.com Continue reading

Posted in Human Robots

#432671 Stuff 3.0: The Era of Programmable ...

It’s the end of a long day in your apartment in the early 2040s. You decide your work is done for the day, stand up from your desk, and yawn. “Time for a film!” you say. The house responds to your cues. The desk splits into hundreds of tiny pieces, which flow behind you and take on shape again as a couch. The computer screen you were working on flows up the wall and expands into a flat projection screen. You relax into the couch and, after a few seconds, a remote control surfaces from one of its arms.

In a few seconds flat, you’ve gone from a neatly-equipped office to a home cinema…all within the same four walls. Who needs more than one room?

This is the dream of those who work on “programmable matter.”

In his recent book about AI, Max Tegmark makes a distinction between three different levels of computational sophistication for organisms. Life 1.0 is single-celled organisms like bacteria; here, hardware is indistinguishable from software. The behavior of the bacteria is encoded into its DNA; it cannot learn new things.

Life 2.0 is where humans live on the spectrum. We are more or less stuck with our hardware, but we can change our software by choosing to learn different things, say, Spanish instead of Italian. Much like managing space on your smartphone, your brain’s hardware will allow you to download only a certain number of packages, but, at least theoretically, you can learn new behaviors without changing your underlying genetic code.

Life 3.0 marks a step-change from this: creatures that can change both their hardware and software in something like a feedback loop. This is what Tegmark views as a true artificial intelligence—one that can learn to change its own base code, leading to an explosion in intelligence. Perhaps, with CRISPR and other gene-editing techniques, we could be using our “software” to doctor our “hardware” before too long.

Programmable matter extends this analogy to the things in our world: what if your sofa could “learn” how to become a writing desk? What if, instead of a Swiss Army knife with dozens of tool attachments, you just had a single tool that “knew” how to become any other tool you could require, on command? In the crowded cities of the future, could houses be replaced by single, OmniRoom apartments? It would save space, and perhaps resources too.

Such are the dreams, anyway.

But when engineering and manufacturing individual gadgets is such a complex process, you can imagine that making stuff that can turn into many different items can be extremely complicated. Professor Skylar Tibbits at MIT referred to it as 4D printing in a TED Talk, and the website for his research group, the Self-Assembly Lab, excitedly claims, “We have also identified the key ingredients for self-assembly as a simple set of responsive building blocks, energy and interactions that can be designed within nearly every material and machining process available. Self-assembly promises to enable breakthroughs across many disciplines, from biology to material science, software, robotics, manufacturing, transportation, infrastructure, construction, the arts, and even space exploration.”

Naturally, their projects are still in the early stages, but the Self-Assembly Lab and others are genuinely exploring just the kind of science fiction applications we mooted.

For example, there’s the cell-phone self-assembly project, which brings to mind eerie, 24/7 factories where mobile phones assemble themselves from 3D printed kits without human or robotic intervention. Okay, so the phones they’re making are hardly going to fly off the shelves as fashion items, but if all you want is something that works, it could cut manufacturing costs substantially and automate even more of the process.

One of the major hurdles to overcome in making programmable matter a reality is choosing the right fundamental building blocks. There’s a very important balance to strike. To create fine details, you need to have things that aren’t too big, so as to keep your rearranged matter from being too lumpy. This might make the building blocks useless for certain applications—for example, if you wanted to make tools for fine manipulation. With big pieces, it might be difficult to simulate a range of textures. On the other hand, if the pieces are too small, different problems can arise.

Imagine a setup where each piece is a small robot. You have to contain the robot’s power source and its brain, or at least some kind of signal-generator and signal-processor, all in the same compact unit. Perhaps you can imagine that one might be able to simulate a range of textures and strengths by changing the strength of the “bond” between individual units—your desk might need to be a little bit more firm than your bed, which might be nicer with a little more give.

Early steps toward creating this kind of matter have been taken by those who are developing modular robots. There are plenty of different groups working on this, including MIT, Lausanne, and the University of Brussels.

In the latter configuration, one individual robot acts as a centralized decision-maker, referred to as the brain unit, but additional robots can autonomously join the brain unit as and when needed to change the shape and structure of the overall system. Although the system is only ten units at present, it’s a proof-of-concept that control can be orchestrated over a modular system of robots; perhaps in the future, smaller versions of the same thing could be the components of Stuff 3.0.

You can imagine that with machine learning algorithms, such swarms of robots might be able to negotiate obstacles and respond to a changing environment more easily than an individual robot (those of you with techno-fear may read “respond to a changing environment” and imagine a robot seamlessly rearranging itself to allow a bullet to pass straight through without harm).

Speaking of robotics, the form of an ideal robot has been a subject of much debate. In fact, one of the major recent robotics competitions—DARPA’s Robotics Challenge—was won by a robot that could adapt, beating Boston Dynamics’ infamous ATLAS humanoid with the simple addition of a wheel that allowed it to drive as well as walk.

Rather than building robots into a humanoid shape (only sometimes useful), allowing them to evolve and discover the ideal form for performing whatever you’ve tasked them to do could prove far more useful. This is particularly true in disaster response, where expensive robots can still be more valuable than humans, but conditions can be very unpredictable and adaptability is key.

Further afield, many futurists imagine “foglets” as the tiny nanobots that will be capable of constructing anything from raw materials, somewhat like the “Santa Claus machine.” But you don’t necessarily need anything quite so indistinguishable from magic to be useful. Programmable matter that can respond and adapt to its surroundings could be used in all kinds of industrial applications. How about a pipe that can strengthen or weaken at will, or divert its direction on command?

We’re some way off from being able to order our beds to turn into bicycles. As with many tech ideas, it may turn out that the traditional low-tech solution is far more practical and cost-effective, even as we can imagine alternatives. But as the march to put a chip in every conceivable object goes on, it seems certain that inanimate objects are about to get a lot more animated.

Image Credit: PeterVrabel / Shutterstock.com Continue reading

Posted in Human Robots

#432646 How Fukushima Changed Japanese Robotics ...

In March 2011, Japan was hit by a catastrophic earthquake that triggered a terrible tsunami. Thousands were killed and billions of dollars of damage was done in one of the worst disasters of modern times. For a few perilous weeks, though, the eyes of the world were focused on the Fukushima Daiichi nuclear power plant. Its safety systems were unable to cope with the tsunami damage, and there were widespread fears of another catastrophic meltdown that could spread radiation over several countries, like the Chernobyl disaster in the 1980s. A heroic effort that included dumping seawater into the reactor core prevented an even bigger catastrophe. As it is, a hundred thousand people are still evacuated from the area, and it will likely take many years and hundreds of billions of dollars before the region is safe.

Because radiation is so dangerous to humans, the natural solution to the Fukushima disaster was to send in robots to monitor levels of radiation and attempt to begin the clean-up process. The techno-optimists in Japan had discovered a challenge, deep in the heart of that reactor core, that even their optimism could not solve. The radiation fried the circuits of the robots that were sent in, even those specifically designed and built to deal with the Fukushima catastrophe. The power plant slowly became a vast robot graveyard. While some robots initially saw success in measuring radiation levels around the plant—and, recently, a robot was able to identify the melted uranium fuel at the heart of the disaster—hopes of them playing a substantial role in the clean-up are starting to diminish.



In Tokyo’s neon Shibuya district, it can sometimes seem like it’s brighter at night than it is during the daytime. In karaoke booths on the twelfth floor—because everything is on the twelfth floor—overlooking the brightly-lit streets, businessmen unwind by blasting out pop hits. It can feel like the most artificial place on Earth; your senses are dazzled by the futuristic techno-optimism. Stock footage of the area has become symbolic of futurism and modernity.

Japan has had a reputation for being a nation of futurists for a long time. We’ve already described how tech giant Softbank, headed by visionary founder Masayoshi Son, is investing billions in a technological future, including plans for the world’s largest solar farm.

When Google sold pioneering robotics company Boston Dynamics in 2017, Softbank added it to their portfolio, alongside the famous Nao and Pepper robots. Some may think that Son is taking a gamble in pursuing a robotics project even Google couldn’t succeed in, but this is a man who lost nearly everything in the dot-com crash of 2000. The fact that even this reversal didn’t dent his optimism and faith in technology is telling. But how long can it last?

The failure of Japan’s robots to deal with the immense challenge of Fukushima has sparked something of a crisis of conscience within the industry. Disaster response is an obvious stepping-stone technology for robots. Initially, producing a humanoid robot will be very costly, and the robot will be less capable than a human; building a robot to wait tables might not be particularly economical yet. Building a robot to do jobs that are too dangerous for humans is far more viable. Yet, at Fukushima, in one of the most advanced nations in the world, many of the robots weren’t up to the task.

Nowhere was this crisis more felt than Honda; the company had developed ASIMO, which stunned the world in 2000 and continues to fascinate as an iconic humanoid robot. Despite all this technological advancement, however, Honda knew that ASIMO was still too unreliable for the real world.

It was Fukushima that triggered a sea-change in Honda’s approach to robotics. Two years after the disaster, there were rumblings that Honda was developing a disaster robot, and in October 2017, the prototype was revealed to the public for the first time. It’s not yet ready for deployment in disaster zones, however. Interestingly, the creators chose not to give it dexterous hands but instead to assume that remotely-operated tools fitted to the robot would be a better solution for the range of circumstances it might encounter.

This shift in focus for humanoid robots away from entertainment and amusement like ASIMO, and towards being practically useful, has been mirrored across the world.

In 2015, also inspired by the Fukushima disaster and the lack of disaster-ready robots, the DARPA Robotics Challenge tested humanoid robots with a range of tasks that might be needed in emergency response, such as driving cars, opening doors, and climbing stairs. The Terminator-like ATLAS robot from Boston Dynamics, alongside Korean robot HUBO, took many of the plaudits, and CHIMP also put in an impressive display by being able to right itself after falling.

Yet the DARPA Robotics Challenge showed us just how far the robots are from truly being as useful as we’d like, or maybe even as we would imagine. Many robots took hours to complete the tasks, which were highly idealized to suit them. Climbing stairs proved a particular challenge. Those who watched were more likely to see a robot that had fallen over, struggling to get up, rather than heroic superbots striding in to save the day. The “striding” proved a particular problem, with the fastest robot HUBO managing this by resorting to wheels in its knees when the legs weren’t necessary.

Fukushima may have brought a sea-change over futuristic Japan, but before robots will really begin to enter our everyday lives, they will need to prove their worth. In the interim, aerial drone robots designed to examine infrastructure damage after disasters may well see earlier deployment and more success.

It’s a considerable challenge.

Building a humanoid robot is expensive; if these multi-million-dollar machines can’t help in a crisis, people may begin to question the worth of investing in them in the first place (unless your aim is just to make viral videos). This could lead to a further crisis of confidence among the Japanese, who are starting to rely on humanoid robotics as a solution to the crisis of the aging population. The Japanese government, as part of its robots strategy, has already invested $44 million in their development.

But if they continue to fail when put to the test, that will raise serious concerns. In Tokyo’s Akihabara district, you can see all kinds of flash robotic toys for sale in the neon-lit superstores, and dancing, acting robots like Robothespian can entertain crowds all over the world. But if we want these machines to be anything more than toys—partners, helpers, even saviors—more work needs to be done.

At the same time, those who participated in the DARPA Robotics Challenge in 2015 won’t be too concerned if people were underwhelmed by the performance of their disaster relief robots. Back in 2004, nearly every participant in the DARPA Grand Challenge crashed, caught fire, or failed on the starting line. To an outside observer, the whole thing would have seemed like an unmitigated disaster, and a pointless investment. What was the task in 2004? Developing a self-driving car. A lot can change in a decade.

Image Credit: MARCUSZ2527 / Shutterstock.com Continue reading

Posted in Human Robots

#432563 This Week’s Awesome Stories From ...

ARTIFICIAL INTELLIGENCE
Pedro Domingos on the Arms Race in Artificial Intelligence
Christoph Scheuermann and Bernhard Zand | Spiegel Online
“AI lowers the cost of knowledge by orders of magnitude. One good, effective machine learning system can do the work of a million people, whether it’s for commercial purposes or for cyberespionage. Imagine a country that produces a thousand times more knowledge than another. This is the challenge we are facing.”

BIOTECHNOLOGY
Gene Therapy Could Free Some People From a Lifetime of Blood Transfusions
Emily Mullin | MIT Technology Review
“A one-time, experimental treatment for an inherited blood disorder has shown dramatic results in a small study. …[Lead author Alexis Thompson] says the effect on patients has been remarkable. ‘They have been tied to this ongoing medical therapy that is burdensome and expensive for their whole lives,’ she says. ‘Gene therapy has allowed people to have aspirations and really pursue them.’ ”

ENVIRONMENT
The Revolutionary Giant Ocean Cleanup Machine Is About to Set Sail
Adele Peters | Fast Company
“By the end of 2018, the nonprofit says it will bring back its first harvest of ocean plastic from the North Pacific Gyre, along with concrete proof that the design works. The organization expects to bring 5,000 kilograms of plastic ashore per month with its first system. With a full fleet of systems deployed, it believes that it can collect half of the plastic trash in the Great Pacific Garbage Patch—around 40,000 metric tons—within five years.”

ROBOTICS
Autonomous Boats Will Be on the Market Sooner Than Self-Driving Cars
Tracey Lindeman | Motherboard
“Some unmanned watercraft…may be at sea commercially before 2020. That’s partly because automating all ships could generate a ridiculous amount of revenue. According to the United Nations, 90 percent of the world’s trade is carried by sea and 10.3 billion tons of products were shipped in 2016.”

DIGITAL CULTURE
Style Is an Algorithm
Kyle Chayka | Racked
“Confronting the Echo Look’s opaque statements on my fashion sense, I realize that all of these algorithmic experiences are matters of taste: the question of what we like and why we like it, and what it means that taste is increasingly dictated by black-box robots like the camera on my shelf.”

COMPUTING
How Apple Will Use AR to Reinvent the Human-Computer Interface
Tim Bajarin | Fast Company
“It’s in Apple’s DNA to continually deliver the ‘next’ major advancement to the personal computing experience. Its innovation in man-machine interfaces started with the Mac and then extended to the iPod, the iPhone, the iPad, and most recently, the Apple Watch. Now, get ready for the next chapter, as Apple tackles augmented reality, in a way that could fundamentally transform the human-computer interface.”

SCIENCE
Advanced Microscope Shows Cells at Work in Incredible Detail
Steve Dent | Engadget
“For the first time, scientists have peered into living cells and created videos showing how they function with unprecedented 3D detail. Using a special microscope and new lighting techniques, a team from Harvard and the Howard Hughes Medical Institute captured zebrafish immune cell interactions with unheard-of 3D detail and resolution.”

Image Credit: dubassy / Shutterstock.com Continue reading

Posted in Human Robots