Tag Archives: mouth

#432487 Can We Make a Musical Turing Test?

As artificial intelligence advances, we’re encountering the same old questions. How much of what we consider to be fundamentally human can be reduced to an algorithm? Can we create something sufficiently advanced that people can no longer distinguish between the two? This, after all, is the idea behind the Turing Test, which has yet to be passed.

At first glance, you might think music is beyond the realm of algorithms. Birds can sing, and people can compose symphonies. Music is evocative; it makes us feel. Very often, our intense personal and emotional attachments to music are because it reminds us of our shared humanity. We are told that creative jobs are the least likely to be automated. Creativity seems fundamentally human.

But I think above all, we view it as reductionist sacrilege: to dissect beautiful things. “If you try to strangle a skylark / to cut it up, see how it works / you will stop its heart from beating / you will stop its mouth from singing.” A human musician wrote that; a machine might be able to string words together that are happy or sad; it might even be able to conjure up a decent metaphor from the depths of some neural network—but could it understand humanity enough to produce art that speaks to humans?

Then, of course, there’s the other side of the debate. Music, after all, has a deeply mathematical structure; you can train a machine to produce harmonics. “In the teachings of Pythagoras and his followers, music was inseparable from numbers, which were thought to be the key to the whole spiritual and physical universe,” according to Grout in A History of Western Music. You might argue that the process of musical composition cannot be reduced to a simple algorithm, yet musicians have often done so. Mozart, with his “Dice Music,” used the roll of a dice to decide how to order musical fragments; creativity through an 18th-century random number generator. Algorithmic music goes back a very long way, with the first papers on the subject from the 1960s.

Then there’s the techno-enthusiast side of the argument. iTunes has 26 million songs, easily more than a century of music. A human could never listen to and learn from them all, but a machine could. It could also memorize every note of Beethoven. Music can be converted into MIDI files, a nice chewable data format that allows even a character-by-character neural net you can run on your computer to generate music. (Seriously, even I could get this thing working.)

Indeed, generating music in the style of Bach has long been a test for AI, and you can see neural networks gradually learn to imitate classical composers while trying to avoid overfitting. When an algorithm overfits, it essentially starts copying the existing music, rather than being inspired by it but creating something similar: a tightrope the best human artists learn to walk. Creativity doesn’t spring from nowhere; even maverick musical geniuses have their influences.

Does a machine have to be truly ‘creative’ to produce something that someone would find valuable? To what extent would listeners’ attitudes change if they thought they were hearing a human vs. an AI composition? This all suggests a musical Turing Test. Of course, it already exists. In fact, it’s run out of Dartmouth, the school that hosted that first, seminal AI summer conference. This year, the contest is bigger than ever: alongside the PoetiX, LimeriX and LyriX competitions for poetry and lyrics, there’s a DigiKidLit competition for children’s literature (although you may have reservations about exposing your children to neural-net generated content… it can get a bit surreal).

There’s also a pair of musical competitions, including one for original compositions in different genres. Key genres and styles are represented by Charlie Parker for Jazz and the Bach chorales for classical music. There’s also a free composition, and a contest where a human and an AI try to improvise together—the AI must respond to a human spontaneously, in real time, and in a musically pleasing way. Quite a challenge! In all cases, if any of the generated work is indistinguishable from human performers, the neural net has passed the Turing Test.

Did they? Here’s part of 2017’s winning sonnet from Charese Smiley and Hiroko Bretz:

The large cabin was in total darkness.
Come marching up the eastern hill afar.
When is the clock on the stairs dangerous?
Everything seemed so near and yet so far.
Behind the wall silence alone replied.
Was, then, even the staircase occupied?
Generating the rhymes is easy enough, the sentence structure a little trickier, but what’s impressive about this sonnet is that it sticks to a single topic and appears to be a more coherent whole. I’d guess they used associated “lexical fields” of similar words to help generate something coherent. In a similar way, most of the more famous examples of AI-generated music still involve some amount of human control, even if it’s editorial; a human will build a song around an AI-generated riff, or select the most convincing Bach chorale from amidst many different samples.

We are seeing strides forward in the ability of AI to generate human voices and human likenesses. As the latter example shows, in the fake news era people have focused on the dangers of this tech– but might it also be possible to create a virtual performer, trained on a dataset of their original music? Did you ever want to hear another Beatles album, or jam with Miles Davis? Of course, these things are impossible—but could we create a similar experience that people would genuinely value? Even, to the untrained eye, something indistinguishable from the real thing?

And if it did measure up to the real thing, what would this mean? Jaron Lanier is a fascinating technology writer, a critic of strong AI, and a believer in the power of virtual reality to change the world and provide truly meaningful experiences. He’s also a composer and a musical aficionado. He pointed out in a recent interview that translation algorithms, by reducing the amount of work translators are commissioned to do, have, in some sense, profited from stolen expertise. They were trained on huge datasets purloined from human linguists and translators. If you can train an AI on someone’s creative output and it produces new music, who “owns” it?

Although companies that offer AI music tools are starting to proliferate, and some groups will argue that the musical Turing test has been passed already, AI-generated music is hardly racing to the top of the pop charts just yet. Even as the line between human-composed and AI-generated music starts to blur, there’s still a gulf between the average human and musical genius. In the next few years, we’ll see how far the current techniques can take us. It may be the case that there’s something in the skylark’s song that can’t be generated by machines. But maybe not, and then this song might need an extra verse.

Image Credit: d1sk / Shutterstock.com Continue reading

Posted in Human Robots

#432467 Dungeons and Dragons, Not Chess and Go: ...

Everyone had died—not that you’d know it, from how they were laughing about their poor choices and bad rolls of the dice. As a social anthropologist, I study how people understand artificial intelligence (AI) and our efforts towards attaining it; I’m also a life-long fan of Dungeons and Dragons (D&D), the inventive fantasy roleplaying game. During a recent quest, when I was playing an elf ranger, the trainee paladin (or holy knight) acted according to his noble character, and announced our presence at the mouth of a dragon’s lair. The results were disastrous. But while success in D&D means “beating the bad guy,” the game is also a creative sandbox, where failure can count as collective triumph so long as you tell a great tale.

What does this have to do with AI? In computer science, games are frequently used as a benchmark for an algorithm’s “intelligence.” The late Robert Wilensky, a professor at the University of California, Berkeley and a leading figure in AI, offered one reason why this might be. Computer scientists “looked around at who the smartest people were, and they were themselves, of course,” he told the authors of Compulsive Technology: Computers as Culture (1985). “They were all essentially mathematicians by training, and mathematicians do two things—they prove theorems and play chess. And they said, hey, if it proves a theorem or plays chess, it must be smart.” No surprise that demonstrations of AI’s “smarts” have focused on the artificial player’s prowess.

Yet the games that get chosen—like Go, the main battlefield for Google DeepMind’s algorithms in recent years—tend to be tightly bounded, with set objectives and clear paths to victory or defeat. These experiences have none of the open-ended collaboration of D&D. Which got me thinking: do we need a new test for intelligence, where the goal is not simply about success, but storytelling? What would it mean for an AI to “pass” as human in a game of D&D? Instead of the Turing test, perhaps we need an elf ranger test?

Of course, this is just a playful thought experiment, but it does highlight the flaws in certain models of intelligence. First, it reveals how intelligence has to work across a variety of environments. D&D participants can inhabit many characters in many games, and the individual player can “switch” between roles (the fighter, the thief, the healer). Meanwhile, AI researchers know that it’s super difficult to get a well-trained algorithm to apply its insights in even slightly different domains—something that we humans manage surprisingly well.

Second, D&D reminds us that intelligence is embodied. In computer games, the bodily aspect of the experience might range from pressing buttons on a controller in order to move an icon or avatar (a ping-pong paddle; a spaceship; an anthropomorphic, eternally hungry, yellow sphere), to more recent and immersive experiences involving virtual-reality goggles and haptic gloves. Even without these add-ons, games can still produce biological responses associated with stress and fear (if you’ve ever played Alien: Isolation you’ll understand). In the original D&D, the players encounter the game while sitting around a table together, feeling the story and its impact. Recent research in cognitive science suggests that bodily interactions are crucial to how we grasp more abstract mental concepts. But we give minimal attention to the embodiment of artificial agents, and how that might affect the way they learn and process information.

Finally, intelligence is social. AI algorithms typically learn through multiple rounds of competition, in which successful strategies get reinforced with rewards. True, it appears that humans also evolved to learn through repetition, reward and reinforcement. But there’s an important collaborative dimension to human intelligence. In the 1930s, the psychologist Lev Vygotsky identified the interaction of an expert and a novice as an example of what became called “scaffolded” learning, where the teacher demonstrates and then supports the learner in acquiring a new skill. In unbounded games, this cooperation is channelled through narrative. Games of It among small children can evolve from win/lose into attacks by terrible monsters, before shifting again to more complex narratives that explain why the monsters are attacking, who is the hero, and what they can do and why—narratives that aren’t always logical or even internally compatible. An AI that could engage in social storytelling is doubtless on a surer, more multifunctional footing than one that plays chess; and there’s no guarantee that chess is even a step on the road to attaining intelligence of this sort.

In some ways, this failure to look at roleplaying as a technical hurdle for intelligence is strange. D&D was a key cultural touchstone for technologists in the 1980s and the inspiration for many early text-based computer games, as Katie Hafner and Matthew Lyon point out in Where Wizards Stay up Late: The Origins of the Internet (1996). Even today, AI researchers who play games in their free time often mention D&D specifically. So instead of beating adversaries in games, we might learn more about intelligence if we tried to teach artificial agents to play together as we do: as paladins and elf rangers.

This article was originally published at Aeon and has been republished under Creative Commons.

Image Credit:Benny Mazur/Flickr / CC BY 2.0 Continue reading

Posted in Human Robots

#431389 Tech Is Becoming Emotionally ...

Many people get frustrated with technology when it malfunctions or is counterintuitive. The last thing people might expect is for that same technology to pick up on their emotions and engage with them differently as a result.
All of that is now changing. Computers are increasingly able to figure out what we’re feeling—and it’s big business.
A recent report predicts that the global affective computing market will grow from $12.2 billion in 2016 to $53.98 billion by 2021. The report by research and consultancy firm MarketsandMarkets observed that enabling technologies have already been adopted in a wide range of industries and noted a rising demand for facial feature extraction software.
Affective computing is also referred to as emotion AI or artificial emotional intelligence. Although many people are still unfamiliar with the category, researchers in academia have already discovered a multitude of uses for it.
At the University of Tokyo, Professor Toshihiko Yamasaki decided to develop a machine learning system that evaluates the quality of TED Talk videos. Of course, a TED Talk is only considered to be good if it resonates with a human audience. On the surface, this would seem too qualitatively abstract for computer analysis. But Yamasaki wanted his system to watch videos of presentations and predict user impressions. Could a machine learning system accurately evaluate the emotional persuasiveness of a speaker?
Yamasaki and his colleagues came up with a method that analyzed correlations and “multimodal features including linguistic as well as acoustic features” in a dataset of 1,646 TED Talk videos. The experiment was successful. The method obtained “a statistically significant macro-average accuracy of 93.3 percent, outperforming several competitive baseline methods.”
A machine was able to predict whether or not a person would emotionally connect with other people. In their report, the authors noted that these findings could be used for recommendation purposes and also as feedback to the presenters, in order to improve the quality of their public presentation. However, the usefulness of affective computing goes far beyond the way people present content. It may also transform the way they learn it.
Researchers from North Carolina State University explored the connection between students’ affective states and their ability to learn. Their software was able to accurately predict the effectiveness of online tutoring sessions by analyzing the facial expressions of participating students. The software tracked fine-grained facial movements such as eyebrow raising, eyelid tightening, and mouth dimpling to determine engagement, frustration, and learning. The authors concluded that “analysis of facial expressions has great potential for educational data mining.”
This type of technology is increasingly being used within the private sector. Affectiva is a Boston-based company that makes emotion recognition software. When asked to comment on this emerging technology, Gabi Zijderveld, chief marketing officer at Affectiva, explained in an interview for this article, “Our software measures facial expressions of emotion. So basically all you need is our software running and then access to a camera so you can basically record a face and analyze it. We can do that in real time or we can do this by looking at a video and then analyzing data and sending it back to folks.”
The technology has particular relevance for the advertising industry.
Zijderveld said, “We have products that allow you to measure how consumers or viewers respond to digital content…you could have a number of people looking at an ad, you measure their emotional response so you aggregate the data and it gives you insight into how well your content is performing. And then you can adapt and adjust accordingly.”
Zijderveld explained that this is the first market where the company got traction. However, they have since packaged up their core technology in software development kits or SDKs. This allows other companies to integrate emotion detection into whatever they are building.
By licensing its technology to others, Affectiva is now rapidly expanding into a wide variety of markets, including gaming, education, robotics, and healthcare. The core technology is also used in human resources for the purposes of video recruitment. The software analyzes the emotional responses of interviewees, and that data is factored into hiring decisions.
Richard Yonck is founder and president of Intelligent Future Consulting and the author of a book about our relationship with technology. “One area I discuss in Heart of the Machine is the idea of an emotional economy that will arise as an ecosystem of emotionally aware businesses, systems, and services are developed. This will rapidly expand into a multi-billion-dollar industry, leading to an infrastructure that will be both emotionally responsive and potentially exploitive at personal, commercial, and political levels,” said Yonck, in an interview for this article.
According to Yonck, these emotionally-aware systems will “better anticipate needs, improve efficiency, and reduce stress and misunderstandings.”
Affectiva is uniquely positioned to profit from this “emotional economy.” The company has already created the world’s largest emotion database. “We’ve analyzed a little bit over 4.7 million faces in 75 countries,” said Zijderveld. “This is data first and foremost, it’s data gathered with consent. So everyone has opted in to have their faces analyzed.”
The vastness of that database is essential for deep learning approaches. The software would be inaccurate if the data was inadequate. According to Zijderveld, “If you don’t have massive amounts of data of people of all ages, genders, and ethnicities, then your algorithms are going to be pretty biased.”
This massive database has already revealed cultural insights into how people express emotion. Zijderveld explained, “Obviously everyone knows that women are more expressive than men. But our data confirms that, but not only that, it can also show that women smile longer. They tend to smile more often. There’s also regional differences.”
Yonck believes that affective computing will inspire unimaginable forms of innovation and that change will happen at a fast pace.
He explained, “As businesses, software, systems, and services develop, they’ll support and make possible all sorts of other emotionally aware technologies that couldn’t previously exist. This leads to a spiral of increasingly sophisticated products, just as happened in the early days of computing.”
Those who are curious about affective technology will soon be able to interact with it.
Hubble Connected unveiled the Hubble Hugo at multiple trade shows this year. Hugo is billed as “the world’s first smart camera,” with emotion AI video analytics powered by Affectiva. The product can identify individuals, figure out how they’re feeling, receive voice commands, video monitor your home, and act as a photographer and videographer of events. Media can then be transmitted to the cloud. The company’s website describes Hugo as “a fun pal to have in the house.”
Although he sees the potential for improved efficiencies and expanding markets, Richard Yonck cautions that AI technology is not without its pitfalls.
“It’s critical that we understand we are headed into very unknown territory as we develop these systems, creating problems unlike any we’ve faced before,” said Yonck. “We should put our focus on ensuring AI develops in a way that represents our human values and ideals.”
Image Credit: Kisan / Shutterstock.com Continue reading

Posted in Human Robots

#430686 This Week’s Awesome Stories From ...

ARTIFICIAL INTELLIGENCE
DeepMind’s AI Is Teaching Itself Parkour, and the Results Are AdorableJames Vincent | The Verge“The research explores how reinforcement learning (or RL) can be used to teach a computer to navigate unfamiliar and complex environments. It’s the sort of fundamental AI research that we’re now testing in virtual worlds, but that will one day help program robots that can navigate the stairs in your house.”
VIRTUAL REALITY
Now You Can Broadcast Facebook Live Videos From Virtual RealityDaniel Terdiman | Fast Company“The idea is fairly simple. Spaces allows up to four people—each of whom must have an Oculus Rift VR headset—to hang out together in VR. Together, they can talk, chat, draw, create new objects, watch 360-degree videos, share photos, and much more. And now, they can live-broadcast everything they do in Spaces, much the same way that any Facebook user can produce live video of real life and share it with the world.”
ROBOTICS
I Watched Two Robots Chat Together on Stage at a Tech EventJon Russell | TechCrunch“The robots in question are Sophia and Han, and they belong to Hanson Robotics, a Hong Kong-based company that is developing and deploying artificial intelligence in humanoids. The duo took to the stage at Rise in Hong Kong with Hanson Robotics’ Chief Scientist Ben Goertzel directing the banter. The conversation, which was partially scripted, wasn’t as slick as the human-to-human panels at the show, but it was certainly a sight to behold for the packed audience.”
BIOTECH
Scientists Used CRISPR to Put a GIF Inside a Living Organism’s DNAEmily Mullin | MIT Technology Review“They delivered the GIF into the living bacteria in the form of five frames: images of a galloping horse and rider, taken by English photographer Eadweard Muybridge…The researchers were then able to retrieve the data by sequencing the bacterial DNA. They reconstructed the movie with 90 percent accuracy by reading the pixel nucleotide code.”
DIGITAL MEDIA
AI Creates Fake ObamaCharles Q. Choi | IEEE Spectrum“In the new study, the neural net learned what mouth shapes were linked to various sounds. The researchers took audio clips and dubbed them over the original sound files of a video. They next took mouth shapes that matched the new audio clips and grafted and blended them onto the video. Essentially, the researchers synthesized videos where Obama lip-synched words he said up to decades beforehand.”
Stock Media provided by adam121 / Pond5 Continue reading

Posted in Human Robots

#430015 Open wide! Dental students get to ...

If you don’t feel like being a Guinea pig for dentists in-training, you’re not alone. Enters the Japanese robot with a complete set of teeth, that senses pain and allows dental students to hone their skills before moving on to … Continue reading

Posted in Human Robots