Tag Archives: old
#432487 Can We Make a Musical Turing Test?
As artificial intelligence advances, we’re encountering the same old questions. How much of what we consider to be fundamentally human can be reduced to an algorithm? Can we create something sufficiently advanced that people can no longer distinguish between the two? This, after all, is the idea behind the Turing Test, which has yet to be passed.
At first glance, you might think music is beyond the realm of algorithms. Birds can sing, and people can compose symphonies. Music is evocative; it makes us feel. Very often, our intense personal and emotional attachments to music are because it reminds us of our shared humanity. We are told that creative jobs are the least likely to be automated. Creativity seems fundamentally human.
But I think above all, we view it as reductionist sacrilege: to dissect beautiful things. “If you try to strangle a skylark / to cut it up, see how it works / you will stop its heart from beating / you will stop its mouth from singing.” A human musician wrote that; a machine might be able to string words together that are happy or sad; it might even be able to conjure up a decent metaphor from the depths of some neural network—but could it understand humanity enough to produce art that speaks to humans?
Then, of course, there’s the other side of the debate. Music, after all, has a deeply mathematical structure; you can train a machine to produce harmonics. “In the teachings of Pythagoras and his followers, music was inseparable from numbers, which were thought to be the key to the whole spiritual and physical universe,” according to Grout in A History of Western Music. You might argue that the process of musical composition cannot be reduced to a simple algorithm, yet musicians have often done so. Mozart, with his “Dice Music,” used the roll of a dice to decide how to order musical fragments; creativity through an 18th-century random number generator. Algorithmic music goes back a very long way, with the first papers on the subject from the 1960s.
Then there’s the techno-enthusiast side of the argument. iTunes has 26 million songs, easily more than a century of music. A human could never listen to and learn from them all, but a machine could. It could also memorize every note of Beethoven. Music can be converted into MIDI files, a nice chewable data format that allows even a character-by-character neural net you can run on your computer to generate music. (Seriously, even I could get this thing working.)
Indeed, generating music in the style of Bach has long been a test for AI, and you can see neural networks gradually learn to imitate classical composers while trying to avoid overfitting. When an algorithm overfits, it essentially starts copying the existing music, rather than being inspired by it but creating something similar: a tightrope the best human artists learn to walk. Creativity doesn’t spring from nowhere; even maverick musical geniuses have their influences.
Does a machine have to be truly ‘creative’ to produce something that someone would find valuable? To what extent would listeners’ attitudes change if they thought they were hearing a human vs. an AI composition? This all suggests a musical Turing Test. Of course, it already exists. In fact, it’s run out of Dartmouth, the school that hosted that first, seminal AI summer conference. This year, the contest is bigger than ever: alongside the PoetiX, LimeriX and LyriX competitions for poetry and lyrics, there’s a DigiKidLit competition for children’s literature (although you may have reservations about exposing your children to neural-net generated content… it can get a bit surreal).
There’s also a pair of musical competitions, including one for original compositions in different genres. Key genres and styles are represented by Charlie Parker for Jazz and the Bach chorales for classical music. There’s also a free composition, and a contest where a human and an AI try to improvise together—the AI must respond to a human spontaneously, in real time, and in a musically pleasing way. Quite a challenge! In all cases, if any of the generated work is indistinguishable from human performers, the neural net has passed the Turing Test.
Did they? Here’s part of 2017’s winning sonnet from Charese Smiley and Hiroko Bretz:
The large cabin was in total darkness.
Come marching up the eastern hill afar.
When is the clock on the stairs dangerous?
Everything seemed so near and yet so far.
Behind the wall silence alone replied.
Was, then, even the staircase occupied?
Generating the rhymes is easy enough, the sentence structure a little trickier, but what’s impressive about this sonnet is that it sticks to a single topic and appears to be a more coherent whole. I’d guess they used associated “lexical fields” of similar words to help generate something coherent. In a similar way, most of the more famous examples of AI-generated music still involve some amount of human control, even if it’s editorial; a human will build a song around an AI-generated riff, or select the most convincing Bach chorale from amidst many different samples.
We are seeing strides forward in the ability of AI to generate human voices and human likenesses. As the latter example shows, in the fake news era people have focused on the dangers of this tech– but might it also be possible to create a virtual performer, trained on a dataset of their original music? Did you ever want to hear another Beatles album, or jam with Miles Davis? Of course, these things are impossible—but could we create a similar experience that people would genuinely value? Even, to the untrained eye, something indistinguishable from the real thing?
And if it did measure up to the real thing, what would this mean? Jaron Lanier is a fascinating technology writer, a critic of strong AI, and a believer in the power of virtual reality to change the world and provide truly meaningful experiences. He’s also a composer and a musical aficionado. He pointed out in a recent interview that translation algorithms, by reducing the amount of work translators are commissioned to do, have, in some sense, profited from stolen expertise. They were trained on huge datasets purloined from human linguists and translators. If you can train an AI on someone’s creative output and it produces new music, who “owns” it?
Although companies that offer AI music tools are starting to proliferate, and some groups will argue that the musical Turing test has been passed already, AI-generated music is hardly racing to the top of the pop charts just yet. Even as the line between human-composed and AI-generated music starts to blur, there’s still a gulf between the average human and musical genius. In the next few years, we’ll see how far the current techniques can take us. It may be the case that there’s something in the skylark’s song that can’t be generated by machines. But maybe not, and then this song might need an extra verse.
Image Credit: d1sk / Shutterstock.com Continue reading
#432051 What Roboticists Are Learning From Early ...
You might not have heard of Hanson Robotics, but if you’re reading this, you’ve probably seen their work. They were the company behind Sophia, the lifelike humanoid avatar that’s made dozens of high-profile media appearances. Before that, they were the company behind that strange-looking robot that seemed a bit like Asimo with Albert Einstein’s head—or maybe you saw BINA48, who was interviewed for the New York Times in 2010 and featured in Jon Ronson’s books. For the sci-fi aficionados amongst you, they even made a replica of legendary author Philip K. Dick, best remembered for having books with titles like Do Androids Dream of Electric Sheep? turned into films with titles like Blade Runner.
Hanson Robotics, in other words, with their proprietary brand of life-like humanoid robots, have been playing the same game for a while. Sometimes it can be a frustrating game to watch. Anyone who gives the robot the slightest bit of thought will realize that this is essentially a chat-bot, with all the limitations this implies. Indeed, even in that New York Times interview with BINA48, author Amy Harmon describes it as a frustrating experience—with “rare (but invariably thrilling) moments of coherence.” This sensation will be familiar to anyone who’s conversed with a chatbot that has a few clever responses.
The glossy surface belies the lack of real intelligence underneath; it seems, at first glance, like a much more advanced machine than it is. Peeling back that surface layer—at least for a Hanson robot—means you’re peeling back Frubber. This proprietary substance—short for “Flesh Rubber,” which is slightly nightmarish—is surprisingly complicated. Up to thirty motors are required just to control the face; they manipulate liquid cells in order to make the skin soft, malleable, and capable of a range of different emotional expressions.
A quick combinatorial glance at the 30+ motors suggests that there are millions of possible combinations; researchers identify 62 that they consider “human-like” in Sophia, although not everyone agrees with this assessment. Arguably, the technical expertise that went into reconstructing the range of human facial expressions far exceeds the more simplistic chat engine the robots use, although it’s the second one that allows it to inflate the punters’ expectations with a few pre-programmed questions in an interview.
Hanson Robotics’ belief is that, ultimately, a lot of how humans will eventually relate to robots is going to depend on their faces and voices, as well as on what they’re saying. “The perception of identity is so intimately bound up with the perception of the human form,” says David Hanson, company founder.
Yet anyone attempting to design a robot that won’t terrify people has to contend with the uncanny valley—that strange blend of concern and revulsion people react with when things appear to be creepily human. Between cartoonish humanoids and genuine humans lies what has often been a no-go zone in robotic aesthetics.
The uncanny valley concept originated with roboticist Masahiro Mori, who argued that roboticists should avoid trying to replicate humans exactly. Since anything that wasn’t perfect, but merely very good, would elicit an eerie feeling in humans, shirking the challenge entirely was the only way to avoid the uncanny valley. It’s probably a task made more difficult by endless streams of articles about AI taking over the world that inexplicably conflate AI with killer humanoid Terminators—which aren’t particularly likely to exist (although maybe it’s best not to push robots around too much).
The idea behind this realm of psychological horror is fairly simple, cognitively speaking.
We know how to categorize things that are unambiguously human or non-human. This is true even if they’re designed to interact with us. Consider the popularity of Aibo, Jibo, or even some robots that don’t try to resemble humans. Something that resembles a human, but isn’t quite right, is bound to evoke a fear response in the same way slightly distorted music or slightly rearranged furniture in your home will. The creature simply doesn’t fit.
You may well reject the idea of the uncanny valley entirely. David Hanson, naturally, is not a fan. In the paper Upending the Uncanny Valley, he argues that great art forms have often resembled humans, but the ultimate goal for humanoid roboticists is probably to create robots we can relate to as something closer to humans than works of art.
Meanwhile, Hanson and other scientists produce competing experiments to either demonstrate that the uncanny valley is overhyped, or to confirm it exists and probe its edges.
The classic experiment involves gradually morphing a cartoon face into a human face, via some robotic-seeming intermediaries—yet it’s in movement that the real horror of the almost-human often lies. Hanson has argued that incorporating cartoonish features may help—and, sometimes, that the uncanny valley is a generational thing which will melt away when new generations grow used to the quirks of robots. Although Hanson might dispute the severity of this effect, it’s clearly what he’s trying to avoid with each new iteration.
Hiroshi Ishiguro is the latest of the roboticists to have dived headlong into the valley.
Building on the work of pioneers like Hanson, those who study human-robot interaction are pushing at the boundaries of robotics—but also of social science. It’s usually difficult to simulate what you don’t understand, and there’s still an awful lot we don’t understand about how we interpret the constant streams of non-verbal information that flow when you interact with people in the flesh.
Ishiguro took this imitation of human forms to extreme levels. Not only did he monitor and log the physical movements people made on videotapes, but some of his robots are based on replicas of people; the Repliee series began with a ‘replicant’ of his daughter. This involved making a rubber replica—a silicone cast—of her entire body. Future experiments were focused on creating Geminoid, a replica of Ishiguro himself.
As Ishiguro aged, he realized that it would be more effective to resemble his replica through cosmetic surgery rather than by continually creating new casts of his face, each with more lines than the last. “I decided not to get old anymore,” Ishiguro said.
We love to throw around abstract concepts and ideas: humans being replaced by machines, cared for by machines, getting intimate with machines, or even merging themselves with machines. You can take an idea like that, hold it in your hand, and examine it—dispassionately, if not without interest. But there’s a gulf between thinking about it and living in a world where human-robot interaction is not a field of academic research, but a day-to-day reality.
As the scientists studying human-robot interaction develop their robots, their replicas, and their experiments, they are making some of the first forays into that world. We might all be living there someday. Understanding ourselves—decrypting the origins of empathy and love—may be the greatest challenge to face. That is, if you want to avoid the valley.
Image Credit: Anton Gvozdikov / Shutterstock.com Continue reading
#431928 How Fast Is AI Progressing? Stanford’s ...
When? This is probably the question that futurists, AI experts, and even people with a keen interest in technology dread the most. It has proved famously difficult to predict when new developments in AI will take place. The scientists at the Dartmouth Summer Research Project on Artificial Intelligence in 1956 thought that perhaps two months would be enough to make “significant advances” in a whole range of complex problems, including computers that can understand language, improve themselves, and even understand abstract concepts.
Sixty years later, and these problems are not yet solved. The AI Index, from Stanford, is an attempt to measure how much progress has been made in artificial intelligence.
The index adopts a unique approach, and tries to aggregate data across many regimes. It contains Volume of Activity metrics, which measure things like venture capital investment, attendance at academic conferences, published papers, and so on. The results are what you might expect: tenfold increases in academic activity since 1996, an explosive growth in startups focused around AI, and corresponding venture capital investment. The issue with this metric is that it measures AI hype as much as AI progress. The two might be correlated, but then again, they may not.
The index also scrapes data from the popular coding website Github, which hosts more source code than anyone in the world. They can track the amount of AI-related software people are creating, as well as the interest levels in popular machine learning packages like Tensorflow and Keras. The index also keeps track of the sentiment of news articles that mention AI: surprisingly, given concerns about the apocalypse and an employment crisis, those considered “positive” outweigh the “negative” by three to one.
But again, this could all just be a measure of AI enthusiasm in general.
No one would dispute the fact that we’re in an age of considerable AI hype, but the progress of AI is littered by booms and busts in hype, growth spurts that alternate with AI winters. So the AI Index attempts to track the progress of algorithms against a series of tasks. How well does computer vision perform at the Large Scale Visual Recognition challenge? (Superhuman at annotating images since 2015, but they still can’t answer questions about images very well, combining natural language processing and image recognition). Speech recognition on phone calls is almost at parity.
In other narrow fields, AIs are still catching up to humans. Translation might be good enough that you can usually get the gist of what’s being said, but still scores poorly on the BLEU metric for translation accuracy. The AI index even keeps track of how well the programs can do on the SAT test, so if you took it, you can compare your score to an AI’s.
Measuring the performance of state-of-the-art AI systems on narrow tasks is useful and fairly easy to do. You can define a metric that’s simple to calculate, or devise a competition with a scoring system, and compare new software with old in a standardized way. Academics can always debate about the best method of assessing translation or natural language understanding. The Loebner prize, a simplified question-and-answer Turing Test, recently adopted Winograd Schema type questions, which rely on contextual understanding. AI has more difficulty with these.
Where the assessment really becomes difficult, though, is in trying to map these narrow-task performances onto general intelligence. This is hard because of a lack of understanding of our own intelligence. Computers are superhuman at chess, and now even a more complex game like Go. The braver predictors who came up with timelines thought AlphaGo’s success was faster than expected, but does this necessarily mean we’re closer to general intelligence than they thought?
Here is where it’s harder to track progress.
We can note the specialized performance of algorithms on tasks previously reserved for humans—for example, the index cites a Nature paper that shows AI can now predict skin cancer with more accuracy than dermatologists. We could even try to track one specific approach to general AI; for example, how many regions of the brain have been successfully simulated by a computer? Alternatively, we could simply keep track of the number of professions and professional tasks that can now be performed to an acceptable standard by AI.
“We are running a race, but we don’t know how to get to the endpoint, or how far we have to go.”
Progress in AI over the next few years is far more likely to resemble a gradual rising tide—as more and more tasks can be turned into algorithms and accomplished by software—rather than the tsunami of a sudden intelligence explosion or general intelligence breakthrough. Perhaps measuring the ability of an AI system to learn and adapt to the work routines of humans in office-based tasks could be possible.
The AI index doesn’t attempt to offer a timeline for general intelligence, as this is still too nebulous and confused a concept.
Michael Woodridge, head of Computer Science at the University of Oxford, notes, “The main reason general AI is not captured in the report is that neither I nor anyone else would know how to measure progress.” He is concerned about another AI winter, and overhyped “charlatans and snake-oil salesmen” exaggerating the progress that has been made.
A key concern that all the experts bring up is the ethics of artificial intelligence.
Of course, you don’t need general intelligence to have an impact on society; algorithms are already transforming our lives and the world around us. After all, why are Amazon, Google, and Facebook worth any money? The experts agree on the need for an index to measure the benefits of AI, the interactions between humans and AIs, and our ability to program values, ethics, and oversight into these systems.
Barbra Grosz of Harvard champions this view, saying, “It is important to take on the challenge of identifying success measures for AI systems by their impact on people’s lives.”
For those concerned about the AI employment apocalypse, tracking the use of AI in the fields considered most vulnerable (say, self-driving cars replacing taxi drivers) would be a good idea. Society’s flexibility for adapting to AI trends should be measured, too; are we providing people with enough educational opportunities to retrain? How about teaching them to work alongside the algorithms, treating them as tools rather than replacements? The experts also note that the data suffers from being US-centric.
We are running a race, but we don’t know how to get to the endpoint, or how far we have to go. We are judging by the scenery, and how far we’ve run already. For this reason, measuring progress is a daunting task that starts with defining progress. But the AI index, as an annual collection of relevant information, is a good start.
Image Credit: Photobank gallery / Shutterstock.com Continue reading
#431902 Old dog, new tricks: Sony unleashes ...
As Japan celebrates the year of the dog, electronics giant Sony on Thursday unleashed its new robot canine companion, packed with artificial intelligence and internet connectivity. Continue reading