Tag Archives: hearing
As artificial intelligence advances, we’re encountering the same old questions. How much of what we consider to be fundamentally human can be reduced to an algorithm? Can we create something sufficiently advanced that people can no longer distinguish between the two? This, after all, is the idea behind the Turing Test, which has yet to be passed.
At first glance, you might think music is beyond the realm of algorithms. Birds can sing, and people can compose symphonies. Music is evocative; it makes us feel. Very often, our intense personal and emotional attachments to music are because it reminds us of our shared humanity. We are told that creative jobs are the least likely to be automated. Creativity seems fundamentally human.
But I think above all, we view it as reductionist sacrilege: to dissect beautiful things. “If you try to strangle a skylark / to cut it up, see how it works / you will stop its heart from beating / you will stop its mouth from singing.” A human musician wrote that; a machine might be able to string words together that are happy or sad; it might even be able to conjure up a decent metaphor from the depths of some neural network—but could it understand humanity enough to produce art that speaks to humans?
Then, of course, there’s the other side of the debate. Music, after all, has a deeply mathematical structure; you can train a machine to produce harmonics. “In the teachings of Pythagoras and his followers, music was inseparable from numbers, which were thought to be the key to the whole spiritual and physical universe,” according to Grout in A History of Western Music. You might argue that the process of musical composition cannot be reduced to a simple algorithm, yet musicians have often done so. Mozart, with his “Dice Music,” used the roll of a dice to decide how to order musical fragments; creativity through an 18th-century random number generator. Algorithmic music goes back a very long way, with the first papers on the subject from the 1960s.
Then there’s the techno-enthusiast side of the argument. iTunes has 26 million songs, easily more than a century of music. A human could never listen to and learn from them all, but a machine could. It could also memorize every note of Beethoven. Music can be converted into MIDI files, a nice chewable data format that allows even a character-by-character neural net you can run on your computer to generate music. (Seriously, even I could get this thing working.)
Indeed, generating music in the style of Bach has long been a test for AI, and you can see neural networks gradually learn to imitate classical composers while trying to avoid overfitting. When an algorithm overfits, it essentially starts copying the existing music, rather than being inspired by it but creating something similar: a tightrope the best human artists learn to walk. Creativity doesn’t spring from nowhere; even maverick musical geniuses have their influences.
Does a machine have to be truly ‘creative’ to produce something that someone would find valuable? To what extent would listeners’ attitudes change if they thought they were hearing a human vs. an AI composition? This all suggests a musical Turing Test. Of course, it already exists. In fact, it’s run out of Dartmouth, the school that hosted that first, seminal AI summer conference. This year, the contest is bigger than ever: alongside the PoetiX, LimeriX and LyriX competitions for poetry and lyrics, there’s a DigiKidLit competition for children’s literature (although you may have reservations about exposing your children to neural-net generated content… it can get a bit surreal).
There’s also a pair of musical competitions, including one for original compositions in different genres. Key genres and styles are represented by Charlie Parker for Jazz and the Bach chorales for classical music. There’s also a free composition, and a contest where a human and an AI try to improvise together—the AI must respond to a human spontaneously, in real time, and in a musically pleasing way. Quite a challenge! In all cases, if any of the generated work is indistinguishable from human performers, the neural net has passed the Turing Test.
Did they? Here’s part of 2017’s winning sonnet from Charese Smiley and Hiroko Bretz:
The large cabin was in total darkness.
Come marching up the eastern hill afar.
When is the clock on the stairs dangerous?
Everything seemed so near and yet so far.
Behind the wall silence alone replied.
Was, then, even the staircase occupied?
Generating the rhymes is easy enough, the sentence structure a little trickier, but what’s impressive about this sonnet is that it sticks to a single topic and appears to be a more coherent whole. I’d guess they used associated “lexical fields” of similar words to help generate something coherent. In a similar way, most of the more famous examples of AI-generated music still involve some amount of human control, even if it’s editorial; a human will build a song around an AI-generated riff, or select the most convincing Bach chorale from amidst many different samples.
We are seeing strides forward in the ability of AI to generate human voices and human likenesses. As the latter example shows, in the fake news era people have focused on the dangers of this tech– but might it also be possible to create a virtual performer, trained on a dataset of their original music? Did you ever want to hear another Beatles album, or jam with Miles Davis? Of course, these things are impossible—but could we create a similar experience that people would genuinely value? Even, to the untrained eye, something indistinguishable from the real thing?
And if it did measure up to the real thing, what would this mean? Jaron Lanier is a fascinating technology writer, a critic of strong AI, and a believer in the power of virtual reality to change the world and provide truly meaningful experiences. He’s also a composer and a musical aficionado. He pointed out in a recent interview that translation algorithms, by reducing the amount of work translators are commissioned to do, have, in some sense, profited from stolen expertise. They were trained on huge datasets purloined from human linguists and translators. If you can train an AI on someone’s creative output and it produces new music, who “owns” it?
Although companies that offer AI music tools are starting to proliferate, and some groups will argue that the musical Turing test has been passed already, AI-generated music is hardly racing to the top of the pop charts just yet. Even as the line between human-composed and AI-generated music starts to blur, there’s still a gulf between the average human and musical genius. In the next few years, we’ll see how far the current techniques can take us. It may be the case that there’s something in the skylark’s song that can’t be generated by machines. But maybe not, and then this song might need an extra verse.
Image Credit: d1sk / Shutterstock.com Continue reading
Ever wished you could be in two places at the same time? The XPRIZE Foundation wants to make that a reality with a $10 million competition to build robot avatars that can be controlled from at least 100 kilometers away.
The competition was announced by XPRIZE founder Peter Diamandis at the SXSW conference in Austin last week, with an ambitious timeline of awarding the grand prize by October 2021. Teams have until October 31st to sign up, and they need to submit detailed plans to a panel of judges by the end of next January.
The prize, sponsored by Japanese airline ANA, has given contestants little guidance on how they expect them to solve the challenge other than saying their solutions need to let users see, hear, feel, and interact with the robot’s environment as well as the people in it.
XPRIZE has also not revealed details of what kind of tasks the robots will be expected to complete, though they’ve said tasks will range from “simple” to “complex,” and it should be possible for an untrained operator to use them.
That’s a hugely ambitious goal that’s likely to require teams to combine multiple emerging technologies, from humanoid robotics to virtual reality high-bandwidth communications and high-resolution haptics.
If any of the teams succeed, the technology could have myriad applications, from letting emergency responders enter areas too hazardous for humans to helping people care for relatives who live far away or even just allowing tourists to visit other parts of the world without the jet lag.
“Our ability to physically experience another geographic location, or to provide on-the-ground assistance where needed, is limited by cost and the simple availability of time,” Diamandis said in a statement.
“The ANA Avatar XPRIZE can enable creation of an audacious alternative that could bypass these limitations, allowing us to more rapidly and efficiently distribute skill and hands-on expertise to distant geographic locations where they are needed, bridging the gap between distance, time, and cultures,” he added.
Interestingly, the technology may help bypass an enduring hand break on the widespread use of robotics: autonomy. By having a human in the loop, you don’t need nearly as much artificial intelligence analyzing sensory input and making decisions.
Robotics software is doing a lot more than just high-level planning and strategizing, though. While a human moves their limbs instinctively without consciously thinking about which muscles to activate, controlling and coordinating a robot’s components requires sophisticated algorithms.
The DARPA Robotics Challenge demonstrated just how hard it was to get human-shaped robots to do tasks humans would find simple, such as opening doors, climbing steps, and even just walking. These robots were supposedly semi-autonomous, but on many tasks they were essentially tele-operated, and the results suggested autonomy isn’t the only problem.
There’s also the issue of powering these devices. You may have noticed that in a lot of the slick web videos of humanoid robots doing cool things, the machine is attached to the roof by a large cable. That’s because they suck up huge amounts of power.
Possibly the most advanced humanoid robot—Boston Dynamics’ Atlas—has a battery, but it can only run for about an hour. That might be fine for some applications, but you don’t want it running out of juice halfway through rescuing someone from a mine shaft.
When it comes to the link between the robot and its human user, some of the technology is probably not that much of a stretch. Virtual reality headsets can create immersive audio-visual environments, and a number of companies are working on advanced haptic suits that will let people “feel” virtual environments.
Motion tracking technology may be more complicated. While even consumer-grade devices can track peoples’ movements with high accuracy, you will probably need to don something more like an exoskeleton that can both pick up motion and provide mechanical resistance, so that when the robot bumps into an immovable object, the user stops dead too.
How hard all of this will be is also dependent on how the competition ultimately defines subjective terms like “feel” and “interact.” Will the user need to be able to feel a gentle breeze on the robot’s cheek or be able to paint a watercolor? Or will simply having the ability to distinguish a hard object from a soft one or shake someone’s hand be enough?
Whatever the fidelity they decide on, the approach will require huge amounts of sensory and control data to be transmitted over large distances, most likely wirelessly, in a way that’s fast and reliable enough that there’s no lag or interruptions. Fortunately 5G is launching this year, with a speed of 10 gigabits per second and very low latency, so this problem should be solved by 2021.
And it’s worth remembering there have already been some tentative attempts at building robotic avatars. Telepresence robots have solved the seeing, hearing, and some of the interacting problems, and MIT has already used virtual reality to control robots to carry out complex manipulation tasks.
South Korean company Hankook Mirae Technology has also unveiled a 13-foot-tall robotic suit straight out of a sci-fi movie that appears to have made some headway with the motion tracking problem, albeit with a human inside the robot. Toyota’s T-HR3 does the same, but with the human controlling the robot from a “Master Maneuvering System” that marries motion tracking with VR.
Combining all of these capabilities into a single machine will certainly prove challenging. But if one of the teams pulls it off, you may be able to tick off trips to the Seven Wonders of the World without ever leaving your house.
Image Credit: ANA Avatar XPRIZE Continue reading