Tag Archives: character
As artificial intelligence advances, we’re encountering the same old questions. How much of what we consider to be fundamentally human can be reduced to an algorithm? Can we create something sufficiently advanced that people can no longer distinguish between the two? This, after all, is the idea behind the Turing Test, which has yet to be passed.
At first glance, you might think music is beyond the realm of algorithms. Birds can sing, and people can compose symphonies. Music is evocative; it makes us feel. Very often, our intense personal and emotional attachments to music are because it reminds us of our shared humanity. We are told that creative jobs are the least likely to be automated. Creativity seems fundamentally human.
But I think above all, we view it as reductionist sacrilege: to dissect beautiful things. “If you try to strangle a skylark / to cut it up, see how it works / you will stop its heart from beating / you will stop its mouth from singing.” A human musician wrote that; a machine might be able to string words together that are happy or sad; it might even be able to conjure up a decent metaphor from the depths of some neural network—but could it understand humanity enough to produce art that speaks to humans?
Then, of course, there’s the other side of the debate. Music, after all, has a deeply mathematical structure; you can train a machine to produce harmonics. “In the teachings of Pythagoras and his followers, music was inseparable from numbers, which were thought to be the key to the whole spiritual and physical universe,” according to Grout in A History of Western Music. You might argue that the process of musical composition cannot be reduced to a simple algorithm, yet musicians have often done so. Mozart, with his “Dice Music,” used the roll of a dice to decide how to order musical fragments; creativity through an 18th-century random number generator. Algorithmic music goes back a very long way, with the first papers on the subject from the 1960s.
Then there’s the techno-enthusiast side of the argument. iTunes has 26 million songs, easily more than a century of music. A human could never listen to and learn from them all, but a machine could. It could also memorize every note of Beethoven. Music can be converted into MIDI files, a nice chewable data format that allows even a character-by-character neural net you can run on your computer to generate music. (Seriously, even I could get this thing working.)
Indeed, generating music in the style of Bach has long been a test for AI, and you can see neural networks gradually learn to imitate classical composers while trying to avoid overfitting. When an algorithm overfits, it essentially starts copying the existing music, rather than being inspired by it but creating something similar: a tightrope the best human artists learn to walk. Creativity doesn’t spring from nowhere; even maverick musical geniuses have their influences.
Does a machine have to be truly ‘creative’ to produce something that someone would find valuable? To what extent would listeners’ attitudes change if they thought they were hearing a human vs. an AI composition? This all suggests a musical Turing Test. Of course, it already exists. In fact, it’s run out of Dartmouth, the school that hosted that first, seminal AI summer conference. This year, the contest is bigger than ever: alongside the PoetiX, LimeriX and LyriX competitions for poetry and lyrics, there’s a DigiKidLit competition for children’s literature (although you may have reservations about exposing your children to neural-net generated content… it can get a bit surreal).
There’s also a pair of musical competitions, including one for original compositions in different genres. Key genres and styles are represented by Charlie Parker for Jazz and the Bach chorales for classical music. There’s also a free composition, and a contest where a human and an AI try to improvise together—the AI must respond to a human spontaneously, in real time, and in a musically pleasing way. Quite a challenge! In all cases, if any of the generated work is indistinguishable from human performers, the neural net has passed the Turing Test.
Did they? Here’s part of 2017’s winning sonnet from Charese Smiley and Hiroko Bretz:
The large cabin was in total darkness.
Come marching up the eastern hill afar.
When is the clock on the stairs dangerous?
Everything seemed so near and yet so far.
Behind the wall silence alone replied.
Was, then, even the staircase occupied?
Generating the rhymes is easy enough, the sentence structure a little trickier, but what’s impressive about this sonnet is that it sticks to a single topic and appears to be a more coherent whole. I’d guess they used associated “lexical fields” of similar words to help generate something coherent. In a similar way, most of the more famous examples of AI-generated music still involve some amount of human control, even if it’s editorial; a human will build a song around an AI-generated riff, or select the most convincing Bach chorale from amidst many different samples.
We are seeing strides forward in the ability of AI to generate human voices and human likenesses. As the latter example shows, in the fake news era people have focused on the dangers of this tech– but might it also be possible to create a virtual performer, trained on a dataset of their original music? Did you ever want to hear another Beatles album, or jam with Miles Davis? Of course, these things are impossible—but could we create a similar experience that people would genuinely value? Even, to the untrained eye, something indistinguishable from the real thing?
And if it did measure up to the real thing, what would this mean? Jaron Lanier is a fascinating technology writer, a critic of strong AI, and a believer in the power of virtual reality to change the world and provide truly meaningful experiences. He’s also a composer and a musical aficionado. He pointed out in a recent interview that translation algorithms, by reducing the amount of work translators are commissioned to do, have, in some sense, profited from stolen expertise. They were trained on huge datasets purloined from human linguists and translators. If you can train an AI on someone’s creative output and it produces new music, who “owns” it?
Although companies that offer AI music tools are starting to proliferate, and some groups will argue that the musical Turing test has been passed already, AI-generated music is hardly racing to the top of the pop charts just yet. Even as the line between human-composed and AI-generated music starts to blur, there’s still a gulf between the average human and musical genius. In the next few years, we’ll see how far the current techniques can take us. It may be the case that there’s something in the skylark’s song that can’t be generated by machines. But maybe not, and then this song might need an extra verse.
Image Credit: d1sk / Shutterstock.com Continue reading
The best storytellers react to their audience. They look for smiles, signs of awe, or boredom; they simultaneously and skillfully read both the story and their sitters. Kevin Brooks, a seasoned storyteller working for Motorola’s Human Interface Labs, explains, “As the storyteller begins, they must tune in to… the audience’s energy. Based on this energy, the storyteller will adjust their timing, their posture, their characterizations, and sometimes even the events of the story. There is a dialog between audience and storyteller.”
Shortly after I read the script to Melita, the latest virtual reality experience from Madrid-based immersive storytelling company Future Lighthouse, CEO Nicolas Alcalá explained to me that the piece is an example of “reactive content,” a concept he’s been working on since his days at Singularity University.
For the first time in history, we have access to technology that can merge the reactive and affective elements of oral storytelling with the affordances of digital media, weaving stunning visuals, rich soundtracks, and complex meta-narratives in a story arena that has the capability to know you more intimately than any conventional storyteller could.
It’s no understatement to say that the storytelling potential here is phenomenal.
In short, we can refer to content as reactive if it reads and reacts to users based on their body rhythms, emotions, preferences, and data points. Artificial intelligence is used to analyze users’ behavior or preferences to sculpt unique storylines and narratives, essentially allowing for a story that changes in real time based on who you are and how you feel.
The development of reactive content will allow those working in the industry to go one step further than simply translating the essence of oral storytelling into VR. Rather than having a narrative experience with a digital storyteller who can read you, reactive content has the potential to create an experience with a storyteller who knows you.
This means being able to subtly insert minor personal details that have a specific meaning to the viewer. When we talk to our friends we often use experiences we’ve shared in the past or knowledge of our audience to give our story as much resonance as possible. Targeting personal memories and aspects of our lives is a highly effective way to elicit emotions and aid in visualizing narratives. When you can do this with the addition of visuals, music, and characters—all lifted from someone’s past—you have the potential for overwhelmingly engaging and emotionally-charged content.
Future Lighthouse inform me that for now, reactive content will rely primarily on biometric feedback technology such as breathing, heartbeat, and eye tracking sensors. A simple example would be a story in which parts of the environment or soundscape change in sync with the user’s heartbeat and breathing, or characters who call you out for not paying attention.
The next step would be characters and situations that react to the user’s emotions, wherein algorithms analyze biometric information to make inferences about states of emotional arousal (“why are you so nervous?” etc.). Another example would be implementing the use of “arousal parameters,” where the audience can choose what level of “fear” they want from a VR horror story before algorithms modulate the experience using information from biometric feedback devices.
The company’s long-term goal is to gather research on storytelling conventions and produce a catalogue of story “wireframes.” This entails distilling the basic formula to different genres so they can then be fleshed out with visuals, character traits, and soundtracks that are tailored for individual users based on their deep data, preferences, and biometric information.
The development of reactive content will go hand in hand with a renewed exploration of diverging, dynamic storylines, and multi-narratives, a concept that hasn’t had much impact in the movie world thus far. In theory, the idea of having a story that changes and mutates is captivating largely because of our love affair with serendipity and unpredictability, a cultural condition theorist Arthur Kroker refers to as the “hypertextual imagination.” This feeling of stepping into the unknown with the possibility of deviation from the habitual translates as a comforting reminder that our own lives can take exciting and unexpected turns at any moment.
The inception of the concept into mainstream culture dates to the classic Choose Your Own Adventure book series that launched in the late 70s, which in its literary form had great success. However, filmic takes on the theme have made somewhat less of an impression. DVDs like I’m Your Man (1998) and Switching (2003) both use scene selection tools to determine the direction of the storyline.
A more recent example comes from Kino Industries, who claim to have developed the technology to allow filmmakers to produce interactive films in which viewers can use smartphones to quickly vote on which direction the narrative takes at numerous decision points throughout the film.
The main problem with diverging narrative films has been the stop-start nature of the interactive element: when I’m immersed in a story I don’t want to have to pick up a controller or remote to select what’s going to happen next. Every time the audience is given the option to take a new path (“press this button”, “vote on X, Y, Z”) the narrative— and immersion within that narrative—is temporarily halted, and it takes the mind a while to get back into this state of immersion.
Reactive content has the potential to resolve these issues by enabling passive interactivity—that is, input and output without having to pause and actively make decisions or engage with the hardware. This will result in diverging, dynamic narratives that will unfold seamlessly while being dependent on and unique to the specific user and their emotions. Passive interactivity will also remove the game feel that can often be a symptom of interactive experiences and put a viewer somewhere in the middle: still firmly ensconced in an interactive dynamic narrative, but in a much subtler way.
While reading the Melita script I was particularly struck by a scene in which the characters start to engage with the user and there’s a synchronicity between the user’s heartbeat and objects in the virtual world. As the narrative unwinds and the words of Melita’s character get more profound, parts of the landscape, which seemed to be flashing and pulsating at random, come together and start to mimic the user’s heartbeat.
In 2013, Jane Aspell of Anglia Ruskin University (UK) and Lukas Heydrich of the Swiss Federal Institute of Technology proved that a user’s sense of presence and identification with a virtual avatar could be dramatically increased by syncing the on-screen character with the heartbeat of the user. The relationship between bio-digital synchronicity, immersion, and emotional engagement is something that will surely have revolutionary narrative and storytelling potential.
Image Credit: Tithi Luadthong / Shutterstock.com Continue reading
Major websites all over the world use a system called CAPTCHA to verify that someone is indeed a human and not a bot when entering data or signing into an account. CAPTCHA stands for the “Completely Automated Public Turing test to tell Computers and Humans Apart.” The squiggly letters and numbers, often posted against photographs or textured backgrounds, have been a good way to foil hackers. They are annoying but effective.
The days of CAPTCHA as a viable line of defense may, however, be numbered.
Researchers at Vicarious, a Californian artificial intelligence firm funded by Amazon founder Jeff Bezos and Facebook’s Mark Zuckerberg, have just published a paper documenting how they were able to defeat CAPTCHA using new artificial intelligence techniques. Whereas today’s most advanced artificial intelligence (AI) technologies use neural networks that require massive amounts of data to learn from, sometimes millions of examples, the researchers said their system needed just five training steps to crack Google’s reCAPTCHA technology. With this, they achieved a 67 percent success rate per character—reasonably close to the human accuracy rate of 87 percent. In answering PayPal and Yahoo CAPTCHAs, the system achieved an accuracy rate of greater than 50 percent.
The CAPTCHA breakthrough came hard on the heels of another major milestone from Google’s DeepMind team, the people who built the world’s best Go-playing system. DeepMind built a new artificial-intelligence system called AlphaGo Zero that taught itself to play the game at a world-beating level with minimal training data, mainly using trial and error—in a fashion similar to how humans learn.
Both playing Go and deciphering CAPTCHAs are clear examples of what we call narrow AI, which is different from artificial general intelligence (AGI)—the stuff of science fiction. Remember R2-D2 of Star Wars, Ava from Ex Machina, and Samantha from Her? They could do many things and learned everything they needed on their own.
Narrow AI technologies are systems that can only perform one specific type of task. For example, if you asked AlphaGo Zero to learn to play Monopoly, it could not, even though that is a far less sophisticated game than Go. If you asked the CAPTCHA cracker to learn to understand a spoken phrase, it would not even know where to start.
To date, though, even narrow AI has been difficult to build and perfect. To perform very elementary tasks such as determining whether an image is of a cat or a dog, the system requires the development of a model that details exactly what is being analyzed and massive amounts of data with labeled examples of both. The examples are used to train the AI systems, which are modeled on the neural networks in the brain, in which the connections between layers of neurons are adjusted based on what is observed. To put it simply, you tell an AI system exactly what to learn, and the more data you give it, the more accurate it becomes.
The methods that Vicarious and Google used were different; they allowed the systems to learn on their own, albeit in a narrow field. By making their own assumptions about what the training model should be and trying different permutations until they got the right results, they were able to teach themselves how to read the letters in a CAPTCHA or to play a game.
This blurs the line between narrow AI and AGI and has broader implications in robotics and virtually any other field in which machine learning in complex environments may be relevant.
Beyond visual recognition, the Vicarious breakthrough and AlphaGo Zero success are encouraging scientists to think about how AIs can learn to do things from scratch. And this brings us one step closer to coexisting with classes of AIs and robots that can learn to perform new tasks that are slight variants on their previous tasks—and ultimately the AGI of science fiction.
So R2-D2 may be here sooner than we expected.
This article was originally published by The Washington Post. Read the original article here.
Image Credit: Zapp2Photo / Shutterstock.com Continue reading