Tag Archives: emotional
As artificial intelligence advances, we’re encountering the same old questions. How much of what we consider to be fundamentally human can be reduced to an algorithm? Can we create something sufficiently advanced that people can no longer distinguish between the two? This, after all, is the idea behind the Turing Test, which has yet to be passed.
At first glance, you might think music is beyond the realm of algorithms. Birds can sing, and people can compose symphonies. Music is evocative; it makes us feel. Very often, our intense personal and emotional attachments to music are because it reminds us of our shared humanity. We are told that creative jobs are the least likely to be automated. Creativity seems fundamentally human.
But I think above all, we view it as reductionist sacrilege: to dissect beautiful things. “If you try to strangle a skylark / to cut it up, see how it works / you will stop its heart from beating / you will stop its mouth from singing.” A human musician wrote that; a machine might be able to string words together that are happy or sad; it might even be able to conjure up a decent metaphor from the depths of some neural network—but could it understand humanity enough to produce art that speaks to humans?
Then, of course, there’s the other side of the debate. Music, after all, has a deeply mathematical structure; you can train a machine to produce harmonics. “In the teachings of Pythagoras and his followers, music was inseparable from numbers, which were thought to be the key to the whole spiritual and physical universe,” according to Grout in A History of Western Music. You might argue that the process of musical composition cannot be reduced to a simple algorithm, yet musicians have often done so. Mozart, with his “Dice Music,” used the roll of a dice to decide how to order musical fragments; creativity through an 18th-century random number generator. Algorithmic music goes back a very long way, with the first papers on the subject from the 1960s.
Then there’s the techno-enthusiast side of the argument. iTunes has 26 million songs, easily more than a century of music. A human could never listen to and learn from them all, but a machine could. It could also memorize every note of Beethoven. Music can be converted into MIDI files, a nice chewable data format that allows even a character-by-character neural net you can run on your computer to generate music. (Seriously, even I could get this thing working.)
Indeed, generating music in the style of Bach has long been a test for AI, and you can see neural networks gradually learn to imitate classical composers while trying to avoid overfitting. When an algorithm overfits, it essentially starts copying the existing music, rather than being inspired by it but creating something similar: a tightrope the best human artists learn to walk. Creativity doesn’t spring from nowhere; even maverick musical geniuses have their influences.
Does a machine have to be truly ‘creative’ to produce something that someone would find valuable? To what extent would listeners’ attitudes change if they thought they were hearing a human vs. an AI composition? This all suggests a musical Turing Test. Of course, it already exists. In fact, it’s run out of Dartmouth, the school that hosted that first, seminal AI summer conference. This year, the contest is bigger than ever: alongside the PoetiX, LimeriX and LyriX competitions for poetry and lyrics, there’s a DigiKidLit competition for children’s literature (although you may have reservations about exposing your children to neural-net generated content… it can get a bit surreal).
There’s also a pair of musical competitions, including one for original compositions in different genres. Key genres and styles are represented by Charlie Parker for Jazz and the Bach chorales for classical music. There’s also a free composition, and a contest where a human and an AI try to improvise together—the AI must respond to a human spontaneously, in real time, and in a musically pleasing way. Quite a challenge! In all cases, if any of the generated work is indistinguishable from human performers, the neural net has passed the Turing Test.
Did they? Here’s part of 2017’s winning sonnet from Charese Smiley and Hiroko Bretz:
The large cabin was in total darkness.
Come marching up the eastern hill afar.
When is the clock on the stairs dangerous?
Everything seemed so near and yet so far.
Behind the wall silence alone replied.
Was, then, even the staircase occupied?
Generating the rhymes is easy enough, the sentence structure a little trickier, but what’s impressive about this sonnet is that it sticks to a single topic and appears to be a more coherent whole. I’d guess they used associated “lexical fields” of similar words to help generate something coherent. In a similar way, most of the more famous examples of AI-generated music still involve some amount of human control, even if it’s editorial; a human will build a song around an AI-generated riff, or select the most convincing Bach chorale from amidst many different samples.
We are seeing strides forward in the ability of AI to generate human voices and human likenesses. As the latter example shows, in the fake news era people have focused on the dangers of this tech– but might it also be possible to create a virtual performer, trained on a dataset of their original music? Did you ever want to hear another Beatles album, or jam with Miles Davis? Of course, these things are impossible—but could we create a similar experience that people would genuinely value? Even, to the untrained eye, something indistinguishable from the real thing?
And if it did measure up to the real thing, what would this mean? Jaron Lanier is a fascinating technology writer, a critic of strong AI, and a believer in the power of virtual reality to change the world and provide truly meaningful experiences. He’s also a composer and a musical aficionado. He pointed out in a recent interview that translation algorithms, by reducing the amount of work translators are commissioned to do, have, in some sense, profited from stolen expertise. They were trained on huge datasets purloined from human linguists and translators. If you can train an AI on someone’s creative output and it produces new music, who “owns” it?
Although companies that offer AI music tools are starting to proliferate, and some groups will argue that the musical Turing test has been passed already, AI-generated music is hardly racing to the top of the pop charts just yet. Even as the line between human-composed and AI-generated music starts to blur, there’s still a gulf between the average human and musical genius. In the next few years, we’ll see how far the current techniques can take us. It may be the case that there’s something in the skylark’s song that can’t be generated by machines. But maybe not, and then this song might need an extra verse.
Image Credit: d1sk / Shutterstock.com Continue reading
The best storytellers react to their audience. They look for smiles, signs of awe, or boredom; they simultaneously and skillfully read both the story and their sitters. Kevin Brooks, a seasoned storyteller working for Motorola’s Human Interface Labs, explains, “As the storyteller begins, they must tune in to… the audience’s energy. Based on this energy, the storyteller will adjust their timing, their posture, their characterizations, and sometimes even the events of the story. There is a dialog between audience and storyteller.”
Shortly after I read the script to Melita, the latest virtual reality experience from Madrid-based immersive storytelling company Future Lighthouse, CEO Nicolas Alcalá explained to me that the piece is an example of “reactive content,” a concept he’s been working on since his days at Singularity University.
For the first time in history, we have access to technology that can merge the reactive and affective elements of oral storytelling with the affordances of digital media, weaving stunning visuals, rich soundtracks, and complex meta-narratives in a story arena that has the capability to know you more intimately than any conventional storyteller could.
It’s no understatement to say that the storytelling potential here is phenomenal.
In short, we can refer to content as reactive if it reads and reacts to users based on their body rhythms, emotions, preferences, and data points. Artificial intelligence is used to analyze users’ behavior or preferences to sculpt unique storylines and narratives, essentially allowing for a story that changes in real time based on who you are and how you feel.
The development of reactive content will allow those working in the industry to go one step further than simply translating the essence of oral storytelling into VR. Rather than having a narrative experience with a digital storyteller who can read you, reactive content has the potential to create an experience with a storyteller who knows you.
This means being able to subtly insert minor personal details that have a specific meaning to the viewer. When we talk to our friends we often use experiences we’ve shared in the past or knowledge of our audience to give our story as much resonance as possible. Targeting personal memories and aspects of our lives is a highly effective way to elicit emotions and aid in visualizing narratives. When you can do this with the addition of visuals, music, and characters—all lifted from someone’s past—you have the potential for overwhelmingly engaging and emotionally-charged content.
Future Lighthouse inform me that for now, reactive content will rely primarily on biometric feedback technology such as breathing, heartbeat, and eye tracking sensors. A simple example would be a story in which parts of the environment or soundscape change in sync with the user’s heartbeat and breathing, or characters who call you out for not paying attention.
The next step would be characters and situations that react to the user’s emotions, wherein algorithms analyze biometric information to make inferences about states of emotional arousal (“why are you so nervous?” etc.). Another example would be implementing the use of “arousal parameters,” where the audience can choose what level of “fear” they want from a VR horror story before algorithms modulate the experience using information from biometric feedback devices.
The company’s long-term goal is to gather research on storytelling conventions and produce a catalogue of story “wireframes.” This entails distilling the basic formula to different genres so they can then be fleshed out with visuals, character traits, and soundtracks that are tailored for individual users based on their deep data, preferences, and biometric information.
The development of reactive content will go hand in hand with a renewed exploration of diverging, dynamic storylines, and multi-narratives, a concept that hasn’t had much impact in the movie world thus far. In theory, the idea of having a story that changes and mutates is captivating largely because of our love affair with serendipity and unpredictability, a cultural condition theorist Arthur Kroker refers to as the “hypertextual imagination.” This feeling of stepping into the unknown with the possibility of deviation from the habitual translates as a comforting reminder that our own lives can take exciting and unexpected turns at any moment.
The inception of the concept into mainstream culture dates to the classic Choose Your Own Adventure book series that launched in the late 70s, which in its literary form had great success. However, filmic takes on the theme have made somewhat less of an impression. DVDs like I’m Your Man (1998) and Switching (2003) both use scene selection tools to determine the direction of the storyline.
A more recent example comes from Kino Industries, who claim to have developed the technology to allow filmmakers to produce interactive films in which viewers can use smartphones to quickly vote on which direction the narrative takes at numerous decision points throughout the film.
The main problem with diverging narrative films has been the stop-start nature of the interactive element: when I’m immersed in a story I don’t want to have to pick up a controller or remote to select what’s going to happen next. Every time the audience is given the option to take a new path (“press this button”, “vote on X, Y, Z”) the narrative— and immersion within that narrative—is temporarily halted, and it takes the mind a while to get back into this state of immersion.
Reactive content has the potential to resolve these issues by enabling passive interactivity—that is, input and output without having to pause and actively make decisions or engage with the hardware. This will result in diverging, dynamic narratives that will unfold seamlessly while being dependent on and unique to the specific user and their emotions. Passive interactivity will also remove the game feel that can often be a symptom of interactive experiences and put a viewer somewhere in the middle: still firmly ensconced in an interactive dynamic narrative, but in a much subtler way.
While reading the Melita script I was particularly struck by a scene in which the characters start to engage with the user and there’s a synchronicity between the user’s heartbeat and objects in the virtual world. As the narrative unwinds and the words of Melita’s character get more profound, parts of the landscape, which seemed to be flashing and pulsating at random, come together and start to mimic the user’s heartbeat.
In 2013, Jane Aspell of Anglia Ruskin University (UK) and Lukas Heydrich of the Swiss Federal Institute of Technology proved that a user’s sense of presence and identification with a virtual avatar could be dramatically increased by syncing the on-screen character with the heartbeat of the user. The relationship between bio-digital synchronicity, immersion, and emotional engagement is something that will surely have revolutionary narrative and storytelling potential.
Image Credit: Tithi Luadthong / Shutterstock.com Continue reading
Many people get frustrated with technology when it malfunctions or is counterintuitive. The last thing people might expect is for that same technology to pick up on their emotions and engage with them differently as a result.
All of that is now changing. Computers are increasingly able to figure out what we’re feeling—and it’s big business.
A recent report predicts that the global affective computing market will grow from $12.2 billion in 2016 to $53.98 billion by 2021. The report by research and consultancy firm MarketsandMarkets observed that enabling technologies have already been adopted in a wide range of industries and noted a rising demand for facial feature extraction software.
Affective computing is also referred to as emotion AI or artificial emotional intelligence. Although many people are still unfamiliar with the category, researchers in academia have already discovered a multitude of uses for it.
At the University of Tokyo, Professor Toshihiko Yamasaki decided to develop a machine learning system that evaluates the quality of TED Talk videos. Of course, a TED Talk is only considered to be good if it resonates with a human audience. On the surface, this would seem too qualitatively abstract for computer analysis. But Yamasaki wanted his system to watch videos of presentations and predict user impressions. Could a machine learning system accurately evaluate the emotional persuasiveness of a speaker?
Yamasaki and his colleagues came up with a method that analyzed correlations and “multimodal features including linguistic as well as acoustic features” in a dataset of 1,646 TED Talk videos. The experiment was successful. The method obtained “a statistically significant macro-average accuracy of 93.3 percent, outperforming several competitive baseline methods.”
A machine was able to predict whether or not a person would emotionally connect with other people. In their report, the authors noted that these findings could be used for recommendation purposes and also as feedback to the presenters, in order to improve the quality of their public presentation. However, the usefulness of affective computing goes far beyond the way people present content. It may also transform the way they learn it.
Researchers from North Carolina State University explored the connection between students’ affective states and their ability to learn. Their software was able to accurately predict the effectiveness of online tutoring sessions by analyzing the facial expressions of participating students. The software tracked fine-grained facial movements such as eyebrow raising, eyelid tightening, and mouth dimpling to determine engagement, frustration, and learning. The authors concluded that “analysis of facial expressions has great potential for educational data mining.”
This type of technology is increasingly being used within the private sector. Affectiva is a Boston-based company that makes emotion recognition software. When asked to comment on this emerging technology, Gabi Zijderveld, chief marketing officer at Affectiva, explained in an interview for this article, “Our software measures facial expressions of emotion. So basically all you need is our software running and then access to a camera so you can basically record a face and analyze it. We can do that in real time or we can do this by looking at a video and then analyzing data and sending it back to folks.”
The technology has particular relevance for the advertising industry.
Zijderveld said, “We have products that allow you to measure how consumers or viewers respond to digital content…you could have a number of people looking at an ad, you measure their emotional response so you aggregate the data and it gives you insight into how well your content is performing. And then you can adapt and adjust accordingly.”
Zijderveld explained that this is the first market where the company got traction. However, they have since packaged up their core technology in software development kits or SDKs. This allows other companies to integrate emotion detection into whatever they are building.
By licensing its technology to others, Affectiva is now rapidly expanding into a wide variety of markets, including gaming, education, robotics, and healthcare. The core technology is also used in human resources for the purposes of video recruitment. The software analyzes the emotional responses of interviewees, and that data is factored into hiring decisions.
Richard Yonck is founder and president of Intelligent Future Consulting and the author of a book about our relationship with technology. “One area I discuss in Heart of the Machine is the idea of an emotional economy that will arise as an ecosystem of emotionally aware businesses, systems, and services are developed. This will rapidly expand into a multi-billion-dollar industry, leading to an infrastructure that will be both emotionally responsive and potentially exploitive at personal, commercial, and political levels,” said Yonck, in an interview for this article.
According to Yonck, these emotionally-aware systems will “better anticipate needs, improve efficiency, and reduce stress and misunderstandings.”
Affectiva is uniquely positioned to profit from this “emotional economy.” The company has already created the world’s largest emotion database. “We’ve analyzed a little bit over 4.7 million faces in 75 countries,” said Zijderveld. “This is data first and foremost, it’s data gathered with consent. So everyone has opted in to have their faces analyzed.”
The vastness of that database is essential for deep learning approaches. The software would be inaccurate if the data was inadequate. According to Zijderveld, “If you don’t have massive amounts of data of people of all ages, genders, and ethnicities, then your algorithms are going to be pretty biased.”
This massive database has already revealed cultural insights into how people express emotion. Zijderveld explained, “Obviously everyone knows that women are more expressive than men. But our data confirms that, but not only that, it can also show that women smile longer. They tend to smile more often. There’s also regional differences.”
Yonck believes that affective computing will inspire unimaginable forms of innovation and that change will happen at a fast pace.
He explained, “As businesses, software, systems, and services develop, they’ll support and make possible all sorts of other emotionally aware technologies that couldn’t previously exist. This leads to a spiral of increasingly sophisticated products, just as happened in the early days of computing.”
Those who are curious about affective technology will soon be able to interact with it.
Hubble Connected unveiled the Hubble Hugo at multiple trade shows this year. Hugo is billed as “the world’s first smart camera,” with emotion AI video analytics powered by Affectiva. The product can identify individuals, figure out how they’re feeling, receive voice commands, video monitor your home, and act as a photographer and videographer of events. Media can then be transmitted to the cloud. The company’s website describes Hugo as “a fun pal to have in the house.”
Although he sees the potential for improved efficiencies and expanding markets, Richard Yonck cautions that AI technology is not without its pitfalls.
“It’s critical that we understand we are headed into very unknown territory as we develop these systems, creating problems unlike any we’ve faced before,” said Yonck. “We should put our focus on ensuring AI develops in a way that represents our human values and ideals.”
Image Credit: Kisan / Shutterstock.com Continue reading