Tag Archives: voice
Many people get frustrated with technology when it malfunctions or is counterintuitive. The last thing people might expect is for that same technology to pick up on their emotions and engage with them differently as a result.
All of that is now changing. Computers are increasingly able to figure out what we’re feeling—and it’s big business.
A recent report predicts that the global affective computing market will grow from $12.2 billion in 2016 to $53.98 billion by 2021. The report by research and consultancy firm MarketsandMarkets observed that enabling technologies have already been adopted in a wide range of industries and noted a rising demand for facial feature extraction software.
Affective computing is also referred to as emotion AI or artificial emotional intelligence. Although many people are still unfamiliar with the category, researchers in academia have already discovered a multitude of uses for it.
At the University of Tokyo, Professor Toshihiko Yamasaki decided to develop a machine learning system that evaluates the quality of TED Talk videos. Of course, a TED Talk is only considered to be good if it resonates with a human audience. On the surface, this would seem too qualitatively abstract for computer analysis. But Yamasaki wanted his system to watch videos of presentations and predict user impressions. Could a machine learning system accurately evaluate the emotional persuasiveness of a speaker?
Yamasaki and his colleagues came up with a method that analyzed correlations and “multimodal features including linguistic as well as acoustic features” in a dataset of 1,646 TED Talk videos. The experiment was successful. The method obtained “a statistically significant macro-average accuracy of 93.3 percent, outperforming several competitive baseline methods.”
A machine was able to predict whether or not a person would emotionally connect with other people. In their report, the authors noted that these findings could be used for recommendation purposes and also as feedback to the presenters, in order to improve the quality of their public presentation. However, the usefulness of affective computing goes far beyond the way people present content. It may also transform the way they learn it.
Researchers from North Carolina State University explored the connection between students’ affective states and their ability to learn. Their software was able to accurately predict the effectiveness of online tutoring sessions by analyzing the facial expressions of participating students. The software tracked fine-grained facial movements such as eyebrow raising, eyelid tightening, and mouth dimpling to determine engagement, frustration, and learning. The authors concluded that “analysis of facial expressions has great potential for educational data mining.”
This type of technology is increasingly being used within the private sector. Affectiva is a Boston-based company that makes emotion recognition software. When asked to comment on this emerging technology, Gabi Zijderveld, chief marketing officer at Affectiva, explained in an interview for this article, “Our software measures facial expressions of emotion. So basically all you need is our software running and then access to a camera so you can basically record a face and analyze it. We can do that in real time or we can do this by looking at a video and then analyzing data and sending it back to folks.”
The technology has particular relevance for the advertising industry.
Zijderveld said, “We have products that allow you to measure how consumers or viewers respond to digital content…you could have a number of people looking at an ad, you measure their emotional response so you aggregate the data and it gives you insight into how well your content is performing. And then you can adapt and adjust accordingly.”
Zijderveld explained that this is the first market where the company got traction. However, they have since packaged up their core technology in software development kits or SDKs. This allows other companies to integrate emotion detection into whatever they are building.
By licensing its technology to others, Affectiva is now rapidly expanding into a wide variety of markets, including gaming, education, robotics, and healthcare. The core technology is also used in human resources for the purposes of video recruitment. The software analyzes the emotional responses of interviewees, and that data is factored into hiring decisions.
Richard Yonck is founder and president of Intelligent Future Consulting and the author of a book about our relationship with technology. “One area I discuss in Heart of the Machine is the idea of an emotional economy that will arise as an ecosystem of emotionally aware businesses, systems, and services are developed. This will rapidly expand into a multi-billion-dollar industry, leading to an infrastructure that will be both emotionally responsive and potentially exploitive at personal, commercial, and political levels,” said Yonck, in an interview for this article.
According to Yonck, these emotionally-aware systems will “better anticipate needs, improve efficiency, and reduce stress and misunderstandings.”
Affectiva is uniquely positioned to profit from this “emotional economy.” The company has already created the world’s largest emotion database. “We’ve analyzed a little bit over 4.7 million faces in 75 countries,” said Zijderveld. “This is data first and foremost, it’s data gathered with consent. So everyone has opted in to have their faces analyzed.”
The vastness of that database is essential for deep learning approaches. The software would be inaccurate if the data was inadequate. According to Zijderveld, “If you don’t have massive amounts of data of people of all ages, genders, and ethnicities, then your algorithms are going to be pretty biased.”
This massive database has already revealed cultural insights into how people express emotion. Zijderveld explained, “Obviously everyone knows that women are more expressive than men. But our data confirms that, but not only that, it can also show that women smile longer. They tend to smile more often. There’s also regional differences.”
Yonck believes that affective computing will inspire unimaginable forms of innovation and that change will happen at a fast pace.
He explained, “As businesses, software, systems, and services develop, they’ll support and make possible all sorts of other emotionally aware technologies that couldn’t previously exist. This leads to a spiral of increasingly sophisticated products, just as happened in the early days of computing.”
Those who are curious about affective technology will soon be able to interact with it.
Hubble Connected unveiled the Hubble Hugo at multiple trade shows this year. Hugo is billed as “the world’s first smart camera,” with emotion AI video analytics powered by Affectiva. The product can identify individuals, figure out how they’re feeling, receive voice commands, video monitor your home, and act as a photographer and videographer of events. Media can then be transmitted to the cloud. The company’s website describes Hugo as “a fun pal to have in the house.”
Although he sees the potential for improved efficiencies and expanding markets, Richard Yonck cautions that AI technology is not without its pitfalls.
“It’s critical that we understand we are headed into very unknown territory as we develop these systems, creating problems unlike any we’ve faced before,” said Yonck. “We should put our focus on ensuring AI develops in a way that represents our human values and ideals.”
Image Credit: Kisan / Shutterstock.com Continue reading
Con artistry is one of the world’s oldest and most innovative professions, and it may soon have a new target. Research suggests artificial intelligence may be uniquely susceptible to tricksters, and as its influence in the modern world grows, attacks against it are likely to become more common.
The root of the problem lies in the fact that artificial intelligence algorithms learn about the world in very different ways than people do, and so slight tweaks to the data fed into these algorithms can throw them off completely while remaining imperceptible to humans.
Much of the research into this area has been conducted on image recognition systems, in particular those relying on deep learning neural networks. These systems are trained by showing them thousands of examples of images of a particular object until they can extract common features that allow them to accurately spot the object in new images.
But the features they extract are not necessarily the same high-level features a human would be looking for, like the word STOP on a sign or a tail on a dog. These systems analyze images at the individual pixel level to detect patterns shared between examples. These patterns can be obscure combinations of pixel values, in small pockets or spread across the image, that would be impossible to discern for a human, but highly accurate at predicting a particular object.
“An attacker can trick the object recognition algorithm into seeing something that isn’t there, without these alterations being obvious to a human.”
What this means is that by identifying these patterns and overlaying them over a different image, an attacker can trick the object recognition algorithm into seeing something that isn’t there, without these alterations being obvious to a human. This kind of manipulation is known as an “adversarial attack.”
Early attempts to trick image recognition systems this way required access to the algorithm’s inner workings to decipher these patterns. But in 2016 researchers demonstrated a “black box” attack that enabled them to trick such a system without knowing its inner workings.
By feeding the system doctored images and seeing how it classified them, they were able to work out what it was focusing on and therefore generate images they knew would fool it. Importantly, the doctored images were not obviously different to human eyes.
These approaches were tested by feeding doctored image data directly into the algorithm, but more recently, similar approaches have been applied in the real world. Last year it was shown that printouts of doctored images that were then photographed on a smartphone successfully tricked an image classification system.
Another group showed that wearing specially designed, psychedelically-colored spectacles could trick a facial recognition system into thinking people were celebrities. In August scientists showed that adding stickers to stop signs in particular configurations could cause a neural net designed to spot them to misclassify the signs.
These last two examples highlight some of the potential nefarious applications for this technology. Getting a self-driving car to miss a stop sign could cause an accident, either for insurance fraud or to do someone harm. If facial recognition becomes increasingly popular for biometric security applications, being able to pose as someone else could be very useful to a con artist.
Unsurprisingly, there are already efforts to counteract the threat of adversarial attacks. In particular, it has been shown that deep neural networks can be trained to detect adversarial images. One study from the Bosch Center for AI demonstrated such a detector, an adversarial attack that fools the detector, and a training regime for the detector that nullifies the attack, hinting at the kind of arms race we are likely to see in the future.
While image recognition systems provide an easy-to-visualize demonstration, they’re not the only machine learning systems at risk. The techniques used to perturb pixel data can be applied to other kinds of data too.
“Bypassing cybersecurity defenses is one of the more worrying and probable near-term applications for this approach.”
Chinese researchers showed that adding specific words to a sentence or misspelling a word can completely throw off machine learning systems designed to analyze what a passage of text is about. Another group demonstrated that garbled sounds played over speakers could make a smartphone running the Google Now voice command system visit a particular web address, which could be used to download malware.
This last example points toward one of the more worrying and probable near-term applications for this approach: bypassing cybersecurity defenses. The industry is increasingly using machine learning and data analytics to identify malware and detect intrusions, but these systems are also highly susceptible to trickery.
At this summer’s DEF CON hacking convention, a security firm demonstrated they could bypass anti-malware AI using a similar approach to the earlier black box attack on the image classifier, but super-powered with an AI of their own.
Their system fed malicious code to the antivirus software and then noted the score it was given. It then used genetic algorithms to iteratively tweak the code until it was able to bypass the defenses while maintaining its function.
All the approaches noted so far are focused on tricking pre-trained machine learning systems, but another approach of major concern to the cybersecurity industry is that of “data poisoning.” This is the idea that introducing false data into a machine learning system’s training set will cause it to start misclassifying things.
This could be particularly challenging for things like anti-malware systems that are constantly being updated to take into account new viruses. A related approach bombards systems with data designed to generate false positives so the defenders recalibrate their systems in a way that then allows the attackers to sneak in.
How likely it is that these approaches will be used in the wild will depend on the potential reward and the sophistication of the attackers. Most of the techniques described above require high levels of domain expertise, but it’s becoming ever easier to access training materials and tools for machine learning.
Simpler versions of machine learning have been at the heart of email spam filters for years, and spammers have developed a host of innovative workarounds to circumvent them. As machine learning and AI increasingly embed themselves in our lives, the rewards for learning how to trick them will likely outweigh the costs.
Image Credit: Nejron Photo / Shutterstock.com Continue reading
Does creativity make human intelligence special?
It may appear so at first glance. Though machines can calculate, analyze, and even perceive, creativity may seem far out of reach. Perhaps this is because we find it mysterious, even in ourselves. How can the output of a machine be anything more than that which is determined by its programmers?
Increasingly, however, artificial intelligence is moving into creativity’s hallowed domain, from art to industry. And though much is already possible, the future is sure to bring ever more creative machines.
What Is Machine Creativity?
Robotic art is just one example of machine creativity, a rapidly growing sub-field that sits somewhere between the study of artificial intelligence and human psychology.
The winning paintings from the 2017 Robot Art Competition are strikingly reminiscent of those showcased each spring at university exhibitions for graduating art students. Like the works produced by skilled artists, the compositions dreamed up by the competition’s robotic painters are aesthetically ambitious. One robot-made painting features a man’s bearded face gazing intently out from the canvas, his eyes locking with the viewer’s. Another abstract painting, “inspired” by data from EEG signals, visually depicts the human emotion of misery with jagged, gloomy stripes of black and purple.
More broadly, a creative machine is software (sometimes encased in a robotic body) that synthesizes inputs to generate new and valuable ideas, solutions to complex scientific problems, or original works of art. In a process similar to that followed by a human artist or scientist, a creative machine begins its work by framing a problem. Next, its software specifies the requirements the solution should have before generating “answers” in the form of original designs, patterns, or some other form of output.
Although the notion of machine creativity sounds a bit like science fiction, the basic concept is one that has been slowly developing for decades.
Nearly 50 years ago while a high school student, inventor and futurist Ray Kurzweil created software that could analyze the patterns in musical compositions and then compose new melodies in a similar style. Aaron, one of the world’s most famous painting robots, has been hard at work since the 1970s.
Industrial designers have used an automated, algorithm-driven process for decades to design computer chips (or machine parts) whose layout (or form) is optimized for a particular function or environment. Emily Howell, a computer program created by David Cope, writes original works in the style of classical composers, some of which have been performed by human orchestras to live audiences.
What’s different about today’s new and emerging generation of robotic artists, scientists, composers, authors, and product designers is their ubiquity and power.
“The recent explosion of artificial creativity has been enabled by the rapid maturation of the same exponential technologies that have already re-drawn our daily lives.”
I’ve already mentioned the rapidly advancing fields of robotic art and music. In the realm of scientific research, so-called “robotic scientists” such as Eureqa and Adam and Eve develop new scientific hypotheses; their “insights” have contributed to breakthroughs that are cited by hundreds of academic research papers. In the medical industry, creative machines are hard at work creating chemical compounds for new pharmaceuticals. After it read over seven million words of 20th century English poetry, a neural network developed by researcher Jack Hopkins learned to write passable poetry in a number of different styles and meters.
The recent explosion of artificial creativity has been enabled by the rapid maturation of the same exponential technologies that have already re-drawn our daily lives, including faster processors, ubiquitous sensors and wireless networks, and better algorithms.
As they continue to improve, creative machines—like humans—will perform a broad range of creative activities, ranging from everyday problem solving (sometimes known as “Little C” creativity) to producing once-in-a-century masterpieces (“Big C” creativity). A creative machine’s outputs could range from a design for a cast for a marble sculpture to a schematic blueprint for a clever new gadget for opening bottles of wine.
In the coming decades, by automating the process of solving complex problems, creative machines will again transform our world. Creative machines will serve as a versatile source of on-demand talent.
In the battle to recruit a workforce that can solve complex problems, creative machines will put small businesses on equal footing with large corporations. Art and music lovers will enjoy fresh creative works that re-interpret the style of ancient disciplines. People with a health condition will benefit from individualized medical treatments, and low-income people will receive top-notch legal advice, to name but a few potentially beneficial applications.
How Can We Make Creative Machines, Unless We Understand Our Own Creativity?
One of the most intriguing—yet unsettling—aspects of watching robotic arms skillfully oil paint is that we humans still do not understand our own creative process. Over the centuries, several different civilizations have devised a variety of models to explain creativity.
The ancient Greeks believed that poets drew inspiration from a transcendent realm parallel to the material world where ideas could take root and flourish. In the Middle Ages, philosophers and poets attributed our peculiarly human ability to “make something of nothing” to an external source, namely divine inspiration. Modern academic study of human creativity has generated vast reams of scholarship, but despite the value of these insights, the human imagination remains a great mystery, second only to that of consciousness.
Today, the rise of machine creativity demonstrates (once again), that we do not have to fully understand a biological process in order to emulate it with advanced technology.
Past experience has shown that jet planes can fly higher and faster than birds by using the forward thrust of an engine rather than wings. Submarines propel themselves forward underwater without fins or a tail. Deep learning neural networks identify objects in randomly-selected photographs with super-human accuracy. Similarly, using a fairly straightforward software architecture, creative software (sometimes paired with a robotic body) can paint, write, hypothesize, or design with impressive originality, skill, and boldness.
At the heart of machine creativity is simple iteration. No matter what sort of output they produce, creative machines fall into one of three categories depending on their internal architecture.
Briefly, the first group consists of software programs that use traditional rule-based, or symbolic AI, the second group uses evolutionary algorithms, and the third group uses a variation of a form of machine learning called deep learning that has already revolutionized voice and facial recognition software.
1) Symbolic creative machines are the oldest artificial artists and musicians. In this approach—also known as “good old-fashioned AI (GOFAI) or symbolic AI—the human programmer plays a key role by writing a set of step-by-step instructions to guide the computer through a task. Despite the fact that symbolic AI is limited in its ability to adapt to environmental changes, it’s still possible for a robotic artist programmed this way to create an impressively wide variety of different outputs.
2) Evolutionary algorithms (EA) have been in use for several decades and remain powerful tools for design. In this approach, potential solutions “compete” in a software simulator in a Darwinian process reminiscent of biological evolution. The human programmer specifies a “fitness criterion” that will be used to score and rank the solutions generated by the software. The software then generates a “first generation” population of random solutions (which typically are pretty poor in quality), scores this first generation of solutions, and selects the top 50% (those random solutions deemed to be the best “fit”). The software then takes another pass and recombines the “winning” solutions to create the next generation and repeats this process for thousands (and sometimes millions) of generations.
3) Generative deep learning (DL) neural networks represent the newest software architecture of the three, since DL is data-dependent and resource-intensive. First, a human programmer “trains” a DL neural network to recognize a particular feature in a dataset, for example, an image of a dog in a stream of digital images. Next, the standard “feed forward” process is reversed and the DL neural network begins to generate the feature, for example, eventually producing new and sometimes original images of (or poetry about) dogs. Generative DL networks have tremendous and unexplored creative potential and are able to produce a broad range of original outputs, from paintings to music to poetry.
The Coming Explosion of Machine Creativity
In the near future as Moore’s Law continues its work, we will see sophisticated combinations of these three basic architectures. Since the 1950s, artificial intelligence has steadily mastered one human ability after another, and in the process of doing so, has reduced the cost of calculation, analysis, and most recently, perception. When creative software becomes as inexpensive and ubiquitous as analytical software is today, humans will no longer be the only intelligent beings capable of creative work.
This is why I have to bite my tongue when I hear the well-intended (but shortsighted) advice frequently dispensed to young people that they should pursue work that demands creativity to help them “AI-proof” their futures.
Instead, students should gain skills to harness the power of creative machines.
There are two skills in which humans excel that will enable us to remain useful in a world of ever-advancing artificial intelligence. One, the ability to frame and define a complex problem so that it can be handed off to a creative machine to solve. And two, the ability to communicate the value of both the framework and the proposed solution to the other humans involved.
What will happen to people when creative machines begin to capably tread on intellectual ground that was once considered the sole domain of the human mind, and before that, the product of divine inspiration? While machines engaging in Big C creativity—e.g., oil painting and composing new symphonies—tend to garner controversy and make the headlines, I suspect the real world-changing application of machine creativity will be in the realm of everyday problem solving, or Little C. The mainstream emergence of powerful problem-solving tools will help people create abundance where there was once scarcity.
Image Credit: adike / Shutterstock.com Continue reading