Tag Archives: labs
In the myth about the Tower of Babel, people conspired to build a city and tower that would reach heaven. Their creator observed, “And now nothing will be restrained from them, which they have imagined to do.” According to the myth, God thwarted this effort by creating diverse languages so that they could no longer collaborate.
In our modern times, we’re experiencing a state of unprecedented connectivity thanks to technology. However, we’re still living under the shadow of the Tower of Babel. Language remains a barrier in business and marketing. Even though technological devices can quickly and easily connect, humans from different parts of the world often can’t.
Translation agencies step in, making presentations, contracts, outsourcing instructions, and advertisements comprehensible to all intended recipients. Some agencies also offer “localization” expertise. For instance, if a company is marketing in Quebec, the advertisements need to be in Québécois French, not European French. Risk-averse companies may be reluctant to invest in these translations. Consequently, these ventures haven’t achieved full market penetration.
Global markets are waiting, but AI-powered language translation isn’t ready yet, despite recent advancements in natural language processing and sentiment analysis. AI still has difficulties processing requests in one language, without the additional complications of translation. In November 2016, Google added a neural network to its translation tool. However, some of its translations are still socially and grammatically odd. I spoke to technologists and a language professor to find out why.
“To Google’s credit, they made a pretty massive improvement that appeared almost overnight. You know, I don’t use it as much. I will say this. Language is hard,” said Michael Housman, chief data science officer at RapportBoost.AI and faculty member of Singularity University.
He explained that the ideal scenario for machine learning and artificial intelligence is something with fixed rules and a clear-cut measure of success or failure. He named chess as an obvious example, and noted machines were able to beat the best human Go player. This happened faster than anyone anticipated because of the game’s very clear rules and limited set of moves.
Housman elaborated, “Language is almost the opposite of that. There aren’t as clearly-cut and defined rules. The conversation can go in an infinite number of different directions. And then of course, you need labeled data. You need to tell the machine to do it right or wrong.”
Housman noted that it’s inherently difficult to assign these informative labels. “Two translators won’t even agree on whether it was translated properly or not,” he said. “Language is kind of the wild west, in terms of data.”
Google’s technology is now able to consider the entirety of a sentence, as opposed to merely translating individual words. Still, the glitches linger. I asked Dr. Jorge Majfud, Associate Professor of Spanish, Latin American Literature, and International Studies at Jacksonville University, to explain why consistently accurate language translation has thus far eluded AI.
He replied, “The problem is that considering the ‘entire’ sentence is still not enough. The same way the meaning of a word depends on the rest of the sentence (more in English than in Spanish), the meaning of a sentence depends on the rest of the paragraph and the rest of the text, as the meaning of a text depends on a larger context called culture, speaker intentions, etc.”
He noted that sarcasm and irony only make sense within this widened context. Similarly, idioms can be problematic for automated translations.
“Google translation is a good tool if you use it as a tool, that is, not to substitute human learning or understanding,” he said, before offering examples of mistranslations that could occur.
“Months ago, I went to buy a drill at Home Depot and I read a sign under a machine: ‘Saw machine.’ Right below it, the Spanish translation: ‘La máquina vió,’ which means, ‘The machine did see it.’ Saw, not as a noun but as a verb in the preterit form,” he explained.
Dr. Majfud warned, “We should be aware of the fragility of their ‘interpretation.’ Because to translate is basically to interpret, not just an idea but a feeling. Human feelings and ideas that only humans can understand—and sometimes not even we, humans, understand other humans.”
He noted that cultures, gender, and even age can pose barriers to this understanding and also contended that an over-reliance on technology is leading to our cultural and political decline. Dr. Majfud mentioned that Argentinean writer Julio Cortázar used to refer to dictionaries as “cemeteries.” He suggested that automatic translators could be called “zombies.”
Erik Cambria is an academic AI researcher and assistant professor at Nanyang Technological University in Singapore. He mostly focuses on natural language processing, which is at the core of AI-powered language translation. Like Dr. Majfud, he sees the complexity and associated risks. “There are so many things that we unconsciously do when we read a piece of text,” he told me. Reading comprehension requires multiple interrelated tasks, which haven’t been accounted for in past attempts to automate translation.
Cambria continued, “The biggest issue with machine translation today is that we tend to go from the syntactic form of a sentence in the input language to the syntactic form of that sentence in the target language. That’s not what we humans do. We first decode the meaning of the sentence in the input language and then we encode that meaning into the target language.”
Additionally, there are cultural risks involved with these translations. Dr. Ramesh Srinivasan, Director of UCLA’s Digital Cultures Lab, said that new technological tools sometimes reflect underlying biases.
“There tend to be two parameters that shape how we design ‘intelligent systems.’ One is the values and you might say biases of those that create the systems. And the second is the world if you will that they learn from,” he told me. “If you build AI systems that reflect the biases of their creators and of the world more largely, you get some, occasionally, spectacular failures.”
Dr. Srinivasan said translation tools should be transparent about their capabilities and limitations. He said, “You know, the idea that a single system can take languages that I believe are very diverse semantically and syntactically from one another and claim to unite them or universalize them, or essentially make them sort of a singular entity, it’s a misnomer, right?”
Mary Cochran, co-founder of Launching Labs Marketing, sees the commercial upside. She mentioned that listings in online marketplaces such as Amazon could potentially be auto-translated and optimized for buyers in other countries.
She said, “I believe that we’re just at the tip of the iceberg, so to speak, with what AI can do with marketing. And with better translation, and more globalization around the world, AI can’t help but lead to exploding markets.”
Image Credit: igor kisselev / Shutterstock.com Continue reading
The tech industry touts its ability to automate tasks and remove slow and expensive humans from the equation. But in the background, a lot of the legwork training machine learning systems, solving problems software can’t, and cleaning up its mistakes is still done by people.
This was highlighted recently when Expensify, which promises to automatically scan photos of receipts to extract data for expense reports, was criticized for sending customers’ personally identifiable receipts to workers on Amazon’s Mechanical Turk (MTurk) crowdsourcing platform.
The company uses text analysis software to read the receipts, but if the automated system falls down then the images are passed to a human for review. While entrusting this job to random workers on MTurk was maybe not so wise—and the company quickly stopped after the furor—the incident brought to light that this kind of human safety net behind AI-powered services is actually very common.
As Wired notes, similar services like Ibotta and Receipt Hog that collect receipt information for marketing purposes also use crowdsourced workers. In a similar vein, while most users might assume their Facebook newsfeed is governed by faceless algorithms, the company has been ramping up the number of human moderators it employs to catch objectionable content that slips through the net, as has YouTube. Twitter also has thousands of human overseers.
Humans aren’t always witting contributors either. The old text-based reCAPTCHA problems Google used to use to distinguish humans from machines was actually simultaneously helping the company digitize books by getting humans to interpret hard-to-read text.
“Every product that uses AI also uses people,” Jeffrey Bigham, a crowdsourcing expert at Carnegie Mellon University, told Wired. “I wouldn’t even say it’s a backstop so much as a core part of the process.”
Some companies are not shy about their use of crowdsourced workers. Startup Eloquent Labs wants to insert them between customer service chatbots and human agents who step in when the machines fail. Many times the AI is pretty certain what particular work means, and an MTurk worker can step in and quickly classify them faster and cheaper than a service agent.
Fashion retailer Gilt provides “pre-emptive shipping,” which uses data analytics to predict what people will buy to get products to them faster. The company uses MTurk workers to provide subjective critiques of clothing that feed into their models.
MTurk isn’t the only player. Companies like Cloudfactory and Crowdflower provide crowdsourced human manpower tailored to particular niches, and some companies prefer to maintain their own communities of workers. Unlabel uses an army of 50,000 humans to check and edit the translations its artificial intelligence system produces for customers.
Most of the time these human workers aren’t just filling in the gaps, they’re also helping to train the machine learning component of these companies’ services by providing new examples of how to solve problems. Other times humans aren’t used “in-the-loop” with AI systems, but to prepare data sets they can learn from by labeling images, text, or audio.
It’s even possible to use crowdsourced workers to carry out tasks typically tackled by machine learning, such as large-scale image analysis and forecasting.
Zooniverse gets citizen scientists to classify images of distant galaxies or videos of animals to help academics analyze large data sets too complex for computers. Almanis creates forecasts on everything from economics to politics with impressive accuracy by giving those who sign up to the website incentives for backing the correct answer to a question. Researchers have used MTurkers to power a chatbot, and there’s even a toolkit for building algorithms to control this human intelligence called TurKit.
So what does this prominent role for humans in AI services mean? Firstly, it suggests that many tools people assume are powered by AI may in fact be relying on humans. This has obvious privacy implications, as the Expensify story highlighted, but should also raise concerns about whether customers are really getting what they pay for.
One example of this is IBM’s Watson for oncology, which is marketed as a data-driven AI system for providing cancer treatment recommendations. But an investigation by STAT highlighted that it’s actually largely driven by recommendations from a handful of (admittedly highly skilled) doctors at Memorial Sloan Kettering Cancer Center in New York.
Secondly, humans intervening in AI-run processes also suggests AI is still largely helpless without us, which is somewhat comforting to know among all the doomsday predictions of AI destroying jobs. At the same time, though, much of this crowdsourced work is monotonous, poorly paid, and isolating.
As machines trained by human workers get better at all kinds of tasks, this kind of piecemeal work filling in the increasingly small gaps in their capabilities may get more common. While tech companies often talk about AI augmenting human intelligence, for many it may actually end up being the other way around.
Image Credit: kentoh / Shutterstock.com Continue reading
The best storytellers react to their audience. They look for smiles, signs of awe, or boredom; they simultaneously and skillfully read both the story and their sitters. Kevin Brooks, a seasoned storyteller working for Motorola’s Human Interface Labs, explains, “As the storyteller begins, they must tune in to… the audience’s energy. Based on this energy, the storyteller will adjust their timing, their posture, their characterizations, and sometimes even the events of the story. There is a dialog between audience and storyteller.”
Shortly after I read the script to Melita, the latest virtual reality experience from Madrid-based immersive storytelling company Future Lighthouse, CEO Nicolas Alcalá explained to me that the piece is an example of “reactive content,” a concept he’s been working on since his days at Singularity University.
For the first time in history, we have access to technology that can merge the reactive and affective elements of oral storytelling with the affordances of digital media, weaving stunning visuals, rich soundtracks, and complex meta-narratives in a story arena that has the capability to know you more intimately than any conventional storyteller could.
It’s no understatement to say that the storytelling potential here is phenomenal.
In short, we can refer to content as reactive if it reads and reacts to users based on their body rhythms, emotions, preferences, and data points. Artificial intelligence is used to analyze users’ behavior or preferences to sculpt unique storylines and narratives, essentially allowing for a story that changes in real time based on who you are and how you feel.
The development of reactive content will allow those working in the industry to go one step further than simply translating the essence of oral storytelling into VR. Rather than having a narrative experience with a digital storyteller who can read you, reactive content has the potential to create an experience with a storyteller who knows you.
This means being able to subtly insert minor personal details that have a specific meaning to the viewer. When we talk to our friends we often use experiences we’ve shared in the past or knowledge of our audience to give our story as much resonance as possible. Targeting personal memories and aspects of our lives is a highly effective way to elicit emotions and aid in visualizing narratives. When you can do this with the addition of visuals, music, and characters—all lifted from someone’s past—you have the potential for overwhelmingly engaging and emotionally-charged content.
Future Lighthouse inform me that for now, reactive content will rely primarily on biometric feedback technology such as breathing, heartbeat, and eye tracking sensors. A simple example would be a story in which parts of the environment or soundscape change in sync with the user’s heartbeat and breathing, or characters who call you out for not paying attention.
The next step would be characters and situations that react to the user’s emotions, wherein algorithms analyze biometric information to make inferences about states of emotional arousal (“why are you so nervous?” etc.). Another example would be implementing the use of “arousal parameters,” where the audience can choose what level of “fear” they want from a VR horror story before algorithms modulate the experience using information from biometric feedback devices.
The company’s long-term goal is to gather research on storytelling conventions and produce a catalogue of story “wireframes.” This entails distilling the basic formula to different genres so they can then be fleshed out with visuals, character traits, and soundtracks that are tailored for individual users based on their deep data, preferences, and biometric information.
The development of reactive content will go hand in hand with a renewed exploration of diverging, dynamic storylines, and multi-narratives, a concept that hasn’t had much impact in the movie world thus far. In theory, the idea of having a story that changes and mutates is captivating largely because of our love affair with serendipity and unpredictability, a cultural condition theorist Arthur Kroker refers to as the “hypertextual imagination.” This feeling of stepping into the unknown with the possibility of deviation from the habitual translates as a comforting reminder that our own lives can take exciting and unexpected turns at any moment.
The inception of the concept into mainstream culture dates to the classic Choose Your Own Adventure book series that launched in the late 70s, which in its literary form had great success. However, filmic takes on the theme have made somewhat less of an impression. DVDs like I’m Your Man (1998) and Switching (2003) both use scene selection tools to determine the direction of the storyline.
A more recent example comes from Kino Industries, who claim to have developed the technology to allow filmmakers to produce interactive films in which viewers can use smartphones to quickly vote on which direction the narrative takes at numerous decision points throughout the film.
The main problem with diverging narrative films has been the stop-start nature of the interactive element: when I’m immersed in a story I don’t want to have to pick up a controller or remote to select what’s going to happen next. Every time the audience is given the option to take a new path (“press this button”, “vote on X, Y, Z”) the narrative— and immersion within that narrative—is temporarily halted, and it takes the mind a while to get back into this state of immersion.
Reactive content has the potential to resolve these issues by enabling passive interactivity—that is, input and output without having to pause and actively make decisions or engage with the hardware. This will result in diverging, dynamic narratives that will unfold seamlessly while being dependent on and unique to the specific user and their emotions. Passive interactivity will also remove the game feel that can often be a symptom of interactive experiences and put a viewer somewhere in the middle: still firmly ensconced in an interactive dynamic narrative, but in a much subtler way.
While reading the Melita script I was particularly struck by a scene in which the characters start to engage with the user and there’s a synchronicity between the user’s heartbeat and objects in the virtual world. As the narrative unwinds and the words of Melita’s character get more profound, parts of the landscape, which seemed to be flashing and pulsating at random, come together and start to mimic the user’s heartbeat.
In 2013, Jane Aspell of Anglia Ruskin University (UK) and Lukas Heydrich of the Swiss Federal Institute of Technology proved that a user’s sense of presence and identification with a virtual avatar could be dramatically increased by syncing the on-screen character with the heartbeat of the user. The relationship between bio-digital synchronicity, immersion, and emotional engagement is something that will surely have revolutionary narrative and storytelling potential.
Image Credit: Tithi Luadthong / Shutterstock.com Continue reading