Tag Archives: literature
#432487 Can We Make a Musical Turing Test?
As artificial intelligence advances, we’re encountering the same old questions. How much of what we consider to be fundamentally human can be reduced to an algorithm? Can we create something sufficiently advanced that people can no longer distinguish between the two? This, after all, is the idea behind the Turing Test, which has yet to be passed.
At first glance, you might think music is beyond the realm of algorithms. Birds can sing, and people can compose symphonies. Music is evocative; it makes us feel. Very often, our intense personal and emotional attachments to music are because it reminds us of our shared humanity. We are told that creative jobs are the least likely to be automated. Creativity seems fundamentally human.
But I think above all, we view it as reductionist sacrilege: to dissect beautiful things. “If you try to strangle a skylark / to cut it up, see how it works / you will stop its heart from beating / you will stop its mouth from singing.” A human musician wrote that; a machine might be able to string words together that are happy or sad; it might even be able to conjure up a decent metaphor from the depths of some neural network—but could it understand humanity enough to produce art that speaks to humans?
Then, of course, there’s the other side of the debate. Music, after all, has a deeply mathematical structure; you can train a machine to produce harmonics. “In the teachings of Pythagoras and his followers, music was inseparable from numbers, which were thought to be the key to the whole spiritual and physical universe,” according to Grout in A History of Western Music. You might argue that the process of musical composition cannot be reduced to a simple algorithm, yet musicians have often done so. Mozart, with his “Dice Music,” used the roll of a dice to decide how to order musical fragments; creativity through an 18th-century random number generator. Algorithmic music goes back a very long way, with the first papers on the subject from the 1960s.
Then there’s the techno-enthusiast side of the argument. iTunes has 26 million songs, easily more than a century of music. A human could never listen to and learn from them all, but a machine could. It could also memorize every note of Beethoven. Music can be converted into MIDI files, a nice chewable data format that allows even a character-by-character neural net you can run on your computer to generate music. (Seriously, even I could get this thing working.)
Indeed, generating music in the style of Bach has long been a test for AI, and you can see neural networks gradually learn to imitate classical composers while trying to avoid overfitting. When an algorithm overfits, it essentially starts copying the existing music, rather than being inspired by it but creating something similar: a tightrope the best human artists learn to walk. Creativity doesn’t spring from nowhere; even maverick musical geniuses have their influences.
Does a machine have to be truly ‘creative’ to produce something that someone would find valuable? To what extent would listeners’ attitudes change if they thought they were hearing a human vs. an AI composition? This all suggests a musical Turing Test. Of course, it already exists. In fact, it’s run out of Dartmouth, the school that hosted that first, seminal AI summer conference. This year, the contest is bigger than ever: alongside the PoetiX, LimeriX and LyriX competitions for poetry and lyrics, there’s a DigiKidLit competition for children’s literature (although you may have reservations about exposing your children to neural-net generated content… it can get a bit surreal).
There’s also a pair of musical competitions, including one for original compositions in different genres. Key genres and styles are represented by Charlie Parker for Jazz and the Bach chorales for classical music. There’s also a free composition, and a contest where a human and an AI try to improvise together—the AI must respond to a human spontaneously, in real time, and in a musically pleasing way. Quite a challenge! In all cases, if any of the generated work is indistinguishable from human performers, the neural net has passed the Turing Test.
Did they? Here’s part of 2017’s winning sonnet from Charese Smiley and Hiroko Bretz:
The large cabin was in total darkness.
Come marching up the eastern hill afar.
When is the clock on the stairs dangerous?
Everything seemed so near and yet so far.
Behind the wall silence alone replied.
Was, then, even the staircase occupied?
Generating the rhymes is easy enough, the sentence structure a little trickier, but what’s impressive about this sonnet is that it sticks to a single topic and appears to be a more coherent whole. I’d guess they used associated “lexical fields” of similar words to help generate something coherent. In a similar way, most of the more famous examples of AI-generated music still involve some amount of human control, even if it’s editorial; a human will build a song around an AI-generated riff, or select the most convincing Bach chorale from amidst many different samples.
We are seeing strides forward in the ability of AI to generate human voices and human likenesses. As the latter example shows, in the fake news era people have focused on the dangers of this tech– but might it also be possible to create a virtual performer, trained on a dataset of their original music? Did you ever want to hear another Beatles album, or jam with Miles Davis? Of course, these things are impossible—but could we create a similar experience that people would genuinely value? Even, to the untrained eye, something indistinguishable from the real thing?
And if it did measure up to the real thing, what would this mean? Jaron Lanier is a fascinating technology writer, a critic of strong AI, and a believer in the power of virtual reality to change the world and provide truly meaningful experiences. He’s also a composer and a musical aficionado. He pointed out in a recent interview that translation algorithms, by reducing the amount of work translators are commissioned to do, have, in some sense, profited from stolen expertise. They were trained on huge datasets purloined from human linguists and translators. If you can train an AI on someone’s creative output and it produces new music, who “owns” it?
Although companies that offer AI music tools are starting to proliferate, and some groups will argue that the musical Turing test has been passed already, AI-generated music is hardly racing to the top of the pop charts just yet. Even as the line between human-composed and AI-generated music starts to blur, there’s still a gulf between the average human and musical genius. In the next few years, we’ll see how far the current techniques can take us. It may be the case that there’s something in the skylark’s song that can’t be generated by machines. But maybe not, and then this song might need an extra verse.
Image Credit: d1sk / Shutterstock.com Continue reading
#432236 Why Hasn’t AI Mastered Language ...
In the myth about the Tower of Babel, people conspired to build a city and tower that would reach heaven. Their creator observed, “And now nothing will be restrained from them, which they have imagined to do.” According to the myth, God thwarted this effort by creating diverse languages so that they could no longer collaborate.
In our modern times, we’re experiencing a state of unprecedented connectivity thanks to technology. However, we’re still living under the shadow of the Tower of Babel. Language remains a barrier in business and marketing. Even though technological devices can quickly and easily connect, humans from different parts of the world often can’t.
Translation agencies step in, making presentations, contracts, outsourcing instructions, and advertisements comprehensible to all intended recipients. Some agencies also offer “localization” expertise. For instance, if a company is marketing in Quebec, the advertisements need to be in Québécois French, not European French. Risk-averse companies may be reluctant to invest in these translations. Consequently, these ventures haven’t achieved full market penetration.
Global markets are waiting, but AI-powered language translation isn’t ready yet, despite recent advancements in natural language processing and sentiment analysis. AI still has difficulties processing requests in one language, without the additional complications of translation. In November 2016, Google added a neural network to its translation tool. However, some of its translations are still socially and grammatically odd. I spoke to technologists and a language professor to find out why.
“To Google’s credit, they made a pretty massive improvement that appeared almost overnight. You know, I don’t use it as much. I will say this. Language is hard,” said Michael Housman, chief data science officer at RapportBoost.AI and faculty member of Singularity University.
He explained that the ideal scenario for machine learning and artificial intelligence is something with fixed rules and a clear-cut measure of success or failure. He named chess as an obvious example, and noted machines were able to beat the best human Go player. This happened faster than anyone anticipated because of the game’s very clear rules and limited set of moves.
Housman elaborated, “Language is almost the opposite of that. There aren’t as clearly-cut and defined rules. The conversation can go in an infinite number of different directions. And then of course, you need labeled data. You need to tell the machine to do it right or wrong.”
Housman noted that it’s inherently difficult to assign these informative labels. “Two translators won’t even agree on whether it was translated properly or not,” he said. “Language is kind of the wild west, in terms of data.”
Google’s technology is now able to consider the entirety of a sentence, as opposed to merely translating individual words. Still, the glitches linger. I asked Dr. Jorge Majfud, Associate Professor of Spanish, Latin American Literature, and International Studies at Jacksonville University, to explain why consistently accurate language translation has thus far eluded AI.
He replied, “The problem is that considering the ‘entire’ sentence is still not enough. The same way the meaning of a word depends on the rest of the sentence (more in English than in Spanish), the meaning of a sentence depends on the rest of the paragraph and the rest of the text, as the meaning of a text depends on a larger context called culture, speaker intentions, etc.”
He noted that sarcasm and irony only make sense within this widened context. Similarly, idioms can be problematic for automated translations.
“Google translation is a good tool if you use it as a tool, that is, not to substitute human learning or understanding,” he said, before offering examples of mistranslations that could occur.
“Months ago, I went to buy a drill at Home Depot and I read a sign under a machine: ‘Saw machine.’ Right below it, the Spanish translation: ‘La máquina vió,’ which means, ‘The machine did see it.’ Saw, not as a noun but as a verb in the preterit form,” he explained.
Dr. Majfud warned, “We should be aware of the fragility of their ‘interpretation.’ Because to translate is basically to interpret, not just an idea but a feeling. Human feelings and ideas that only humans can understand—and sometimes not even we, humans, understand other humans.”
He noted that cultures, gender, and even age can pose barriers to this understanding and also contended that an over-reliance on technology is leading to our cultural and political decline. Dr. Majfud mentioned that Argentinean writer Julio Cortázar used to refer to dictionaries as “cemeteries.” He suggested that automatic translators could be called “zombies.”
Erik Cambria is an academic AI researcher and assistant professor at Nanyang Technological University in Singapore. He mostly focuses on natural language processing, which is at the core of AI-powered language translation. Like Dr. Majfud, he sees the complexity and associated risks. “There are so many things that we unconsciously do when we read a piece of text,” he told me. Reading comprehension requires multiple interrelated tasks, which haven’t been accounted for in past attempts to automate translation.
Cambria continued, “The biggest issue with machine translation today is that we tend to go from the syntactic form of a sentence in the input language to the syntactic form of that sentence in the target language. That’s not what we humans do. We first decode the meaning of the sentence in the input language and then we encode that meaning into the target language.”
Additionally, there are cultural risks involved with these translations. Dr. Ramesh Srinivasan, Director of UCLA’s Digital Cultures Lab, said that new technological tools sometimes reflect underlying biases.
“There tend to be two parameters that shape how we design ‘intelligent systems.’ One is the values and you might say biases of those that create the systems. And the second is the world if you will that they learn from,” he told me. “If you build AI systems that reflect the biases of their creators and of the world more largely, you get some, occasionally, spectacular failures.”
Dr. Srinivasan said translation tools should be transparent about their capabilities and limitations. He said, “You know, the idea that a single system can take languages that I believe are very diverse semantically and syntactically from one another and claim to unite them or universalize them, or essentially make them sort of a singular entity, it’s a misnomer, right?”
Mary Cochran, co-founder of Launching Labs Marketing, sees the commercial upside. She mentioned that listings in online marketplaces such as Amazon could potentially be auto-translated and optimized for buyers in other countries.
She said, “I believe that we’re just at the tip of the iceberg, so to speak, with what AI can do with marketing. And with better translation, and more globalization around the world, AI can’t help but lead to exploding markets.”
Image Credit: igor kisselev / Shutterstock.com Continue reading
#431599 8 Ways AI Will Transform Our Cities by ...
How will AI shape the average North American city by 2030? A panel of experts assembled as part of a century-long study into the impact of AI thinks its effects will be profound.
The One Hundred Year Study on Artificial Intelligence is the brainchild of Eric Horvitz, technical fellow and a managing director at Microsoft Research.
Every five years a panel of experts will assess the current state of AI and its future directions. The first panel, comprised of experts in AI, law, political science, policy, and economics, was launched last fall and decided to frame their report around the impact AI will have on the average American city. Here’s how they think it will affect eight key domains of city life in the next fifteen years.
1. Transportation
The speed of the transition to AI-guided transport may catch the public by surprise. Self-driving vehicles will be widely adopted by 2020, and it won’t just be cars — driverless delivery trucks, autonomous delivery drones, and personal robots will also be commonplace.
Uber-style “cars as a service” are likely to replace car ownership, which may displace public transport or see it transition towards similar on-demand approaches. Commutes will become a time to relax or work productively, encouraging people to live further from home, which could combine with reduced need for parking to drastically change the face of modern cities.
Mountains of data from increasing numbers of sensors will allow administrators to model individuals’ movements, preferences, and goals, which could have major impact on the design city infrastructure.
Humans won’t be out of the loop, though. Algorithms that allow machines to learn from human input and coordinate with them will be crucial to ensuring autonomous transport operates smoothly. Getting this right will be key as this will be the public’s first experience with physically embodied AI systems and will strongly influence public perception.
2. Home and Service Robots
Robots that do things like deliver packages and clean offices will become much more common in the next 15 years. Mobile chipmakers are already squeezing the power of last century’s supercomputers into systems-on-a-chip, drastically boosting robots’ on-board computing capacity.
Cloud-connected robots will be able to share data to accelerate learning. Low-cost 3D sensors like Microsoft’s Kinect will speed the development of perceptual technology, while advances in speech comprehension will enhance robots’ interactions with humans. Robot arms in research labs today are likely to evolve into consumer devices around 2025.
But the cost and complexity of reliable hardware and the difficulty of implementing perceptual algorithms in the real world mean general-purpose robots are still some way off. Robots are likely to remain constrained to narrow commercial applications for the foreseeable future.
3. Healthcare
AI’s impact on healthcare in the next 15 years will depend more on regulation than technology. The most transformative possibilities of AI in healthcare require access to data, but the FDA has failed to find solutions to the difficult problem of balancing privacy and access to data. Implementation of electronic health records has also been poor.
If these hurdles can be cleared, AI could automate the legwork of diagnostics by mining patient records and the scientific literature. This kind of digital assistant could allow doctors to focus on the human dimensions of care while using their intuition and experience to guide the process.
At the population level, data from patient records, wearables, mobile apps, and personal genome sequencing will make personalized medicine a reality. While fully automated radiology is unlikely, access to huge datasets of medical imaging will enable training of machine learning algorithms that can “triage” or check scans, reducing the workload of doctors.
Intelligent walkers, wheelchairs, and exoskeletons will help keep the elderly active while smart home technology will be able to support and monitor them to keep them independent. Robots may begin to enter hospitals carrying out simple tasks like delivering goods to the right room or doing sutures once the needle is correctly placed, but these tasks will only be semi-automated and will require collaboration between humans and robots.
4. Education
The line between the classroom and individual learning will be blurred by 2030. Massive open online courses (MOOCs) will interact with intelligent tutors and other AI technologies to allow personalized education at scale. Computer-based learning won’t replace the classroom, but online tools will help students learn at their own pace using techniques that work for them.
AI-enabled education systems will learn individuals’ preferences, but by aggregating this data they’ll also accelerate education research and the development of new tools. Online teaching will increasingly widen educational access, making learning lifelong, enabling people to retrain, and increasing access to top-quality education in developing countries.
Sophisticated virtual reality will allow students to immerse themselves in historical and fictional worlds or explore environments and scientific objects difficult to engage with in the real world. Digital reading devices will become much smarter too, linking to supplementary information and translating between languages.
5. Low-Resource Communities
In contrast to the dystopian visions of sci-fi, by 2030 AI will help improve life for the poorest members of society. Predictive analytics will let government agencies better allocate limited resources by helping them forecast environmental hazards or building code violations. AI planning could help distribute excess food from restaurants to food banks and shelters before it spoils.
Investment in these areas is under-funded though, so how quickly these capabilities will appear is uncertain. There are fears valueless machine learning could inadvertently discriminate by correlating things with race or gender, or surrogate factors like zip codes. But AI programs are easier to hold accountable than humans, so they’re more likely to help weed out discrimination.
6. Public Safety and Security
By 2030 cities are likely to rely heavily on AI technologies to detect and predict crime. Automatic processing of CCTV and drone footage will make it possible to rapidly spot anomalous behavior. This will not only allow law enforcement to react quickly but also forecast when and where crimes will be committed. Fears that bias and error could lead to people being unduly targeted are justified, but well-thought-out systems could actually counteract human bias and highlight police malpractice.
Techniques like speech and gait analysis could help interrogators and security guards detect suspicious behavior. Contrary to concerns about overly pervasive law enforcement, AI is likely to make policing more targeted and therefore less overbearing.
7. Employment and Workplace
The effects of AI will be felt most profoundly in the workplace. By 2030 AI will be encroaching on skilled professionals like lawyers, financial advisers, and radiologists. As it becomes capable of taking on more roles, organizations will be able to scale rapidly with relatively small workforces.
AI is more likely to replace tasks rather than jobs in the near term, and it will also create new jobs and markets, even if it’s hard to imagine what those will be right now. While it may reduce incomes and job prospects, increasing automation will also lower the cost of goods and services, effectively making everyone richer.
These structural shifts in the economy will require political rather than purely economic responses to ensure these riches are shared. In the short run, this may include resources being pumped into education and re-training, but longer term may require a far more comprehensive social safety net or radical approaches like a guaranteed basic income.
8. Entertainment
Entertainment in 2030 will be interactive, personalized, and immeasurably more engaging than today. Breakthroughs in sensors and hardware will see virtual reality, haptics and companion robots increasingly enter the home. Users will be able to interact with entertainment systems conversationally, and they will show emotion, empathy, and the ability to adapt to environmental cues like the time of day.
Social networks already allow personalized entertainment channels, but the reams of data being collected on usage patterns and preferences will allow media providers to personalize entertainment to unprecedented levels. There are concerns this could endow media conglomerates with unprecedented control over people’s online experiences and the ideas to which they are exposed.
But advances in AI will also make creating your own entertainment far easier and more engaging, whether by helping to compose music or choreograph dances using an avatar. Democratizing the production of high-quality entertainment makes it nearly impossible to predict how highly fluid human tastes for entertainment will develop.
Image Credit: Asgord / Shutterstock.com Continue reading