Tag Archives: general
#439073 There’s a ‘New’ Nirvana Song Out, ...
One of the primary capabilities separating human intelligence from artificial intelligence is our ability to be creative—to use nothing but the world around us, our experiences, and our brains to create art. At present, AI needs to be extensively trained on human-made works of art in order to produce new work, so we’ve still got a leg up. That said, neural networks like OpenAI’s GPT-3 and Russian designer Nikolay Ironov have been able to create content indistinguishable from human-made work.
Now there’s another example of AI artistry that’s hard to tell apart from the real thing, and it’s sure to excite 90s alternative rock fans the world over: a brand-new, never-heard-before Nirvana song. Or, more accurately, a song written by a neural network that was trained on Nirvana’s music.
The song is called “Drowned in the Sun,” and it does have a pretty Nirvana-esque ring to it. The neural network that wrote it is Magenta, which was launched by Google in 2016 with the goal of training machines to create art—or as the tool’s website puts it, exploring the role of machine learning as a tool in the creative process. Magenta was built using TensorFlow, Google’s massive open-source software library focused on deep learning applications.
The song was written as part of an album called Lost Tapes of the 27 Club, a project carried out by a Toronto-based organization called Over the Bridge focused on mental health in the music industry.
Here’s how a computer was able to write a song in the unique style of a deceased musician. Music, 20 to 30 tracks, was fed into Magenta’s neural network in the form of MIDI files. MIDI stands for Musical Instrument Digital Interface, and the format contains the details of a song written in code that represents musical parameters like pitch and tempo. Components of each song, like vocal melody or rhythm guitar, were fed in one at a time.
The neural network found patterns in these different components, and got enough of a handle on them that when given a few notes to start from, it could use those patterns to predict what would come next; in this case, chords and melodies that sound like they could’ve been written by Kurt Cobain.
To be clear, Magenta didn’t spit out a ready-to-go song complete with lyrics. The AI wrote the music, but a different neural network wrote the lyrics (using essentially the same process as Magenta), and the team then sifted through “pages and pages” of output to find lyrics that fit the melodies Magenta created.
Eric Hogan, a singer for a Nirvana tribute band who the Over the Bridge team hired to sing “Drowned in the Sun,” felt that the lyrics were spot-on. “The song is saying, ‘I’m a weirdo, but I like it,’” he said. “That is total Kurt Cobain right there. The sentiment is exactly what he would have said.”
Cobain isn’t the only musician the Lost Tapes project tried to emulate; songs in the styles of Jimi Hendrix, Jim Morrison, and Amy Winehouse were also included. What all these artists have in common is that they died by suicide at the age of 27.
The project is meant to raise awareness around mental health, particularly among music industry professionals. It’s not hard to think of great artists of all persuasions—musicians, painters, writers, actors—whose lives are cut short due to severe depression and other mental health issues for which it can be hard to get help. These issues are sometimes romanticized, as suffering does tend to create art that’s meaningful, relatable, and timeless. But according to the Lost Tapes website, suicide attempts among music industry workers are more than double that of the general population.
How many more hit songs would these artists have written if they were still alive? We’ll never know, but hopefully Lost Tapes of the 27 Club and projects like it will raise awareness of mental health issues, both in the music industry and in general, and help people in need find the right resources. Because no matter how good computers eventually get at creating music, writing, or other art, as Lost Tapes’ website pointedly says, “Even AI will never replace the real thing.”
Image Credit: Edward Xu on Unsplash Continue reading
#438807 Visible Touch: How Cameras Can Help ...
The dawn of the robot revolution is already here, and it is not the dystopian nightmare we imagined. Instead, it comes in the form of social robots: Autonomous robots in homes and schools, offices and public spaces, able to interact with humans and other robots in a socially acceptable, human-perceptible way to resolve tasks related to core human needs.
To design social robots that “understand” humans, robotics scientists are delving into the psychology of human communication. Researchers from Cornell University posit that embedding the sense of touch in social robots could teach them to detect physical interactions and gestures. They describe a way of doing so by relying not on touch but on vision.
A USB camera inside the robot captures shadows of hand gestures on the robot’s surface and classifies them with machine-learning software. They call this method ShadowSense, which they define as a modality between vision and touch, bringing “the high resolution and low cost of vision-sensing to the close-up sensory experience of touch.”
Touch-sensing in social or interactive robots is usually achieved with force sensors or capacitive sensors, says study co-author Guy Hoffman of the Sibley School of Mechanical and Aerospace Engineering at Cornell University. The drawback to his group’s approach has been that, even to achieve coarse spatial resolution, many sensors are needed in a small area.
However, working with non-rigid, inflatable robots, Hoffman and his co-researchers installed a consumer-grade USB camera to which they attached a fisheye lens for a wider field of vision.
“Given that the robot is already hollow, and has a soft and translucent skin, we could do touch interaction by looking at the shadows created by people touching the robot,” says Hoffman. They used deep neural networks to interpret the shadows. “And we were able to do it with very high accuracy,” he says. The robot was able to interpret six different gestures, including one- or two-handed touch, pointing, hugging and punching, with an accuracy of 87.5 to 96 percent, depending on the lighting.
This is not the first time that computer vision has been used for tactile sensing, though the scale and application of ShadowSense is unique. “Photography has been used for touch mainly in robotic grasping,” says Hoffman. By contrast, Hoffman and collaborators wanted to develop a sense that could be “felt” across the whole of the device.
The potential applications for ShadowSense include mobile robot guidance using touch, and interactive screens on soft robots. A third concerns privacy, especially in home-based social robots. “We have another paper currently under review that looks specifically at the ability to detect gestures that are further away [from the robot’s skin],” says Hoffman. This way, users would be able to cover their robot’s camera with a translucent material and still allow it to interpret actions and gestures from shadows. Thus, even though it’s prevented from capturing a high-resolution image of the user or their surrounding environment, using the right kind of training datasets, the robot can continue to monitor some kinds of non-tactile activities.
In its current iteration, Hoffman says, ShadowSense doesn’t do well in low-light conditions. Environmental noise, or shadows from surrounding objects, also interfere with image classification. Relying on one camera also means a single point of failure. “I think if this were to become a commercial product, we would probably [have to] work a little bit better on image detection,” says Hoffman.
As it was, the researchers used transfer learning—reusing a pre-trained deep-learning model in a new problem—for image analysis. “One of the problems with multi-layered neural networks is that you need a lot of training data to make accurate predictions,” says Hoffman. “Obviously, we don’t have millions of examples of people touching a hollow, inflatable robot. But we can use pre-trained networks trained on general images, which we have billions of, and we only retrain the last layers of the network using our own dataset.” Continue reading