Tag Archives: Companies

#433282 The 4 Waves of AI: Who Will Own the ...

Recently, I picked up Kai-Fu Lee’s newest book, AI Superpowers.

Kai-Fu Lee is one of the most plugged-in AI investors on the planet, managing over $2 billion between six funds and over 300 portfolio companies in the US and China.

Drawing from his pioneering work in AI, executive leadership at Microsoft, Apple, and Google (where he served as founding president of Google China), and his founding of VC fund Sinovation Ventures, Lee shares invaluable insights about:

The four factors driving today’s AI ecosystems;
China’s extraordinary inroads in AI implementation;
Where autonomous systems are headed;
How we’ll need to adapt.

With a foothold in both Beijing and Silicon Valley, Lee looks at the power balance between Chinese and US tech behemoths—each turbocharging new applications of deep learning and sweeping up global markets in the process.

In this post, I’ll be discussing Lee’s “Four Waves of AI,” an excellent framework for discussing where AI is today and where it’s going. I’ll also be featuring some of the hottest Chinese tech companies leading the charge, worth watching right now.

I’m super excited that this Tuesday, I’ve scored the opportunity to sit down with Kai-Fu Lee to discuss his book in detail via a webinar.

With Sino-US competition heating up, who will own the future of technology?

Let’s dive in.

The First Wave: Internet AI
In this first stage of AI deployment, we’re dealing primarily with recommendation engines—algorithmic systems that learn from masses of user data to curate online content personalized to each one of us.

Think Amazon’s spot-on product recommendations, or that “Up Next” YouTube video you just have to watch before getting back to work, or Facebook ads that seem to know what you’ll buy before you do.

Powered by the data flowing through our networks, internet AI leverages the fact that users automatically label data as we browse. Clicking versus not clicking; lingering on a web page longer than we did on another; hovering over a Facebook video to see what happens at the end.

These cascades of labeled data build a detailed picture of our personalities, habits, demands, and desires: the perfect recipe for more tailored content to keep us on a given platform.

Currently, Lee estimates that Chinese and American companies stand head-to-head when it comes to deployment of internet AI. But given China’s data advantage, he predicts that Chinese tech giants will have a slight lead (60-40) over their US counterparts in the next five years.

While you’ve most definitely heard of Alibaba and Baidu, you’ve probably never stumbled upon Toutiao.

Starting out as a copycat of America’s wildly popular Buzzfeed, Toutiao reached a valuation of $20 billion by 2017, dwarfing Buzzfeed’s valuation by more than a factor of 10. But with almost 120 million daily active users, Toutiao doesn’t just stop at creating viral content.

Equipped with natural-language processing and computer vision, Toutiao’s AI engines survey a vast network of different sites and contributors, rewriting headlines to optimize for user engagement, and processing each user’s online behavior—clicks, comments, engagement time—to curate individualized news feeds for millions of consumers.

And as users grow more engaged with Toutiao’s content, the company’s algorithms get better and better at recommending content, optimizing headlines, and delivering a truly personalized feed.

It’s this kind of positive feedback loop that fuels today’s AI giants surfing the wave of internet AI.

The Second Wave: Business AI
While internet AI takes advantage of the fact that netizens are constantly labeling data via clicks and other engagement metrics, business AI jumps on the data that traditional companies have already labeled in the past.

Think banks issuing loans and recording repayment rates; hospitals archiving diagnoses, imaging data, and subsequent health outcomes; or courts noting conviction history, recidivism, and flight.

While we humans make predictions based on obvious root causes (strong features), AI algorithms can process thousands of weakly correlated variables (weak features) that may have much more to do with a given outcome than the usual suspects.

By scouting out hidden correlations that escape our linear cause-and-effect logic, business AI leverages labeled data to train algorithms that outperform even the most veteran of experts.

Apply these data-trained AI engines to banking, insurance, and legal sentencing, and you get minimized default rates, optimized premiums, and plummeting recidivism rates.

While Lee confidently places America in the lead (90-10) for business AI, China’s substantial lag in structured industry data could actually work in its favor going forward.

In industries where Chinese startups can leapfrog over legacy systems, China has a major advantage.

Take Chinese app Smart Finance, for instance.

While Americans embraced credit and debit cards in the 1970s, China was still in the throes of its Cultural Revolution, largely missing the bus on this technology.

Fast forward to 2017, and China’s mobile payment spending outnumbered that of Americans’ by a ratio of 50 to 1. Without the competition of deeply entrenched credit cards, mobile payments were an obvious upgrade to China’s cash-heavy economy, embraced by 70 percent of China’s 753 million smartphone users by the end of 2017.

But by leapfrogging over credit cards and into mobile payments, China largely left behind the notion of credit.

And here’s where Smart Finance comes in.

An AI-powered app for microfinance, Smart Finance depends almost exclusively on its algorithms to make millions of microloans. For each potential borrower, the app simply requests access to a portion of the user’s phone data.

On the basis of variables as subtle as your typing speed and battery percentage, Smart Finance can predict with astounding accuracy your likelihood of repaying a $300 loan.

Such deployments of business AI and internet AI are already revolutionizing our industries and individual lifestyles. But still on the horizon lie two even more monumental waves— perception AI and autonomous AI.

The Third Wave: Perception AI
In this wave, AI gets an upgrade with eyes, ears, and myriad other senses, merging the digital world with our physical environments.

As sensors and smart devices proliferate through our homes and cities, we are on the verge of entering a trillion-sensor economy.

Companies like China’s Xiaomi are putting out millions of IoT-connected devices, and teams of researchers have already begun prototyping smart dust—solar cell- and sensor-geared particulates that can store and communicate troves of data anywhere, anytime.

As Kai-Fu explains, perception AI “will bring the convenience and abundance of the online world into our offline reality.” Sensor-enabled hardware devices will turn everything from hospitals to cars to schools into online-merge-offline (OMO) environments.

Imagine walking into a grocery store, scanning your face to pull up your most common purchases, and then picking up a virtual assistant (VA) shopping cart. Having pre-loaded your data, the cart adjusts your usual grocery list with voice input, reminds you to get your spouse’s favorite wine for an upcoming anniversary, and guides you through a personalized store route.

While we haven’t yet leveraged the full potential of perception AI, China and the US are already making incredible strides. Given China’s hardware advantage, Lee predicts China currently has a 60-40 edge over its American tech counterparts.

Now the go-to city for startups building robots, drones, wearable technology, and IoT infrastructure, Shenzhen has turned into a powerhouse for intelligent hardware, as I discussed last week. Turbocharging output of sensors and electronic parts via thousands of factories, Shenzhen’s skilled engineers can prototype and iterate new products at unprecedented scale and speed.

With the added fuel of Chinese government support and a relaxed Chinese attitude toward data privacy, China’s lead may even reach 80-20 in the next five years.

Jumping on this wave are companies like Xiaomi, which aims to turn bathrooms, kitchens, and living rooms into smart OMO environments. Having invested in 220 companies and incubated 29 startups that produce its products, Xiaomi surpassed 85 million intelligent home devices by the end of 2017, making it the world’s largest network of these connected products.

One KFC restaurant in China has even teamed up with Alipay (Alibaba’s mobile payments platform) to pioneer a ‘pay-with-your-face’ feature. Forget cash, cards, and cell phones, and let OMO do the work.

The Fourth Wave: Autonomous AI
But the most monumental—and unpredictable—wave is the fourth and final: autonomous AI.

Integrating all previous waves, autonomous AI gives machines the ability to sense and respond to the world around them, enabling AI to move and act productively.

While today’s machines can outperform us on repetitive tasks in structured and even unstructured environments (think Boston Dynamics’ humanoid Atlas or oncoming autonomous vehicles), machines with the power to see, hear, touch and optimize data will be a whole new ballgame.

Think: swarms of drones that can selectively spray and harvest entire farms with computer vision and remarkable dexterity, heat-resistant drones that can put out forest fires 100X more efficiently, or Level 5 autonomous vehicles that navigate smart roads and traffic systems all on their own.

While autonomous AI will first involve robots that create direct economic value—automating tasks on a one-to-one replacement basis—these intelligent machines will ultimately revamp entire industries from the ground up.

Kai-Fu Lee currently puts America in a commanding lead of 90-10 in autonomous AI, especially when it comes to self-driving vehicles. But Chinese government efforts are quickly ramping up the competition.

Already in China’s Zhejiang province, highway regulators and government officials have plans to build China’s first intelligent superhighway, outfitted with sensors, road-embedded solar panels and wireless communication between cars, roads and drivers.

Aimed at increasing transit efficiency by up to 30 percent while minimizing fatalities, the project may one day allow autonomous electric vehicles to continuously charge as they drive.

A similar government-fueled project involves Beijing’s new neighbor Xiong’an. Projected to take in over $580 billion in infrastructure spending over the next 20 years, Xiong’an New Area could one day become the world’s first city built around autonomous vehicles.

Baidu is already working with Xiong’an’s local government to build out this AI city with an environmental focus. Possibilities include sensor-geared cement, computer vision-enabled traffic lights, intersections with facial recognition, and parking lots-turned parks.

Lastly, Lee predicts China will almost certainly lead the charge in autonomous drones. Already, Shenzhen is home to premier drone maker DJI—a company I’ll be visiting with 24 top executives later this month as part of my annual China Platinum Trip.

Named “the best company I have ever encountered” by Chris Anderson, DJI owns an estimated 50 percent of the North American drone market, supercharged by Shenzhen’s extraordinary maker movement.

While the long-term Sino-US competitive balance in fourth wave AI remains to be seen, one thing is certain: in a matter of decades, we will witness the rise of AI-embedded cityscapes and autonomous machines that can interact with the real world and help solve today’s most pressing grand challenges.

Join Me
Webinar with Dr. Kai-Fu Lee: Dr. Kai-Fu Lee — one of the world’s most respected experts on AI — and I will discuss his latest book AI Superpowers: China, Silicon Valley, and the New World Order. Artificial Intelligence is reshaping the world as we know it. With U.S.-Sino competition heating up, who will own the future of technology? Register here for the free webinar on September 4th, 2018 from 11:00am–12:30pm PST.

Image Credit: Elena11 / Shutterstock.com Continue reading

Posted in Human Robots

#433278 Outdated Evolution: Updating Our ...

What happens when evolution shapes an animal for tribes of 150 primitive individuals living in a chaotic jungle, and then suddenly that animal finds itself living with millions of others in an engineered metropolis, their pockets all bulging with devices of godlike power?

The result, it seems, is a modern era of tension where archaic forms of governance struggle to keep up with the technological advances of their citizenry, where governmental policies act like constraining bottlenecks rather than spearheads of progress.

Simply put, our governments have failed to adapt to disruptive technologies. And if we are to regain our stability moving forward into a future of even greater disruption, it’s imperative that we understand the issues that got us into this situation and what kind of solutions we can engineer to overcome our governmental weaknesses.

Hierarchy vs. Technological Decentralization
Many of the greatest issues our governments face today come from humanity’s biologically-hardwired desire for centralized hierarchies. This innate proclivity towards building and navigating systems of status and rank were evolutionary gifts handed down to us by our ape ancestors, where each member of a community had a mental map of their social hierarchy. Their nervous systems behaved differently depending on their rank in this hierarchy, influencing their interactions in a way that ensured only the most competent ape would rise to the top to gain access to the best food and mates.

As humanity emerged and discovered the power of language, we continued this practice by ensuring that those at the top of the hierarchies, those with the greatest education and access to information, were the dominant decision-makers for our communities.

However, this kind of structured chain of power is only necessary if we’re operating in conditions of scarcity. But resources, including information, are no longer scarce.

It’s estimated that more than two-thirds of adults in the world now own a smartphone, giving the average citizen the same access to the world’s information as the leaders of our governments. And with global poverty falling from 35.5 percent to 10.9 percent over the last 25 years, our younger generations are growing up seeing automation and abundance as a likely default, where innovations like solar energy, lab-grown meat, and 3D printing are expected to become commonplace.

It’s awareness of this paradigm shift that has empowered the recent rise of decentralization. As information and access to resources become ubiquitous, there is noticeably less need for our inefficient and bureaucratic hierarchies.

For example, if blockchain can prove its feasibility for large-scale systems, it can be used to update and upgrade numerous applications to a decentralized model, including currency and voting. Such innovations would lower the risk of failing banks collapsing the economy like they did in 2008, as well as prevent corrupt politicians from using gerrymandering and long queues at polling stations to deter voter participation.

Of course, technology isn’t a magic wand that should be implemented carelessly. Facebook’s “move fast and break things” approach might have very possibly broken American democracy in 2016, as social media played on some of the worst tendencies humanity can operate on during an election: fear and hostility.

But if decentralized technology, like blockchain’s public ledgers, can continue to spread a sense of security and transparency throughout society, perhaps we can begin to quiet that paranoia and hyper-vigilance our brains evolved to cope with living as apes in dangerous jungles. By decentralizing our power structures, we take away the channels our outdated biological behaviors might use to enact social dominance and manipulation.

The peace of mind this creates helps to reestablish trust in our communities and in our governments. And with trust in the government increased, it’s likely we’ll see our next issue corrected.

From Business and Law to Science and Technology
A study found that 59 percent of US presidents, 68 percent of vice presidents, and 78 percent of secretaries of state were lawyers by education and occupation. That’s more than one out of every two people in the most powerful positions in the American government restricted to a field dedicated to convincing other people (judges) their perspective is true, even if they lack evidence.

And so the scientific method became less important than semantics to our leaders.

Similarly, of the 535 individuals in the American congress, only 24 hold a PhD, only 2 of which are in a STEM field. And so far, it’s not getting better: Trump is the first president since WWII not to name a science advisor.

But if we can use technologies like blockchain to increase transparency, efficiency, and trust in the government, then the upcoming generations who understand decentralization, abundance, and exponential technologies might feel inspired enough to run for government positions. This helps solve that common problem where the smartest and most altruistic people tend to avoid government positions because they don’t want to play the semantic and deceitful game of politics.

By changing this narrative, our governments can begin to fill with techno-progressive individuals who actually understand the technologies that are rapidly reshaping our reality. And this influence of expertise is going to be crucial as our governments are forced to restructure and create new policies to accommodate the incoming disruption.

Clearing Regulations to Begin Safe Experimentation
As exponential technologies become more ubiquitous, we’re likely going to see young kids and garage tinkerers creating powerful AIs and altering genetics thanks to tools like CRISPR and free virtual reality tutorials.

This easy accessibility to such powerful technology means unexpected and rapid progress can occur almost overnight, quickly overwhelming our government’s regulatory systems.

Uber and Airbnb are two of the best examples of our government’s inability to keep up with such technology, both companies achieving market dominance before regulators were even able to consider how to handle them. And when a government has decided against them, they often still continue to operate because people simply choose to keep using the apps.

Luckily, this kind of disruption hasn’t yet posed a major existential threat. But this will change when we see companies begin developing cyborg body parts, brain-computer interfaces, nanobot health injectors, and at-home genetic engineering kits.

For this reason, it’s crucial that we have experts who understand how to update our regulations to be as flexible as is necessary to ensure we don’t create black market conditions like we’ve done with drugs. It’s better to have safe and monitored experimentation, rather than forcing individuals into seedy communities using unsafe products.

Survival of the Most Adaptable
If we hope to be an animal that survives our changing environment, we have to adapt. We cannot cling to the behaviors and systems formed thousands of years ago. We must instead acknowledge that we now exist in an ecosystem of disruptive technology, and we must evolve and update our governments if they’re going to be capable of navigating these transformative impacts.

Image Credit: mmatee / Shutterstock.com Continue reading

Posted in Human Robots

#432880 Google’s Duplex Raises the Question: ...

By now, you’ve probably seen Google’s new Duplex software, which promises to call people on your behalf to book appointments for haircuts and the like. As yet, it only exists in demo form, but already it seems like Google has made a big stride towards capturing a market that plenty of companies have had their eye on for quite some time. This software is impressive, but it raises questions.

Many of you will be familiar with the stilted, robotic conversations you can have with early chatbots that are, essentially, glorified menus. Instead of pressing 1 to confirm or 2 to re-enter, some of these bots would allow for simple commands like “Yes” or “No,” replacing the buttons with limited ability to recognize a few words. Using them was often a far more frustrating experience than attempting to use a menu—there are few things more irritating than a robot saying, “Sorry, your response was not recognized.”

Google Duplex scheduling a hair salon appointment:

Google Duplex calling a restaurant:

Even getting the response recognized is hard enough. After all, there are countless different nuances and accents to baffle voice recognition software, and endless turns of phrase that amount to saying the same thing that can confound natural language processing (NLP), especially if you like your phrasing quirky.

You may think that standard customer-service type conversations all travel the same route, using similar words and phrasing. But when there are over 80,000 ways to order coffee, and making a mistake is frowned upon, even simple tasks require high accuracy over a huge dataset.

Advances in audio processing, neural networks, and NLP, as well as raw computing power, have meant that basic recognition of what someone is trying to say is less of an issue. Soundhound’s virtual assistant prides itself on being able to process complicated requests (perhaps needlessly complicated).

The deeper issue, as with all attempts to develop conversational machines, is one of understanding context. There are so many ways a conversation can go that attempting to construct a conversation two or three layers deep quickly runs into problems. Multiply the thousands of things people might say by the thousands they might say next, and the combinatorics of the challenge runs away from most chatbots, leaving them as either glorified menus, gimmicks, or rather bizarre to talk to.

Yet Google, who surely remembers from Glass the risk of premature debuts for technology, especially the kind that ask you to rethink how you interact with or trust in software, must have faith in Duplex to show it on the world stage. We know that startups like Semantic Machines and x.ai have received serious funding to perform very similar functions, using natural-language conversations to perform computing tasks, schedule meetings, book hotels, or purchase items.

It’s no great leap to imagine Google will soon do the same, bringing us closer to a world of onboard computing, where Lens labels the world around us and their assistant arranges it for us (all the while gathering more and more data it can convert into personalized ads). The early demos showed some clever tricks for keeping the conversation within a fairly narrow realm where the AI should be comfortable and competent, and the blog post that accompanied the release shows just how much effort has gone into the technology.

Yet given the privacy and ethics funk the tech industry finds itself in, and people’s general unease about AI, the main reaction to Duplex’s impressive demo was concern. The voice sounded too natural, bringing to mind Lyrebird and their warnings of deepfakes. You might trust “Do the Right Thing” Google with this technology, but it could usher in an era when automated robo-callers are far more convincing.

A more human-like voice may sound like a perfectly innocuous improvement, but the fact that the assistant interjects naturalistic “umm” and “mm-hm” responses to more perfectly mimic a human rubbed a lot of people the wrong way. This wasn’t just a voice assistant trying to sound less grinding and robotic; it was actively trying to deceive people into thinking they were talking to a human.

Google is running the risk of trying to get to conversational AI by going straight through the uncanny valley.

“Google’s experiments do appear to have been designed to deceive,” said Dr. Thomas King of the Oxford Internet Institute’s Digital Ethics Lab, according to Techcrunch. “Their main hypothesis was ‘can you distinguish this from a real person?’ In this case it’s unclear why their hypothesis was about deception and not the user experience… there should be some kind of mechanism there to let people know what it is they are speaking to.”

From Google’s perspective, being able to say “90 percent of callers can’t tell the difference between this and a human personal assistant” is an excellent marketing ploy, even though statistics about how many interactions are successful might be more relevant.

In fact, Duplex runs contrary to pretty much every major recommendation about ethics for the use of robotics or artificial intelligence, not to mention certain eavesdropping laws. Transparency is key to holding machines (and the people who design them) accountable, especially when it comes to decision-making.

Then there are the more subtle social issues. One prominent effect social media has had is to allow people to silo themselves; in echo chambers of like-minded individuals, it’s hard to see how other opinions exist. Technology exacerbates this by removing the evolutionary cues that go along with face-to-face interaction. Confronted with a pair of human eyes, people are more generous. Confronted with a Twitter avatar or a Facebook interface, people hurl abuse and criticism they’d never dream of using in a public setting.

Now that we can use technology to interact with ever fewer people, will it change us? Is it fair to offload the burden of dealing with a robot onto the poor human at the other end of the line, who might have to deal with dozens of such calls a day? Google has said that if the AI is in trouble, it will put you through to a human, which might help save receptionists from the hell of trying to explain a concept to dozens of dumbfounded AI assistants all day. But there’s always the risk that failures will be blamed on the person and not the machine.

As AI advances, could we end up treating the dwindling number of people in these “customer-facing” roles as the buggiest part of a fully automatic service? Will people start accusing each other of being robots on the phone, as well as on Twitter?

Google has provided plenty of reassurances about how the system will be used. They have said they will ensure that the system is identified, and it’s hardly difficult to resolve this problem; a slight change in the script from their demo would do it. For now, consumers will likely appreciate moves that make it clear whether the “intelligent agents” that make major decisions for us, that we interact with daily, and that hide behind social media avatars or phone numbers are real or artificial.

Image Credit: Besjunior / Shutterstock.com Continue reading

Posted in Human Robots

#432878 Chinese Port Goes Full Robot With ...

By the end of 2018, something will be very different about the harbor area in the northern Chinese city of Caofeidian. If you were to visit, the whirring cranes and tractors driving containers to and fro would be the only things in sight.

Caofeidian is set to become the world’s first fully autonomous harbor by the end of the year. The US-Chinese startup TuSimple, a specialist in developing self-driving trucks, will replace human-driven terminal tractor-trucks with 20 self-driving models. A separate company handles crane automation, and a central control system will coordinate the movements of both.

According to Robert Brown, Director of Public Affairs at TuSimple, the project could quickly transform into a much wider trend. “The potential for automating systems in harbors and ports is staggering when considering the number of deep-water and inland ports around the world. At the same time, the closed, controlled nature of a port environment makes it a perfect proving ground for autonomous truck technology,” he said.

Going Global
The autonomous cranes and trucks have a big task ahead of them. Caofeidian currently processes around 300,000 TEU containers a year. Even if you were dealing with Lego bricks, that number of units would get you a decent-sized cathedral or a 22-foot-long aircraft carrier. For any maritime fans—or people who enjoy the moving of heavy objects—TEU stands for twenty-foot equivalent unit. It is the industry standard for containers. A TEU equals an 8-foot (2.43 meter) wide, 8.5-foot (2.59 meter) high, and 20-foot (6.06 meter) long container.

While impressive, the Caofeidian number pales in comparison with the biggest global ports like Shanghai, Singapore, Busan, or Rotterdam. For example, 2017 saw more than 40 million TEU moved through Shanghai port facilities.

Self-driving container vehicles have been trialled elsewhere, including in Yangshan, close to Shanghai, and Rotterdam. Qingdao New Qianwan Container Terminal in China recently laid claim to being the first fully automated terminal in Asia.

The potential for efficiencies has many ports interested in automation. Qingdao said its systems allow the terminal to operate in complete darkness and have reduced labor costs by 70 percent while increasing efficiency by 30 percent. In some cases, the number of workers needed to unload a cargo ship has gone from 60 to 9.

TuSimple says it is in negotiations with several other ports and also sees potential in related logistics-heavy fields.

Stable Testing Ground
For autonomous vehicles, ports seem like a perfect testing ground. They are restricted, confined areas with few to no pedestrians where operating speeds are limited. The predictability makes it unlike, say, city driving.

Robert Brown describes it as an ideal setting for the first adaptation of TuSimple’s technology. The company, which, amongst others, is backed by chipmaker Nvidia, have been retrofitting existing vehicles from Shaanxi Automobile Group with sensors and technology.

At the same time, it is running open road tests in Arizona and China of its Class 8 Level 4 autonomous trucks.

The Camera Approach
Dozens of autonomous truck startups are reported to have launched in China over the past two years. In other countries the situation is much the same, as the race for the future of goods transportation heats up. Startup companies like Embark, Einride, Starsky Robotics, and Drive.ai are just a few of the names in the space. They are facing competition from the likes of Tesla, Daimler, VW, Uber’s Otto subsidiary, and in March, Waymo announced it too was getting into the truck race.

Compared to many of its competitors, TuSimple’s autonomous driving system is based on a different approach. Instead of laser-based radar (LIDAR), TuSimple primarily uses cameras to gather data about its surroundings. Currently, the company uses ten cameras, including forward-facing, backward-facing, and wide-lens. Together, they produce the 360-degree “God View” of the vehicle’s surroundings, which is interpreted by the onboard autonomous driving systems.

Each camera gathers information at 30 frames a second. Millimeter wave radar is used as a secondary sensor. In total, the vehicles generate what Robert Brown describes with a laugh as “almost too much” data about its surroundings and is accurate beyond 300 meters in locating and identifying objects. This includes objects that have given LIDAR problems, such as black vehicles.

Another advantage is price. Companies often loathe revealing exact amounts, but Tesla has gone as far as to say that the ‘expected’ price of its autonomous truck will be from $150,0000 and upwards. While unconfirmed, TuSimple’s retrofitted, camera-based solution is thought to cost around $20,000.

Image Credit: chinahbzyg / Shutterstock.com Continue reading

Posted in Human Robots

#432487 Can We Make a Musical Turing Test?

As artificial intelligence advances, we’re encountering the same old questions. How much of what we consider to be fundamentally human can be reduced to an algorithm? Can we create something sufficiently advanced that people can no longer distinguish between the two? This, after all, is the idea behind the Turing Test, which has yet to be passed.

At first glance, you might think music is beyond the realm of algorithms. Birds can sing, and people can compose symphonies. Music is evocative; it makes us feel. Very often, our intense personal and emotional attachments to music are because it reminds us of our shared humanity. We are told that creative jobs are the least likely to be automated. Creativity seems fundamentally human.

But I think above all, we view it as reductionist sacrilege: to dissect beautiful things. “If you try to strangle a skylark / to cut it up, see how it works / you will stop its heart from beating / you will stop its mouth from singing.” A human musician wrote that; a machine might be able to string words together that are happy or sad; it might even be able to conjure up a decent metaphor from the depths of some neural network—but could it understand humanity enough to produce art that speaks to humans?

Then, of course, there’s the other side of the debate. Music, after all, has a deeply mathematical structure; you can train a machine to produce harmonics. “In the teachings of Pythagoras and his followers, music was inseparable from numbers, which were thought to be the key to the whole spiritual and physical universe,” according to Grout in A History of Western Music. You might argue that the process of musical composition cannot be reduced to a simple algorithm, yet musicians have often done so. Mozart, with his “Dice Music,” used the roll of a dice to decide how to order musical fragments; creativity through an 18th-century random number generator. Algorithmic music goes back a very long way, with the first papers on the subject from the 1960s.

Then there’s the techno-enthusiast side of the argument. iTunes has 26 million songs, easily more than a century of music. A human could never listen to and learn from them all, but a machine could. It could also memorize every note of Beethoven. Music can be converted into MIDI files, a nice chewable data format that allows even a character-by-character neural net you can run on your computer to generate music. (Seriously, even I could get this thing working.)

Indeed, generating music in the style of Bach has long been a test for AI, and you can see neural networks gradually learn to imitate classical composers while trying to avoid overfitting. When an algorithm overfits, it essentially starts copying the existing music, rather than being inspired by it but creating something similar: a tightrope the best human artists learn to walk. Creativity doesn’t spring from nowhere; even maverick musical geniuses have their influences.

Does a machine have to be truly ‘creative’ to produce something that someone would find valuable? To what extent would listeners’ attitudes change if they thought they were hearing a human vs. an AI composition? This all suggests a musical Turing Test. Of course, it already exists. In fact, it’s run out of Dartmouth, the school that hosted that first, seminal AI summer conference. This year, the contest is bigger than ever: alongside the PoetiX, LimeriX and LyriX competitions for poetry and lyrics, there’s a DigiKidLit competition for children’s literature (although you may have reservations about exposing your children to neural-net generated content… it can get a bit surreal).

There’s also a pair of musical competitions, including one for original compositions in different genres. Key genres and styles are represented by Charlie Parker for Jazz and the Bach chorales for classical music. There’s also a free composition, and a contest where a human and an AI try to improvise together—the AI must respond to a human spontaneously, in real time, and in a musically pleasing way. Quite a challenge! In all cases, if any of the generated work is indistinguishable from human performers, the neural net has passed the Turing Test.

Did they? Here’s part of 2017’s winning sonnet from Charese Smiley and Hiroko Bretz:

The large cabin was in total darkness.
Come marching up the eastern hill afar.
When is the clock on the stairs dangerous?
Everything seemed so near and yet so far.
Behind the wall silence alone replied.
Was, then, even the staircase occupied?
Generating the rhymes is easy enough, the sentence structure a little trickier, but what’s impressive about this sonnet is that it sticks to a single topic and appears to be a more coherent whole. I’d guess they used associated “lexical fields” of similar words to help generate something coherent. In a similar way, most of the more famous examples of AI-generated music still involve some amount of human control, even if it’s editorial; a human will build a song around an AI-generated riff, or select the most convincing Bach chorale from amidst many different samples.

We are seeing strides forward in the ability of AI to generate human voices and human likenesses. As the latter example shows, in the fake news era people have focused on the dangers of this tech– but might it also be possible to create a virtual performer, trained on a dataset of their original music? Did you ever want to hear another Beatles album, or jam with Miles Davis? Of course, these things are impossible—but could we create a similar experience that people would genuinely value? Even, to the untrained eye, something indistinguishable from the real thing?

And if it did measure up to the real thing, what would this mean? Jaron Lanier is a fascinating technology writer, a critic of strong AI, and a believer in the power of virtual reality to change the world and provide truly meaningful experiences. He’s also a composer and a musical aficionado. He pointed out in a recent interview that translation algorithms, by reducing the amount of work translators are commissioned to do, have, in some sense, profited from stolen expertise. They were trained on huge datasets purloined from human linguists and translators. If you can train an AI on someone’s creative output and it produces new music, who “owns” it?

Although companies that offer AI music tools are starting to proliferate, and some groups will argue that the musical Turing test has been passed already, AI-generated music is hardly racing to the top of the pop charts just yet. Even as the line between human-composed and AI-generated music starts to blur, there’s still a gulf between the average human and musical genius. In the next few years, we’ll see how far the current techniques can take us. It may be the case that there’s something in the skylark’s song that can’t be generated by machines. But maybe not, and then this song might need an extra verse.

Image Credit: d1sk / Shutterstock.com Continue reading

Posted in Human Robots