Tag Archives: thinking
#433758 DeepMind’s New Research Plan to Make ...
Making sure artificial intelligence does what we want and behaves in predictable ways will be crucial as the technology becomes increasingly ubiquitous. It’s an area frequently neglected in the race to develop products, but DeepMind has now outlined its research agenda to tackle the problem.
AI safety, as the field is known, has been gaining prominence in recent years. That’s probably at least partly down to the overzealous warnings of a coming AI apocalypse from well-meaning, but underqualified pundits like Elon Musk and Stephen Hawking. But it’s also recognition of the fact that AI technology is quickly pervading all aspects of our lives, making decisions on everything from what movies we watch to whether we get a mortgage.
That’s why DeepMind hired a bevy of researchers who specialize in foreseeing the unforeseen consequences of the way we built AI back in 2016. And now the team has spelled out the three key domains they think require research if we’re going to build autonomous machines that do what we want.
In a new blog designed to provide updates on the team’s work, they introduce the ideas of specification, robustness, and assurance, which they say will act as the cornerstones of their future research. Specification involves making sure AI systems do what their operator intends; robustness means a system can cope with changes to its environment and attempts to throw it off course; and assurance involves our ability to understand what systems are doing and how to control them.
A classic thought experiment designed to illustrate how we could lose control of an AI system can help illustrate the problem of specification. Philosopher Nick Bostrom’s posited a hypothetical machine charged with making as many paperclips as possible. Because the creators fail to add what they might assume are obvious additional goals like not harming people, the AI wipes out humanity so we can’t switch it off before turning all matter in the universe into paperclips.
Obviously the example is extreme, but it shows how a poorly-specified goal can lead to unexpected and disastrous outcomes. Properly codifying the desires of the designer is no easy feat, though; often there are not neat ways to encompass both the explicit and implicit goals in ways that are understandable to the machine and don’t leave room for ambiguities, meaning we often rely on incomplete approximations.
The researchers note recent research by OpenAI in which an AI was trained to play a boat-racing game called CoastRunners. The game rewards players for hitting targets laid out along the race route. The AI worked out that it could get a higher score by repeatedly knocking over regenerating targets rather than actually completing the course. The blog post includes a link to a spreadsheet detailing scores of such examples.
Another key concern for AI designers is making their creation robust to the unpredictability of the real world. Despite their superhuman abilities on certain tasks, most cutting-edge AI systems are remarkably brittle. They tend to be trained on highly-curated datasets and so can fail when faced with unfamiliar input. This can happen by accident or by design—researchers have come up with numerous ways to trick image recognition algorithms into misclassifying things, including thinking a 3D printed tortoise was actually a gun.
Building systems that can deal with every possible encounter may not be feasible, so a big part of making AIs more robust may be getting them to avoid risks and ensuring they can recover from errors, or that they have failsafes to ensure errors don’t lead to catastrophic failure.
And finally, we need to have ways to make sure we can tell whether an AI is performing the way we expect it to. A key part of assurance is being able to effectively monitor systems and interpret what they’re doing—if we’re basing medical treatments or sentencing decisions on the output of an AI, we’d like to see the reasoning. That’s a major outstanding problem for popular deep learning approaches, which are largely indecipherable black boxes.
The other half of assurance is the ability to intervene if a machine isn’t behaving the way we’d like. But designing a reliable off switch is tough, because most learning systems have a strong incentive to prevent anyone from interfering with their goals.
The authors don’t pretend to have all the answers, but they hope the framework they’ve come up with can help guide others working on AI safety. While it may be some time before AI is truly in a position to do us harm, hopefully early efforts like these will mean it’s built on a solid foundation that ensures it is aligned with our goals.
Image Credit: cono0430 / Shutterstock.com Continue reading
#433731 From cyborgs to sex robots, U of M ...
Francis Shen spends a lot of time thinking about transhuman cyborgs, brain-wave lie detectors, sex robots and terrorists hacking into devices implanted in our heads. Continue reading
#432880 Google’s Duplex Raises the Question: ...
By now, you’ve probably seen Google’s new Duplex software, which promises to call people on your behalf to book appointments for haircuts and the like. As yet, it only exists in demo form, but already it seems like Google has made a big stride towards capturing a market that plenty of companies have had their eye on for quite some time. This software is impressive, but it raises questions.
Many of you will be familiar with the stilted, robotic conversations you can have with early chatbots that are, essentially, glorified menus. Instead of pressing 1 to confirm or 2 to re-enter, some of these bots would allow for simple commands like “Yes” or “No,” replacing the buttons with limited ability to recognize a few words. Using them was often a far more frustrating experience than attempting to use a menu—there are few things more irritating than a robot saying, “Sorry, your response was not recognized.”
Google Duplex scheduling a hair salon appointment:
Google Duplex calling a restaurant:
Even getting the response recognized is hard enough. After all, there are countless different nuances and accents to baffle voice recognition software, and endless turns of phrase that amount to saying the same thing that can confound natural language processing (NLP), especially if you like your phrasing quirky.
You may think that standard customer-service type conversations all travel the same route, using similar words and phrasing. But when there are over 80,000 ways to order coffee, and making a mistake is frowned upon, even simple tasks require high accuracy over a huge dataset.
Advances in audio processing, neural networks, and NLP, as well as raw computing power, have meant that basic recognition of what someone is trying to say is less of an issue. Soundhound’s virtual assistant prides itself on being able to process complicated requests (perhaps needlessly complicated).
The deeper issue, as with all attempts to develop conversational machines, is one of understanding context. There are so many ways a conversation can go that attempting to construct a conversation two or three layers deep quickly runs into problems. Multiply the thousands of things people might say by the thousands they might say next, and the combinatorics of the challenge runs away from most chatbots, leaving them as either glorified menus, gimmicks, or rather bizarre to talk to.
Yet Google, who surely remembers from Glass the risk of premature debuts for technology, especially the kind that ask you to rethink how you interact with or trust in software, must have faith in Duplex to show it on the world stage. We know that startups like Semantic Machines and x.ai have received serious funding to perform very similar functions, using natural-language conversations to perform computing tasks, schedule meetings, book hotels, or purchase items.
It’s no great leap to imagine Google will soon do the same, bringing us closer to a world of onboard computing, where Lens labels the world around us and their assistant arranges it for us (all the while gathering more and more data it can convert into personalized ads). The early demos showed some clever tricks for keeping the conversation within a fairly narrow realm where the AI should be comfortable and competent, and the blog post that accompanied the release shows just how much effort has gone into the technology.
Yet given the privacy and ethics funk the tech industry finds itself in, and people’s general unease about AI, the main reaction to Duplex’s impressive demo was concern. The voice sounded too natural, bringing to mind Lyrebird and their warnings of deepfakes. You might trust “Do the Right Thing” Google with this technology, but it could usher in an era when automated robo-callers are far more convincing.
A more human-like voice may sound like a perfectly innocuous improvement, but the fact that the assistant interjects naturalistic “umm” and “mm-hm” responses to more perfectly mimic a human rubbed a lot of people the wrong way. This wasn’t just a voice assistant trying to sound less grinding and robotic; it was actively trying to deceive people into thinking they were talking to a human.
Google is running the risk of trying to get to conversational AI by going straight through the uncanny valley.
“Google’s experiments do appear to have been designed to deceive,” said Dr. Thomas King of the Oxford Internet Institute’s Digital Ethics Lab, according to Techcrunch. “Their main hypothesis was ‘can you distinguish this from a real person?’ In this case it’s unclear why their hypothesis was about deception and not the user experience… there should be some kind of mechanism there to let people know what it is they are speaking to.”
From Google’s perspective, being able to say “90 percent of callers can’t tell the difference between this and a human personal assistant” is an excellent marketing ploy, even though statistics about how many interactions are successful might be more relevant.
In fact, Duplex runs contrary to pretty much every major recommendation about ethics for the use of robotics or artificial intelligence, not to mention certain eavesdropping laws. Transparency is key to holding machines (and the people who design them) accountable, especially when it comes to decision-making.
Then there are the more subtle social issues. One prominent effect social media has had is to allow people to silo themselves; in echo chambers of like-minded individuals, it’s hard to see how other opinions exist. Technology exacerbates this by removing the evolutionary cues that go along with face-to-face interaction. Confronted with a pair of human eyes, people are more generous. Confronted with a Twitter avatar or a Facebook interface, people hurl abuse and criticism they’d never dream of using in a public setting.
Now that we can use technology to interact with ever fewer people, will it change us? Is it fair to offload the burden of dealing with a robot onto the poor human at the other end of the line, who might have to deal with dozens of such calls a day? Google has said that if the AI is in trouble, it will put you through to a human, which might help save receptionists from the hell of trying to explain a concept to dozens of dumbfounded AI assistants all day. But there’s always the risk that failures will be blamed on the person and not the machine.
As AI advances, could we end up treating the dwindling number of people in these “customer-facing” roles as the buggiest part of a fully automatic service? Will people start accusing each other of being robots on the phone, as well as on Twitter?
Google has provided plenty of reassurances about how the system will be used. They have said they will ensure that the system is identified, and it’s hardly difficult to resolve this problem; a slight change in the script from their demo would do it. For now, consumers will likely appreciate moves that make it clear whether the “intelligent agents” that make major decisions for us, that we interact with daily, and that hide behind social media avatars or phone numbers are real or artificial.
Image Credit: Besjunior / Shutterstock.com Continue reading