Tag Archives: algorithms

#437373 Microsoft’s New Deepfake Detector Puts ...

The upcoming US presidential election seems set to be something of a mess—to put it lightly. Covid-19 will likely deter millions from voting in person, and mail-in voting isn’t shaping up to be much more promising. This all comes at a time when political tensions are running higher than they have in decades, issues that shouldn’t be political (like mask-wearing) have become highly politicized, and Americans are dramatically divided along party lines.

So the last thing we need right now is yet another wrench in the spokes of democracy, in the form of disinformation; we all saw how that played out in 2016, and it wasn’t pretty. For the record, disinformation purposely misleads people, while misinformation is simply inaccurate, but without malicious intent. While there’s not a ton tech can do to make people feel safe at crowded polling stations or up the Postal Service’s budget, tech can help with disinformation, and Microsoft is trying to do so.

On Tuesday the company released two new tools designed to combat disinformation, described in a blog post by VP of Customer Security and Trust Tom Burt and Chief Scientific Officer Eric Horvitz.

The first is Microsoft Video Authenticator, which is made to detect deepfakes. In case you’re not familiar with this wicked byproduct of AI progress, “deepfakes” refers to audio or visual files made using artificial intelligence that can manipulate peoples’ voices or likenesses to make it look like they said things they didn’t. Editing a video to string together words and form a sentence someone didn’t say doesn’t count as a deepfake; though there’s manipulation involved, you don’t need a neural network and you’re not generating any original content or footage.

The Authenticator analyzes videos or images and tells users the percentage chance that they’ve been artificially manipulated. For videos, the tool can even analyze individual frames in real time.

Deepfake videos are made by feeding hundreds of hours of video of someone into a neural network, “teaching” the network the minutiae of the person’s voice, pronunciation, mannerisms, gestures, etc. It’s like when you do an imitation of your annoying coworker from accounting, complete with mimicking the way he makes every sentence sound like a question and his eyes widen when he talks about complex spreadsheets. You’ve spent hours—no, months—in his presence and have his personality quirks down pat. An AI algorithm that produces deepfakes needs to learn those same quirks, and more, about whoever the creator’s target is.

Given enough real information and examples, the algorithm can then generate its own fake footage, with deepfake creators using computer graphics and manually tweaking the output to make it as realistic as possible.

The scariest part? To make a deepfake, you don’t need a fancy computer or even a ton of knowledge about software. There are open-source programs people can access for free online, and as far as finding video footage of famous people—well, we’ve got YouTube to thank for how easy that is.

Microsoft’s Video Authenticator can detect the blending boundary of a deepfake and subtle fading or greyscale elements that the human eye may not be able to see.

In the blog post, Burt and Horvitz point out that as time goes by, deepfakes are only going to get better and become harder to detect; after all, they’re generated by neural networks that are continuously learning from and improving themselves.

Microsoft’s counter-tactic is to come in from the opposite angle, that is, being able to confirm beyond doubt that a video, image, or piece of news is real (I mean, can McDonald’s fries cure baldness? Did a seal slap a kayaker in the face with an octopus? Never has it been so imperative that the world know the truth).

A tool built into Microsoft Azure, the company’s cloud computing service, lets content producers add digital hashes and certificates to their content, and a reader (which can be used as a browser extension) checks the certificates and matches the hashes to indicate the content is authentic.

Finally, Microsoft also launched an interactive “Spot the Deepfake” quiz it developed in collaboration with the University of Washington’s Center for an Informed Public, deepfake detection company Sensity, and USA Today. The quiz is intended to help people “learn about synthetic media, develop critical media literacy skills, and gain awareness of the impact of synthetic media on democracy.”

The impact Microsoft’s new tools will have remains to be seen—but hey, we’re glad they’re trying. And they’re not alone; Facebook, Twitter, and YouTube have all taken steps to ban and remove deepfakes from their sites. The AI Foundation’s Reality Defender uses synthetic media detection algorithms to identify fake content. There’s even a coalition of big tech companies teaming up to try to fight election interference.

One thing is for sure: between a global pandemic, widespread protests and riots, mass unemployment, a hobbled economy, and the disinformation that’s remained rife through it all, we’re going to need all the help we can get to make it through not just the election, but the rest of the conga-line-of-catastrophes year that is 2020.

Image Credit: Darius Bashar on Unsplash Continue reading

Posted in Human Robots

#437357 Algorithms Workers Can’t See Are ...

“I’m sorry, Dave. I’m afraid I can’t do that.” HAL’s cold, if polite, refusal to open the pod bay doors in 2001: A Space Odyssey has become a defining warning about putting too much trust in artificial intelligence, particularly if you work in space.

In the movies, when a machine decides to be the boss (or humans let it) things go wrong. Yet despite myriad dystopian warnings, control by machines is fast becoming our reality.

Algorithms—sets of instructions to solve a problem or complete a task—now drive everything from browser search results to better medical care.

They are helping design buildings. They are speeding up trading on financial markets, making and losing fortunes in micro-seconds. They are calculating the most efficient routes for delivery drivers.

In the workplace, self-learning algorithmic computer systems are being introduced by companies to assist in areas such as hiring, setting tasks, measuring productivity, evaluating performance, and even terminating employment: “I’m sorry, Dave. I’m afraid you are being made redundant.”

Giving self‐learning algorithms the responsibility to make and execute decisions affecting workers is called “algorithmic management.” It carries a host of risks in depersonalizing management systems and entrenching pre-existing biases.

At an even deeper level, perhaps, algorithmic management entrenches a power imbalance between management and worker. Algorithms are closely guarded secrets. Their decision-making processes are hidden. It’s a black-box: perhaps you have some understanding of the data that went in, and you see the result that comes out, but you have no idea of what goes on in between.

Algorithms at Work
Here are a few examples of algorithms already at work.

At Amazon’s fulfillment center in south-east Melbourne, they set the pace for “pickers,” who have timers on their scanners showing how long they have to find the next item. As soon as they scan that item, the timer resets for the next. All at a “not quite walking, not quite running” speed.

Or how about AI determining your success in a job interview? More than 700 companies have trialed such technology. US developer HireVue says its software speeds up the hiring process by 90 percent by having applicants answer identical questions and then scoring them according to language, tone, and facial expressions.

Granted, human assessments during job interviews are notoriously flawed. Algorithms,however, can also be biased. The classic example is the COMPAS software used by US judges, probation, and parole officers to rate a person’s risk of re-offending. In 2016 a ProPublica investigation showed the algorithm was heavily discriminatory, incorrectly classifying black subjects as higher risk 45 percent of the time, compared with 23 percent for white subjects.

How Gig Workers Cope
Algorithms do what their code tells them to do. The problem is this code is rarely available. This makes them difficult to scrutinize, or even understand.

Nowhere is this more evident than in the gig economy. Uber, Lyft, Deliveroo, and other platforms could not exist without algorithms allocating, monitoring, evaluating, and rewarding work.

Over the past year Uber Eats’ bicycle couriers and drivers, for instance, have blamed unexplained changes to the algorithm for slashing their jobs, and incomes.

Rider’s can’t be 100 percent sure it was all down to the algorithm. But that’s part of the problem. The fact those who depend on the algorithm don’t know one way or the other has a powerful influence on them.

This is a key result from our interviews with 58 food-delivery couriers. Most knew their jobs were allocated by an algorithm (via an app). They knew the app collected data. What they didn’t know was how data was used to award them work.

In response, they developed a range of strategies (or guessed how) to “win” more jobs, such as accepting gigs as quickly as possible and waiting in “magic” locations. Ironically, these attempts to please the algorithm often meant losing the very flexibility that was one of the attractions of gig work.

The information asymmetry created by algorithmic management has two profound effects. First, it threatens to entrench systemic biases, the type of discrimination hidden within the COMPAS algorithm for years. Second, it compounds the power imbalance between management and worker.

Our data also confirmed others’ findings that it is almost impossible to complain about the decisions of the algorithm. Workers often do not know the exact basis of those decisions, and there’s no one to complain to anyway. When Uber Eats bicycle couriers asked for reasons about their plummeting income, for example, responses from the company advised them “we have no manual control over how many deliveries you receive.”

Broader Lessons
When algorithmic management operates as a “black box” one of the consequences is that it is can become an indirect control mechanism. Thus far under-appreciated by Australian regulators, this control mechanism has enabled platforms to mobilize a reliable and scalable workforce while avoiding employer responsibilities.

“The absence of concrete evidence about how the algorithms operate”, the Victorian government’s inquiry into the “on-demand” workforce notes in its report, “makes it hard for a driver or rider to complain if they feel disadvantaged by one.”

The report, published in June, also found it is “hard to confirm if concern over algorithm transparency is real.”

But it is precisely the fact it is hard to confirm that’s the problem. How can we start to even identify, let alone resolve, issues like algorithmic management?

Fair conduct standards to ensure transparency and accountability are a start. One example is the Fair Work initiative, led by the Oxford Internet Institute. The initiative is bringing together researchers with platforms, workers, unions, and regulators to develop global principles for work in the platform economy. This includes “fair management,” which focuses on how transparent the results and outcomes of algorithms are for workers.

Understandings about impact of algorithms on all forms of work is still in its infancy. It demands greater scrutiny and research. Without human oversight based on agreed principles we risk inviting HAL into our workplaces.

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Image Credit: PickPik Continue reading

Posted in Human Robots

#437303 The Deck Is Not Rigged: Poker and the ...

Tuomas Sandholm, a computer scientist at Carnegie Mellon University, is not a poker player—or much of a poker fan, in fact—but he is fascinated by the game for much the same reason as the great game theorist John von Neumann before him. Von Neumann, who died in 1957, viewed poker as the perfect model for human decision making, for finding the balance between skill and chance that accompanies our every choice. He saw poker as the ultimate strategic challenge, combining as it does not just the mathematical elements of a game like chess but the uniquely human, psychological angles that are more difficult to model precisely—a view shared years later by Sandholm in his research with artificial intelligence.

“Poker is the main benchmark and challenge program for games of imperfect information,” Sandholm told me on a warm spring afternoon in 2018, when we met in his offices in Pittsburgh. The game, it turns out, has become the gold standard for developing artificial intelligence.

Tall and thin, with wire-frame glasses and neat brow hair framing a friendly face, Sandholm is behind the creation of three computer programs designed to test their mettle against human poker players: Claudico, Libratus, and most recently, Pluribus. (When we met, Libratus was still a toddler and Pluribus didn’t yet exist.) The goal isn’t to solve poker, as such, but to create algorithms whose decision making prowess in poker’s world of imperfect information and stochastic situations—situations that are randomly determined and unable to be predicted—can then be applied to other stochastic realms, like the military, business, government, cybersecurity, even health care.

While the first program, Claudico, was summarily beaten by human poker players—“one broke-ass robot,” an observer called it—Libratus has triumphed in a series of one-on-one, or heads-up, matches against some of the best online players in the United States.

Libratus relies on three main modules. The first involves a basic blueprint strategy for the whole game, allowing it to reach a much faster equilibrium than its predecessor. It includes an algorithm called the Monte Carlo Counterfactual Regret Minimization, which evaluates all future actions to figure out which one would cause the least amount of regret. Regret, of course, is a human emotion. Regret for a computer simply means realizing that an action that wasn’t chosen would have yielded a better outcome than one that was. “Intuitively, regret represents how much the AI regrets having not chosen that action in the past,” says Sandholm. The higher the regret, the higher the chance of choosing that action next time.

It’s a useful way of thinking—but one that is incredibly difficult for the human mind to implement. We are notoriously bad at anticipating our future emotions. How much will we regret doing something? How much will we regret not doing something else? For us, it’s an emotionally laden calculus, and we typically fail to apply it in quite the right way. For a computer, it’s all about the computation of values. What does it regret not doing the most, the thing that would have yielded the highest possible expected value?

The second module is a sub-game solver that takes into account the mistakes the opponent has made so far and accounts for every hand she could possibly have. And finally, there is a self-improver. This is the area where data and machine learning come into play. It’s dangerous to try to exploit your opponent—it opens you up to the risk that you’ll get exploited right back, especially if you’re a computer program and your opponent is human. So instead of attempting to do that, the self-improver lets the opponent’s actions inform the areas where the program should focus. “That lets the opponent’s actions tell us where [they] think they’ve found holes in our strategy,” Sandholm explained. This allows the algorithm to develop a blueprint strategy to patch those holes.

It’s a very human-like adaptation, if you think about it. I’m not going to try to outmaneuver you head on. Instead, I’m going to see how you’re trying to outmaneuver me and respond accordingly. Sun-Tzu would surely approve. Watch how you’re perceived, not how you perceive yourself—because in the end, you’re playing against those who are doing the perceiving, and their opinion, right or not, is the only one that matters when you craft your strategy. Overnight, the algorithm patches up its overall approach according to the resulting analysis.

There’s one final thing Libratus is able to do: play in situations with unknown probabilities. There’s a concept in game theory known as the trembling hand: There are branches of the game tree that, under an optimal strategy, one should theoretically never get to; but with some probability, your all-too-human opponent’s hand trembles, they take a wrong action, and you’re suddenly in a totally unmapped part of the game. Before, that would spell disaster for the computer: An unmapped part of the tree means the program no longer knows how to respond. Now, there’s a contingency plan.

Of course, no algorithm is perfect. When Libratus is playing poker, it’s essentially working in a zero-sum environment. It wins, the opponent loses. The opponent wins, it loses. But while some real-life interactions really are zero-sum—cyber warfare comes to mind—many others are not nearly as straightforward: My win does not necessarily mean your loss. The pie is not fixed, and our interactions may be more positive-sum than not.

What’s more, real-life applications have to contend with something that a poker algorithm does not: the weights that are assigned to different elements of a decision. In poker, this is a simple value-maximizing process. But what is value in the human realm? Sandholm had to contend with this before, when he helped craft the world’s first kidney exchange. Do you want to be more efficient, giving the maximum number of kidneys as quickly as possible—or more fair, which may come at a cost to efficiency? Do you want as many lives as possible saved—or do some take priority at the cost of reaching more? Is there a preference for the length of the wait until a transplant? Do kids get preference? And on and on. It’s essential, Sandholm says, to separate means and the ends. To figure out the ends, a human has to decide what the goal is.

“The world will ultimately become a lot safer with the help of algorithms like Libratus,” Sandholm told me. I wasn’t sure what he meant. The last thing that most people would do is call poker, with its competition, its winners and losers, its quest to gain the maximum edge over your opponent, a haven of safety.

“Logic is good, and the AI is much better at strategic reasoning than humans can ever be,” he explained. “It’s taking out irrationality, emotionality. And it’s fairer. If you have an AI on your side, it can lift non-experts to the level of experts. Naïve negotiators will suddenly have a better weapon. We can start to close off the digital divide.”

It was an optimistic note to end on—a zero-sum, competitive game yielding a more ultimately fair and rational world.

I wanted to learn more, to see if it was really possible that mathematics and algorithms could ultimately be the future of more human, more psychological interactions. And so, later that day, I accompanied Nick Nystrom, the chief scientist of the Pittsburgh Supercomputing Center—the place that runs all of Sandholm’s poker-AI programs—to the actual processing center that make undertakings like Libratus possible.

A half-hour drive found us in a parking lot by a large glass building. I’d expected something more futuristic, not the same square, corporate glass squares I’ve seen countless times before. The inside, however, was more promising. First the security checkpoint. Then the ride in the elevator — down, not up, to roughly three stories below ground, where we found ourselves in a maze of corridors with card readers at every juncture to make sure you don’t slip through undetected. A red-lit panel formed the final barrier, leading to a small sliver of space between two sets of doors. I could hear a loud hum coming from the far side.

“Let me tell you what you’re going to see before we walk in,” Nystrom told me. “Once we get inside, it will be too loud to hear.”

I was about to witness the heart of the supercomputing center: 27 large containers, in neat rows, each housing multiple processors with speeds and abilities too great for my mind to wrap around. Inside, the temperature is by turns arctic and tropic, so-called “cold” rows alternating with “hot”—fans operate around the clock to cool the processors as they churn through millions of giga, mega, tera, peta and other ever-increasing scales of data bytes. In the cool rows, robotic-looking lights blink green and blue in orderly progression. In the hot rows, a jumble of multicolored wires crisscrosses in tangled skeins.

In the corners stood machines that had outlived their heyday. There was Sherlock, an old Cray model, that warmed my heart. There was a sad nameless computer, whose anonymity was partially compensated for by the Warhol soup cans adorning its cage (an homage to Warhol’s Pittsburghian origins).

And where does Libratus live, I asked? Which of these computers is Bridges, the computer that runs the AI Sandholm and I had been discussing?

Bridges, it turned out, isn’t a single computer. It’s a system with processing power beyond comprehension. It takes over two and a half petabytes to run Libratus. A single petabyte is a million gigabytes: You could watch over 13 years of HD video, store 10 billion photos, catalog the contents of the entire Library of Congress word for word. That’s a whole lot of computing power. And that’s only to succeed at heads-up poker, in limited circumstances.

Yet despite the breathtaking computing power at its disposal, Libratus is still severely limited. Yes, it beat its opponents where Claudico failed. But the poker professionals weren’t allowed to use many of the tools of their trade, including the opponent analysis software that they depend on in actual online games. And humans tire. Libratus can churn for a two-week marathon, where the human mind falters.

But there’s still much it can’t do: play more opponents, play live, or win every time. There’s more humanity in poker than Libratus has yet conquered. “There’s this belief that it’s all about statistics and correlations. And we actually don’t believe that,” Nystrom explained as we left Bridges behind. “Once in a while correlations are good, but in general, they can also be really misleading.”

Two years later, the Sandholm lab will produce Pluribus. Pluribus will be able to play against five players—and will run on a single computer. Much of the human edge will have evaporated in a short, very short time. The algorithms have improved, as have the computers. AI, it seems, has gained by leaps and bounds.

So does that mean that, ultimately, the algorithmic can indeed beat out the human, that computation can untangle the web of human interaction by discerning “the little tactics of deception, of asking yourself what is the other man going to think I mean to do,” as von Neumann put it?

Long before I’d spoken to Sandholm, I’d met Kevin Slavin, a polymath of sorts whose past careers have including founding a game design company and an interactive art space and launching the Playful Systems group at MIT’s Media Lab. Slavin has a decidedly different view from the creators of Pluribus. “On the one hand, [von Neumann] was a genius,” Kevin Slavin reflects. “But the presumptuousness of it.”

Slavin is firmly on the side of the gambler, who recognizes uncertainty for what it is and thus is able to take calculated risks when necessary, all the while tampering confidence at the outcome. The most you can do is put yourself in the path of luck—but to think you can guess with certainty the actual outcome is a presumptuousness the true poker player foregoes. For Slavin, the wonder of computers is “That they can generate this fabulous, complex randomness.” His opinion of the algorithmic assaults on chance? “This is their moment,” he said. “But it’s the exact opposite of what’s really beautiful about a computer, which is that it can do something that’s actually unpredictable. That, to me, is the magic.”

Will they actually succeed in making the unpredictable predictable, though? That’s what I want to know. Because everything I’ve seen tells me that absolute success is impossible. The deck is not rigged.

“It’s an unbelievable amount of work to get there. What do you get at the end? Let’s say they’re successful. Then we live in a world where there’s no God, agency, or luck,” Slavin responded.

“I don’t want to live there,’’ he added “I just don’t want to live there.”

Luckily, it seems that for now, he won’t have to. There are more things in life than are yet written in the algorithms. We have no reliable lie detection software—whether in the face, the skin, or the brain. In a recent test of bluffing in poker, computer face recognition failed miserably. We can get at discomfort, but we can’t get at the reasons for that discomfort: lying, fatigue, stress—they all look much the same. And humans, of course, can also mimic stress where none exists, complicating the picture even further.

Pluribus may turn out to be powerful, but von Neumann’s challenge still stands: The true nature of games, the most human of the human, remains to be conquered.

This article was originally published on Undark. Read the original article.

Image Credit: José Pablo Iglesias / Unsplash Continue reading

Posted in Human Robots

#437276 Cars Will Soon Be Able to Sense and ...

Imagine you’re on your daily commute to work, driving along a crowded highway while trying to resist looking at your phone. You’re already a little stressed out because you didn’t sleep well, woke up late, and have an important meeting in a couple hours, but you just don’t feel like your best self.

Suddenly another car cuts you off, coming way too close to your front bumper as it changes lanes. Your already-simmering emotions leap into overdrive, and you lay on the horn and shout curses no one can hear.

Except someone—or, rather, something—can hear: your car. Hearing your angry words, aggressive tone, and raised voice, and seeing your furrowed brow, the onboard computer goes into “soothe” mode, as it’s been programmed to do when it detects that you’re angry. It plays relaxing music at just the right volume, releases a puff of light lavender-scented essential oil, and maybe even says some meditative quotes to calm you down.

What do you think—creepy? Helpful? Awesome? Weird? Would you actually calm down, or get even more angry that a car is telling you what to do?

Scenarios like this (maybe without the lavender oil part) may not be imaginary for much longer, especially if companies working to integrate emotion-reading artificial intelligence into new cars have their way. And it wouldn’t just be a matter of your car soothing you when you’re upset—depending what sort of regulations are enacted, the car’s sensors, camera, and microphone could collect all kinds of data about you and sell it to third parties.

Computers and Feelings
Just as AI systems can be trained to tell the difference between a picture of a dog and one of a cat, they can learn to differentiate between an angry tone of voice or facial expression and a happy one. In fact, there’s a whole branch of machine intelligence devoted to creating systems that can recognize and react to human emotions; it’s called affective computing.

Emotion-reading AIs learn what different emotions look and sound like from large sets of labeled data; “smile = happy,” “tears = sad,” “shouting = angry,” and so on. The most sophisticated systems can likely even pick up on the micro-expressions that flash across our faces before we consciously have a chance to control them, as detailed by Daniel Goleman in his groundbreaking book Emotional Intelligence.

Affective computing company Affectiva, a spinoff from MIT Media Lab, says its algorithms are trained on 5,313,751 face videos (videos of people’s faces as they do an activity, have a conversation, or react to stimuli) representing about 2 billion facial frames. Fascinatingly, Affectiva claims its software can even account for cultural differences in emotional expression (for example, it’s more normalized in Western cultures to be very emotionally expressive, whereas Asian cultures tend to favor stoicism and politeness), as well as gender differences.

But Why?
As reported in Motherboard, companies like Affectiva, Cerence, Xperi, and Eyeris have plans in the works to partner with automakers and install emotion-reading AI systems in new cars. Regulations passed last year in Europe and a bill just introduced this month in the US senate are helping make the idea of “driver monitoring” less weird, mainly by emphasizing the safety benefits of preemptive warning systems for tired or distracted drivers (remember that part in the beginning about sneaking glances at your phone? Yeah, that).

Drowsiness and distraction can’t really be called emotions, though—so why are they being lumped under an umbrella that has a lot of other implications, including what many may consider an eerily Big Brother-esque violation of privacy?

Our emotions, in fact, are among the most private things about us, since we are the only ones who know their true nature. We’ve developed the ability to hide and disguise our emotions, and this can be a useful skill at work, in relationships, and in scenarios that require negotiation or putting on a game face.

And I don’t know about you, but I’ve had more than one good cry in my car. It’s kind of the perfect place for it; private, secluded, soundproof.

Putting systems into cars that can recognize and collect data about our emotions under the guise of preventing accidents due to the state of mind of being distracted or the physical state of being sleepy, then, seems a bit like a bait and switch.

A Highway to Privacy Invasion?
European regulations will help keep driver data from being used for any purpose other than ensuring a safer ride. But the US is lagging behind on the privacy front, with car companies largely free from any enforceable laws that would keep them from using driver data as they please.

Affectiva lists the following as use cases for occupant monitoring in cars: personalizing content recommendations, providing alternate route recommendations, adapting environmental conditions like lighting and heating, and understanding user frustration with virtual assistants and designing those assistants to be emotion-aware so that they’re less frustrating.

Our phones already do the first two (though, granted, we’re not supposed to look at them while we drive—but most cars now let you use bluetooth to display your phone’s content on the dashboard), and the third is simply a matter of reaching a hand out to turn a dial or press a button. The last seems like a solution for a problem that wouldn’t exist without said… solution.

Despite how unnecessary and unsettling it may seem, though, emotion-reading AI isn’t going away, in cars or other products and services where it might provide value.

Besides automotive AI, Affectiva also makes software for clients in the advertising space. With consent, the built-in camera on users’ laptops records them while they watch ads, gauging their emotional response, what kind of marketing is most likely to engage them, and how likely they are to buy a given product. Emotion-recognition tech is also being used or considered for use in mental health applications, call centers, fraud monitoring, and education, among others.

In a 2015 TED talk, Affectiva co-founder Rana El-Kaliouby told her audience that we’re living in a world increasingly devoid of emotion, and her goal was to bring emotions back into our digital experiences. Soon they’ll be in our cars, too; whether the benefits will outweigh the costs remains to be seen.

Image Credit: Free-Photos from Pixabay Continue reading

Posted in Human Robots

#437269 DeepMind’s Newest AI Programs Itself ...

When Deep Blue defeated world chess champion Garry Kasparov in 1997, it may have seemed artificial intelligence had finally arrived. A computer had just taken down one of the top chess players of all time. But it wasn’t to be.

Though Deep Blue was meticulously programmed top-to-bottom to play chess, the approach was too labor-intensive, too dependent on clear rules and bounded possibilities to succeed at more complex games, let alone in the real world. The next revolution would take a decade and a half, when vastly more computing power and data revived machine learning, an old idea in artificial intelligence just waiting for the world to catch up.

Today, machine learning dominates, mostly by way of a family of algorithms called deep learning, while symbolic AI, the dominant approach in Deep Blue’s day, has faded into the background.

Key to deep learning’s success is the fact the algorithms basically write themselves. Given some high-level programming and a dataset, they learn from experience. No engineer anticipates every possibility in code. The algorithms just figure it.

Now, Alphabet’s DeepMind is taking this automation further by developing deep learning algorithms that can handle programming tasks which have been, to date, the sole domain of the world’s top computer scientists (and take them years to write).

In a paper recently published on the pre-print server arXiv, a database for research papers that haven’t been peer reviewed yet, the DeepMind team described a new deep reinforcement learning algorithm that was able to discover its own value function—a critical programming rule in deep reinforcement learning—from scratch.

Surprisingly, the algorithm was also effective beyond the simple environments it trained in, going on to play Atari games—a different, more complicated task—at a level that was, at times, competitive with human-designed algorithms and achieving superhuman levels of play in 14 games.

DeepMind says the approach could accelerate the development of reinforcement learning algorithms and even lead to a shift in focus, where instead of spending years writing the algorithms themselves, researchers work to perfect the environments in which they train.

Pavlov’s Digital Dog
First, a little background.

Three main deep learning approaches are supervised, unsupervised, and reinforcement learning.

The first two consume huge amounts of data (like images or articles), look for patterns in the data, and use those patterns to inform actions (like identifying an image of a cat). To us, this is a pretty alien way to learn about the world. Not only would it be mind-numbingly dull to review millions of cat images, it’d take us years or more to do what these programs do in hours or days. And of course, we can learn what a cat looks like from just a few examples. So why bother?

While supervised and unsupervised deep learning emphasize the machine in machine learning, reinforcement learning is a bit more biological. It actually is the way we learn. Confronted with several possible actions, we predict which will be most rewarding based on experience—weighing the pleasure of eating a chocolate chip cookie against avoiding a cavity and trip to the dentist.

In deep reinforcement learning, algorithms go through a similar process as they take action. In the Atari game Breakout, for instance, a player guides a paddle to bounce a ball at a ceiling of bricks, trying to break as many as possible. When playing Breakout, should an algorithm move the paddle left or right? To decide, it runs a projection—this is the value function—of which direction will maximize the total points, or rewards, it can earn.

Move by move, game by game, an algorithm combines experience and value function to learn which actions bring greater rewards and improves its play, until eventually, it becomes an uncanny Breakout player.

Learning to Learn (Very Meta)
So, a key to deep reinforcement learning is developing a good value function. And that’s difficult. According to the DeepMind team, it takes years of manual research to write the rules guiding algorithmic actions—which is why automating the process is so alluring. Their new Learned Policy Gradient (LPG) algorithm makes solid progress in that direction.

LPG trained in a number of toy environments. Most of these were “gridworlds”—literally two-dimensional grids with objects in some squares. The AI moves square to square and earns points or punishments as it encounters objects. The grids vary in size, and the distribution of objects is either set or random. The training environments offer opportunities to learn fundamental lessons for reinforcement learning algorithms.

Only in LPG’s case, it had no value function to guide that learning.

Instead, LPG has what DeepMind calls a “meta-learner.” You might think of this as an algorithm within an algorithm that, by interacting with its environment, discovers both “what to predict,” thereby forming its version of a value function, and “how to learn from it,” applying its newly discovered value function to each decision it makes in the future.

Prior work in the area has had some success, but according to DeepMind, LPG is the first algorithm to discover reinforcement learning rules from scratch and to generalize beyond training. The latter was particularly surprising because Atari games are so different from the simple worlds LPG trained in—that is, it had never seen anything like an Atari game.

Time to Hand Over the Reins? Not Just Yet
LPG is still behind advanced human-designed algorithms, the researchers said. But it outperformed a human-designed benchmark in training and even some Atari games, which suggests it isn’t strictly worse, just that it specializes in some environments.

This is where there’s room for improvement and more research.

The more environments LPG saw, the more it could successfully generalize. Intriguingly, the researchers speculate that with enough well-designed training environments, the approach might yield a general-purpose reinforcement learning algorithm.

At the least, though, they say further automation of algorithm discovery—that is, algorithms learning to learn—will accelerate the field. In the near term, it can help researchers more quickly develop hand-designed algorithms. Further out, as self-discovered algorithms like LPG improve, engineers may shift from manually developing the algorithms themselves to building the environments where they learn.

Deep learning long ago left Deep Blue in the dust at games. Perhaps algorithms learning to learn will be a winning strategy in the real world too.

Image credit: Mike Szczepanski / Unsplash Continue reading

Posted in Human Robots