Tag Archives: based

#437357 Algorithms Workers Can’t See Are ...

“I’m sorry, Dave. I’m afraid I can’t do that.” HAL’s cold, if polite, refusal to open the pod bay doors in 2001: A Space Odyssey has become a defining warning about putting too much trust in artificial intelligence, particularly if you work in space.

In the movies, when a machine decides to be the boss (or humans let it) things go wrong. Yet despite myriad dystopian warnings, control by machines is fast becoming our reality.

Algorithms—sets of instructions to solve a problem or complete a task—now drive everything from browser search results to better medical care.

They are helping design buildings. They are speeding up trading on financial markets, making and losing fortunes in micro-seconds. They are calculating the most efficient routes for delivery drivers.

In the workplace, self-learning algorithmic computer systems are being introduced by companies to assist in areas such as hiring, setting tasks, measuring productivity, evaluating performance, and even terminating employment: “I’m sorry, Dave. I’m afraid you are being made redundant.”

Giving self‐learning algorithms the responsibility to make and execute decisions affecting workers is called “algorithmic management.” It carries a host of risks in depersonalizing management systems and entrenching pre-existing biases.

At an even deeper level, perhaps, algorithmic management entrenches a power imbalance between management and worker. Algorithms are closely guarded secrets. Their decision-making processes are hidden. It’s a black-box: perhaps you have some understanding of the data that went in, and you see the result that comes out, but you have no idea of what goes on in between.

Algorithms at Work
Here are a few examples of algorithms already at work.

At Amazon’s fulfillment center in south-east Melbourne, they set the pace for “pickers,” who have timers on their scanners showing how long they have to find the next item. As soon as they scan that item, the timer resets for the next. All at a “not quite walking, not quite running” speed.

Or how about AI determining your success in a job interview? More than 700 companies have trialed such technology. US developer HireVue says its software speeds up the hiring process by 90 percent by having applicants answer identical questions and then scoring them according to language, tone, and facial expressions.

Granted, human assessments during job interviews are notoriously flawed. Algorithms,however, can also be biased. The classic example is the COMPAS software used by US judges, probation, and parole officers to rate a person’s risk of re-offending. In 2016 a ProPublica investigation showed the algorithm was heavily discriminatory, incorrectly classifying black subjects as higher risk 45 percent of the time, compared with 23 percent for white subjects.

How Gig Workers Cope
Algorithms do what their code tells them to do. The problem is this code is rarely available. This makes them difficult to scrutinize, or even understand.

Nowhere is this more evident than in the gig economy. Uber, Lyft, Deliveroo, and other platforms could not exist without algorithms allocating, monitoring, evaluating, and rewarding work.

Over the past year Uber Eats’ bicycle couriers and drivers, for instance, have blamed unexplained changes to the algorithm for slashing their jobs, and incomes.

Rider’s can’t be 100 percent sure it was all down to the algorithm. But that’s part of the problem. The fact those who depend on the algorithm don’t know one way or the other has a powerful influence on them.

This is a key result from our interviews with 58 food-delivery couriers. Most knew their jobs were allocated by an algorithm (via an app). They knew the app collected data. What they didn’t know was how data was used to award them work.

In response, they developed a range of strategies (or guessed how) to “win” more jobs, such as accepting gigs as quickly as possible and waiting in “magic” locations. Ironically, these attempts to please the algorithm often meant losing the very flexibility that was one of the attractions of gig work.

The information asymmetry created by algorithmic management has two profound effects. First, it threatens to entrench systemic biases, the type of discrimination hidden within the COMPAS algorithm for years. Second, it compounds the power imbalance between management and worker.

Our data also confirmed others’ findings that it is almost impossible to complain about the decisions of the algorithm. Workers often do not know the exact basis of those decisions, and there’s no one to complain to anyway. When Uber Eats bicycle couriers asked for reasons about their plummeting income, for example, responses from the company advised them “we have no manual control over how many deliveries you receive.”

Broader Lessons
When algorithmic management operates as a “black box” one of the consequences is that it is can become an indirect control mechanism. Thus far under-appreciated by Australian regulators, this control mechanism has enabled platforms to mobilize a reliable and scalable workforce while avoiding employer responsibilities.

“The absence of concrete evidence about how the algorithms operate”, the Victorian government’s inquiry into the “on-demand” workforce notes in its report, “makes it hard for a driver or rider to complain if they feel disadvantaged by one.”

The report, published in June, also found it is “hard to confirm if concern over algorithm transparency is real.”

But it is precisely the fact it is hard to confirm that’s the problem. How can we start to even identify, let alone resolve, issues like algorithmic management?

Fair conduct standards to ensure transparency and accountability are a start. One example is the Fair Work initiative, led by the Oxford Internet Institute. The initiative is bringing together researchers with platforms, workers, unions, and regulators to develop global principles for work in the platform economy. This includes “fair management,” which focuses on how transparent the results and outcomes of algorithms are for workers.

Understandings about impact of algorithms on all forms of work is still in its infancy. It demands greater scrutiny and research. Without human oversight based on agreed principles we risk inviting HAL into our workplaces.

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Image Credit: PickPik Continue reading

Posted in Human Robots

#437337 6G Will Be 100 Times Faster Than ...

Though 5G—a next-generation speed upgrade to wireless networks—is scarcely up and running (and still nonexistent in many places) researchers are already working on what comes next. It lacks an official name, but they’re calling it 6G for the sake of simplicity (and hey, it’s tradition). 6G promises to be up to 100 times faster than 5G—fast enough to download 142 hours of Netflix in a second—but researchers are still trying to figure out exactly how to make such ultra-speedy connections happen.

A new chip, described in a paper in Nature Photonics by a team from Osaka University and Nanyang Technological University in Singapore, may give us a glimpse of our 6G future. The team was able to transmit data at a rate of 11 gigabits per second, topping 5G’s theoretical maximum speed of 10 gigabits per second and fast enough to stream 4K high-def video in real time. They believe the technology has room to grow, and with more development, might hit those blistering 6G speeds.

NTU final year PhD student Abhishek Kumar, Assoc Prof Ranjan Singh and postdoc Dr Yihao Yang. Dr Singh is holding the photonic topological insulator chip made from silicon, which can transmit terahertz waves at ultrahigh speeds. Credit: NTU Singapore
But first, some details about 5G and its predecessors so we can differentiate them from 6G.

Electromagnetic waves are characterized by a wavelength and a frequency; the wavelength is the distance a cycle of the wave covers (peak to peak or trough to trough, for example), and the frequency is the number of waves that pass a given point in one second. Cellphones use miniature radios to pick up electromagnetic signals and convert those signals into the sights and sounds on your phone.

4G wireless networks run on millimeter waves on the low- and mid-band spectrum, defined as a frequency of a little less (low-band) and a little more (mid-band) than one gigahertz (or one billion cycles per second). 5G kicked that up several notches by adding even higher frequency millimeter waves of up to 300 gigahertz, or 300 billion cycles per second. Data transmitted at those higher frequencies tends to be information-dense—like video—because they’re much faster.

The 6G chip kicks 5G up several more notches. It can transmit waves at more than three times the frequency of 5G: one terahertz, or a trillion cycles per second. The team says this yields a data rate of 11 gigabits per second. While that’s faster than the fastest 5G will get, it’s only the beginning for 6G. One wireless communications expert even estimates 6G networks could handle rates up to 8,000 gigabits per second; they’ll also have much lower latency and higher bandwidth than 5G.

Terahertz waves fall between infrared waves and microwaves on the electromagnetic spectrum. Generating and transmitting them is difficult and expensive, requiring special lasers, and even then the frequency range is limited. The team used a new material to transmit terahertz waves, called photonic topological insulators (PTIs). PTIs can conduct light waves on their surface and edges rather than having them run through the material, and allow light to be redirected around corners without disturbing its flow.

The chip is made completely of silicon and has rows of triangular holes. The team’s research showed the chip was able to transmit terahertz waves error-free.

Nanyang Technological University associate professor Ranjan Singh, who led the project, said, “Terahertz technology […] can potentially boost intra-chip and inter-chip communication to support artificial intelligence and cloud-based technologies, such as interconnected self-driving cars, which will need to transmit data quickly to other nearby cars and infrastructure to navigate better and also to avoid accidents.”

Besides being used for AI and self-driving cars (and, of course, downloading hundreds of hours of video in seconds), 6G would also make a big difference for data centers, IoT devices, and long-range communications, among other applications.

Given that 5G networks are still in the process of being set up, though, 6G won’t be coming on the scene anytime soon; a recent whitepaper on 6G from Japanese company NTTDoCoMo estimates we’ll see it in 2030, pointing out that wireless connection tech generations have thus far been spaced about 10 years apart; we got 3G in the early 2000s, 4G in 2010, and 5G in 2020.

In the meantime, as 6G continues to develop, we’re still looking forward to the widespread adoption of 5G.

Image Credit: Hans Braxmeier from Pixabay Continue reading

Posted in Human Robots

#437301 The Global Work Crisis: Automation, the ...

The alarm bell rings. You open your eyes, come to your senses, and slide from dream state to consciousness. You hit the snooze button, and eventually crawl out of bed to the start of yet another working day.

This daily narrative is experienced by billions of people all over the world. We work, we eat, we sleep, and we repeat. As our lives pass day by day, the beating drums of the weekly routine take over and years pass until we reach our goal of retirement.

A Crisis of Work
We repeat the routine so that we can pay our bills, set our kids up for success, and provide for our families. And after a while, we start to forget what we would do with our lives if we didn’t have to go back to work.

In the end, we look back at our careers and reflect on what we’ve achieved. It may have been the hundreds of human interactions we’ve had; the thousands of emails read and replied to; the millions of minutes of physical labor—all to keep the global economy ticking along.

According to Gallup’s World Poll, only 15 percent of people worldwide are actually engaged with their jobs. The current state of “work” is not working for most people. In fact, it seems we as a species are trapped by a global work crisis, which condemns people to cast away their time just to get by in their day-to-day lives.

Technologies like artificial intelligence and automation may help relieve the work burdens of millions of people—but to benefit from their impact, we need to start changing our social structures and the way we think about work now.

The Specter of Automation
Automation has been ongoing since the Industrial Revolution. In recent decades it has taken on a more elegant guise, first with physical robots in production plants, and more recently with software automation entering most offices.

The driving goal behind much of this automation has always been productivity and hence, profits: technology that can act as a multiplier on what a single human can achieve in a day is of huge value to any company. Powered by this strong financial incentive, the quest for automation is growing ever more pervasive.

But if automation accelerates or even continues at its current pace and there aren’t strong social safety nets in place to catch the people who are negatively impacted (such as by losing their jobs), there could be a host of knock-on effects, including more concentrated wealth among a shrinking elite, more strain on government social support, an increase in depression and drug dependence, and even violent social unrest.

It seems as though we are rushing headlong into a major crisis, driven by the engine of accelerating automation. But what if instead of automation challenging our fragile status quo, we view it as the solution that can free us from the shackles of the Work Crisis?

The Way Out
In order to undertake this paradigm shift, we need to consider what society could potentially look like, as well as the problems associated with making this change. In the context of these crises, our primary aim should be for a system where people are not obligated to work to generate the means to survive. This removal of work should not threaten access to food, water, shelter, education, healthcare, energy, or human value. In our current system, work is the gatekeeper to these essentials: one can only access these (and even then often in a limited form), if one has a “job” that affords them.

Changing this system is thus a monumental task. This comes with two primary challenges: providing people without jobs with financial security, and ensuring they maintain a sense of their human value and worth. There are several measures that could be implemented to help meet these challenges, each with important steps for society to consider.

Universal basic income (UBI)

UBI is rapidly gaining support, and it would allow people to become shareholders in the fruits of automation, which would then be distributed more broadly.

UBI trials have been conducted in various countries around the world, including Finland, Kenya, and Spain. The findings have generally been positive on the health and well-being of the participants, and showed no evidence that UBI disincentivizes work, a common concern among the idea’s critics. The most recent popular voice for UBI has been that of former US presidential candidate Andrew Yang, who now runs a non-profit called Humanity Forward.

UBI could also remove wasteful bureaucracy in administering welfare payments (since everyone receives the same amount, there’s no need to prevent false claims), and promote the pursuit of projects aligned with peoples’ skill sets and passions, as well as quantifying the value of tasks not recognized by economic measures like Gross Domestic Product (GDP). This includes looking after children and the elderly at home.

How a UBI can be initiated with political will and social backing and paid for by governments has been hotly debated by economists and UBI enthusiasts. Variables like how much the UBI payments should be, whether to implement taxes such as Yang’s proposed valued added tax (VAT), whether to replace existing welfare payments, the impact on inflation, and the impact on “jobs” from people who would otherwise look for work require additional discussion. However, some have predicted the inevitability of UBI as a result of automation.

Universal healthcare

Another major component of any society is the healthcare of its citizens. A move away from work would further require the implementation of a universal healthcare system to decouple healthcare from jobs. Currently in the US, and indeed many other economies, healthcare is tied to employment.

Universal healthcare such as Medicare in Australia is evidence for the adage “prevention is better than cure,” when comparing the cost of healthcare in the US with Australia on a per capita basis. This has already presented itself as an advancement in the way healthcare is considered. There are further benefits of a healthier population, including less time and money spent on “sick-care.” Healthy people are more likely and more able to achieve their full potential.

Reshape the economy away from work-based value

One of the greatest challenges in a departure from work is for people to find value elsewhere in life. Many people view their identities as being inextricably tied to their jobs, and life without a job is therefore a threat to one’s sense of existence. This presents a shift that must be made at both a societal and personal level.

A person can only seek alternate value in life when afforded the time to do so. To this end, we need to start reducing “work-for-a-living” hours towards zero, which is a trend we are already seeing in Europe. This should not come at the cost of reducing wages pro rata, but rather could be complemented by UBI or additional schemes where people receive dividends for work done by automation. This transition makes even more sense when coupled with the idea of deviating from using GDP as a measure of societal growth, and instead adopting a well-being index based on universal human values like health, community, happiness, and peace.

The crux of this issue is in transitioning away from the view that work gives life meaning and life is about using work to survive, towards a view of living a life that itself is fulfilling and meaningful. This speaks directly to notions from Maslow’s hierarchy of needs, where work largely addresses psychological and safety needs such as shelter, food, and financial well-being. More people should have a chance to grow beyond the most basic needs and engage in self-actualization and transcendence.

The question is largely around what would provide people with a sense of value, and the answers would differ as much as people do; self-mastery, building relationships and contributing to community growth, fostering creativity, and even engaging in the enjoyable aspects of existing jobs could all come into play.

Universal education

With a move towards a society that promotes the values of living a good life, the education system would have to evolve as well. Researchers have long argued for a more nimble education system, but universities and even most online courses currently exist for the dominant purpose of ensuring people are adequately skilled to contribute to the economy. These “job factories” only exacerbate the Work Crisis. In fact, the response often given by educational institutions to the challenge posed by automation is to find new ways of upskilling students, such as ensuring they are all able to code. As alluded to earlier, this is a limited and unimaginative solution to the problem we are facing.

Instead, education should be centered on helping people acknowledge the current crisis of work and automation, teach them how to derive value that is decoupled from work, and enable people to embrace progress as we transition to the new economy.

Disrupting the Status Quo
While we seldom stop to think about it, much of the suffering faced by humanity is brought about by the systemic foe that is the Work Crisis. The way we think about work has brought society far and enabled tremendous developments, but at the same time it has failed many people. Now the status quo is threatened by those very developments as we progress to an era where machines are likely to take over many job functions.

This impending paradigm shift could be a threat to the stability of our fragile system, but only if it is not fully anticipated. If we prepare for it appropriately, it could instead be the key not just to our survival, but to a better future for all.

Image Credit: mostafa meraji from Pixabay Continue reading

Posted in Human Robots

#437293 These Scientists Just Completed a 3D ...

Human brain maps are a dime a dozen these days. Maps that detail neurons in a certain region. Maps that draw out functional connections between those cells. Maps that dive deeper into gene expression. Or even meta-maps that combine all of the above.

But have you ever wondered: how well do those maps represent my brain? After all, no two brains are alike. And if we’re ever going to reverse-engineer the brain as a computer simulation—as Europe’s Human Brain Project is trying to do—shouldn’t we ask whose brain they’re hoping to simulate?

Enter a new kind of map: the Julich-Brain, a probabilistic map of human brains that accounts for individual differences using a computational framework. Rather than generating a static PDF of a brain map, the Julich-Brain atlas is also dynamic, in that it continuously changes to incorporate more recent brain mapping results. So far, the map has data from over 24,000 thinly sliced sections from 23 postmortem brains covering most years of adulthood at the cellular level. But the atlas can also continuously adapt to progress in mapping technologies to aid brain modeling and simulation, and link to other atlases and alternatives.

In other words, rather than “just another” human brain map, the Julich-Brain atlas is its own neuromapping API—one that could unite previous brain-mapping efforts with more modern methods.

“It is exciting to see how far the combination of brain research and digital technologies has progressed,” said Dr. Katrin Amunts of the Institute of Neuroscience and Medicine at Research Centre Jülich in Germany, who spearheaded the study.

The Old Dogma
The Julich-Brain atlas embraces traditional brain-mapping while also yanking the field into the 21st century.

First, the new atlas includes the brain’s cytoarchitecture, or how brain cells are organized. As brain maps go, these kinds of maps are the oldest and most fundamental. Rather than exploring how neurons talk to each other functionally—which is all the rage these days with connectome maps—cytoarchitecture maps draw out the physical arrangement of neurons.

Like a census, these maps literally capture how neurons are distributed in the brain, what they look like, and how they layer within and between different brain regions.

Because neurons aren’t packed together the same way between different brain regions, this provides a way to parse the brain into areas that can be further studied. When we say the brain’s “memory center,” the hippocampus, or the emotion center, the “amygdala,” these distinctions are based on cytoarchitectural maps.

Some may call this type of mapping “boring.” But cytoarchitecture maps form the very basis of any sort of neuroscience understanding. Like hand-drawn maps from early explorers sailing to the western hemisphere, these maps provide the brain’s geographical patterns from which we try to decipher functional connections. If brain regions are cities, then cytoarchitecture maps attempt to show trading or other “functional” activities that occur in the interlinking highways.

You might’ve heard of the most common cytoarchitecture map used today: the Brodmann map from 1909 (yup, that old), which divided the brain into classical regions based on the cells’ morphology and location. The map, while impactful, wasn’t able to account for brain differences between people. More recent brain-mapping technologies have allowed us to dig deeper into neuronal differences and divide the brain into more regions—180 areas in the cortex alone, compared with 43 in the original Brodmann map.

The new study took inspiration from that age-old map and transformed it into a digital ecosystem.

A Living Atlas
Work began on the Julich-Brain atlas in the mid-1990s, with a little help from the crowd.

The preparation of human tissue and its microstructural mapping, analysis, and data processing is incredibly labor-intensive, the authors lamented, making it impossible to do for the whole brain at high resolution in just one lab. To build their “Google Earth” for the brain, the team hooked up with EBRAINS, a shared computing platform set up by the Human Brain Project to promote collaboration between neuroscience labs in the EU.

First, the team acquired MRI scans of 23 postmortem brains, sliced the brains into wafer-thin sections, and scanned and digitized them. They corrected distortions from the chopping using data from the MRI scans and then lined up neurons in consecutive sections—picture putting together a 3D puzzle—to reconstruct the whole brain. Overall, the team had to analyze 24,000 brain sections, which prompted them to build a computational management system for individual brain sections—a win, because they could now track individual donor brains too.

Their method was quite clever. They first mapped their results to a brain template from a single person, called the MNI-Colin27 template. Because the reference brain was extremely detailed, this allowed the team to better figure out the location of brain cells and regions in a particular anatomical space.

However, MNI-Colin27’s brain isn’t your or my brain—or any of the brains the team analyzed. To dilute any of Colin’s potential brain quirks, the team also mapped their dataset onto an “average brain,” dubbed the ICBM2009c (catchy, I know).

This step allowed the team to “standardize” their results with everything else from the Human Connectome Project and the UK Biobank, kind of like adding their Google Maps layer to the existing map. To highlight individual brain differences, the team overlaid their dataset on existing ones, and looked for differences in the cytoarchitecture.

The microscopic architecture of neurons change between two areas (dotted line), forming the basis of different identifiable brain regions. To account for individual differences, the team also calculated a probability map (right hemisphere). Image credit: Forschungszentrum Juelich / Katrin Amunts
Based on structure alone, the brains were both remarkably different and shockingly similar at the same time. For example, the cortexes—the outermost layer of the brain—were physically different across donor brains of different age and sex. The region especially divergent between people was Broca’s region, which is traditionally linked to speech production. In contrast, parts of the visual cortex were almost identical between the brains.

The Brain-Mapping Future
Rather than relying on the brain’s visible “landmarks,” which can still differ between people, the probabilistic map is far more precise, the authors said.

What’s more, the map could also pool yet unmapped regions in the cortex—about 30 percent or so—into “gap maps,” providing neuroscientists with a better idea of what still needs to be understood.

“New maps are continuously replacing gap maps with progress in mapping while the process is captured and documented … Consequently, the atlas is not static but rather represents a ‘living map,’” the authors said.

Thanks to its structurally-sound architecture down to individual cells, the atlas can contribute to brain modeling and simulation down the line—especially for personalized brain models for neurological disorders such as seizures. Researchers can also use the framework for other species, and they can even incorporate new data-crunching processors into the workflow, such as mapping brain regions using artificial intelligence.

Fundamentally, the goal is to build shared resources to better understand the brain. “[These atlases] help us—and more and more researchers worldwide—to better understand the complex organization of the brain and to jointly uncover how things are connected,” the authors said.

Image credit: Richard Watts, PhD, University of Vermont and Fair Neuroimaging Lab, Oregon Health and Science University Continue reading

Posted in Human Robots

#437269 DeepMind’s Newest AI Programs Itself ...

When Deep Blue defeated world chess champion Garry Kasparov in 1997, it may have seemed artificial intelligence had finally arrived. A computer had just taken down one of the top chess players of all time. But it wasn’t to be.

Though Deep Blue was meticulously programmed top-to-bottom to play chess, the approach was too labor-intensive, too dependent on clear rules and bounded possibilities to succeed at more complex games, let alone in the real world. The next revolution would take a decade and a half, when vastly more computing power and data revived machine learning, an old idea in artificial intelligence just waiting for the world to catch up.

Today, machine learning dominates, mostly by way of a family of algorithms called deep learning, while symbolic AI, the dominant approach in Deep Blue’s day, has faded into the background.

Key to deep learning’s success is the fact the algorithms basically write themselves. Given some high-level programming and a dataset, they learn from experience. No engineer anticipates every possibility in code. The algorithms just figure it.

Now, Alphabet’s DeepMind is taking this automation further by developing deep learning algorithms that can handle programming tasks which have been, to date, the sole domain of the world’s top computer scientists (and take them years to write).

In a paper recently published on the pre-print server arXiv, a database for research papers that haven’t been peer reviewed yet, the DeepMind team described a new deep reinforcement learning algorithm that was able to discover its own value function—a critical programming rule in deep reinforcement learning—from scratch.

Surprisingly, the algorithm was also effective beyond the simple environments it trained in, going on to play Atari games—a different, more complicated task—at a level that was, at times, competitive with human-designed algorithms and achieving superhuman levels of play in 14 games.

DeepMind says the approach could accelerate the development of reinforcement learning algorithms and even lead to a shift in focus, where instead of spending years writing the algorithms themselves, researchers work to perfect the environments in which they train.

Pavlov’s Digital Dog
First, a little background.

Three main deep learning approaches are supervised, unsupervised, and reinforcement learning.

The first two consume huge amounts of data (like images or articles), look for patterns in the data, and use those patterns to inform actions (like identifying an image of a cat). To us, this is a pretty alien way to learn about the world. Not only would it be mind-numbingly dull to review millions of cat images, it’d take us years or more to do what these programs do in hours or days. And of course, we can learn what a cat looks like from just a few examples. So why bother?

While supervised and unsupervised deep learning emphasize the machine in machine learning, reinforcement learning is a bit more biological. It actually is the way we learn. Confronted with several possible actions, we predict which will be most rewarding based on experience—weighing the pleasure of eating a chocolate chip cookie against avoiding a cavity and trip to the dentist.

In deep reinforcement learning, algorithms go through a similar process as they take action. In the Atari game Breakout, for instance, a player guides a paddle to bounce a ball at a ceiling of bricks, trying to break as many as possible. When playing Breakout, should an algorithm move the paddle left or right? To decide, it runs a projection—this is the value function—of which direction will maximize the total points, or rewards, it can earn.

Move by move, game by game, an algorithm combines experience and value function to learn which actions bring greater rewards and improves its play, until eventually, it becomes an uncanny Breakout player.

Learning to Learn (Very Meta)
So, a key to deep reinforcement learning is developing a good value function. And that’s difficult. According to the DeepMind team, it takes years of manual research to write the rules guiding algorithmic actions—which is why automating the process is so alluring. Their new Learned Policy Gradient (LPG) algorithm makes solid progress in that direction.

LPG trained in a number of toy environments. Most of these were “gridworlds”—literally two-dimensional grids with objects in some squares. The AI moves square to square and earns points or punishments as it encounters objects. The grids vary in size, and the distribution of objects is either set or random. The training environments offer opportunities to learn fundamental lessons for reinforcement learning algorithms.

Only in LPG’s case, it had no value function to guide that learning.

Instead, LPG has what DeepMind calls a “meta-learner.” You might think of this as an algorithm within an algorithm that, by interacting with its environment, discovers both “what to predict,” thereby forming its version of a value function, and “how to learn from it,” applying its newly discovered value function to each decision it makes in the future.

Prior work in the area has had some success, but according to DeepMind, LPG is the first algorithm to discover reinforcement learning rules from scratch and to generalize beyond training. The latter was particularly surprising because Atari games are so different from the simple worlds LPG trained in—that is, it had never seen anything like an Atari game.

Time to Hand Over the Reins? Not Just Yet
LPG is still behind advanced human-designed algorithms, the researchers said. But it outperformed a human-designed benchmark in training and even some Atari games, which suggests it isn’t strictly worse, just that it specializes in some environments.

This is where there’s room for improvement and more research.

The more environments LPG saw, the more it could successfully generalize. Intriguingly, the researchers speculate that with enough well-designed training environments, the approach might yield a general-purpose reinforcement learning algorithm.

At the least, though, they say further automation of algorithm discovery—that is, algorithms learning to learn—will accelerate the field. In the near term, it can help researchers more quickly develop hand-designed algorithms. Further out, as self-discovered algorithms like LPG improve, engineers may shift from manually developing the algorithms themselves to building the environments where they learn.

Deep learning long ago left Deep Blue in the dust at games. Perhaps algorithms learning to learn will be a winning strategy in the real world too.

Image credit: Mike Szczepanski / Unsplash Continue reading

Posted in Human Robots