Tag Archives: change

#439023 In ‘Klara and the Sun,’ We Glimpse ...

In a store in the center of an unnamed city, humanoid robots are displayed alongside housewares and magazines. They watch the fast-moving world outside the window, anxiously awaiting the arrival of customers who might buy them and take them home. Among them is Klara, a particularly astute robot who loves the sun and wants to learn as much as possible about humans and the world they live in.

So begins Kazuo Ishiguro’s new novel Klara and the Sun, published earlier this month. The book, told from Klara’s perspective, portrays an eerie future society in which intelligent machines and other advanced technologies have been integrated into daily life, but not everyone is happy about it.

Technological unemployment, the progress of artificial intelligence, inequality, the safety and ethics of gene editing, increasing loneliness and isolation—all of which we’re grappling with today—show up in Ishiguro’s world. It’s like he hit a fast-forward button, mirroring back to us how things might play out if we don’t approach these technologies with caution and foresight.

The wealthy genetically edit or “lift” their children to set them up for success, while the poor have to make do with the regular old brains and bodies bequeathed them by evolution. Lifted and unlifted kids generally don’t mix, and this is just one of many sinister delineations between a new breed of haves and have-nots.

There’s anger about robots’ steady infiltration into everyday life, and questions about how similar their rights should be to those of humans. “First they take the jobs. Then they take the seats at the theater?” one woman fumes.

References to “changes” and “substitutions” allude to an economy where automation has eliminated millions of jobs. While “post-employed” people squat in abandoned buildings and fringe communities arm themselves in preparation for conflict, those whose livelihoods haven’t been destroyed can afford to have live-in housekeepers and buy Artificial Friends (or AFs) for their lonely children.

“The old traditional model that we still live with now—where most of us can get some kind of paid work in exchange for our services or the goods we make—has broken down,” Ishiguro said in a podcast discussion of the novel. “We’re not talking just about the difference between rich and poor getting bigger. We’re talking about a gap appearing between people who participate in society in an obvious way and people who do not.”

He has a point; as much as techno-optimists claim that the economic changes brought by automation and AI will give us all more free time, let us work less, and devote time to our passion projects, how would that actually play out? What would millions of “post-employed” people receiving basic income actually do with their time and energy?

In the novel, we don’t get much of a glimpse of this side of the equation, but we do see how the wealthy live. After a long wait, just as the store manager seems ready to give up on selling her, Klara is chosen by a 14-year-old girl named Josie, the daughter of a woman who wears “high-rank clothes” and lives in a large, sunny home outside the city. Cheerful and kind, Josie suffers from an unspecified illness that periodically flares up and leaves her confined to her bed for days at a time.

Her life seems somewhat bleak, the need for an AF clear. In this future world, the children of the wealthy no longer go to school together, instead studying alone at home on their digital devices. “Interaction meetings” are set up for them to learn to socialize, their parents carefully eavesdropping from the next room and trying not to intervene when there’s conflict or hurt feelings.

Klara does her best to be a friend, aide, and confidante to Josie while continuing to learn about the world around her and decode the mysteries of human behavior. We surmise that she was programmed with a basic ability to understand emotions, which evolves along with her other types of intelligence. “I believe I have many feelings. The more I observe, the more feelings become available to me,” she explains to one character.

Ishiguro does an excellent job of representing Klara’s mind: a blend of pre-determined programming, observation, and continuous learning. Her narration has qualities both robotic and human; we can tell when something has been programmed in—she “Gives Privacy” to the humans around her when that’s appropriate, for example—and when she’s figured something out for herself.

But the author maintains some mystery around Klara’s inner emotional life. “Does she actually understand human emotions, or is she just observing human emotions and simulating them within herself?” he said. “I suppose the question comes back to, what are our emotions as human beings? What do they amount to?”

Klara is particularly attuned to human loneliness, since she essentially was made to help prevent it. It is, in her view, peoples’ biggest fear, and something they’ll go to great lengths to avoid, yet can never fully escape. “Perhaps all humans are lonely,” she says.

Warding off loneliness through technology isn’t a futuristic idea, it’s something we’ve been doing for a long time, with the technologies at hand growing more and more sophisticated. Products like AFs already exist. There’s XiaoIce, a chatbot that uses “sentiment analysis” to keep its 660 million users engaged, and Azuma Hikari, a character-based AI designed to “bring comfort” to users whose lives lack emotional connection with other humans.

The mere existence of these tools would be sinister if it wasn’t for their widespread adoption; when millions of people use AIs to fill a void in their lives, it raises deeper questions about our ability to connect with each other and whether technology is building it up or tearing it down.

This isn’t the only big question the novel tackles. An overarching theme is one we’ve been increasingly contemplating as computers start to acquire more complex capabilities, like the beginnings of creativity or emotional awareness: What is it that truly makes us human?

“Do you believe in the human heart?” one character asks. “I don’t mean simply the organ, obviously. I’m speaking in the poetic sense. The human heart. Do you think there is such a thing? Something that makes each of us special and individual?”

The alternative, at least in the story, is that people don’t have a unique essence, but rather we’re all a blend of traits and personalities that can be reduced to strings of code. Our understanding of the brain is still elementary, but at some level, doesn’t all human experience boil down to the firing of billions of neurons between our ears? Will we one day—in a future beyond that painted by Ishiguro, but certainly foreshadowed by it—be able to “decode” our humanity to the point that there’s nothing mysterious left about it? “A human heart is bound to be complex,” Klara says. “But it must be limited.”

Whether or not you agree, Klara and the Sun is worth the read. It’s both a marvelous, engaging story about what it means to love and be human, and a prescient warning to approach technological change with caution and nuance. We’re already living in a world where AI keeps us company, influences our behavior, and is wreaking various forms of havoc. Ishiguro’s novel is a snapshot of one of our possible futures, told through the eyes of a robot who keeps you rooting for her to the end.

Image Credit: Marion Wellmann from Pixabay Continue reading

Posted in Human Robots

#439010 Video Friday: Nanotube-Powered Insect ...

Video Friday is your weekly selection of awesome robotics videos, collected by your Automaton bloggers. We’ll also be posting a weekly calendar of upcoming robotics events for the next few months; here's what we have so far (send us your events!):

HRI 2021 – March 8-11, 2021 – [Online Conference]
RoboSoft 2021 – April 12-16, 2021 – [Online Conference]
ICRA 2021 – May 30-5, 2021 – Xi'an, China
Let us know if you have suggestions for next week, and enjoy today's videos.

If you’ve ever swatted a mosquito away from your face, only to have it return again (and again and again), you know that insects can be remarkably acrobatic and resilient in flight. Those traits help them navigate the aerial world, with all of its wind gusts, obstacles, and general uncertainty. Such traits are also hard to build into flying robots, but MIT Assistant Professor Kevin Yufeng Chen has built a system that approaches insects’ agility.

Chen’s actuators can flap nearly 500 times per second, giving the drone insect-like resilience. “You can hit it when it’s flying, and it can recover,” says Chen. “It can also do aggressive maneuvers like somersaults in the air.” And it weighs in at just 0.6 grams, approximately the mass of a large bumble bee. The drone looks a bit like a tiny cassette tape with wings, though Chen is working on a new prototype shaped like a dragonfly.

[ MIT ]

National Robotics Week is April 3-11, 2021!

[ NRW ]

This is in a motion capture environment, but still, super impressive!

[ Paper ]

Thanks Fan!

Why wait for Boston Dynamics to add an arm to your Spot if you can just do it yourself?

[ ETHZ ]

This video shows the deep-sea free swimming of soft robot in the South China Sea. The soft robot was grasped by a robotic arm on ‘HAIMA’ ROV and reached the bottom of the South China Sea (depth of 3,224 m). After the releasing, the soft robot was actuated with an on-board AC voltage of 8 kV at 1 Hz and demonstrated free swimming locomotion with its flapping fins.

Um, did they bring it back?

[ Nature ]

Quadruped Yuki Mini is 12 DOF robot equipped with a Raspberry Pi that runs ROS. Also, BUNNIES!

[ Lingkang Zhang ]

Thanks Lingkang!

Deployment of drone swarms usually relies on inter-agent communication or visual markers that are mounted on the vehicles to simplify their mutual detection. The vswarm package enables decentralized vision-based control of drone swarms without relying on inter-agent communication or visual fiducial markers. The results show that the drones can safely navigate in an outdoor environment despite substantial background clutter and difficult lighting conditions.

[ Vswarm ]

A conventional adopted method for operating a waiter robot is based on the static position control, where pre-defined goal positions are marked on a map. However, this solution is not optimal in a dynamic setting, such as in a coffee shop or an outdoor catering event, because the customers often change their positions. We explore an alternative human-robot interface design where a human operator communicates the identity of the customer to the robot instead. Inspired by how [a] human communicates, we propose a framework for communicating a visual goal to the robot, through interactive two-way communications.

[ Paper ]

Thanks Poramate!

In this video, LOLA reacts to undetected ground height changes, including a drop and leg-in-hole experiment. Further tests show the robustness to vertical disturbances using a seesaw. The robot is technically blind, not using any camera-based or prior information on the terrain.

[ TUM ]

RaiSim is a cross-platform multi-body physics engine for robotics and AI. It fully supports Linux, Mac OS, and Windows.

[ RaiSim ]

Thanks Fan!

The next generation of LoCoBot is here. The LoCoBot is an ROS research rover for mapping, navigation and manipulation (optional) that enables researchers, educators and students alike to focus on high level code development instead of hardware and building out lower level code. Development on the LoCoBot is simplified with open source software, full ROS-mapping and navigation packages and modular opensource Python API that allows users to move the platform as well as (optional) manipulator in as few as 10 lines of code.

[ Trossen ]

MIT Media Lab Research Specialist Dr. Kate Darling looks at how robots are portrayed in popular film and TV shows.

Kate's book, The New Breed: What Our History with Animals Reveals about Our Future with Robots can be pre-ordered now and comes out next month.

[ Kate Darling ]

The current autonomous mobility systems for planetary exploration are wheeled rovers, limited to flat, gently-sloping terrains and agglomerate regolith. These vehicles cannot tolerate instability and operate within a low-risk envelope (i.e., low-incline driving to avoid toppling). Here, we present ‘Mars Dogs’ (MD), four-legged robotic dogs, the next evolution of extreme planetary exploration.

[ Team CoSTAR ]

In 2020, first-year PhD students at the MIT Media Lab were tasked with a special project—to reimagine the Lab and write sci-fi stories about the MIT Media Lab in the year 2050. “But, we are researchers. We don't only write fiction, we also do science! So, we did what scientists do! We used a secret time machine under the MIT dome to go to the year 2050 and see what’s going on there! Luckily, the Media Lab still exists and we met someone…really cool!” Enjoy this interview of Cyber Joe, AI Mentor for MIT Media Lab Students of 2050.

[ MIT ]

In this talk, we will give an overview of the diverse research we do at CSIRO’s Robotics and Autonomous Systems Group and delve into some specific technologies we have developed including SLAM and Legged robotics. We will also give insights into CSIRO’s participation in the current DARPA Subterranean Challenge where we are deploying a fleet of heterogeneous robots into GPS-denied unknown underground environments.

[ GRASP Seminar ]

Marco Hutter (ETH) and Hae-Won Park (KAIST) talk about “Robotics Inspired by Nature.”

[ Swiss-Korean Science Club ]

Thanks Fan!

In this keynote, Guy Hoffman Assistant Professor and the Mills Family Faculty Fellow in the Sibley School of Mechanical and Aerospace Engineering at Cornell University, discusses “The Social Uncanny of Robotic Companions.”

[ Designerly HRI ] Continue reading

Posted in Human Robots

#439000 Can AI Stop People From Believing Fake ...

Machine learning algorithms provide a way to detect misinformation based on writing style and how articles are shared.

On topics as varied as climate change and the safety of vaccines, you will find a wave of misinformation all over social media. Trust in conventional news sources may seem lower than ever, but researchers are working on ways to give people more insight on whether they can believe what they read. Researchers have been testing artificial intelligence (AI) tools that could help filter legitimate news. But how trustworthy is AI when it comes to stopping the spread of misinformation?

Researchers at the Rensselaer Polytechnic Institute (RPI) and the University of Tennessee collaborated to study the role of AI in helping people identify whether the news they’re reading is legitimate or not.

The research paper, “Tailoring Heuristics and Timing AI Interventions for Supporting News Veracity Assessments,” was published in Computers in Human Behavior Reports. It discussed how crowdsourcing marketplace Amazon Mechanical Turk (AMT) can be used to identify misinformation for fresh news and specific heuristics, which are rules of thumb used to process information and consider its veracity. In other words, heuristics are essentially “shortcuts for decisions,” explained Dorit Nevo, an associate professor at RPI’s Lally School of Management and a lead author for the paper.

The study found that AI would be successful in flagging false stories only if the reader did not already have an opinion on the topic, Nevo said. When study subjects were set in their beliefs, confirmation bias kept them from reassessing their views.

Nevo said the first part of the project focused on whether subjects could detect misinformation around climate change and vaccines like the one designed to prevent chicken pox. Then, beginning in April 2020, her team studied how people responded to news related to COVID-19.

“With COVID-19, there was a significant difference,” Nevo said. They found that about 72 percent of respondents could identify misinformation about the coronavirus without heuristic clues, and roughly 93 percent were able to be convinced by the researcher’s heuristics that the content was fake.

Examples of heuristic clues include text with too many capital letters or the use of strong language, Nevo said.

There were two types of heuristics mentioned in the team’s paper: objective heuristics and source heuristics. They put a statement at the top of each article the subjects read; it instructed them to read the article and indicate whether they believed its central thesis.

“We either put a statement that says the AI finds this article reliable and accurate based on the objective heuristics, or we said the AI finds the source reliable,” Nevo said. “So that's the source heuristic.”

In her research on heuristics, Nevo found that people’s thinking takes one of two paths: The first path is to read the article, think about it and decide if they believe it; the second is to consider the source and what others think about the news, and decide whether to believe it before reading it.

Image: Dorit Nevo/RPI/IEEE Spectrum

Researchers at RPI researched the role of heuristics and AI in detecting whether people thought news was credible

Another research paper, “Timing Matters When Correcting Fake News,” published in the Proceedings of the National Academy of Science by researchers at Harvard University, differed from the RPI researchers in its findings. While Nevo and her collaborators found that it’s easier to convince people that a story is fake news before reading it, the Harvard researchers, led by Nadia M. Brashier, a psychologist and neuroscientist, discovered that a fact-check can convince people of misinformation even after reading headlines. When study subjects read true or false labels after reading a headline, that resulted in a 25.3 percent reduction in “subsequent misclassification,” when compared to headlines with no tag, Brashier and her team found.

In the end, fighting misinformation will require both computing and human efforts such as policy changes, says Benjamin D. Horne, an assistant professor of Information Sciences at the University of Tennessee and one of Nevo’s co-authors. He says the RPI-Tennessee work was inspired by AI tools he designed previously. Horne was previously a research assistant at RPI, where he developed machine learning (ML) algorithms that can detect partial truths as well as decontextualized truths and out-of-date information.

“Our algorithms are trained on source-level behavior, both when using the textual content of an article and the network of other news sources that it draws news from,” Horne said. “We have found that these two types of features together are quite good at distinguishing between sources labeled as reliable or unreliable by external news source ratings.”

The machine learning algorithms analyze the writing style and the content-sharing behavior of news outlets, Horne said. Researchers trained a supervised ML algorithm called Random Forest, a classification algorithm that uses decision trees.

AI for Detecting Fake News

So, what’s the potential for AI to be successful in detecting misinformation?

“The tools we have developed, and other tools developed in this area, have fairly high accuracy in lab settings,” says Horne. “For example, our most recent technical work showed around 83% accuracy in predicting when the source of a news article is reliable or unreliable.”

Despite the effectiveness of algorithms, old-fashioned fact-checking by journalists will still be required to combat fake news. AI could filter the information for fact-checkers to verify, according to Horne.

“AI tools are great at dealing with high quantities of information at fast speeds but lack the nuanced analysis that a journalist or fact-checker can provide,” Horne said. “I see a future where the two work together.” Continue reading

Posted in Human Robots

#438925 Nanophotonics Could Be the ‘Dark ...

The race to build the first practical quantum computers looks like a two-horse contest between machines built from superconducting qubits and those that use trapped ions. But new research suggests a third contender—machines based on optical technology—could sneak up on the inside.

The most advanced quantum computers today are the ones built by Google and IBM, which rely on superconducting circuits to generate the qubits that form the basis of quantum calculations. They are now able to string together tens of qubits, and while controversial, Google claims its machines have achieved quantum supremacy—the ability to carry out a computation beyond normal computers.

Recently this approach has been challenged by a wave of companies looking to use trapped ion qubits, which are more stable and less error-prone than superconducting ones. While these devices are less developed, engineering giant Honeywell has already released a machine with 10 qubits, which it says is more powerful than a machine made of a greater number of superconducting qubits.

But despite this progress, both of these approaches have some major drawbacks. They require specialized fabrication methods, incredibly precise control mechanisms, and they need to be cooled to close to absolute zero to protect the qubits from any outside interference.

That’s why researchers at Canadian quantum computing hardware and software startup Xanadu are backing an alternative quantum computing approach based on optics, which was long discounted as impractical. In a paper published last week in Nature, they unveiled the first fully programmable and scalable optical chip that can run quantum algorithms. Not only does the system run at room temperature, but the company says it could scale to millions of qubits.

The idea isn’t exactly new. As Chris Lee notes in Ars Technica, people have been experimenting with optical approaches to quantum computing for decades, because encoding information in photons’ quantum states and manipulating those states is relatively easy. The biggest problem was that optical circuits were very large and not readily programmable, which meant you had to build a new computer for every new problem you wanted to solve.

That started to change thanks to the growing maturity of photonic integrated circuits. While early experiments with optical computing involved complex table-top arrangements of lasers, lenses, and detectors, today it’s possible to buy silicon chips not dissimilar to electronic ones that feature hundreds of tiny optical components.

In recent years, the reliability and performance of these devices has improved dramatically, and they’re now regularly used by the telecommunications industry. Some companies believe they could be the future of artificial intelligence too.

This allowed the Xanadu researchers to design a silicon chip that implements a complex optical network made up of beam splitters, waveguides, and devices called interferometers that cause light sources to interact with each other.

The chip can generate and manipulate up to eight qubits, but unlike conventional qubits, which can simultaneously be in two states, these qubits can be in any configuration of three states, which means they can carry more information.

Once the light has travelled through the network, it is then fed out to cutting-edge photon-counting detectors that provide the result. This is one of the potential limitations of the system, because currently these detectors need to be cryogenically cooled, although the rest of the chip does not.

But most importantly, the chip is easily re-programmable, which allows it to tackle a variety of problems. The computation can be controlled by adjusting the settings of these interferometers, but the researchers have also developed a software platform that hides the physical complexity from users and allows them to program it using fairly conventional code.

The company announced that its chips were available on the cloud in September of 2020, but the Nature paper is the first peer-reviewed test of their system. The researchers verified that the computations being done were genuinely quantum mechanical in nature, but they also implemented two more practical algorithms: one for simulating molecules and the other for judging how similar two graphs are, which has applications in a variety of pattern recognition problems.

In an accompanying opinion piece, Ulrik Andersen from the Technical University of Denmark says the quality of the qubits needs to be improved considerably and photon losses reduced if the technology is ever to scale to practical problems. But, he says, this breakthrough suggests optical approaches “could turn out to be the dark horse of quantum computing.”

Image Credit: Shahadat Rahman on Unsplash Continue reading

Posted in Human Robots

#438801 This AI Thrashes the Hardest Atari Games ...

Learning from rewards seems like the simplest thing. I make coffee, I sip coffee, I’m happy. My brain registers “brewing coffee” as an action that leads to a reward.

That’s the guiding insight behind deep reinforcement learning, a family of algorithms that famously smashed most of Atari’s gaming catalog and triumphed over humans in strategy games like Go. Here, an AI “agent” explores the game, trying out different actions and registering ones that let it win.

Except it’s not that simple. “Brewing coffee” isn’t one action; it’s a series of actions spanning several minutes, where you’re only rewarded at the very end. By just tasting the final product, how do you learn to fine-tune grind coarseness, water to coffee ratio, brewing temperature, and a gazillion other factors that result in the reward—tasty, perk-me-up coffee?

That’s the problem with “sparse rewards,” which are ironically very abundant in our messy, complex world. We don’t immediately get feedback from our actions—no video-game-style dings or points for just grinding coffee beans—yet somehow we’re able to learn and perform an entire sequence of arm and hand movements while half-asleep.

This week, researchers from UberAI and OpenAI teamed up to bestow this talent on AI.

The trick is to encourage AI agents to “return” to a previous step, one that’s promising for a winning solution. The agent then keeps a record of that state, reloads it, and branches out again to intentionally explore other solutions that may have been left behind on the first go-around. Video gamers are likely familiar with this idea: live, die, reload a saved point, try something else, repeat for a perfect run-through.

The new family of algorithms, appropriately dubbed “Go-Explore,” smashed notoriously difficult Atari games like Montezuma’s Revenge that were previously unsolvable by its AI predecessors, while trouncing human performance along the way.

It’s not just games and digital fun. In a computer simulation of a robotic arm, the team found that installing Go-Explore as its “brain” allowed it to solve a challenging series of actions when given very sparse rewards. Because the overarching idea is so simple, the authors say, it can be adapted and expanded to other real-world problems, such as drug design or language learning.

Growing Pains
How do you reward an algorithm?

Rewards are very hard to craft, the authors say. Take the problem of asking a robot to go to a fridge. A sparse reward will only give the robot “happy points” if it reaches its destination, which is similar to asking a baby, with no concept of space and danger, to crawl through a potential minefield of toys and other obstacles towards a fridge.

“In practice, reinforcement learning works very well, if you have very rich feedback, if you can tell, ‘hey, this move is good, that move is bad, this move is good, that move is bad,’” said study author Joost Huinzinga. However, in situations that offer very little feedback, “rewards can intentionally lead to a dead end. Randomly exploring the space just doesn’t cut it.”

The other extreme is providing denser rewards. In the same robot-to-fridge example, you could frequently reward the bot as it goes along its journey, essentially helping “map out” the exact recipe to success. But that’s troubling as well. Over-holding an AI’s hand could result in an extremely rigid robot that ignores new additions to its path—a pet, for example—leading to dangerous situations. It’s a deceptive AI solution that seems effective in a simple environment, but crashes in the real world.

What we need are AI agents that can tackle both problems, the team said.

Intelligent Exploration
The key is to return to the past.

For AI, motivation usually comes from “exploring new or unusual situations,” said Huizinga. It’s efficient, but comes with significant downsides. For one, the AI agent could prematurely stop going back to promising areas because it thinks it had already found a good solution. For another, it could simply forget a previous decision point because of the mechanics of how it probes the next step in a problem.

For a complex task, the end result is an AI that randomly stumbles around towards a solution while ignoring potentially better ones.

“Detaching from a place that was previously visited after collecting a reward doesn’t work in difficult games, because you might leave out important clues,” Huinzinga explained.

Go-Explore solves these problems with a simple principle: first return, then explore. In essence, the algorithm saves different approaches it previously tried and loads promising save points—once more likely to lead to victory—to explore further.

Digging a bit deeper, the AI stores screen caps from a game. It then analyzes saved points and groups images that look alike as a potential promising “save point” to return to. Rinse and repeat. The AI tries to maximize its final score in the game, and updates its save points when it achieves a new record score. Because Atari doesn’t usually allow people to revisit any random point, the team used an emulator, which is a kind of software that mimics the Atari system but with custom abilities such as saving and reloading at any time.

The trick worked like magic. When pitted against 55 Atari games in the OpenAI gym, now commonly used to benchmark reinforcement learning algorithms, Go-Explore knocked out state-of-the-art AI competitors over 85 percent of the time.

It also crushed games previously unbeatable by AI. Montezuma’s Revenge, for example, requires you to move Pedro, the blocky protagonist, through a labyrinth of underground temples while evading obstacles such as traps and enemies and gathering jewels. One bad jump could derail the path to the next level. It’s a perfect example of sparse rewards: you need a series of good actions to get to the reward—advancing onward.

Go-Explore didn’t just beat all levels of the game, a first for AI. It also scored higher than any previous record for reinforcement learning algorithms at lower levels while toppling the human world record.

Outside a gaming environment, Go-Explore was also able to boost the performance of a simulated robot arm. While it’s easy for humans to follow high-level guidance like “put the cup on this shelf in a cupboard,” robots often need explicit training—from grasping the cup to recognizing a cupboard, moving towards it while avoiding obstacles, and learning motions to not smash the cup when putting it down.

Here, similar to the real world, the digital robot arm was only rewarded when it placed the cup onto the correct shelf, out of four possible shelves. When pitted against another algorithm, Go-Explore quickly figured out the movements needed to place the cup, while its competitor struggled with even reliably picking the cup up.

Combining Forces
By itself, the “first return, then explore” idea behind Go-Explore is already powerful. The team thinks it can do even better.

One idea is to change the mechanics of save points. Rather than reloading saved states through the emulator, it’s possible to train a neural network to do the same, without needing to relaunch a saved state. It’s a potential way to make the AI even smarter, the team said, because it can “learn” to overcome one obstacle once, instead of solving the same problem again and again. The downside? It’s much more computationally intensive.

Another idea is to combine Go-Explore with an alternative form of learning, called “imitation learning.” Here, an AI observes human behavior and mimics it through a series of actions. Combined with Go-Explore, said study author Adrien Ecoffet, this could make more robust robots capable of handling all the complexity and messiness in the real world.

To the team, the implications go far beyond Go-Explore. The concept of “first return, then explore” seems to be especially powerful, suggesting “it may be a fundamental feature of learning in general.” The team said, “Harnessing these insights…may be essential…to create generally intelligent agents.”

Image Credit: Adrien Ecoffet, Joost Huizinga, Joel Lehman, Kenneth O. Stanley, and Jeff Clune Continue reading

Posted in Human Robots