Tag Archives: beyond

#438801 This AI Thrashes the Hardest Atari Games ...

Posted on March 4, 2021 by Android

Learning from rewards seems like the simplest thing. I make coffee, I sip coffee, I’m happy. My brain registers “brewing coffee” as an action that leads to a reward.

That’s the guiding insight behind deep reinforcement learning, a family of algorithms that famously smashed most of Atari’s gaming catalog and triumphed over humans in strategy games like Go. Here, an AI “agent” explores the game, trying out different actions and registering ones that let it win.

Except it’s not that simple. “Brewing coffee” isn’t one action; it’s a series of actions spanning several minutes, where you’re only rewarded at the very end. By just tasting the final product, how do you learn to fine-tune grind coarseness, water to coffee ratio, brewing temperature, and a gazillion other factors that result in the reward—tasty, perk-me-up coffee?

That’s the problem with “sparse rewards,” which are ironically very abundant in our messy, complex world. We don’t immediately get feedback from our actions—no video-game-style dings or points for just grinding coffee beans—yet somehow we’re able to learn and perform an entire sequence of arm and hand movements while half-asleep.

This week, researchers from UberAI and OpenAI teamed up to bestow this talent on AI.

The trick is to encourage AI agents to “return” to a previous step, one that’s promising for a winning solution. The agent then keeps a record of that state, reloads it, and branches out again to intentionally explore other solutions that may have been left behind on the first go-around. Video gamers are likely familiar with this idea: live, die, reload a saved point, try something else, repeat for a perfect run-through.

The new family of algorithms, appropriately dubbed “Go-Explore,” smashed notoriously difficult Atari games like Montezuma’s Revenge that were previously unsolvable by its AI predecessors, while trouncing human performance along the way.

It’s not just games and digital fun. In a computer simulation of a robotic arm, the team found that installing Go-Explore as its “brain” allowed it to solve a challenging series of actions when given very sparse rewards. Because the overarching idea is so simple, the authors say, it can be adapted and expanded to other real-world problems, such as drug design or language learning.

Growing Pains
How do you reward an algorithm?

Rewards are very hard to craft, the authors say. Take the problem of asking a robot to go to a fridge. A sparse reward will only give the robot “happy points” if it reaches its destination, which is similar to asking a baby, with no concept of space and danger, to crawl through a potential minefield of toys and other obstacles towards a fridge.

“In practice, reinforcement learning works very well, if you have very rich feedback, if you can tell, ‘hey, this move is good, that move is bad, this move is good, that move is bad,’” said study author Joost Huinzinga. However, in situations that offer very little feedback, “rewards can intentionally lead to a dead end. Randomly exploring the space just doesn’t cut it.”

The other extreme is providing denser rewards. In the same robot-to-fridge example, you could frequently reward the bot as it goes along its journey, essentially helping “map out” the exact recipe to success. But that’s troubling as well. Over-holding an AI’s hand could result in an extremely rigid robot that ignores new additions to its path—a pet, for example—leading to dangerous situations. It’s a deceptive AI solution that seems effective in a simple environment, but crashes in the real world.

What we need are AI agents that can tackle both problems, the team said.

Intelligent Exploration
The key is to return to the past.

For AI, motivation usually comes from “exploring new or unusual situations,” said Huizinga. It’s efficient, but comes with significant downsides. For one, the AI agent could prematurely stop going back to promising areas because it thinks it had already found a good solution. For another, it could simply forget a previous decision point because of the mechanics of how it probes the next step in a problem.

For a complex task, the end result is an AI that randomly stumbles around towards a solution while ignoring potentially better ones.

“Detaching from a place that was previously visited after collecting a reward doesn’t work in difficult games, because you might leave out important clues,” Huinzinga explained.

Go-Explore solves these problems with a simple principle: first return, then explore. In essence, the algorithm saves different approaches it previously tried and loads promising save points—once more likely to lead to victory—to explore further.

Digging a bit deeper, the AI stores screen caps from a game. It then analyzes saved points and groups images that look alike as a potential promising “save point” to return to. Rinse and repeat. The AI tries to maximize its final score in the game, and updates its save points when it achieves a new record score. Because Atari doesn’t usually allow people to revisit any random point, the team used an emulator, which is a kind of software that mimics the Atari system but with custom abilities such as saving and reloading at any time.

The trick worked like magic. When pitted against 55 Atari games in the OpenAI gym, now commonly used to benchmark reinforcement learning algorithms, Go-Explore knocked out state-of-the-art AI competitors over 85 percent of the time.

It also crushed games previously unbeatable by AI. Montezuma’s Revenge, for example, requires you to move Pedro, the blocky protagonist, through a labyrinth of underground temples while evading obstacles such as traps and enemies and gathering jewels. One bad jump could derail the path to the next level. It’s a perfect example of sparse rewards: you need a series of good actions to get to the reward—advancing onward.

Go-Explore didn’t just beat all levels of the game, a first for AI. It also scored higher than any previous record for reinforcement learning algorithms at lower levels while toppling the human world record.

Outside a gaming environment, Go-Explore was also able to boost the performance of a simulated robot arm. While it’s easy for humans to follow high-level guidance like “put the cup on this shelf in a cupboard,” robots often need explicit training—from grasping the cup to recognizing a cupboard, moving towards it while avoiding obstacles, and learning motions to not smash the cup when putting it down.

Here, similar to the real world, the digital robot arm was only rewarded when it placed the cup onto the correct shelf, out of four possible shelves. When pitted against another algorithm, Go-Explore quickly figured out the movements needed to place the cup, while its competitor struggled with even reliably picking the cup up.

Combining Forces
By itself, the “first return, then explore” idea behind Go-Explore is already powerful. The team thinks it can do even better.

One idea is to change the mechanics of save points. Rather than reloading saved states through the emulator, it’s possible to train a neural network to do the same, without needing to relaunch a saved state. It’s a potential way to make the AI even smarter, the team said, because it can “learn” to overcome one obstacle once, instead of solving the same problem again and again. The downside? It’s much more computationally intensive.

Another idea is to combine Go-Explore with an alternative form of learning, called “imitation learning.” Here, an AI observes human behavior and mimics it through a series of actions. Combined with Go-Explore, said study author Adrien Ecoffet, this could make more robust robots capable of handling all the complexity and messiness in the real world.

To the team, the implications go far beyond Go-Explore. The concept of “first return, then explore” seems to be especially powerful, suggesting “it may be a fundamental feature of learning in general.” The team said, “Harnessing these insights…may be essential…to create generally intelligent agents.”

Image Credit: Adrien Ecoffet, Joost Huizinga, Joel Lehman, Kenneth O. Stanley, and Jeff Clune Continue reading →

Posted in Human Robots

#438798 This AI Thrashes the Hardest Atari Games ...

Posted on March 3, 2021 by Android

Learning from rewards seems like the simplest thing. I make coffee, I sip coffee, I’m happy. My brain registers “brewing coffee” as an action that leads to a reward.

This week, researchers from UberAI and OpenAI teamed up to bestow this talent on AI.

Growing Pains
How do you reward an algorithm?

What we need are AI agents that can tackle both problems, the team said.

Intelligent Exploration
The key is to return to the past.

For a complex task, the end result is an AI that randomly stumbles around towards a solution while ignoring potentially better ones.

“Detaching from a place that was previously visited after collecting a reward doesn’t work in difficult games, because you might leave out important clues,” Huinzinga explained.

Combining Forces
By itself, the “first return, then explore” idea behind Go-Explore is already powerful. The team thinks it can do even better.

Image Credit: Adrien Ecoffet, Joost Huizinga, Joel Lehman, Kenneth O. Stanley, and Jeff Clune Continue reading →

Posted in Human Robots

#438769 Will Robots Make Good Friends? ...

Posted on February 26, 2021 by Android

In the 2012 film Robot and Frank, the protagonist, a retired cat burglar named Frank, is suffering the early symptoms of dementia. Concerned and guilty, his son buys him a “home robot” that can talk, do household chores like cooking and cleaning, and remind Frank to take his medicine. It’s a robot the likes of which we’re getting closer to building in the real world.

The film follows Frank, who is initially appalled by the idea of living with a robot, as he gradually begins to see the robot as both functionally useful and socially companionable. The film ends with a clear bond between man and machine, such that Frank is protective of the robot when the pair of them run into trouble.

This is, of course, a fictional story, but it challenges us to explore different kinds of human-to-robot bonds. My recent research on human-robot relationships examines this topic in detail, looking beyond sex robots and robot love affairs to examine that most profound and meaningful of relationships: friendship.

My colleague and I identified some potential risks, like the abandonment of human friends for robotic ones, but we also found several scenarios where robotic companionship can constructively augment people’s lives, leading to friendships that are directly comparable to human-to-human relationships.

Philosophy of Friendship
The robotics philosopher John Danaher sets a very high bar for what friendship means. His starting point is the “true” friendship first described by the Greek philosopher Aristotle, which saw an ideal friendship as premised on mutual good will, admiration, and shared values. In these terms, friendship is about a partnership of equals.

Building a robot that can satisfy Aristotle’s criteria is a substantial technical challenge and is some considerable way off, as Danaher himself admits. Robots that may seem to be getting close, such as Hanson Robotics’ Sophia, base their behavior on a library of pre-prepared responses: a humanoid chatbot, rather than a conversational equal. Anyone who’s had a testing back-and-forth with Alexa or Siri will know AI still has some way to go in this regard.

Aristotle also talked about other forms of “imperfect” friendship, such as “utilitarian” and “pleasure” friendships, which are considered inferior to true friendship because they don’t require symmetrical bonding and are often to one party’s unequal benefit. This form of friendship sets a relatively very low bar which some robots, like “sexbots” and robotic pets, clearly already meet.

Artificial Amigos
For some, relating to robots is just a natural extension of relating to other things in our world, like people, pets, and possessions. Psychologists have even observed how people respond naturally and socially towards media artefacts like computers and televisions. Humanoid robots, you’d have thought, are more personable than your home PC.

However, the field of “robot ethics” is far from unanimous on whether we can—or should— develop any form of friendship with robots. For an influential group of UK researchers who charted a set of “ethical principles of robotics,” human-robot “companionship” is an oxymoron, and to market robots as having social capabilities is dishonest and should be treated with caution, if not alarm. For these researchers, wasting emotional energy on entities that can only simulate emotions will always be less rewarding than forming human-to-human bonds.

But people are already developing bonds with basic robots, like vacuum-cleaning and lawn-trimming machines that can be bought for less than the price of a dishwasher. A surprisingly large number of people give these robots pet names—something they don’t do with their dishwashers. Some even take their cleaning robots on holiday.

Other evidence of emotional bonds with robots include the Shinto blessing ceremony for Sony Aibo robot dogs that were dismantled for spare parts, and the squad of US troops who fired a 21-gun salute, and awarded medals, to a bomb-disposal robot named “Boomer” after it was destroyed in action.

These stories, and the psychological evidence we have so far, make clear that we can extend emotional connections to things that are very different to us, even when we know they are manufactured and pre-programmed. But do those connections constitute a friendship comparable to that shared between humans?

True Friendship?
A colleague and I recently reviewed the extensive literature on human-to-human relationships to try to understand how, and if, the concepts we found could apply to bonds we might form with robots. We found evidence that many coveted human-to-human friendships do not in fact live up to Aristotle’s ideal.

We noted a wide range of human-to-human relationships, from relatives and lovers to parents, carers, service providers, and the intense (but unfortunately one-way) relationships we maintain with our celebrity heroes. Few of these relationships could be described as completely equal and, crucially, they are all destined to evolve over time.

All this means that expecting robots to form Aristotelian bonds with us is to set a standard even human relationships fail to live up to. We also observed forms of social connectedness that are rewarding and satisfying and yet are far from the ideal friendship outlined by the Greek philosopher.

We know that social interaction is rewarding in its own right, and something that, as social mammals, humans have a strong need for. It seems probable that relationships with robots could help to address the deep-seated urge we all feel for social connection—like providing physical comfort, emotional support, and enjoyable social exchanges—currently provided by other humans.

Our paper also discussed some potential risks. These arise particularly in settings where interaction with a robot could come to replace interaction with people, or where people are denied a choice as to whether they interact with a person or a robot—in a care setting, for instance.

These are important concerns, but they’re possibilities and not inevitabilities. In the literature we reviewed we actually found evidence of the opposite effect: robots acting to scaffold social interactions with others, acting as ice-breakers in groups, and helping people to improve their social skills or to boost their self-esteem.

It appears likely that, as time progresses, many of us will simply follow Frank’s path towards acceptance: scoffing at first, before settling into the idea that robots can make surprisingly good companions. Our research suggests that’s already happening—though perhaps not in a way of which Aristotle would have approved.

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Image Credit: Andy Kelly on Unsplash Continue reading →

Posted in Human Robots

#438749 Folding Drone Can Drop Into Inaccessible ...

Posted on February 23, 2021 by Android

Inspecting old mines is a dangerous business. For humans, mines can be lethal: prone to rockfalls and filled with noxious gases. Robots can go where humans might suffocate, but even robots can only do so much when mines are inaccessible from the surface.

Now, researchers in the UK, led by Headlight AI, have developed a drone that could cast a light in the darkness. Named Prometheus, this drone can enter a mine through a borehole not much larger than a football, before unfurling its arms and flying around the void. Once down there, it can use its payload of scanning equipment to map mines where neither humans nor robots can presently go. This, the researchers hope, could make mine inspection quicker and easier. The team behind Prometheus published its design in November in the journal Robotics.

Mine inspection might seem like a peculiarly specific task to fret about, but old mines can collapse, causing the ground to sink and damaging nearby buildings. It’s a far-reaching threat: the geotechnical engineering firm Geoinvestigate, based in Northeast England, estimates that around 8 percent of all buildings in the UK are at risk from any of the thousands of abandoned coal mines near the country’s surface. It’s also a threat to transport, such as road and rail. Indeed, Prometheus is backed by Network Rail, which operates Britain’s railway infrastructure.

Such grave dangers mean that old mines need periodic check-ups. To enter depths that are forbidden to traditional wheeled robots—such as those featured in the DARPA SubT Challenge—inspectors today drill boreholes down into the mine and lower scanners into the darkness.

But that can be an arduous and often fruitless process. Inspecting the entirety of a mine can take multiple boreholes, and that still might not be enough to chart a complete picture. Mines are jagged, labyrinthine places, and much of the void might lie out of sight. Furthermore, many old mines aren’t well-mapped, so it’s hard to tell where best to enter them.

Prometheus can fly around some of those challenges. Inspectors can lower Prometheus, tethered to a docking apparatus, down a single borehole. Once inside the mine, the drone can undock and fly around, using LIDAR scanners—common in mine inspection today—to generate a 3D map of the unknown void. Prometheus can fly through the mine autonomously, using infrared data to plot out its own course.

Other drones exist that can fly underground, but they’re either too small to carry a relatively heavy payload of scanning equipment, or too large to easily fit down a borehole. What makes Prometheus unique is its ability to fold its arms, allowing it to squeeze down spaces its counterparts cannot.

It’s that ability to fold and enter a borehole that makes Prometheus remarkable, says Jason Gross, a professor of mechanical and aerospace engineering at West Virginia University. Gross calls Prometheus “an exciting idea,” but he does note that it has a relatively short flight window and few abilities beyond scanning.

The researchers have conducted a number of successful test flights, both in a basement and in an old mine near Shrewsbury, England. Not only was Prometheus able to map out its space, the drone was able to plot its own course in an unknown area.

The researchers’ next steps, according to Puneet Chhabra, co-founder of Headlight AI, will be to test Prometheus’s ability to unfold in an actual mine. Following that, researchers plan to conduct full-scale test flights by the end of 2021. Continue reading →

Posted in Human Robots

#438613 Video Friday: Digit Takes a Hike

Posted on February 13, 2021 by Android

Video Friday is your weekly selection of awesome robotics videos, collected by your Automaton bloggers. We’ll also be posting a weekly calendar of upcoming robotics events for the next few months; here's what we have so far (send us your events!):

HRI 2021 – March 8-11, 2021 – [Online Conference]
RoboSoft 2021 – April 12-16, 2021 – [Online Conference]
ICRA 2021 – May 30-5, 2021 – Xi'an, China
Let us know if you have suggestions for next week, and enjoy today's videos.

It's winter in Oregon, so everything is damp, all the time. No problem for Digit!

Also the case for summer in Oregon.

[ Agility Robotics ]

While other organisms form collective flocks, schools, or swarms for such purposes as mating, predation, and protection, the Lumbriculus variegatus worms are unusual in their ability to braid themselves together to accomplish tasks that unconnected individuals cannot. A new study reported by researchers at the Georgia Institute of Technology describes how the worms self-organize to act as entangled “active matter,” creating surprising collective behaviors whose principles have been applied to help blobs of simple robots evolve their own locomotion.

No, this doesn't squick me out at all, why would it.

[ Georgia Tech ]

A few years ago, we wrote about Zhifeng Huang's jet-foot equipped bipedal robot, and he's been continuing to work on it to the point where it can now step over gaps that are an absolutely astonishing 147% of its leg length.

[ Paper ]

Thanks Zhifeng!

The Inception Drive is a novel, ultra-compact design for an Infinitely Variable Transmission (IVT) that uses nested-pulleys to adjust the gear ratio between input and output shafts. This video shows the first proof-of-concept prototype for a “Fully Balanced” design, where the spinning masses within the drive are completely balanced to reduce vibration, thereby allowing the drive to operate more efficiently and at higher speeds than achievable on an unbalanced design.

As shown in this video, the Inception Drive can change both the speed and direction of rotation of the output shaft while keeping the direction and speed of the input shaft constant. This ability to adjust speed and direction within such a compact package makes the Inception Drive a compelling choice for machine designers in a wide variety of fields, including robotics, automotive, and renewable-energy generation.

[ SRI ]

Robots with kinematic loops are known to have superior mechanical performance. However, due to these loops, their modeling and control is challenging, and prevents a more widespread use. In this paper, we describe a versatile Inverse Kinematics (IK) formulation for the retargeting of expressive motions onto mechanical systems with loops.

[ Disney Research ]

Watch Engineered Arts put together one of its Mesmer robots in a not at all uncanny way.

[ Engineered Arts ]

There's been a bunch of interesting research into vision-based tactile sensing recently; here's some from Van Ho at JAIST:

[ Paper ]

Thanks Van!

This is really more of an automated system than a robot, but these little levitating pucks are very very slick.

ACOPOS 6D is based on the principle of magnetic levitation: Shuttles with integrated permanent magnets float over the surface of electromagnetic motor segments. The modular motor segments are 240 x 240 millimeters in size and can be arranged freely in any shape. A variety of shuttle sizes carry payloads of 0.6 to 14 kilograms and reach speeds of up to 2 meters per second. They can move freely in two-dimensional space, rotate and tilt along three axes and offer precise control over the height of levitation. All together, that gives them six degrees of motion control freedom.

[ ACOPOS ]

Navigation and motion control of a robot to a destination are tasks that have historically been performed with the assumption that contact with the environment is harmful. This makes sense for rigid-bodied robots where obstacle collisions are fundamentally dangerous. However, because many soft robots have bodies that are low-inertia and compliant, obstacle contact is inherently safe. We find that a planner that takes into account and capitalizes on environmental contact produces paths that are more robust to uncertainty than a planner that avoids all obstacle contact.

[ CHARM Lab ]

The quadrotor experts at UZH have been really cranking it up recently.

Aerodynamic forces render accurate high-speed trajectory tracking with quadrotors extremely challenging. These complex aerodynamic effects become a significant disturbance at high speeds, introducing large positional tracking errors, and are extremely difficult to model. To fly at high speeds, feedback control must be able to account for these aerodynamic effects in real-time. This necessitates a modelling procedure that is both accurate and efficient to evaluate. Therefore, we present an approach to model aerodynamic effects using Gaussian Processes, which we incorporate into a Model Predictive Controller to achieve efficient and precise real-time feedback control, leading to up to 70% reduction in trajectory tracking error at high speeds. We verify our method by extensive comparison to a state-of-the-art linear drag model in synthetic and real-world experiments at speeds of up to 14m/s and accelerations beyond 4g.

[ Paper ]

I have not heard much from Harvest Automation over the last couple years and their website was last updated in 2016, but I guess they're selling robots in France, so that's good?

[ Harvest Automation ]

Last year, Clearpath Robotics introduced a ROS package for Spot which enables robotics developers to leverage ROS capabilities out-of-the-box. Here at OTTO Motors, we thought it would be a compelling test case to see just how easy it would be to integrate Spot into our test fleet of OTTO materials handling robots.

[ OTTO Motors ]

Video showcasing recent robotics activities at PRISMA Lab, coordinated by Prof. Bruno Siciliano, at Università di Napoli Federico II.

[ PRISMA Lab ]

Thanks Fan!

State estimation framework developed by the team CoSTAR for the DARPA Subterranean Challenge, where the team achieved 2nd and 1st places in the Tunnel and Urban circuits.

[ Paper ]

Highlights from the 2020 ROS Industrial conference.

[ ROS Industrial ]

Thanks Thilo!

Not robotics, but entertaining anyway. From the CHI 1995 Technical Video Program, “The Tablet Newspaper: a Vision for the Future.”

[ CHI 1995 ]

This week's GRASP on Robotics seminar comes from Allison Okamura at Stanford, on “Wearable Haptic Devices for Ubiquitous Communication.”

Haptic devices allow touch-based information transfer between humans and intelligent systems, enabling communication in a salient but private manner that frees other sensory channels. For such devices to become ubiquitous, their physical and computational aspects must be intuitive and unobtrusive. We explore the design of a wide array of haptic feedback mechanisms, ranging from devices that can be actively touched by the fingertips to multi-modal haptic actuation mounted on the arm. We demonstrate how these devices are effective in virtual reality, human-machine communication, and human-human communication.

[ UPenn ] Continue reading →

Posted in Human Robots

Humanoid Gallery

Popular Searches

Tag Archives: beyond

#438801 This AI Thrashes the Hardest Atari Games ...

#438798 This AI Thrashes the Hardest Atari Games ...

#438769 Will Robots Make Good Friends? ...

#438749 Folding Drone Can Drop Into Inaccessible ...

#438613 Video Friday: Digit Takes a Hike