Tag Archives: action

#439032 To Learn To Deal With Uncertainty, This ...

Posted on March 26, 2021 by Android

AI is endowing robots, autonomous vehicles and countless of other forms of tech with new abilities and levels of self-sufficiency. Yet these models faithfully “make decisions” based on whatever data is fed into them, which could have dangerous consequences. For instance, if an autonomous car is driving down a highway and the sensor picks up a confusing signal (e.g., a paint smudge that is incorrectly interpreted as a lane marking), this could cause the car to swerve into another lane unnecessarily.

But in the ever-evolving world of AI, researchers are developing new ways to address challenges like this. One group of researchers has devised a new algorithm that allows the AI model to account for uncertain data, which they describe in a study published February 15 in IEEE Transactions on Neural Networks and Learning Systems.

“While we would like robots to work seamlessly in the real world, the real world is full of uncertainty,” says Michael Everett, a post-doctoral associate at MIT who helped develop the new approach. “It's important for a system to be aware of what it knows and what it is unsure about, which has been a major challenge for modern AI.”

His team focused on a type of AI called reinforcement learning (RL), whereby the model tries to learn the “value” of taking each action in a given scenario through trial-and-error. They developed a secondary algorithm, called Certified Adversarial Robustness for deep RL (CARRL), that can be built on top of an existing RL model.

“Our key innovation is that rather than blindly trusting the measurements, as is done today [by AI models], our algorithm CARRL thinks through all possible measurements that could have been made, and makes a decision that considers the worst-case outcome,” explains Everett.

In their study, the researchers tested CARRL across several different tasks, including collision avoidance simulations and Atari pong. For younger readers who may not be familiar with it, Atari pong is a classic computer game whereby an electronic paddle is used to direct a ping pong on the screen. In the test scenario, CARRL helped move the paddle slightly higher or lower to compensate for the possibility that the ball could approach at a slightly different point than what the input data indicated. All the while, CARRL would try to ensure that the ball would make contact with at least some part of paddle.

Gif: MIT Aerospace Controls Laboratory

In a perfect world, the information that an AI model is fed would be accurate all the time and AI model will perform well (left). But in some cases, the AI may be given inaccurate data, causing it to miss its targets (middle). The new algorithm CARRL helps AIs account for uncertainty in its data inputs, yielding a better performance when relying on poor data (right).

Across all test scenarios, the RL model was better at compensating for potential inaccurate or “noisy” data with CARRL, than without CARRL.

But the results also show that, like with humans, too much self-doubt and uncertainty can be unhelpful. In the collision avoidance scenario, for example, indulging in too much uncertainty caused the main moving object in the simulation to avoid both the obstacle and its goal. “There is definitely a limit to how ‘skeptical’ the algorithm can be without becoming overly conservative,” Everett says.

This research was funded by Ford Motor Company, but Everett notes that it could be applicable under many other commercial applications requiring safety-aware AI, including aerospace, healthcare, or manufacturing domains.

“This work is a step toward my vision of creating ‘certifiable learning machines’—systems that can discover how to explore and perform in the real world on their own, while still having safety and robustness guarantees,” says Everett. “We'd like to bring CARRL into robotic hardware while continuing to explore the theoretical challenges at the interface of robotics and AI.” Continue reading →

Posted in Human Robots

#439006 Low-Cost Drones Learn Precise Control ...

Posted on March 19, 2021 by Android

I’ll admit to having been somewhat skeptical about the strategy of dangling payloads on long tethers for drone delivery. I mean, I get why Wing does it— it keeps the drone and all of its spinny bits well away from untrained users while preserving the capability of making deliveries to very specific areas that may have nearby obstacles. But it also seems like you’re adding some risk as well, because once your payload is out on that long tether, it’s more or less out of your control in at least two axes. And you can forget about your drone doing anything while this is going on, because who the heck knows what’s going to happen to your payload if the drone starts moving around?

NYU roboticists, that’s who.

This research is by Guanrui Li, Alex Tunchez, and Giuseppe Loianno at the Agile Robotics and Perception Lab (ARPL) at NYU. As you can see from the video, the drone makes keeping rock-solid control over that suspended payload look easy, but it’s very much not, especially considering that everything you see is running onboard the drone itself at 500Hz— all it takes is an IMU and a downward-facing monocular camera, along with the drone’s Snapdragon processor.

To get this to work, the drone has to be thinking about two things. First, there’s state estimation, which is the behavior of the drone itself along with its payload at the end of the tether. The drone figures this out by watching how the payload moves using its camera and tracking its own movement with its IMU. Second, there’s predicting what the payload is going to do next, and how that jibes (or not) with what the drone wants to do next. The researchers developed a model predictive control (MPC) system for this, with some added perception constraints to make sure that the behavior of the drone keeps the payload in view of the camera.

At the moment, the top speed of the system is 4 m/s, but it sounds like rather than increasing the speed of a single payload-swinging drone, the next steps will be to make the overall system more complicated by somehow using multiple drones to cooperatively manage tethered payloads that are too big or heavy for one drone to handle alone.

For more on this, we spoke with Giuseppe Loianno, head of the ARPL.

IEEE Spectrum: We've seen some examples of delivery drones delivering suspended loads. How will this work improve their capabilities?

Giuseppe Loianno: For the first time, we jointly design a perception-constrained model predictive control and state estimation approaches to enable the autonomy of a quadrotor with a cable suspended payload using onboard sensing and computation. The proposed control method guarantees the visibility of the payload in the robot camera as well as the respect of the system dynamics and actuator constraints. These are critical design aspects to guarantee safety and resilience for such a complex and delicate task involving transportation of objects.

The additional challenge involves the fact that we aim to solve the aforementioned problem using a minimal sensor suite for autonomous navigation made by a single camera and IMU. This is an ambitious goal since it concurrently involves estimating the load and the vehicle states. Previous approaches leverage GPS or motion capture systems for state estimation and do not consider the perception and physical constraints when solving the problem. We are confident that our solution will contribute to making a reality the autonomous delivery process in warehouses or in dense urban areas where the GPS signal is currently absent or shadowed.

Will it make a difference to delivery systems that use an actuated cable and only leave the load suspended for the delivery itself?

This is certainly an interesting question. We believe that adding an actuated cable will introduce more disadvantages than benefits. Certainly, an actuated cable can be leveraged to compensate for cable's swinging motions in windy conditions and/or increase the delivery precision. However, the introduction of additional actuated mechanisms and components come at the price of an increased system mass and inertia. This will reduce the overall flight time and the vehicle’s agility as well as the system resilience with respect to the transportation task. Finally, active mechanisms are also more difficult to design compared to passive ones.

What's challenging about doing all of this on-vehicle?

There are several challenges to solve on-board this problem. First, it is very difficult to concurrently run perception and action on such computationally constrained platforms in real-time. Second, the first aspect becomes even more challenging if we consider as in our case a perception-based constrained receding horizon control problem that aims to guarantee the visibility of the payload during the motion, while concurrently respecting all the system physical and sensing limitations. Finally, it has been challenging to run the entire system at a high rate to fully unleash the system’s agility. We are currently able to reach rates of 500 Hz.

Can your method adapt to loads of varying shapes, sizes, and masses? What about aerodynamics or flying in wind?

Technically, our approach can easily be adapted to varying objects sizes and masses. Our previous contributions have already shown the ability to estimate online changes in the vehicle/load configuration and can potentially be used to operate the proposed system in dynamic conditions, where the load’s characteristics are unknown and/or may vary across consecutive flights. This can be useful for both package delivery or warehouse operations, where different types of objects need to be transported or manipulated.

The aerodynamics problem is a great point. Overall, our past work has investigated the aerodynamics of wind disturbances for a single robot without a load. Formulating these problems for the proposed system is challenging and is still an open research question. We have some ideas to approach this problem combining Bayesian estimation techniques with more recent machine learning approaches and we will tackle it in the near future.

What are the limitations on the performance of the system? How fast and agile can it be with a suspended payload?

The limits of the performances are established by the actuating and sensing system. Our approach intrinsically considers both physical and sensing limitations of our system. From a sensing and computation perspective, we believe to be close to the limits with speeds of up to 4 m/s. Faster speeds can potentially introduce motion blur while decreasing the load tracking precision. Moreover, faster motions will increase as well aerodynamic disturbances that we have just mentioned. In the future, modeling these phenomena and their incorporation in the proposed solution can further push the agility.

Your paper talks about extending this approach to multiple vehicles cooperatively transporting a payload, can you tell us more about that?

We are currently working on a distributed perception and control approach for cooperative transportation. We already have some very exciting results that we will share with you very soon! Overall, we can employ a team of aerial robots to cooperatively transport a payload to increase the payload capacity and endow the system with additional resilience in case of vehicles’ failures. A cooperative cable suspended payload cooperative transportation system allows as well to concurrently and independently control the load’s position and orientation. This is not possible just using rigid connections. We believe that our approach will have a strong impact in real-world settings for delivery and constructions in warehouses and GPS-denied environments such as dense urban areas. Moreover, in post disaster scenarios, a team of physically interconnected aerial robots can deliver supplies and establish communication in areas where GPS signal is intermittent or unavailable.

PCMPC: Perception-Constrained Model Predictive Control for Quadrotors with Suspended Loads using a Single Camera and IMU, by Guanrui Li, Alex Tunchez, and Giuseppe Loianno from NYU, will be presented (virtually) at ICRA 2021.

<Back to IEEE Journal Watch Continue reading →

Posted in Human Robots

#438801 This AI Thrashes the Hardest Atari Games ...

Posted on March 4, 2021 by Android

Learning from rewards seems like the simplest thing. I make coffee, I sip coffee, I’m happy. My brain registers “brewing coffee” as an action that leads to a reward.

That’s the guiding insight behind deep reinforcement learning, a family of algorithms that famously smashed most of Atari’s gaming catalog and triumphed over humans in strategy games like Go. Here, an AI “agent” explores the game, trying out different actions and registering ones that let it win.

Except it’s not that simple. “Brewing coffee” isn’t one action; it’s a series of actions spanning several minutes, where you’re only rewarded at the very end. By just tasting the final product, how do you learn to fine-tune grind coarseness, water to coffee ratio, brewing temperature, and a gazillion other factors that result in the reward—tasty, perk-me-up coffee?

That’s the problem with “sparse rewards,” which are ironically very abundant in our messy, complex world. We don’t immediately get feedback from our actions—no video-game-style dings or points for just grinding coffee beans—yet somehow we’re able to learn and perform an entire sequence of arm and hand movements while half-asleep.

This week, researchers from UberAI and OpenAI teamed up to bestow this talent on AI.

The trick is to encourage AI agents to “return” to a previous step, one that’s promising for a winning solution. The agent then keeps a record of that state, reloads it, and branches out again to intentionally explore other solutions that may have been left behind on the first go-around. Video gamers are likely familiar with this idea: live, die, reload a saved point, try something else, repeat for a perfect run-through.

The new family of algorithms, appropriately dubbed “Go-Explore,” smashed notoriously difficult Atari games like Montezuma’s Revenge that were previously unsolvable by its AI predecessors, while trouncing human performance along the way.

It’s not just games and digital fun. In a computer simulation of a robotic arm, the team found that installing Go-Explore as its “brain” allowed it to solve a challenging series of actions when given very sparse rewards. Because the overarching idea is so simple, the authors say, it can be adapted and expanded to other real-world problems, such as drug design or language learning.

Growing Pains
How do you reward an algorithm?

Rewards are very hard to craft, the authors say. Take the problem of asking a robot to go to a fridge. A sparse reward will only give the robot “happy points” if it reaches its destination, which is similar to asking a baby, with no concept of space and danger, to crawl through a potential minefield of toys and other obstacles towards a fridge.

“In practice, reinforcement learning works very well, if you have very rich feedback, if you can tell, ‘hey, this move is good, that move is bad, this move is good, that move is bad,’” said study author Joost Huinzinga. However, in situations that offer very little feedback, “rewards can intentionally lead to a dead end. Randomly exploring the space just doesn’t cut it.”

The other extreme is providing denser rewards. In the same robot-to-fridge example, you could frequently reward the bot as it goes along its journey, essentially helping “map out” the exact recipe to success. But that’s troubling as well. Over-holding an AI’s hand could result in an extremely rigid robot that ignores new additions to its path—a pet, for example—leading to dangerous situations. It’s a deceptive AI solution that seems effective in a simple environment, but crashes in the real world.

What we need are AI agents that can tackle both problems, the team said.

Intelligent Exploration
The key is to return to the past.

For AI, motivation usually comes from “exploring new or unusual situations,” said Huizinga. It’s efficient, but comes with significant downsides. For one, the AI agent could prematurely stop going back to promising areas because it thinks it had already found a good solution. For another, it could simply forget a previous decision point because of the mechanics of how it probes the next step in a problem.

For a complex task, the end result is an AI that randomly stumbles around towards a solution while ignoring potentially better ones.

“Detaching from a place that was previously visited after collecting a reward doesn’t work in difficult games, because you might leave out important clues,” Huinzinga explained.

Go-Explore solves these problems with a simple principle: first return, then explore. In essence, the algorithm saves different approaches it previously tried and loads promising save points—once more likely to lead to victory—to explore further.

Digging a bit deeper, the AI stores screen caps from a game. It then analyzes saved points and groups images that look alike as a potential promising “save point” to return to. Rinse and repeat. The AI tries to maximize its final score in the game, and updates its save points when it achieves a new record score. Because Atari doesn’t usually allow people to revisit any random point, the team used an emulator, which is a kind of software that mimics the Atari system but with custom abilities such as saving and reloading at any time.

The trick worked like magic. When pitted against 55 Atari games in the OpenAI gym, now commonly used to benchmark reinforcement learning algorithms, Go-Explore knocked out state-of-the-art AI competitors over 85 percent of the time.

It also crushed games previously unbeatable by AI. Montezuma’s Revenge, for example, requires you to move Pedro, the blocky protagonist, through a labyrinth of underground temples while evading obstacles such as traps and enemies and gathering jewels. One bad jump could derail the path to the next level. It’s a perfect example of sparse rewards: you need a series of good actions to get to the reward—advancing onward.

Go-Explore didn’t just beat all levels of the game, a first for AI. It also scored higher than any previous record for reinforcement learning algorithms at lower levels while toppling the human world record.

Outside a gaming environment, Go-Explore was also able to boost the performance of a simulated robot arm. While it’s easy for humans to follow high-level guidance like “put the cup on this shelf in a cupboard,” robots often need explicit training—from grasping the cup to recognizing a cupboard, moving towards it while avoiding obstacles, and learning motions to not smash the cup when putting it down.

Here, similar to the real world, the digital robot arm was only rewarded when it placed the cup onto the correct shelf, out of four possible shelves. When pitted against another algorithm, Go-Explore quickly figured out the movements needed to place the cup, while its competitor struggled with even reliably picking the cup up.

Combining Forces
By itself, the “first return, then explore” idea behind Go-Explore is already powerful. The team thinks it can do even better.

One idea is to change the mechanics of save points. Rather than reloading saved states through the emulator, it’s possible to train a neural network to do the same, without needing to relaunch a saved state. It’s a potential way to make the AI even smarter, the team said, because it can “learn” to overcome one obstacle once, instead of solving the same problem again and again. The downside? It’s much more computationally intensive.

Another idea is to combine Go-Explore with an alternative form of learning, called “imitation learning.” Here, an AI observes human behavior and mimics it through a series of actions. Combined with Go-Explore, said study author Adrien Ecoffet, this could make more robust robots capable of handling all the complexity and messiness in the real world.

To the team, the implications go far beyond Go-Explore. The concept of “first return, then explore” seems to be especially powerful, suggesting “it may be a fundamental feature of learning in general.” The team said, “Harnessing these insights…may be essential…to create generally intelligent agents.”

Image Credit: Adrien Ecoffet, Joost Huizinga, Joel Lehman, Kenneth O. Stanley, and Jeff Clune Continue reading →

Posted in Human Robots

#438798 This AI Thrashes the Hardest Atari Games ...

Posted on March 3, 2021 by Android

Learning from rewards seems like the simplest thing. I make coffee, I sip coffee, I’m happy. My brain registers “brewing coffee” as an action that leads to a reward.

This week, researchers from UberAI and OpenAI teamed up to bestow this talent on AI.

Growing Pains
How do you reward an algorithm?

What we need are AI agents that can tackle both problems, the team said.

Intelligent Exploration
The key is to return to the past.

For a complex task, the end result is an AI that randomly stumbles around towards a solution while ignoring potentially better ones.

“Detaching from a place that was previously visited after collecting a reward doesn’t work in difficult games, because you might leave out important clues,” Huinzinga explained.

Combining Forces
By itself, the “first return, then explore” idea behind Go-Explore is already powerful. The team thinks it can do even better.

Image Credit: Adrien Ecoffet, Joost Huizinga, Joel Lehman, Kenneth O. Stanley, and Jeff Clune Continue reading →

Posted in Human Robots

#438769 Will Robots Make Good Friends? ...

Posted on February 26, 2021 by Android

In the 2012 film Robot and Frank, the protagonist, a retired cat burglar named Frank, is suffering the early symptoms of dementia. Concerned and guilty, his son buys him a “home robot” that can talk, do household chores like cooking and cleaning, and remind Frank to take his medicine. It’s a robot the likes of which we’re getting closer to building in the real world.

The film follows Frank, who is initially appalled by the idea of living with a robot, as he gradually begins to see the robot as both functionally useful and socially companionable. The film ends with a clear bond between man and machine, such that Frank is protective of the robot when the pair of them run into trouble.

This is, of course, a fictional story, but it challenges us to explore different kinds of human-to-robot bonds. My recent research on human-robot relationships examines this topic in detail, looking beyond sex robots and robot love affairs to examine that most profound and meaningful of relationships: friendship.

My colleague and I identified some potential risks, like the abandonment of human friends for robotic ones, but we also found several scenarios where robotic companionship can constructively augment people’s lives, leading to friendships that are directly comparable to human-to-human relationships.

Philosophy of Friendship
The robotics philosopher John Danaher sets a very high bar for what friendship means. His starting point is the “true” friendship first described by the Greek philosopher Aristotle, which saw an ideal friendship as premised on mutual good will, admiration, and shared values. In these terms, friendship is about a partnership of equals.

Building a robot that can satisfy Aristotle’s criteria is a substantial technical challenge and is some considerable way off, as Danaher himself admits. Robots that may seem to be getting close, such as Hanson Robotics’ Sophia, base their behavior on a library of pre-prepared responses: a humanoid chatbot, rather than a conversational equal. Anyone who’s had a testing back-and-forth with Alexa or Siri will know AI still has some way to go in this regard.

Aristotle also talked about other forms of “imperfect” friendship, such as “utilitarian” and “pleasure” friendships, which are considered inferior to true friendship because they don’t require symmetrical bonding and are often to one party’s unequal benefit. This form of friendship sets a relatively very low bar which some robots, like “sexbots” and robotic pets, clearly already meet.

Artificial Amigos
For some, relating to robots is just a natural extension of relating to other things in our world, like people, pets, and possessions. Psychologists have even observed how people respond naturally and socially towards media artefacts like computers and televisions. Humanoid robots, you’d have thought, are more personable than your home PC.

However, the field of “robot ethics” is far from unanimous on whether we can—or should— develop any form of friendship with robots. For an influential group of UK researchers who charted a set of “ethical principles of robotics,” human-robot “companionship” is an oxymoron, and to market robots as having social capabilities is dishonest and should be treated with caution, if not alarm. For these researchers, wasting emotional energy on entities that can only simulate emotions will always be less rewarding than forming human-to-human bonds.

But people are already developing bonds with basic robots, like vacuum-cleaning and lawn-trimming machines that can be bought for less than the price of a dishwasher. A surprisingly large number of people give these robots pet names—something they don’t do with their dishwashers. Some even take their cleaning robots on holiday.

Other evidence of emotional bonds with robots include the Shinto blessing ceremony for Sony Aibo robot dogs that were dismantled for spare parts, and the squad of US troops who fired a 21-gun salute, and awarded medals, to a bomb-disposal robot named “Boomer” after it was destroyed in action.

These stories, and the psychological evidence we have so far, make clear that we can extend emotional connections to things that are very different to us, even when we know they are manufactured and pre-programmed. But do those connections constitute a friendship comparable to that shared between humans?

True Friendship?
A colleague and I recently reviewed the extensive literature on human-to-human relationships to try to understand how, and if, the concepts we found could apply to bonds we might form with robots. We found evidence that many coveted human-to-human friendships do not in fact live up to Aristotle’s ideal.

We noted a wide range of human-to-human relationships, from relatives and lovers to parents, carers, service providers, and the intense (but unfortunately one-way) relationships we maintain with our celebrity heroes. Few of these relationships could be described as completely equal and, crucially, they are all destined to evolve over time.

All this means that expecting robots to form Aristotelian bonds with us is to set a standard even human relationships fail to live up to. We also observed forms of social connectedness that are rewarding and satisfying and yet are far from the ideal friendship outlined by the Greek philosopher.

We know that social interaction is rewarding in its own right, and something that, as social mammals, humans have a strong need for. It seems probable that relationships with robots could help to address the deep-seated urge we all feel for social connection—like providing physical comfort, emotional support, and enjoyable social exchanges—currently provided by other humans.

Our paper also discussed some potential risks. These arise particularly in settings where interaction with a robot could come to replace interaction with people, or where people are denied a choice as to whether they interact with a person or a robot—in a care setting, for instance.

These are important concerns, but they’re possibilities and not inevitabilities. In the literature we reviewed we actually found evidence of the opposite effect: robots acting to scaffold social interactions with others, acting as ice-breakers in groups, and helping people to improve their social skills or to boost their self-esteem.

It appears likely that, as time progresses, many of us will simply follow Frank’s path towards acceptance: scoffing at first, before settling into the idea that robots can make surprisingly good companions. Our research suggests that’s already happening—though perhaps not in a way of which Aristotle would have approved.

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Image Credit: Andy Kelly on Unsplash Continue reading →

Posted in Human Robots

Humanoid Gallery

Popular Searches

Tag Archives: action

#439032 To Learn To Deal With Uncertainty, This ...

#439006 Low-Cost Drones Learn Precise Control ...

#438801 This AI Thrashes the Hardest Atari Games ...

#438798 This AI Thrashes the Hardest Atari Games ...

#438769 Will Robots Make Good Friends? ...