Tag Archives: system

#438998 Foam Sword Fencing With a PR2 Is the ...

Most of what we cover in the Human Robot Interaction (HRI) space involves collaboration, because collaborative interactions tend to be productive, positive, and happy. Yay! But sometimes, collaboration is not what you want. Sometimes, you want competition.

Competition between humans and robots doesn’t have to be a bad thing, in the same way that competition between humans and humans doesn’t have to be a bad thing. There are all kinds of scenarios in which humans respond favorably to competition, and exercise is an obvious example.

Studies have shown that humans can perform significantly better when they’re exercising competitively as opposed to when they’re exercising individually. And while researchers have looked at whether robots can be effective exercise coaches (they can be), there hasn’t been a lot of exploration of physical robots actually competing directly with humans. Roboticists from the University of Washington decided to put adversarial exercise robots to the test, and they did it by giving a PR2 a giant foam sword. Awesome.

This exercise game matches a PR2 with a human in a zero-sum competitive fencing game with foam swords. Expecting the PR2 to actually be a competitive fencer isn’t realistic because, like, it’s a PR2. Instead, the objective of the game is for the human to keep their foam sword within a target area near the PR2 while also avoiding the PR2’s low-key sword-waving. A VR system allows the user to see the target area, while also giving the system a way to track the user’s location and pose.

Looks like fun, right? It’s also exercise, at least in the sense that the user’s heart rate nearly doubled over their resting heart rate during the highest scoring game. This is super preliminary research, though, and there’s still a lot of work to do. It’ll be important to figure out how skilled a competitive robot should be in order to keep providing a reasonable challenge to a human who gradually improves over time, while also being careful to avoid generating any negative reactions. For example, the robot should probably not beat you over the head with its foam sword, even if that’s a highly effective strategy for getting your heart rate up.

Competitive Physical Human-Robot Game Play, by Boling Yang, Xiangyu Xie, Golnaz Habibi, and Joshua R. Smith from the University of Washington and MIT, was presented as a late-breaking report at the ACM/IEEE International Conference on Human-Robot Interaction. Continue reading

Posted in Human Robots

#438982 Quantum Computing and Reinforcement ...

Deep reinforcement learning is having a superstar moment.

Powering smarter robots. Simulating human neural networks. Trouncing physicians at medical diagnoses and crushing humanity’s best gamers at Go and Atari. While far from achieving the flexible, quick thinking that comes naturally to humans, this powerful machine learning idea seems unstoppable as a harbinger of better thinking machines.

Except there’s a massive roadblock: they take forever to run. Because the concept behind these algorithms is based on trial and error, a reinforcement learning AI “agent” only learns after being rewarded for its correct decisions. For complex problems, the time it takes an AI agent to try and fail to learn a solution can quickly become untenable.

But what if you could try multiple solutions at once?

This week, an international collaboration led by Dr. Philip Walther at the University of Vienna took the “classic” concept of reinforcement learning and gave it a quantum spin. They designed a hybrid AI that relies on both quantum and run-of-the-mill classic computing, and showed that—thanks to quantum quirkiness—it could simultaneously screen a handful of different ways to solve a problem.

The result is a reinforcement learning AI that learned over 60 percent faster than its non-quantum-enabled peers. This is one of the first tests that shows adding quantum computing can speed up the actual learning process of an AI agent, the authors explained.

Although only challenged with a “toy problem” in the study, the hybrid AI, once scaled, could impact real-world problems such as building an efficient quantum internet. The setup “could readily be integrated within future large-scale quantum communication networks,” the authors wrote.

The Bottleneck
Learning from trial and error comes intuitively to our brains.

Say you’re trying to navigate a new convoluted campground without a map. The goal is to get from the communal bathroom back to your campsite. Dead ends and confusing loops abound. We tackle the problem by deciding to turn either left or right at every branch in the road. One will get us closer to the goal; the other leads to a half hour of walking in circles. Eventually, our brain chemistry rewards correct decisions, so we gradually learn the correct route. (If you’re wondering…yeah, true story.)

Reinforcement learning AI agents operate in a similar trial-and-error way. As a problem becomes more complex, the number—and time—of each trial also skyrockets.

“Even in a moderately realistic environment, it may simply take too long to rationally respond to a given situation,” explained study author Dr. Hans Briegel at the Universität Innsbruck in Austria, who previously led efforts to speed up AI decision-making using quantum mechanics. If there’s pressure that allows “only a certain time for a response, an agent may then be unable to cope with the situation and to learn at all,” he wrote.

Many attempts have tried speeding up reinforcement learning. Giving the AI agent a short-term “memory.” Tapping into neuromorphic computing, which better resembles the brain. In 2014, Briegel and colleagues showed that a “quantum brain” of sorts can help propel an AI agent’s decision-making process after learning. But speeding up the learning process itself has eluded our best attempts.

The Hybrid AI
The new study went straight for that previously untenable jugular.

The team’s key insight was to tap into the best of both worlds—quantum and classical computing. Rather than building an entire reinforcement learning system using quantum mechanics, they turned to a hybrid approach that could prove to be more practical. Here, the AI agent uses quantum weirdness as it’s trying out new approaches—the “trial” in trial and error. The system then passes the baton to a classical computer to give the AI its reward—or not—based on its performance.

At the heart of the quantum “trial” process is a quirk called superposition. Stay with me. Our computers are powered by electrons, which can represent only two states—0 or 1. Quantum mechanics is far weirder, in that photons (particles of light) can simultaneously be both 0 and 1, with a slightly different probability of “leaning towards” one or the other.

This noncommittal oddity is part of what makes quantum computing so powerful. Take our reinforcement learning example of navigating a new campsite. In our classic world, we—and our AI—need to decide between turning left or right at an intersection. In a quantum setup, however, the AI can (in a sense) turn left and right at the same time. So when searching for the correct path back to home base, the quantum system has a leg up in that it can simultaneously explore multiple routes, making it far faster than conventional, consecutive trail and error.

“As a consequence, an agent that can explore its environment in superposition will learn significantly faster than its classical counterpart,” said Briegel.

It’s not all theory. To test out their idea, the team turned to a programmable chip called a nanophotonic processor. Think of it as a CPU-like computer chip, but it processes particles of light—photons—rather than electricity. These light-powered chips have been a long time in the making. Back in 2017, for example, a team from MIT built a fully optical neural network into an optical chip to bolster deep learning.

The chips aren’t all that exotic. Nanophotonic processors act kind of like our eyeglasses, which can carry out complex calculations that transform light that passes through them. In the glasses case, they let people see better. For a light-based computer chip, it allows computation. Rather than using electrical cables, the chips use “wave guides” to shuttle photons and perform calculations based on their interactions.

The “error” or “reward” part of the new hardware comes from a classical computer. The nanophotonic processor is coupled to a traditional computer, where the latter provides the quantum circuit with feedback—that is, whether to reward a solution or not. This setup, the team explains, allows them to more objectively judge any speed-ups in learning in real time.

In this way, a hybrid reinforcement learning agent alternates between quantum and classical computing, trying out ideas in wibbly-wobbly “multiverse” land while obtaining feedback in grounded, classic physics “normality.”

A Quantum Boost
In simulations using 10,000 AI agents and actual experimental data from 165 trials, the hybrid approach, when challenged with a more complex problem, showed a clear leg up.

The key word is “complex.” The team found that if an AI agent has a high chance of figuring out the solution anyway—as for a simple problem—then classical computing works pretty well. The quantum advantage blossoms when the task becomes more complex or difficult, allowing quantum mechanics to fully flex its superposition muscles. For these problems, the hybrid AI was 63 percent faster at learning a solution compared to traditional reinforcement learning, decreasing its learning effort from 270 guesses to 100.

Now that scientists have shown a quantum boost for reinforcement learning speeds, the race for next-generation computing is even more lit. Photonics hardware required for long-range light-based communications is rapidly shrinking, while improving signal quality. The partial-quantum setup could “aid specifically in problems where frequent search is needed, for example, network routing problems” that’s prevalent for a smooth-running internet, the authors wrote. With a quantum boost, reinforcement learning may be able to tackle far more complex problems—those in the real world—than currently possible.

“We are just at the beginning of understanding the possibilities of quantum artificial intelligence,” said lead author Walther.

Image Credit: Oleg Gamulinskiy from Pixabay Continue reading

Posted in Human Robots

#438925 Nanophotonics Could Be the ‘Dark ...

The race to build the first practical quantum computers looks like a two-horse contest between machines built from superconducting qubits and those that use trapped ions. But new research suggests a third contender—machines based on optical technology—could sneak up on the inside.

The most advanced quantum computers today are the ones built by Google and IBM, which rely on superconducting circuits to generate the qubits that form the basis of quantum calculations. They are now able to string together tens of qubits, and while controversial, Google claims its machines have achieved quantum supremacy—the ability to carry out a computation beyond normal computers.

Recently this approach has been challenged by a wave of companies looking to use trapped ion qubits, which are more stable and less error-prone than superconducting ones. While these devices are less developed, engineering giant Honeywell has already released a machine with 10 qubits, which it says is more powerful than a machine made of a greater number of superconducting qubits.

But despite this progress, both of these approaches have some major drawbacks. They require specialized fabrication methods, incredibly precise control mechanisms, and they need to be cooled to close to absolute zero to protect the qubits from any outside interference.

That’s why researchers at Canadian quantum computing hardware and software startup Xanadu are backing an alternative quantum computing approach based on optics, which was long discounted as impractical. In a paper published last week in Nature, they unveiled the first fully programmable and scalable optical chip that can run quantum algorithms. Not only does the system run at room temperature, but the company says it could scale to millions of qubits.

The idea isn’t exactly new. As Chris Lee notes in Ars Technica, people have been experimenting with optical approaches to quantum computing for decades, because encoding information in photons’ quantum states and manipulating those states is relatively easy. The biggest problem was that optical circuits were very large and not readily programmable, which meant you had to build a new computer for every new problem you wanted to solve.

That started to change thanks to the growing maturity of photonic integrated circuits. While early experiments with optical computing involved complex table-top arrangements of lasers, lenses, and detectors, today it’s possible to buy silicon chips not dissimilar to electronic ones that feature hundreds of tiny optical components.

In recent years, the reliability and performance of these devices has improved dramatically, and they’re now regularly used by the telecommunications industry. Some companies believe they could be the future of artificial intelligence too.

This allowed the Xanadu researchers to design a silicon chip that implements a complex optical network made up of beam splitters, waveguides, and devices called interferometers that cause light sources to interact with each other.

The chip can generate and manipulate up to eight qubits, but unlike conventional qubits, which can simultaneously be in two states, these qubits can be in any configuration of three states, which means they can carry more information.

Once the light has travelled through the network, it is then fed out to cutting-edge photon-counting detectors that provide the result. This is one of the potential limitations of the system, because currently these detectors need to be cryogenically cooled, although the rest of the chip does not.

But most importantly, the chip is easily re-programmable, which allows it to tackle a variety of problems. The computation can be controlled by adjusting the settings of these interferometers, but the researchers have also developed a software platform that hides the physical complexity from users and allows them to program it using fairly conventional code.

The company announced that its chips were available on the cloud in September of 2020, but the Nature paper is the first peer-reviewed test of their system. The researchers verified that the computations being done were genuinely quantum mechanical in nature, but they also implemented two more practical algorithms: one for simulating molecules and the other for judging how similar two graphs are, which has applications in a variety of pattern recognition problems.

In an accompanying opinion piece, Ulrik Andersen from the Technical University of Denmark says the quality of the qubits needs to be improved considerably and photon losses reduced if the technology is ever to scale to practical problems. But, he says, this breakthrough suggests optical approaches “could turn out to be the dark horse of quantum computing.”

Image Credit: Shahadat Rahman on Unsplash Continue reading

Posted in Human Robots

#438801 This AI Thrashes the Hardest Atari Games ...

Learning from rewards seems like the simplest thing. I make coffee, I sip coffee, I’m happy. My brain registers “brewing coffee” as an action that leads to a reward.

That’s the guiding insight behind deep reinforcement learning, a family of algorithms that famously smashed most of Atari’s gaming catalog and triumphed over humans in strategy games like Go. Here, an AI “agent” explores the game, trying out different actions and registering ones that let it win.

Except it’s not that simple. “Brewing coffee” isn’t one action; it’s a series of actions spanning several minutes, where you’re only rewarded at the very end. By just tasting the final product, how do you learn to fine-tune grind coarseness, water to coffee ratio, brewing temperature, and a gazillion other factors that result in the reward—tasty, perk-me-up coffee?

That’s the problem with “sparse rewards,” which are ironically very abundant in our messy, complex world. We don’t immediately get feedback from our actions—no video-game-style dings or points for just grinding coffee beans—yet somehow we’re able to learn and perform an entire sequence of arm and hand movements while half-asleep.

This week, researchers from UberAI and OpenAI teamed up to bestow this talent on AI.

The trick is to encourage AI agents to “return” to a previous step, one that’s promising for a winning solution. The agent then keeps a record of that state, reloads it, and branches out again to intentionally explore other solutions that may have been left behind on the first go-around. Video gamers are likely familiar with this idea: live, die, reload a saved point, try something else, repeat for a perfect run-through.

The new family of algorithms, appropriately dubbed “Go-Explore,” smashed notoriously difficult Atari games like Montezuma’s Revenge that were previously unsolvable by its AI predecessors, while trouncing human performance along the way.

It’s not just games and digital fun. In a computer simulation of a robotic arm, the team found that installing Go-Explore as its “brain” allowed it to solve a challenging series of actions when given very sparse rewards. Because the overarching idea is so simple, the authors say, it can be adapted and expanded to other real-world problems, such as drug design or language learning.

Growing Pains
How do you reward an algorithm?

Rewards are very hard to craft, the authors say. Take the problem of asking a robot to go to a fridge. A sparse reward will only give the robot “happy points” if it reaches its destination, which is similar to asking a baby, with no concept of space and danger, to crawl through a potential minefield of toys and other obstacles towards a fridge.

“In practice, reinforcement learning works very well, if you have very rich feedback, if you can tell, ‘hey, this move is good, that move is bad, this move is good, that move is bad,’” said study author Joost Huinzinga. However, in situations that offer very little feedback, “rewards can intentionally lead to a dead end. Randomly exploring the space just doesn’t cut it.”

The other extreme is providing denser rewards. In the same robot-to-fridge example, you could frequently reward the bot as it goes along its journey, essentially helping “map out” the exact recipe to success. But that’s troubling as well. Over-holding an AI’s hand could result in an extremely rigid robot that ignores new additions to its path—a pet, for example—leading to dangerous situations. It’s a deceptive AI solution that seems effective in a simple environment, but crashes in the real world.

What we need are AI agents that can tackle both problems, the team said.

Intelligent Exploration
The key is to return to the past.

For AI, motivation usually comes from “exploring new or unusual situations,” said Huizinga. It’s efficient, but comes with significant downsides. For one, the AI agent could prematurely stop going back to promising areas because it thinks it had already found a good solution. For another, it could simply forget a previous decision point because of the mechanics of how it probes the next step in a problem.

For a complex task, the end result is an AI that randomly stumbles around towards a solution while ignoring potentially better ones.

“Detaching from a place that was previously visited after collecting a reward doesn’t work in difficult games, because you might leave out important clues,” Huinzinga explained.

Go-Explore solves these problems with a simple principle: first return, then explore. In essence, the algorithm saves different approaches it previously tried and loads promising save points—once more likely to lead to victory—to explore further.

Digging a bit deeper, the AI stores screen caps from a game. It then analyzes saved points and groups images that look alike as a potential promising “save point” to return to. Rinse and repeat. The AI tries to maximize its final score in the game, and updates its save points when it achieves a new record score. Because Atari doesn’t usually allow people to revisit any random point, the team used an emulator, which is a kind of software that mimics the Atari system but with custom abilities such as saving and reloading at any time.

The trick worked like magic. When pitted against 55 Atari games in the OpenAI gym, now commonly used to benchmark reinforcement learning algorithms, Go-Explore knocked out state-of-the-art AI competitors over 85 percent of the time.

It also crushed games previously unbeatable by AI. Montezuma’s Revenge, for example, requires you to move Pedro, the blocky protagonist, through a labyrinth of underground temples while evading obstacles such as traps and enemies and gathering jewels. One bad jump could derail the path to the next level. It’s a perfect example of sparse rewards: you need a series of good actions to get to the reward—advancing onward.

Go-Explore didn’t just beat all levels of the game, a first for AI. It also scored higher than any previous record for reinforcement learning algorithms at lower levels while toppling the human world record.

Outside a gaming environment, Go-Explore was also able to boost the performance of a simulated robot arm. While it’s easy for humans to follow high-level guidance like “put the cup on this shelf in a cupboard,” robots often need explicit training—from grasping the cup to recognizing a cupboard, moving towards it while avoiding obstacles, and learning motions to not smash the cup when putting it down.

Here, similar to the real world, the digital robot arm was only rewarded when it placed the cup onto the correct shelf, out of four possible shelves. When pitted against another algorithm, Go-Explore quickly figured out the movements needed to place the cup, while its competitor struggled with even reliably picking the cup up.

Combining Forces
By itself, the “first return, then explore” idea behind Go-Explore is already powerful. The team thinks it can do even better.

One idea is to change the mechanics of save points. Rather than reloading saved states through the emulator, it’s possible to train a neural network to do the same, without needing to relaunch a saved state. It’s a potential way to make the AI even smarter, the team said, because it can “learn” to overcome one obstacle once, instead of solving the same problem again and again. The downside? It’s much more computationally intensive.

Another idea is to combine Go-Explore with an alternative form of learning, called “imitation learning.” Here, an AI observes human behavior and mimics it through a series of actions. Combined with Go-Explore, said study author Adrien Ecoffet, this could make more robust robots capable of handling all the complexity and messiness in the real world.

To the team, the implications go far beyond Go-Explore. The concept of “first return, then explore” seems to be especially powerful, suggesting “it may be a fundamental feature of learning in general.” The team said, “Harnessing these insights…may be essential…to create generally intelligent agents.”

Image Credit: Adrien Ecoffet, Joost Huizinga, Joel Lehman, Kenneth O. Stanley, and Jeff Clune Continue reading

Posted in Human Robots

#438785 Video Friday: A Blimp For Your Cat

Video Friday is your weekly selection of awesome robotics videos, collected by your Automaton bloggers. We’ll also be posting a weekly calendar of upcoming robotics events for the next few months; here's what we have so far (send us your events!):

HRI 2021 – March 8-11, 2021 – [Online Conference]
RoboSoft 2021 – April 12-16, 2021 – [Online Conference]
ICRA 2021 – May 30-5, 2021 – Xi'an, China
Let us know if you have suggestions for next week, and enjoy today's videos.

Shiny robotic cat toy blimp!

I am pretty sure this is Google Translate getting things wrong, but the About page mentions that the blimp will “take you to your destination after appearing in the death of God.”

[ NTT DoCoMo ] via [ RobotStart ]

If you have yet to see this real-time video of Perseverance landing on Mars, drop everything and watch it.

During the press conference, someone commented that this is the first time anyone on the team who designed and built this system has ever seen it in operation, since it could only be tested at the component scale on Earth. This landing system has blown my mind since Curiosity.

Here's a better look at where Percy ended up:

[ NASA ]

The fact that Digit can just walk up and down wet, slippery, muddy hills without breaking a sweat is (still) astonishing.

[ Agility Robotics ]

SkyMul wants drones to take over the task of tying rebar, which looks like just the sort of thing we'd rather robots be doing so that we don't have to:

The tech certainly looks promising, and SkyMul says that they're looking for some additional support to bring things to the pilot stage.

[ SkyMul ]

Thanks Eohan!

Flatcat is a pet-like, playful robot that reacts to touch. Flatcat feels everything exactly: Cuddle with it, romp around with it, or just watch it do weird things of its own accord. We are sure that flatcat will amaze you, like us, and caress your soul.

I don't totally understand it, but I want it anyway.

[ Flatcat ]

Thanks Oswald!

This is how I would have a romantic dinner date if I couldn't get together in person. Herman the UR3 and an OptiTrack system let me remotely make a romantic meal!

[ Dave's Armoury ]

Here, we propose a novel design of deformable propellers inspired by dragonfly wings. The structure of these propellers includes a flexible segment similar to the nodus on a dragonfly wing. This flexible segment can bend, twist and even fold upon collision, absorbing force upon impact and protecting the propeller from damage.

[ Paper ]

Thanks Van!

In the 1970s, The CIA​ created the world's first miniaturized unmanned aerial vehicle, or UAV, which was intended to be a clandestine listening device. The Insectothopter was never deployed operationally, but was still revolutionary for its time.

It may never have been deployed (not that they'll admit to, anyway), but it was definitely operational and could fly controllably.

[ CIA ]

Research labs are starting to get Digits, which means we're going to get a much better idea of what its limitations are.

[ Ohio State ]

This video shows the latest achievements for LOLA walking on undetected uneven terrain. The robot is technically blind, not using any camera-based or prior information on the terrain.

[ TUM ]

We define “robotic contact juggling” to be the purposeful control of the motion of a three-dimensional smooth object as it rolls freely on a motion-controlled robot manipulator, or “hand.” While specific examples of robotic contact juggling have been studied before, in this paper we provide the first general formulation and solution method for the case of an arbitrary smooth object in single-point rolling contact on an arbitrary smooth hand.

[ Paper ]

Thanks Fan!

A couple of new cobots from ABB, designed to work safely around humans.

[ ABB ]

Thanks Fan!

It's worth watching at least a little bit of Adam Savage testing Spot's new arm, because we get to see Spot try, fail, and eventually succeed at an autonomous door-opening behavior at the 10 minute mark.

[ Tested ]

SVR discusses diversity with guest speakers Dr. Michelle Johnson from the GRASP Lab at UPenn; Dr Ariel Anders from Women in Robotics and first technical hire at Robust.ai; Alka Roy from The Responsible Innovation Project; and Kenechukwu C. Mbanesi and Kenya Andrews from Black in Robotics. The discussion here is moderated by Dr. Ken Goldberg—artist, roboticist and Director of the CITRIS People and Robots Lab—and Andra Keay from Silicon Valley Robotics.

[ SVR ]

RAS presents a Soft Robotics Debate on Bioinspired vs. Biohybrid Design.

In this debate, we will bring together experts in Bioinspiration and Biohybrid design to discuss the necessary steps to make more competent soft robots. We will try to answer whether bioinspired research should focus more on developing new bioinspired material and structures or on the integration of living and artificial structures in biohybrid designs.

[ RAS SoRo ]

IFRR presents a Colloquium on Human Robot Interaction.

Across many application domains, robots are expected to work in human environments, side by side with people. The users will vary substantially in background, training, physical and cognitive abilities, and readiness to adopt technology. Robotic products are expected to not only be intuitive, easy to use, and responsive to the needs and states of their users, but they must also be designed with these differences in mind, making human-robot interaction (HRI) a key area of research.

[ IFRR ]

Vijay Kumar, Nemirovsky Family Dean and Professor at Penn Engineering, gives an introduction to ENIAC day and David Patterson, Pardee Professor of Computer Science, Emeritus at the University of California at Berkeley, speaks about the legacy of the ENIAC and its impact on computer architecture today. This video is comprised of lectures one and two of nine total lectures in the ENIAC Day series.

There are more interesting ENIAC videos at the link below, but we'll highlight this particular one, about the women of the ENIAC, also known as the First Programmers.

[ ENIAC Day ] Continue reading

Posted in Human Robots