Tag Archives: Human Behavior

#439023 In ‘Klara and the Sun,’ We Glimpse ...

In a store in the center of an unnamed city, humanoid robots are displayed alongside housewares and magazines. They watch the fast-moving world outside the window, anxiously awaiting the arrival of customers who might buy them and take them home. Among them is Klara, a particularly astute robot who loves the sun and wants to learn as much as possible about humans and the world they live in.

So begins Kazuo Ishiguro’s new novel Klara and the Sun, published earlier this month. The book, told from Klara’s perspective, portrays an eerie future society in which intelligent machines and other advanced technologies have been integrated into daily life, but not everyone is happy about it.

Technological unemployment, the progress of artificial intelligence, inequality, the safety and ethics of gene editing, increasing loneliness and isolation—all of which we’re grappling with today—show up in Ishiguro’s world. It’s like he hit a fast-forward button, mirroring back to us how things might play out if we don’t approach these technologies with caution and foresight.

The wealthy genetically edit or “lift” their children to set them up for success, while the poor have to make do with the regular old brains and bodies bequeathed them by evolution. Lifted and unlifted kids generally don’t mix, and this is just one of many sinister delineations between a new breed of haves and have-nots.

There’s anger about robots’ steady infiltration into everyday life, and questions about how similar their rights should be to those of humans. “First they take the jobs. Then they take the seats at the theater?” one woman fumes.

References to “changes” and “substitutions” allude to an economy where automation has eliminated millions of jobs. While “post-employed” people squat in abandoned buildings and fringe communities arm themselves in preparation for conflict, those whose livelihoods haven’t been destroyed can afford to have live-in housekeepers and buy Artificial Friends (or AFs) for their lonely children.

“The old traditional model that we still live with now—where most of us can get some kind of paid work in exchange for our services or the goods we make—has broken down,” Ishiguro said in a podcast discussion of the novel. “We’re not talking just about the difference between rich and poor getting bigger. We’re talking about a gap appearing between people who participate in society in an obvious way and people who do not.”

He has a point; as much as techno-optimists claim that the economic changes brought by automation and AI will give us all more free time, let us work less, and devote time to our passion projects, how would that actually play out? What would millions of “post-employed” people receiving basic income actually do with their time and energy?

In the novel, we don’t get much of a glimpse of this side of the equation, but we do see how the wealthy live. After a long wait, just as the store manager seems ready to give up on selling her, Klara is chosen by a 14-year-old girl named Josie, the daughter of a woman who wears “high-rank clothes” and lives in a large, sunny home outside the city. Cheerful and kind, Josie suffers from an unspecified illness that periodically flares up and leaves her confined to her bed for days at a time.

Her life seems somewhat bleak, the need for an AF clear. In this future world, the children of the wealthy no longer go to school together, instead studying alone at home on their digital devices. “Interaction meetings” are set up for them to learn to socialize, their parents carefully eavesdropping from the next room and trying not to intervene when there’s conflict or hurt feelings.

Klara does her best to be a friend, aide, and confidante to Josie while continuing to learn about the world around her and decode the mysteries of human behavior. We surmise that she was programmed with a basic ability to understand emotions, which evolves along with her other types of intelligence. “I believe I have many feelings. The more I observe, the more feelings become available to me,” she explains to one character.

Ishiguro does an excellent job of representing Klara’s mind: a blend of pre-determined programming, observation, and continuous learning. Her narration has qualities both robotic and human; we can tell when something has been programmed in—she “Gives Privacy” to the humans around her when that’s appropriate, for example—and when she’s figured something out for herself.

But the author maintains some mystery around Klara’s inner emotional life. “Does she actually understand human emotions, or is she just observing human emotions and simulating them within herself?” he said. “I suppose the question comes back to, what are our emotions as human beings? What do they amount to?”

Klara is particularly attuned to human loneliness, since she essentially was made to help prevent it. It is, in her view, peoples’ biggest fear, and something they’ll go to great lengths to avoid, yet can never fully escape. “Perhaps all humans are lonely,” she says.

Warding off loneliness through technology isn’t a futuristic idea, it’s something we’ve been doing for a long time, with the technologies at hand growing more and more sophisticated. Products like AFs already exist. There’s XiaoIce, a chatbot that uses “sentiment analysis” to keep its 660 million users engaged, and Azuma Hikari, a character-based AI designed to “bring comfort” to users whose lives lack emotional connection with other humans.

The mere existence of these tools would be sinister if it wasn’t for their widespread adoption; when millions of people use AIs to fill a void in their lives, it raises deeper questions about our ability to connect with each other and whether technology is building it up or tearing it down.

This isn’t the only big question the novel tackles. An overarching theme is one we’ve been increasingly contemplating as computers start to acquire more complex capabilities, like the beginnings of creativity or emotional awareness: What is it that truly makes us human?

“Do you believe in the human heart?” one character asks. “I don’t mean simply the organ, obviously. I’m speaking in the poetic sense. The human heart. Do you think there is such a thing? Something that makes each of us special and individual?”

The alternative, at least in the story, is that people don’t have a unique essence, but rather we’re all a blend of traits and personalities that can be reduced to strings of code. Our understanding of the brain is still elementary, but at some level, doesn’t all human experience boil down to the firing of billions of neurons between our ears? Will we one day—in a future beyond that painted by Ishiguro, but certainly foreshadowed by it—be able to “decode” our humanity to the point that there’s nothing mysterious left about it? “A human heart is bound to be complex,” Klara says. “But it must be limited.”

Whether or not you agree, Klara and the Sun is worth the read. It’s both a marvelous, engaging story about what it means to love and be human, and a prescient warning to approach technological change with caution and nuance. We’re already living in a world where AI keeps us company, influences our behavior, and is wreaking various forms of havoc. Ishiguro’s novel is a snapshot of one of our possible futures, told through the eyes of a robot who keeps you rooting for her to the end.

Image Credit: Marion Wellmann from Pixabay Continue reading

Posted in Human Robots

#439000 Can AI Stop People From Believing Fake ...

Machine learning algorithms provide a way to detect misinformation based on writing style and how articles are shared.

On topics as varied as climate change and the safety of vaccines, you will find a wave of misinformation all over social media. Trust in conventional news sources may seem lower than ever, but researchers are working on ways to give people more insight on whether they can believe what they read. Researchers have been testing artificial intelligence (AI) tools that could help filter legitimate news. But how trustworthy is AI when it comes to stopping the spread of misinformation?

Researchers at the Rensselaer Polytechnic Institute (RPI) and the University of Tennessee collaborated to study the role of AI in helping people identify whether the news they’re reading is legitimate or not.

The research paper, “Tailoring Heuristics and Timing AI Interventions for Supporting News Veracity Assessments,” was published in Computers in Human Behavior Reports. It discussed how crowdsourcing marketplace Amazon Mechanical Turk (AMT) can be used to identify misinformation for fresh news and specific heuristics, which are rules of thumb used to process information and consider its veracity. In other words, heuristics are essentially “shortcuts for decisions,” explained Dorit Nevo, an associate professor at RPI’s Lally School of Management and a lead author for the paper.

The study found that AI would be successful in flagging false stories only if the reader did not already have an opinion on the topic, Nevo said. When study subjects were set in their beliefs, confirmation bias kept them from reassessing their views.

Nevo said the first part of the project focused on whether subjects could detect misinformation around climate change and vaccines like the one designed to prevent chicken pox. Then, beginning in April 2020, her team studied how people responded to news related to COVID-19.

“With COVID-19, there was a significant difference,” Nevo said. They found that about 72 percent of respondents could identify misinformation about the coronavirus without heuristic clues, and roughly 93 percent were able to be convinced by the researcher’s heuristics that the content was fake.

Examples of heuristic clues include text with too many capital letters or the use of strong language, Nevo said.

There were two types of heuristics mentioned in the team’s paper: objective heuristics and source heuristics. They put a statement at the top of each article the subjects read; it instructed them to read the article and indicate whether they believed its central thesis.

“We either put a statement that says the AI finds this article reliable and accurate based on the objective heuristics, or we said the AI finds the source reliable,” Nevo said. “So that's the source heuristic.”

In her research on heuristics, Nevo found that people’s thinking takes one of two paths: The first path is to read the article, think about it and decide if they believe it; the second is to consider the source and what others think about the news, and decide whether to believe it before reading it.

Image: Dorit Nevo/RPI/IEEE Spectrum

Researchers at RPI researched the role of heuristics and AI in detecting whether people thought news was credible

Another research paper, “Timing Matters When Correcting Fake News,” published in the Proceedings of the National Academy of Science by researchers at Harvard University, differed from the RPI researchers in its findings. While Nevo and her collaborators found that it’s easier to convince people that a story is fake news before reading it, the Harvard researchers, led by Nadia M. Brashier, a psychologist and neuroscientist, discovered that a fact-check can convince people of misinformation even after reading headlines. When study subjects read true or false labels after reading a headline, that resulted in a 25.3 percent reduction in “subsequent misclassification,” when compared to headlines with no tag, Brashier and her team found.

In the end, fighting misinformation will require both computing and human efforts such as policy changes, says Benjamin D. Horne, an assistant professor of Information Sciences at the University of Tennessee and one of Nevo’s co-authors. He says the RPI-Tennessee work was inspired by AI tools he designed previously. Horne was previously a research assistant at RPI, where he developed machine learning (ML) algorithms that can detect partial truths as well as decontextualized truths and out-of-date information.

“Our algorithms are trained on source-level behavior, both when using the textual content of an article and the network of other news sources that it draws news from,” Horne said. “We have found that these two types of features together are quite good at distinguishing between sources labeled as reliable or unreliable by external news source ratings.”

The machine learning algorithms analyze the writing style and the content-sharing behavior of news outlets, Horne said. Researchers trained a supervised ML algorithm called Random Forest, a classification algorithm that uses decision trees.

AI for Detecting Fake News

So, what’s the potential for AI to be successful in detecting misinformation?

“The tools we have developed, and other tools developed in this area, have fairly high accuracy in lab settings,” says Horne. “For example, our most recent technical work showed around 83% accuracy in predicting when the source of a news article is reliable or unreliable.”

Despite the effectiveness of algorithms, old-fashioned fact-checking by journalists will still be required to combat fake news. AI could filter the information for fact-checkers to verify, according to Horne.

“AI tools are great at dealing with high quantities of information at fast speeds but lack the nuanced analysis that a journalist or fact-checker can provide,” Horne said. “I see a future where the two work together.” Continue reading

Posted in Human Robots

#438801 This AI Thrashes the Hardest Atari Games ...

Learning from rewards seems like the simplest thing. I make coffee, I sip coffee, I’m happy. My brain registers “brewing coffee” as an action that leads to a reward.

That’s the guiding insight behind deep reinforcement learning, a family of algorithms that famously smashed most of Atari’s gaming catalog and triumphed over humans in strategy games like Go. Here, an AI “agent” explores the game, trying out different actions and registering ones that let it win.

Except it’s not that simple. “Brewing coffee” isn’t one action; it’s a series of actions spanning several minutes, where you’re only rewarded at the very end. By just tasting the final product, how do you learn to fine-tune grind coarseness, water to coffee ratio, brewing temperature, and a gazillion other factors that result in the reward—tasty, perk-me-up coffee?

That’s the problem with “sparse rewards,” which are ironically very abundant in our messy, complex world. We don’t immediately get feedback from our actions—no video-game-style dings or points for just grinding coffee beans—yet somehow we’re able to learn and perform an entire sequence of arm and hand movements while half-asleep.

This week, researchers from UberAI and OpenAI teamed up to bestow this talent on AI.

The trick is to encourage AI agents to “return” to a previous step, one that’s promising for a winning solution. The agent then keeps a record of that state, reloads it, and branches out again to intentionally explore other solutions that may have been left behind on the first go-around. Video gamers are likely familiar with this idea: live, die, reload a saved point, try something else, repeat for a perfect run-through.

The new family of algorithms, appropriately dubbed “Go-Explore,” smashed notoriously difficult Atari games like Montezuma’s Revenge that were previously unsolvable by its AI predecessors, while trouncing human performance along the way.

It’s not just games and digital fun. In a computer simulation of a robotic arm, the team found that installing Go-Explore as its “brain” allowed it to solve a challenging series of actions when given very sparse rewards. Because the overarching idea is so simple, the authors say, it can be adapted and expanded to other real-world problems, such as drug design or language learning.

Growing Pains
How do you reward an algorithm?

Rewards are very hard to craft, the authors say. Take the problem of asking a robot to go to a fridge. A sparse reward will only give the robot “happy points” if it reaches its destination, which is similar to asking a baby, with no concept of space and danger, to crawl through a potential minefield of toys and other obstacles towards a fridge.

“In practice, reinforcement learning works very well, if you have very rich feedback, if you can tell, ‘hey, this move is good, that move is bad, this move is good, that move is bad,’” said study author Joost Huinzinga. However, in situations that offer very little feedback, “rewards can intentionally lead to a dead end. Randomly exploring the space just doesn’t cut it.”

The other extreme is providing denser rewards. In the same robot-to-fridge example, you could frequently reward the bot as it goes along its journey, essentially helping “map out” the exact recipe to success. But that’s troubling as well. Over-holding an AI’s hand could result in an extremely rigid robot that ignores new additions to its path—a pet, for example—leading to dangerous situations. It’s a deceptive AI solution that seems effective in a simple environment, but crashes in the real world.

What we need are AI agents that can tackle both problems, the team said.

Intelligent Exploration
The key is to return to the past.

For AI, motivation usually comes from “exploring new or unusual situations,” said Huizinga. It’s efficient, but comes with significant downsides. For one, the AI agent could prematurely stop going back to promising areas because it thinks it had already found a good solution. For another, it could simply forget a previous decision point because of the mechanics of how it probes the next step in a problem.

For a complex task, the end result is an AI that randomly stumbles around towards a solution while ignoring potentially better ones.

“Detaching from a place that was previously visited after collecting a reward doesn’t work in difficult games, because you might leave out important clues,” Huinzinga explained.

Go-Explore solves these problems with a simple principle: first return, then explore. In essence, the algorithm saves different approaches it previously tried and loads promising save points—once more likely to lead to victory—to explore further.

Digging a bit deeper, the AI stores screen caps from a game. It then analyzes saved points and groups images that look alike as a potential promising “save point” to return to. Rinse and repeat. The AI tries to maximize its final score in the game, and updates its save points when it achieves a new record score. Because Atari doesn’t usually allow people to revisit any random point, the team used an emulator, which is a kind of software that mimics the Atari system but with custom abilities such as saving and reloading at any time.

The trick worked like magic. When pitted against 55 Atari games in the OpenAI gym, now commonly used to benchmark reinforcement learning algorithms, Go-Explore knocked out state-of-the-art AI competitors over 85 percent of the time.

It also crushed games previously unbeatable by AI. Montezuma’s Revenge, for example, requires you to move Pedro, the blocky protagonist, through a labyrinth of underground temples while evading obstacles such as traps and enemies and gathering jewels. One bad jump could derail the path to the next level. It’s a perfect example of sparse rewards: you need a series of good actions to get to the reward—advancing onward.

Go-Explore didn’t just beat all levels of the game, a first for AI. It also scored higher than any previous record for reinforcement learning algorithms at lower levels while toppling the human world record.

Outside a gaming environment, Go-Explore was also able to boost the performance of a simulated robot arm. While it’s easy for humans to follow high-level guidance like “put the cup on this shelf in a cupboard,” robots often need explicit training—from grasping the cup to recognizing a cupboard, moving towards it while avoiding obstacles, and learning motions to not smash the cup when putting it down.

Here, similar to the real world, the digital robot arm was only rewarded when it placed the cup onto the correct shelf, out of four possible shelves. When pitted against another algorithm, Go-Explore quickly figured out the movements needed to place the cup, while its competitor struggled with even reliably picking the cup up.

Combining Forces
By itself, the “first return, then explore” idea behind Go-Explore is already powerful. The team thinks it can do even better.

One idea is to change the mechanics of save points. Rather than reloading saved states through the emulator, it’s possible to train a neural network to do the same, without needing to relaunch a saved state. It’s a potential way to make the AI even smarter, the team said, because it can “learn” to overcome one obstacle once, instead of solving the same problem again and again. The downside? It’s much more computationally intensive.

Another idea is to combine Go-Explore with an alternative form of learning, called “imitation learning.” Here, an AI observes human behavior and mimics it through a series of actions. Combined with Go-Explore, said study author Adrien Ecoffet, this could make more robust robots capable of handling all the complexity and messiness in the real world.

To the team, the implications go far beyond Go-Explore. The concept of “first return, then explore” seems to be especially powerful, suggesting “it may be a fundamental feature of learning in general.” The team said, “Harnessing these insights…may be essential…to create generally intelligent agents.”

Image Credit: Adrien Ecoffet, Joost Huizinga, Joel Lehman, Kenneth O. Stanley, and Jeff Clune Continue reading

Posted in Human Robots

#437940 How Boston Dynamics Taught Its Robots to ...

A week ago, Boston Dynamics posted a video of Atlas, Spot, and Handle dancing to “Do You Love Me.” It was, according to the video description, a way “to celebrate the start of what we hope will be a happier year.” As of today the video has been viewed nearly 24 million times, and the popularity is no surprise, considering the compelling mix of technical prowess and creativity on display.

Strictly speaking, the stuff going on in the video isn’t groundbreaking, in the sense that we’re not seeing any of the robots demonstrate fundamentally new capabilities, but that shouldn’t take away from how impressive it is—you’re seeing state-of-the-art in humanoid robotics, quadrupedal robotics, and whatever-the-heck-Handle-is robotics.

What is unique about this video from Boston Dynamics is the artistic component. We know that Atlas can do some practical tasks, and we know it can do some gymnastics and some parkour, but dancing is certainly something new. To learn more about what it took to make these dancing robots happen (and it’s much more complicated than it might seem), we spoke with Aaron Saunders, Boston Dynamics’ VP of Engineering.

Saunders started at Boston Dynamics in 2003, meaning that he’s been a fundamental part of a huge number of Boston Dynamics’ robots, even the ones you may have forgotten about. Remember LittleDog, for example? A team of two designed and built that adorable little quadruped, and Saunders was one of them.

While he’s been part of the Atlas project since the beginning (and had a hand in just about everything else that Boston Dynamics works on), Saunders has spent the last few years leading the Atlas team specifically, and he was kind enough to answer our questions about their dancing robots.

IEEE Spectrum: What’s your sense of how the Internet has been reacting to the video?

Aaron Saunders: We have different expectations for the videos that we make; this one was definitely anchored in fun for us. The response on YouTube was record-setting for us: We received hundreds of emails and calls with people expressing their enthusiasm, and also sharing their ideas for what we should do next, what about this song, what about this dance move, so that was really fun. My favorite reaction was one that I got from my 94-year-old grandma, who watched the video on YouTube and then sent a message through the family asking if I’d taught the robot those sweet moves. I think this video connected with a broader audience, because it mixed the old-school music with new technology.

We haven’t seen Atlas move like this before—can you talk about how you made it happen?

We started by working with dancers and a choreographer to create an initial concept for the dance by composing and assembling a routine. One of the challenges, and probably the core challenge for Atlas in particular, was adjusting human dance moves so that they could be performed on the robot. To do that, we used simulation to rapidly iterate through movement concepts while soliciting feedback from the choreographer to reach behaviors that Atlas had the strength and speed to execute. It was very iterative—they would literally dance out what they wanted us to do, and the engineers would look at the screen and go “that would be easy” or “that would be hard” or “that scares me.” And then we’d have a discussion, try different things in simulation, and make adjustments to find a compatible set of moves that we could execute on Atlas.

Throughout the project, the time frame for creating those new dance moves got shorter and shorter as we built tools, and as an example, eventually we were able to use that toolchain to create one of Atlas’ ballet moves in just one day, the day before we filmed, and it worked. So it’s not hand-scripted or hand-coded, it’s about having a pipeline that lets you take a diverse set of motions, that you can describe through a variety of different inputs, and push them through and onto the robot.

Image: Boston Dynamics

Were there some things that were particularly difficult to translate from human dancers to Atlas? Or, things that Atlas could do better than humans?

Some of the spinning turns in the ballet parts took more iterations to get to work, because they were the furthest from leaping and running and some of the other things that we have more experience with, so they challenged both the machine and the software in new ways. We definitely learned not to underestimate how flexible and strong dancers are—when you take elite athletes and you try to do what they do but with a robot, it’s a hard problem. It’s humbling. Fundamentally, I don’t think that Atlas has the range of motion or power that these athletes do, although we continue developing our robots towards that, because we believe that in order to broadly deploy these kinds of robots commercially, and eventually in a home, we think they need to have this level of performance.

One thing that robots are really good at is doing something over and over again the exact same way. So once we dialed in what we wanted to do, the robots could just do it again and again as we played with different camera angles.

I can understand how you could use human dancers to help you put together a routine with Atlas, but how did that work with Spot, and particularly with Handle?

I think the people we worked with actually had a lot of talent for thinking about motion, and thinking about how to express themselves through motion. And our robots do motion really well—they’re dynamic, they’re exciting, they balance. So I think what we found was that the dancers connected with the way the robots moved, and then shaped that into a story, and it didn’t matter whether there were two legs or four legs. When you don’t necessarily have a template of animal motion or human behavior, you just have to think a little harder about how to go about doing something, and that’s true for more pragmatic commercial behaviors as well.

“We used simulation to rapidly iterate through movement concepts while soliciting feedback from the choreographer to reach behaviors that Atlas had the strength and speed to execute. It was very iterative—they would literally dance out what they wanted us to do, and the engineers would look at the screen and go ‘that would be easy’ or ‘that would be hard’ or ‘that scares me.’”
—Aaron Saunders, Boston Dynamics

How does the experience that you get teaching robots to dance, or to do gymnastics or parkour, inform your approach to robotics for commercial applications?

We think that the skills inherent in dance and parkour, like agility, balance, and perception, are fundamental to a wide variety of robot applications. Maybe more importantly, finding that intersection between building a new robot capability and having fun has been Boston Dynamics’ recipe for robotics—it’s a great way to advance.

One good example is how when you push limits by asking your robots to do these dynamic motions over a period of several days, you learn a lot about the robustness of your hardware. Spot, through its productization, has become incredibly robust, and required almost no maintenance—it could just dance all day long once you taught it to. And the reason it’s so robust today is because of all those lessons we learned from previous things that may have just seemed weird and fun. You’ve got to go into uncharted territory to even know what you don’t know.

Image: Boston Dynamics

It’s often hard to tell from watching videos like these how much time it took to make things work the way you wanted them to, and how representative they are of the actual capabilities of the robots. Can you talk about that?

Let me try to answer in the context of this video, but I think the same is true for all of the videos that we post. We work hard to make something, and once it works, it works. For Atlas, most of the robot control existed from our previous work, like the work that we’ve done on parkour, which sent us down a path of using model predictive controllers that account for dynamics and balance. We used those to run on the robot a set of dance steps that we’d designed offline with the dancers and choreographer. So, a lot of time, months, we spent thinking about the dance and composing the motions and iterating in simulation.

Dancing required a lot of strength and speed, so we even upgraded some of Atlas’ hardware to give it more power. Dance might be the highest power thing we’ve done to date—even though you might think parkour looks way more explosive, the amount of motion and speed that you have in dance is incredible. That also took a lot of time over the course of months; creating the capability in the machine to go along with the capability in the algorithms.

Once we had the final sequence that you see in the video, we only filmed for two days. Much of that time was spent figuring out how to move the camera through a scene with a bunch of robots in it to capture one continuous two-minute shot, and while we ran and filmed the dance routine multiple times, we could repeat it quite reliably. There was no cutting or splicing in that opening two-minute shot.

There were definitely some failures in the hardware that required maintenance, and our robots stumbled and fell down sometimes. These behaviors are not meant to be productized and to be a 100 percent reliable, but they’re definitely repeatable. We try to be honest with showing things that we can do, not a snippet of something that we did once. I think there’s an honesty required in saying that you’ve achieved something, and that’s definitely important for us.

You mentioned that Spot is now robust enough to dance all day. How about Atlas? If you kept on replacing its batteries, could it dance all day, too?

Atlas, as a machine, is still, you know… there are only a handful of them in the world, they’re complicated, and reliability was not a main focus. We would definitely break the robot from time to time. But the robustness of the hardware, in the context of what we were trying to do, was really great. And without that robustness, we wouldn’t have been able to make the video at all. I think Atlas is a little more like a helicopter, where there’s a higher ratio between the time you spend doing maintenance and the time you spend operating. Whereas with Spot, the expectation is that it’s more like a car, where you can run it for a long time before you have to touch it.

When you’re teaching Atlas to do new things, is it using any kind of machine learning? And if not, why not?

As a company, we’ve explored a lot of things, but Atlas is not using a learning controller right now. I expect that a day will come when we will. Atlas’ current dance performance uses a mixture of what we like to call reflexive control, which is a combination of reacting to forces, online and offline trajectory optimization, and model predictive control. We leverage these techniques because they’re a reliable way of unlocking really high performance stuff, and we understand how to wield these tools really well. We haven’t found the end of the road in terms of what we can do with them.

We plan on using learning to extend and build on the foundation of software and hardware that we’ve developed, but I think that we, along with the community, are still trying to figure out where the right places to apply these tools are. I think you’ll see that as part of our natural progression.

Image: Boston Dynamics

Much of Atlas’ dynamic motion comes from its lower body at the moment, but parkour makes use of upper body strength and agility as well, and we’ve seen some recent concept images showing Atlas doing vaults and pullups. Can you tell us more?

Humans and animals do amazing things using their legs, but they do even more amazing things when they use their whole bodies. I think parkour provides a fantastic framework that allows us to progress towards whole body mobility. Walking and running was just the start of that journey. We’re progressing through more complex dynamic behaviors like jumping and spinning, that’s what we’ve been working on for the last couple of years. And the next step is to explore how using arms to push and pull on the world could extend that agility.

One of the missions that I’ve given to the Atlas team is to start working on leveraging the arms as much as we leverage the legs to enhance and extend our mobility, and I’m really excited about what we’re going to be working on over the next couple of years, because it’s going to open up a lot more opportunities for us to do exciting stuff with Atlas.

What’s your perspective on hydraulic versus electric actuators for highly dynamic robots?

Across my career at Boston Dynamics, I’ve felt passionately connected to so many different types of technology, but I’ve settled into a place where I really don’t think this is an either-or conversation anymore. I think the selection of actuator technology really depends on the size of the robot that you’re building, what you want that robot to do, where you want it to go, and many other factors. Ultimately, it’s good to have both kinds of actuators in your toolbox, and I love having access to both—and we’ve used both with great success to make really impressive dynamic machines.

I think the only delineation between hydraulic and electric actuators that appears to be distinct for me is probably in scale. It’s really challenging to make tiny hydraulic things because the industry just doesn’t do a lot of that, and the reciprocal is that the industry also doesn’t tend to make massive electrical things. So, you may find that to be a natural division between these two technologies.

Besides what you’re working on at Boston Dynamics, what recent robotics research are you most excited about?

For us as a company, we really love to follow advances in sensing, computer vision, terrain perception, these are all things where the better they get, the more we can do. For me personally, one of the things I like to follow is manipulation research, and in particular manipulation research that advances our understanding of complex, friction-based interactions like sliding and pushing, or moving compliant things like ropes.

We’re seeing a shift from just pinching things, lifting them, moving them, and dropping them, to much more meaningful interactions with the environment. Research in that type of manipulation I think is going to unlock the potential for mobile manipulators, and I think it’s really going to open up the ability for robots to interact with the world in a rich way.

Is there anything else you’d like people to take away from this video?

For me personally, and I think it’s because I spend so much of my time immersed in robotics and have a deep appreciation for what a robot is and what its capabilities and limitations are, one of my strong desires is for more people to spend more time with robots. We see a lot of opinions and ideas from people looking at our videos on YouTube, and it seems to me that if more people had opportunities to think about and learn about and spend time with robots, that new level of understanding could help them imagine new ways in which robots could be useful in our daily lives. I think the possibilities are really exciting, and I just want more people to be able to take that journey. Continue reading

Posted in Human Robots

#437579 Disney Research Makes Robotic Gaze ...

While it’s not totally clear to what extent human-like robots are better than conventional robots for most applications, one area I’m personally comfortable with them is entertainment. The folks over at Disney Research, who are all about entertainment, have been working on this sort of thing for a very long time, and some of their animatronic attractions are actually quite impressive.

The next step for Disney is to make its animatronic figures, which currently feature scripted behaviors, to perform in an interactive manner with visitors. The challenge is that this is where you start to get into potential Uncanny Valley territory, which is what happens when you try to create “the illusion of life,” which is what Disney (they explicitly say) is trying to do.

In a paper presented at IROS this month, a team from Disney Research, Caltech, University of Illinois at Urbana-Champaign, and Walt Disney Imagineering is trying to nail that illusion of life with a single, and perhaps most important, social cue: eye gaze.

Before you watch this video, keep in mind that you’re watching a specific character, as Disney describes:

The robot character plays an elderly man reading a book, perhaps in a library or on a park bench. He has difficulty hearing and his eyesight is in decline. Even so, he is constantly distracted from reading by people passing by or coming up to greet him. Most times, he glances at people moving quickly in the distance, but as people encroach into his personal space, he will stare with disapproval for the interruption, or provide those that are familiar to him with friendly acknowledgment.

What, exactly, does “lifelike” mean in the context of robotic gaze? The paper abstract describes the goal as “[seeking] to create an interaction which demonstrates the illusion of life.” I suppose you could think of it like a sort of old-fashioned Turing test focused on gaze: If the gaze of this robot cannot be distinguished from the gaze of a human, then victory, that’s lifelike. And critically, we’re talking about mutual gaze here—not just a robot gazing off into the distance, but you looking deep into the eyes of this robot and it looking right back at you just like a human would. Or, just like some humans would.

The approach that Disney is using is more animation-y than biology-y or psychology-y. In other words, they’re not trying to figure out what’s going on in our brains to make our eyes move the way that they do when we’re looking at other people and basing their control system on that, but instead, Disney just wants it to look right. This “visual appeal” approach is totally fine, and there’s been an enormous amount of human-robot interaction (HRI) research behind it already, albeit usually with less explicitly human-like platforms. And speaking of human-like platforms, the hardware is a “custom Walt Disney Imagineering Audio-Animatronics bust,” which has DoFs that include neck, eyes, eyelids, and eyebrows.

In order to decide on gaze motions, the system first identifies a person to target with its attention using an RGB-D camera. If more than one person is visible, the system calculates a curiosity score for each, currently simplified to be based on how much motion it sees. Depending on which person that the robot can see has the highest curiosity score, the system will choose from a variety of high level gaze behavior states, including:

Read: The Read state can be considered the “default” state of the character. When not executing another state, the robot character will return to the Read state. Here, the character will appear to read a book located at torso level.

Glance: A transition to the Glance state from the Read or Engage states occurs when the attention engine indicates that there is a stimuli with a curiosity score […] above a certain threshold.

Engage: The Engage state occurs when the attention engine indicates that there is a stimuli […] to meet a threshold and can be triggered from both Read and Glance states. This state causes the robot to gaze at the person-of-interest with both the eyes and head.

Acknowledge: The Acknowledge state is triggered from either Engage or Glance states when the person-of-interest is deemed to be familiar to the robot.

Running underneath these higher level behavior states are lower level motion behaviors like breathing, small head movements, eye blinking, and saccades (the quick eye movements that occur when people, or robots, look between two different focal points). The term for this hierarchical behavioral state layering is a subsumption architecture, which goes all the way back to Rodney Brooks’ work on robots like Genghis in the 1980s and Cog and Kismet in the ’90s, and it provides a way for more complex behaviors to emerge from a set of simple, decentralized low-level behaviors.

“25 years on Disney is using my subsumption architecture for humanoid eye control, better and smoother now than our 1995 implementations on Cog and Kismet.”
—Rodney Brooks, MIT emeritus professor

Brooks, an emeritus professor at MIT and, most recently, cofounder and CTO of Robust.ai, tweeted about the Disney project, saying: “People underestimate how long it takes to get from academic paper to real world robotics. 25 years on Disney is using my subsumption architecture for humanoid eye control, better and smoother now than our 1995 implementations on Cog and Kismet.”

From the paper:

Although originally intended for control of mobile robots, we find that the subsumption architecture, as presented in [17], lends itself as a framework for organizing animatronic behaviors. This is due to the analogous use of subsumption in human behavior: human psychomotor behavior can be intuitively modeled as layered behaviors with incoming sensory inputs, where higher behavioral levels are able to subsume lower behaviors. At the lowest level, we have involuntary movements such as heartbeats, breathing and blinking. However, higher behavioral responses can take over and control lower level behaviors, e.g., fight-or-flight response can induce faster heart rate and breathing. As our robot character is modeled after human morphology, mimicking biological behaviors through the use of a bottom-up approach is straightforward.

The result, as the video shows, appears to be quite good, although it’s hard to tell how it would all come together if the robot had more of, you know, a face. But it seems like you don’t necessarily need to have a lifelike humanoid robot to take advantage of this architecture in an HRI context—any robot that wants to make a gaze-based connection with a human could benefit from doing it in a more human-like way.

“Realistic and Interactive Robot Gaze,” by Matthew K.X.J. Pan, Sungjoon Choi, James Kennedy, Kyna McIntosh, Daniel Campos Zamora, Gunter Niemeyer, Joohyung Kim, Alexis Wieland, and David Christensen from Disney Research, California Institute of Technology, University of Illinois at Urbana-Champaign, and Walt Disney Imagineering, was presented at IROS 2020. You can find the full paper, along with a 13-minute video presentation, on the IROS on-demand conference website.

< Back to IEEE Journal Watch Continue reading

Posted in Human Robots