Tag Archives: california

#439105 This Robot Taught Itself to Walk in a ...

Recently, in a Berkeley lab, a robot called Cassie taught itself to walk, a little like a toddler might. Through trial and error, it learned to move in a simulated world. Then its handlers sent it strolling through a minefield of real-world tests to see how it’d fare.

And, as it turns out, it fared pretty damn well. With no further fine-tuning, the robot—which is basically just a pair of legs—was able to walk in all directions, squat down while walking, right itself when pushed off balance, and adjust to different kinds of surfaces.

It’s the first time a machine learning approach known as reinforcement learning has been so successfully applied in two-legged robots.

This likely isn’t the first robot video you’ve seen, nor the most polished.

For years, the internet has been enthralled by videos of robots doing far more than walking and regaining their balance. All that is table stakes these days. Boston Dynamics, the heavyweight champ of robot videos, regularly releases mind-blowing footage of robots doing parkour, back flips, and complex dance routines. At times, it can seem the world of iRobot is just around the corner.

This sense of awe is well-earned. Boston Dynamics is one of the world’s top makers of advanced robots.

But they still have to meticulously hand program and choreograph the movements of the robots in their videos. This is a powerful approach, and the Boston Dynamics team has done incredible things with it.

In real-world situations, however, robots need to be robust and resilient. They need to regularly deal with the unexpected, and no amount of choreography will do. Which is how, it’s hoped, machine learning can help.

Reinforcement learning has been most famously exploited by Alphabet’s DeepMind to train algorithms that thrash humans at some the most difficult games. Simplistically, it’s modeled on the way we learn. Touch the stove, get burned, don’t touch the damn thing again; say please, get a jelly bean, politely ask for another.

In Cassie’s case, the Berkeley team used reinforcement learning to train an algorithm to walk in a simulation. It’s not the first AI to learn to walk in this manner. But going from simulation to the real world doesn’t always translate.

Subtle differences between the two can (literally) trip up a fledgling robot as it tries out its sim skills for the first time.

To overcome this challenge, the researchers used two simulations instead of one. The first simulation, an open source training environment called MuJoCo, was where the algorithm drew upon a large library of possible movements and, through trial and error, learned to apply them. The second simulation, called Matlab SimMechanics, served as a low-stakes testing ground that more precisely matched real-world conditions.

Once the algorithm was good enough, it graduated to Cassie.

And amazingly, it didn’t need further polishing. Said another way, when it was born into the physical world—it knew how to walk just fine. In addition, it was also quite robust. The researchers write that two motors in Cassie’s knee malfunctioned during the experiment, but the robot was able to adjust and keep on trucking.

Other labs have been hard at work applying machine learning to robotics.

Last year Google used reinforcement learning to train a (simpler) four-legged robot. And OpenAI has used it with robotic arms. Boston Dynamics, too, will likely explore ways to augment their robots with machine learning. New approaches—like this one aimed at training multi-skilled robots or this one offering continuous learning beyond training—may also move the dial. It’s early yet, however, and there’s no telling when machine learning will exceed more traditional methods.

And in the meantime, Boston Dynamics bots are testing the commercial waters.

Still, robotics researchers, who were not part of the Berkeley team, think the approach is promising. Edward Johns, head of Imperial College London’s Robot Learning Lab, told MIT Technology Review, “This is one of the most successful examples I have seen.”

The Berkeley team hopes to build on that success by trying out “more dynamic and agile behaviors.” So, might a self-taught parkour-Cassie be headed our way? We’ll see.

Image Credit: University of California Berkeley Hybrid Robotics via YouTube Continue reading

Posted in Human Robots

#438785 Video Friday: A Blimp For Your Cat

Video Friday is your weekly selection of awesome robotics videos, collected by your Automaton bloggers. We’ll also be posting a weekly calendar of upcoming robotics events for the next few months; here's what we have so far (send us your events!):

HRI 2021 – March 8-11, 2021 – [Online Conference]
RoboSoft 2021 – April 12-16, 2021 – [Online Conference]
ICRA 2021 – May 30-5, 2021 – Xi'an, China
Let us know if you have suggestions for next week, and enjoy today's videos.

Shiny robotic cat toy blimp!

I am pretty sure this is Google Translate getting things wrong, but the About page mentions that the blimp will “take you to your destination after appearing in the death of God.”

[ NTT DoCoMo ] via [ RobotStart ]

If you have yet to see this real-time video of Perseverance landing on Mars, drop everything and watch it.

During the press conference, someone commented that this is the first time anyone on the team who designed and built this system has ever seen it in operation, since it could only be tested at the component scale on Earth. This landing system has blown my mind since Curiosity.

Here's a better look at where Percy ended up:

[ NASA ]

The fact that Digit can just walk up and down wet, slippery, muddy hills without breaking a sweat is (still) astonishing.

[ Agility Robotics ]

SkyMul wants drones to take over the task of tying rebar, which looks like just the sort of thing we'd rather robots be doing so that we don't have to:

The tech certainly looks promising, and SkyMul says that they're looking for some additional support to bring things to the pilot stage.

[ SkyMul ]

Thanks Eohan!

Flatcat is a pet-like, playful robot that reacts to touch. Flatcat feels everything exactly: Cuddle with it, romp around with it, or just watch it do weird things of its own accord. We are sure that flatcat will amaze you, like us, and caress your soul.

I don't totally understand it, but I want it anyway.

[ Flatcat ]

Thanks Oswald!

This is how I would have a romantic dinner date if I couldn't get together in person. Herman the UR3 and an OptiTrack system let me remotely make a romantic meal!

[ Dave's Armoury ]

Here, we propose a novel design of deformable propellers inspired by dragonfly wings. The structure of these propellers includes a flexible segment similar to the nodus on a dragonfly wing. This flexible segment can bend, twist and even fold upon collision, absorbing force upon impact and protecting the propeller from damage.

[ Paper ]

Thanks Van!

In the 1970s, The CIA​ created the world's first miniaturized unmanned aerial vehicle, or UAV, which was intended to be a clandestine listening device. The Insectothopter was never deployed operationally, but was still revolutionary for its time.

It may never have been deployed (not that they'll admit to, anyway), but it was definitely operational and could fly controllably.

[ CIA ]

Research labs are starting to get Digits, which means we're going to get a much better idea of what its limitations are.

[ Ohio State ]

This video shows the latest achievements for LOLA walking on undetected uneven terrain. The robot is technically blind, not using any camera-based or prior information on the terrain.

[ TUM ]

We define “robotic contact juggling” to be the purposeful control of the motion of a three-dimensional smooth object as it rolls freely on a motion-controlled robot manipulator, or “hand.” While specific examples of robotic contact juggling have been studied before, in this paper we provide the first general formulation and solution method for the case of an arbitrary smooth object in single-point rolling contact on an arbitrary smooth hand.

[ Paper ]

Thanks Fan!

A couple of new cobots from ABB, designed to work safely around humans.

[ ABB ]

Thanks Fan!

It's worth watching at least a little bit of Adam Savage testing Spot's new arm, because we get to see Spot try, fail, and eventually succeed at an autonomous door-opening behavior at the 10 minute mark.

[ Tested ]

SVR discusses diversity with guest speakers Dr. Michelle Johnson from the GRASP Lab at UPenn; Dr Ariel Anders from Women in Robotics and first technical hire at Robust.ai; Alka Roy from The Responsible Innovation Project; and Kenechukwu C. Mbanesi and Kenya Andrews from Black in Robotics. The discussion here is moderated by Dr. Ken Goldberg—artist, roboticist and Director of the CITRIS People and Robots Lab—and Andra Keay from Silicon Valley Robotics.

[ SVR ]

RAS presents a Soft Robotics Debate on Bioinspired vs. Biohybrid Design.

In this debate, we will bring together experts in Bioinspiration and Biohybrid design to discuss the necessary steps to make more competent soft robots. We will try to answer whether bioinspired research should focus more on developing new bioinspired material and structures or on the integration of living and artificial structures in biohybrid designs.

[ RAS SoRo ]

IFRR presents a Colloquium on Human Robot Interaction.

Across many application domains, robots are expected to work in human environments, side by side with people. The users will vary substantially in background, training, physical and cognitive abilities, and readiness to adopt technology. Robotic products are expected to not only be intuitive, easy to use, and responsive to the needs and states of their users, but they must also be designed with these differences in mind, making human-robot interaction (HRI) a key area of research.

[ IFRR ]

Vijay Kumar, Nemirovsky Family Dean and Professor at Penn Engineering, gives an introduction to ENIAC day and David Patterson, Pardee Professor of Computer Science, Emeritus at the University of California at Berkeley, speaks about the legacy of the ENIAC and its impact on computer architecture today. This video is comprised of lectures one and two of nine total lectures in the ENIAC Day series.

There are more interesting ENIAC videos at the link below, but we'll highlight this particular one, about the women of the ENIAC, also known as the First Programmers.

[ ENIAC Day ] Continue reading

Posted in Human Robots

#437978 How Mirroring the Architecture of the ...

While AI can carry out some impressive feats when trained on millions of data points, the human brain can often learn from a tiny number of examples. New research shows that borrowing architectural principles from the brain can help AI get closer to our visual prowess.

The prevailing wisdom in deep learning research is that the more data you throw at an algorithm, the better it will learn. And in the era of Big Data, that’s easier than ever, particularly for the large data-centric tech companies carrying out a lot of the cutting-edge AI research.

Today’s largest deep learning models, like OpenAI’s GPT-3 and Google’s BERT, are trained on billions of data points, and even more modest models require large amounts of data. Collecting these datasets and investing the computational resources to crunch through them is a major bottleneck, particularly for less well-resourced academic labs.

It also means today’s AI is far less flexible than natural intelligence. While a human only needs to see a handful of examples of an animal, a tool, or some other category of object to be able pick it out again, most AI need to be trained on many examples of an object in order to be able to recognize it.

There is an active sub-discipline of AI research aimed at what is known as “one-shot” or “few-shot” learning, where algorithms are designed to be able to learn from very few examples. But these approaches are still largely experimental, and they can’t come close to matching the fastest learner we know—the human brain.

This prompted a pair of neuroscientists to see if they could design an AI that could learn from few data points by borrowing principles from how we think the brain solves this problem. In a paper in Frontiers in Computational Neuroscience, they explained that the approach significantly boosts AI’s ability to learn new visual concepts from few examples.

“Our model provides a biologically plausible way for artificial neural networks to learn new visual concepts from a small number of examples,” Maximilian Riesenhuber, from Georgetown University Medical Center, said in a press release. “We can get computers to learn much better from few examples by leveraging prior learning in a way that we think mirrors what the brain is doing.”

Several decades of neuroscience research suggest that the brain’s ability to learn so quickly depends on its ability to use prior knowledge to understand new concepts based on little data. When it comes to visual understanding, this can rely on similarities of shape, structure, or color, but the brain can also leverage abstract visual concepts thought to be encoded in a brain region called the anterior temporal lobe (ATL).

“It is like saying that a platypus looks a bit like a duck, a beaver, and a sea otter,” said paper co-author Joshua Rule, from the University of California Berkeley.

The researchers decided to try and recreate this capability by using similar high-level concepts learned by an AI to help it quickly learn previously unseen categories of images.

Deep learning algorithms work by getting layers of artificial neurons to learn increasingly complex features of an image or other data type, which are then used to categorize new data. For instance, early layers will look for simple features like edges, while later ones might look for more complex ones like noses, faces, or even more high-level characteristics.

First they trained the AI on 2.5 million images across 2,000 different categories from the popular ImageNet dataset. They then extracted features from various layers of the network, including the very last layer before the output layer. They refer to these as “conceptual features” because they are the highest-level features learned, and most similar to the abstract concepts that might be encoded in the ATL.

They then used these different sets of features to train the AI to learn new concepts based on 2, 4, 8, 16, 32, 64, and 128 examples. They found that the AI that used the conceptual features yielded much better performance than ones trained using lower-level features on lower numbers of examples, but the gap shrunk as they were fed more training examples.

While the researchers admit the challenge they set their AI was relatively simple and only covers one aspect of the complex process of visual reasoning, they said that using a biologically plausible approach to solving the few-shot problem opens up promising new avenues in both neuroscience and AI.

“Our findings not only suggest techniques that could help computers learn more quickly and efficiently, they can also lead to improved neuroscience experiments aimed at understanding how people learn so quickly, which is not yet well understood,” Riesenhuber said.

As the researchers note, the human visual system is still the gold standard when it comes to understanding the world around us. Borrowing from its design principles might turn out to be a profitable direction for future research.

Image Credit: Gerd Altmann from Pixabay Continue reading

Posted in Human Robots

#437864 Video Friday: Jet-Powered Flying ...

Video Friday is your weekly selection of awesome robotics videos, collected by your Automaton bloggers. We’ll also be posting a weekly calendar of upcoming robotics events for the next few months; here’s what we have so far (send us your events!):

ICRA 2020 – June 1-15, 2020 – [Virtual Conference]
RSS 2020 – July 12-16, 2020 – [Virtual Conference]
CLAWAR 2020 – August 24-26, 2020 – [Virtual Conference]
ICUAS 2020 – September 1-4, 2020 – Athens, Greece
ICRES 2020 – September 28-29, 2020 – Taipei, Taiwan
ICSR 2020 – November 14-16, 2020 – Golden, Colorado
Let us know if you have suggestions for next week, and enjoy today’s videos.

ICRA 2020, the world’s best, biggest, longest virtual robotics conference ever, kicked off last Sunday with an all-star panel on a critical topic: “COVID-19: How Can Roboticists Help?”

Watch other ICRA keynotes on IEEE.tv.

We’re getting closer! Well, kinda. iRonCub, the jet-powered flying humanoid, is still a simulation for now, but not only are the simulations getting better—the researchers have begun testing real jet engines!

This video shows the latest results on Aerial Humanoid Robotics obtained by the Dynamic Interaction Control Lab at the Italian Institute of Technology. The video simulates robot and jet dynamics, where the latter uses the results obtained in the paper “Modeling, Identification and Control of Model Jet Engines for Jet Powered Robotics” published in IEEE Robotics and Automation Letters.

This video presents the paper entitled “Modeling, Identification and Control of Model Jet Engines for Jet Powered Robotics” published in IEEE Robotics and Automation Letters (Volume: 5 , Issue: 2 , April 2020 ) Page(s): 2070 – 2077. Preprint at https://arxiv.org/pdf/1909.13296.pdf.​

[ IIT ]

In a new pair of papers, researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) came up with new tools to let robots better perceive what they’re interacting with: the ability to see and classify items, and a softer, delicate touch.

[ MIT CSAIL ]

UBTECH’s anti-epidemic solutions greatly relieve the workload of front-line medical staff and cut the consumption of personal protective equipment (PPE).

[ UBTECH ]

We demonstrate a method to assess the concrete deterioration in sewers by performing a tactile inspection motion with a sensorized foot of a legged robot.

[ THING ] via [ ANYmal Research ]

Get a closer look at the Virtual competition of the Urban Circuit and how teams can use the simulated environments to better prepare for the physical courses of the Subterranean Challenge.

[ SubT ]

Roboticists at the University of California San Diego have developed flexible feet that can help robots walk up to 40 percent faster on uneven terrain, such as pebbles and wood chips. The work has applications for search-and-rescue missions as well as space exploration.

[ UCSD ]

Thanks Ioana!

Tsuki is a ROS-enabled, highly dynamic quadruped robot developed by Lingkang Zhang.

And as far as we know, Lingkang is still chasing it.

[ Quadruped Tsuki ]

Thanks Lingkang!

Watch this.

This video shows an impressive demo of how YuMi’s superior precision, using precise servo gripper fingers and vacuum suction tool to pick up extremely small parts inside a mechanical watch. The video is not a final application used in production, it is a demo of how such an application can be implemented.

[ ABB ]

Meet Presso, the “5-minute dry cleaning robot.” Can you really call this a robot? We’re not sure. The company says it uses “soft robotics to hold the garment correctly, then clean, sanitize, press and dry under 5 minutes.” The machine was initially designed for use in the hospitality industry, but after adding a disinfectant function for COVID-19, it is now being used on movie and TV sets.

[ Presso ]

The next Mars rover launches next month (!), and here’s a look at some of the instruments on board.

[ JPL ]

Embodied Lead Engineer, Peter Teel, describes why we chose to build Moxie’s computing system from scratch and what makes it so unique.

[ Embodied ]

I did not know that this is where Pepper’s e-stop is. Nice design!

[ Softbank Robotics ]

State of the art in the field of swarm robotics lacks systems capable of absolute decentralization and is hence unable to mimic complex biological swarm systems consisting of simple units. Our research interconnects fields of swarm robotics and computer vision, and introduces novel use of a vision-based method UVDAR for mutual localization in swarm systems, allowing for absolute decentralization found among biological swarm systems. The developed methodology allows us to deploy real-world aerial swarming systems with robots directly localizing each other instead of communicating their states via a communication network, which is a typical bottleneck of current state of the art systems.

[ CVUT ]

I’m almost positive I could not do this task.

It’s easy to pick up objects using YuMi’s integrated vacuum functionality, it also supports ABB Robot’s Conveyor Tracking and Pickmaster 3 functionality, enabling it to track a moving conveyor and pick up objects using vision. Perfect for consumer products handling applications.

[ ABB ]

Cycling safety gestures, such as hand signals and shoulder checks, are an essential part of safe manoeuvring on the road. Child cyclists, in particular, might have difficulties performing safety gestures on the road or even forget about them, given the lack of cycling experience, road distractions and differences in motor and perceptual-motor abilities compared with adults. To support them, we designed two methods to remind about safety gestures while cycling. The first method employs an icon-based reminder in heads-up display (HUD) glasses and the second combines vibration on the handlebar and ambient light in the helmet. We investigated the performance of both methods in a controlled test-track experiment with 18 children using a mid-size tricycle, augmented with a set of sensors to recognize children’s behavior in real time. We found that both systems are successful in reminding children about safety gestures and have their unique advantages and disadvantages.

[ Paper ]

Nathan Sam and Robert “Red” Jensen fabricate and fly a Prandtl-M aircraft at NASA’s Armstrong Flight Research Center in California. The aircraft is the second of three prototypes of varying sizes to provide scientists with options to fly sensors in the Martian atmosphere to collect weather and landing site information for future human exploration of Mars.

[ NASA ]

This is clever: In order to minimize time spent labeling datasets, you can use radar to identify other vehicles, not because the radar can actually recognize other vehicles, but because the radar can recognize other stuff that’s big and moving, which turns out to be almost as good.

[ ICRA Paper ]

Happy 10th birthday to the Natural Robotics Lab at the University of Sheffield.

[ NRL ] Continue reading

Posted in Human Robots

#437769 Q&A: Facebook’s CTO Is at War With ...

Photo: Patricia de Melo Moreira/AFP/Getty Images

Facebook chief technology officer Mike Schroepfer leads the company’s AI and integrity efforts.

Facebook’s challenge is huge. Billions of pieces of content—short and long posts, images, and combinations of the two—are uploaded to the site daily from around the world. And any tiny piece of that—any phrase, image, or video—could contain so-called bad content.

In its early days, Facebook relied on simple computer filters to identify potentially problematic posts by their words, such as those containing profanity. These automatically filtered posts, as well as posts flagged by users as offensive, went to humans for adjudication.

In 2015, Facebook started using artificial intelligence to cull images that contained nudity, illegal goods, and other prohibited content; those images identified as possibly problematic were sent to humans for further review.

By 2016, more offensive photos were reported by Facebook’s AI systems than by Facebook users (and that is still the case).

In 2018, Facebook CEO Mark Zuckerberg made a bold proclamation: He predicted that within five or ten years, Facebook’s AI would not only look for profanity, nudity, and other obvious violations of Facebook’s policies. The tools would also be able to spot bullying, hate speech, and other misuse of the platform, and put an immediate end to them.

Today, automated systems using algorithms developed with AI scan every piece of content between the time when a user completes a post and when it is visible to others on the site—just fractions of a second. In most cases, a violation of Facebook’s standards is clear, and the AI system automatically blocks the post. In other cases, the post goes to human reviewers for a final decision, a workforce that includes 15,000 content reviewers and another 20,000 employees focused on safety and security, operating out of more than 20 facilities around the world.

In the first quarter of this year, Facebook removed or took other action (like appending a warning label) on more than 9.6 million posts involving hate speech, 8.6 million involving child nudity or exploitation, almost 8 million posts involving the sale of drugs, 2.3 million posts involving bullying and harassment, and tens of millions of posts violating other Facebook rules.

Right now, Facebook has more than 1,000 engineers working on further developing and implementing what the company calls “integrity” tools. Using these systems to screen every post that goes up on Facebook, and doing so in milliseconds, is sucking up computing resources. Facebook chief technology officer Mike Schroepfer, who is heading up Facebook’s AI and integrity efforts, spoke with IEEE Spectrum about the team’s progress on building an AI system that detects bad content.

Since that discussion, Facebook’s policies around hate speech have come under increasing scrutiny, with particular attention on divisive posts by political figures. A group of major advertisers in June announced that they would stop advertising on the platform while reviewing the situation, and civil rights groups are putting pressure on others to follow suit until Facebook makes policy changes related to hate speech and groups that promote hate, misinformation, and conspiracies.

Facebook CEO Mark Zuckerberg responded with news that Facebook will widen the category of what it considers hateful content in ads. Now the company prohibits claims that people from a specific race, ethnicity, national origin, religious affiliation, caste, sexual orientation, gender identity, or immigration status are a threat to the physical safety, health, or survival of others. The policy change also aims to better protect immigrants, migrants, refugees, and asylum seekers from ads suggesting these groups are inferior or expressing contempt. Finally, Zuckerberg announced that the company will label some problematic posts by politicians and government officials as content that violates Facebook’s policies.

However, civil rights groups say that’s not enough. And an independent audit released in July also said that Facebook needs to go much further in addressing civil rights concerns and disinformation.

Schroepfer indicated that Facebook’s AI systems are designed to quickly adapt to changes in policy. “I don’t expect considerable technical changes are needed to adjust,” he told Spectrum.

This interview has been edited and condensed for clarity.

IEEE Spectrum: What are the stakes of content moderation? Is this an existential threat to Facebook? And is it critical that you deal well with the issue of election interference this year?

Schroepfer: It’s probably existential; it’s certainly massive. We are devoting a tremendous amount of our attention to it.

The idea that anyone could meddle in an election is deeply disturbing and offensive to all of us here, just as people and citizens of democracies. We don’t want to see that happen anywhere, and certainly not on our watch. So whether it’s important to the company or not, it’s important to us as people. And I feel a similar way on the content-moderation side.

There are not a lot of easy choices here. The only way to prevent people, with certainty, from posting bad things is to not let them post anything. We can take away all voice and just say, “Sorry, the Internet’s too dangerous. No one can use it.” That will certainly get rid of all hate speech online. But I don’t want to end up in that world. And there are variants of that world that various governments are trying to implement, where they get to decide what’s true or not, and you as a person don’t. I don’t want to get there either.

My hope is that we can build a set of tools that make it practical for us to do a good enough job, so that everyone is still excited about the idea that anyone can share what they want, and so that Facebook is a safe and reasonable place for people to operate in.

Spectrum: You joined Facebook in 2008, before AI was part of the company’s toolbox. When did that change? When did you begin to think that AI tools would be useful to Facebook?

Schroepfer: Ten years ago, AI wasn’t commercially practical; the technology just didn’t work very well. In 2012, there was one of those moments that a lot of people point to as the beginning of the current revolution in deep learning and AI. A computer-vision model—a neural network—was trained using what we call supervised training, and it turned out to be better than all the existing models.

Spectrum: How is that training done, and how did computer-vision models come to Facebook?

Image: Facebook

Just Broccoli? Facebook’s image analysis algorithms can tell the difference between marijuana [left] and tempura broccoli [right] better than some humans.

Schroepfer: Say I take a bunch of photos and I have people look at them. If they see a photo of a cat, they put a text label that says cat; if it’s one of a dog, the text label says dog. If you build a big enough data set and feed that to the neural net, it learns how to tell the difference between cats and dogs.

Prior to 2012, it didn’t work very well. And then in 2012, there was this moment where it seemed like, “Oh wow, this technique might work.” And a few years later we were deploying that form of technology to help us detect problematic imagery.

Spectrum: Do your AI systems work equally well on all types of prohibited content?

Schroepfer: Nudity was technically easiest. I don’t need to understand language or culture to understand that this is either a naked human or not. Violence is a much more nuanced problem, so it was harder technically to get it right. And with hate speech, not only do you have to understand the language, it may be very contextual, even tied to recent events. A week before the Christchurch shooting [New Zealand, 2019], saying “I wish you were in the mosque” probably doesn’t mean anything. A week after, that might be a terrible thing to say.

Spectrum: How much progress have you made on hate speech?

Schroepfer: AI, in the first quarter of 2020, proactively detected 88.8 percent of the hate-speech content we removed, up from 80.2 percent in the previous quarter. In the first quarter of 2020, we took action on 9.6 million pieces of content for violating our hate-speech policies.

Image: Facebook

Off Label: Sometimes image analysis isn’t enough to determine whether a picture posted violates the company’s policies. In considering these candy-colored vials of marijuana, for example, the algorithms can look at any accompanying text and, if necessary, comments on the post.

Spectrum: It sounds like you’ve expanded beyond tools that analyze images and are also using AI tools that analyze text.

Schroepfer: AI started off as very siloed. People worked on language, people worked on computer vision, people worked on video. We’ve put these things together—in production, not just as research—into multimodal classifiers.

[Schroepfer shows a photo of a pan of Rice Krispies treats, with text referring to it as a “potent batch”] This is a case in which you have an image, and then you have the text on the post. This looks like Rice Krispies. On its own, this image is fine. You put the text together with it in a bigger model; that can then understand what’s going on. That didn’t work five years ago.

Spectrum: Today, every post that goes up on Facebook is immediately checked by automated systems. Can you explain that process?

Image: Facebook

Bigger Picture: Identifying hate speech is often a matter of context. Either the text or the photo in this post isn’t hateful standing alone, but putting them together tells a different story.

Schroepfer: You upload an image and you write some text underneath it, and the systems look at both the image and the text to try to see which, if any, policies it violates. Those decisions are based on our Community Standards. It will also look at other signals on the posts, like the comments people make.

It happens relatively instantly, though there may be times things happen after the fact. Maybe you uploaded a post that had misinformation in it, and at the time you uploaded it, we didn’t know it was misinformation. The next day we fact-check something and scan again; we may find your post and take it down. As we learn new things, we’re going to go back through and look for violations of what we now know to be a problem. Or, as people comment on your post, we might update our understanding of it. If people are saying, “That’s terrible,” or “That’s mean,” or “That looks fake,” those comments may be an interesting signal.

Spectrum: How is Facebook applying its AI tools to the problem of election interference?

Schroepfer: I would split election interference into two categories. There are times when you’re going after the content, and there are times you’re going after the behavior or the authenticity of the person.

On content, if you’re sharing misinformation, saying, “It’s super Wednesday, not super Tuesday, come vote on Wednesday,” that’s a problem whether you’re an American sitting in California or a foreign actor.

Other times, people create a series of Facebook pages pretending they’re Americans, but they’re really a foreign entity. That is a problem on its own, even if all the content they’re sharing completely meets our Community Standards. The problem there is that you have a foreign government running an information operation.

There, you need different tools. What you’re trying to do is put pieces together, to say, “Wait a second. All of these pages—Martians for Justice, Moonlings for Justice, and Venusians for Justice”—are all run by an administrator with an IP address that’s outside the United States. So they’re all connected, even though they’re pretending to not be connected. That’s a very different problem than me sitting in my office in Menlo Park [Calif.] sharing misinformation.

I’m not going to go into lots of technical detail, because this is an area of adversarial nature. The fundamental problem you’re trying to solve is that there’s one entity coordinating the activity of a bunch of things that look like they’re not all one thing. So this is a series of Instagram accounts, or a series of Facebook pages, or a series of WhatsApp accounts, and they’re pretending to be totally different things. We’re looking for signals that these things are related in some way. And we’re looking through the graph [what Facebook calls its map of relationships between users] to understand the properties of this network.

Spectrum: What cutting-edge AI tools and methods have you been working on lately?

Schroepfer: Supervised learning, with humans setting up the instruction process for the AI systems, is amazingly effective. But it has a very obvious flaw: the speed at which you can develop these things is limited by how fast you can curate the data sets. If you’re dealing in a problem domain where things change rapidly, you have to rebuild a new data set and retrain the whole thing.

Self-supervision is inspired by the way people learn, by the way kids explore the world around them. To get computers to do it themselves, we take a bunch of raw data and build a way for the computer to construct its own tests. For language, you scan a bunch of Web pages, and the computer builds a test where it takes a sentence, eliminates one of the words, and figures out how to predict what word belongs there. And because it created the test, it actually knows the answer. I can use as much raw text as I can find and store because it’s processing everything itself and doesn’t require us to sit down and build the information set. In the last two years there has been a revolution in language understanding as a result of AI self-supervised learning.

Spectrum: What else are you excited about?

Schroepfer: What we’ve been working on over the last few years is multilingual understanding. Usually, when I’m trying to figure out, say, whether something is hate speech or not I have to go through the whole process of training the model in every language. I have to do that one time for every language. When you make a post, the first thing we have to figure out is what language your post is in. “Ah, that’s Spanish. So send it to the Spanish hate-speech model.”

We’ve started to build a multilingual model—one box where you can feed in text in 40 different languages and it determines whether it’s hate speech or not. This is way more effective and easier to deploy.

To geek out for a second, just the idea that you can build a model that understands a concept in multiple languages at once is crazy cool. And it not only works for hate speech, it works for a variety of things.

When we started working on this multilingual model years ago, it performed worse than every single individual model. Now, it not only works as well as the English model, but when you get to the languages where you don’t have enough data, it’s so much better. This rapid progress is very exciting.

Spectrum: How do you move new AI tools from your research labs into operational use?

Schroepfer: Engineers trying to make the next breakthrough will often say, “Cool, I’ve got a new thing and it achieved state-of-the-art results on machine translation.” And we say, “Great. How long does it take to run in production?” They say, “Well, it takes 10 seconds for every sentence to run on a CPU.” And we say, “It’ll eat our whole data center if we deploy that.” So we take that state-of-the-art model and we make it 10 or a hundred or a thousand times more efficient, maybe at the cost of a little bit of accuracy. So it’s not as good as the state-of-the-art version, but it’s something we can actually put into our data centers and run in production.

Spectrum: What’s the role of the humans in the loop? Is it true that Facebook currently employs 35,000 moderators?

Schroepfer: Yes. Right now our goal is not to reduce that. Our goal is to do a better job catching bad content. People often think that the end state will be a fully automated system. I don’t see that world coming anytime soon.

As automated systems get more sophisticated, they take more and more of the grunt work away, freeing up the humans to work on the really gnarly stuff where you have to spend an hour researching.

We also use AI to give our human moderators power tools. Say I spot this new meme that is telling everyone to vote on Wednesday rather than Tuesday. I have a tool in front of me that says, “Find variants of that throughout the system. Find every photo with the same text, find every video that mentions this thing and kill it in one shot.” Rather than, I found this one picture, but then a bunch of other people upload that misinformation in different forms.

Another important aspect of AI is that anything I can do to prevent a person from having to look at terrible things is time well spent. Whether it’s a person employed by us as a moderator or a user of our services, looking at these things is a terrible experience. If I can build systems that take the worst of the worst, the really graphic violence, and deal with that in an automated fashion, that’s worth a lot to me. Continue reading

Posted in Human Robots