Tag Archives: eye

#437735 Robotic Chameleon Tongue Snatches Nearby ...

Chameleons may be slow-moving lizards, but their tongues can accelerate at astounding speeds, snatching insects before they have any chance of fleeing. Inspired by this remarkable skill, researchers in South Korea have developed a robotic tongue that springs forth quickly to snatch up nearby items.

They envision the tool, called Snatcher, being used by drones and robots that need to collect items without getting too close to them. “For example, a quadrotor with this manipulator will be able to snatch distant targets, instead of hovering and picking up,” explains Gwang-Pil Jung, a researcher at Seoul National University of Science and Technology (SeoulTech) who co-designed the new device.

There has been other research into robotic chameleon tongues, but what’s unique about Snatcher is that it packs chameleon-tongue fast snatching performance into a form factor that’s portable—the total size is 12 x 8.5 x 8.5 centimeters and it weighs under 120 grams. Still, it’s able to fast snatch up to 30 grams from 80 centimeters away in under 600 milliseconds.

Image: SeoulTech

The fast snatching deployable arm is powered by a wind-up spring attached to a motor (a series elastic actuator) combined with an active clutch. The clutch is what allows the single spring to drive both the shooting and the retracting.

To create Snatcher, Jung and a colleague at SeoulTech, Dong-Jun Lee, set about developing a spring-like device that’s controlled by an active clutch combined with a single series elastic actuator. Powered by a wind-up spring, a steel tapeline—analogous to a chameleon’s tongue—passes through two geared feeders. The clutch is what allows the single spring unwinding in one direction to drive both the shooting and the retracting, by switching a geared wheel between driving the forward feeder or the backward feeder.

The end result is a lightweight snatching device that can retrieve an object 0.8 meters away within 600 milliseconds. Jung notes that some other, existing devices designed for retrieval are capable of accomplishing the task quicker, at about 300 milliseconds, but these designs tend to be bulky. A more detailed description of Snatcher was published July 21 in IEEE Robotics and Automation Letters.

Photo: Dong-Jun Lee and Gwang-Pil Jung/SeoulTech

Snatcher’s relative small size means that it can be installed on a DJI Phantom drone. The researchers want to find out if their system can help make package delivery or retrieval faster and safer.

“Our final goal is to install the Snatcher to a commercial drone and achieve meaningful work, such as grasping packages,” says Jung. One of the challenges they still need to address is how to power the actuation system more efficiently. “To solve this issue, we are finding materials having high energy density.” Another improvement is designing a chameleon tongue-like gripper, replacing the simple hook that’s currently used to pick up objects. “We are planning to make a bi-stable gripper to passively grasp a target object as soon as the gripper contacts the object,” says Jung.

< Back to IEEE Journal Watch Continue reading

Posted in Human Robots

#437590 Why We Need a Robot Registry


I have a confession to make: A robot haunts my nightmares. For me, Boston Dynamics’ Spot robot is 32.5 kilograms (71.1 pounds) of pure terror. It can climb stairs. It can open doors. Seeing it in a video cannot prepare you for the moment you cross paths on a trade-show floor. Now that companies can buy a Spot robot for US $74,500, you might encounter Spot anywhere.

Spot robots now patrol public parks in Singapore to enforce social distancing during the pandemic. They meet with COVID-19 patients at Boston’s Brigham and Women’s Hospital so that doctors can conduct remote consultations. Imagine coming across Spot while walking in the park or returning to your car in a parking garage. Wouldn’t you want to know why this hunk of metal is there and who’s operating it? Or at least whom to call to report a malfunction?

Robots are becoming more prominent in daily life, which is why I think governments need to create national registries of robots. Such a registry would let citizens and law enforcement look up the owner of any roaming robot, as well as learn that robot’s purpose. It’s not a far-fetched idea: The U.S. Federal Aviation Administration already has a registry for drones.

Governments could create national databases that require any companies operating robots in public spaces to report the robot make and model, its purpose, and whom to contact if the robot breaks down or causes problems. To allow anyone to use the database, all public robots would have an easily identifiable marker or model number on their bodies. Think of it as a license plate or pet microchip, but for bots.

There are some smaller-scale registries today. San Jose’s Department of Transportation (SJDOT), for example, is working with Kiwibot, a delivery robot manufacturer, to get real-time data from the robots as they roam the city’s streets. The Kiwibots report their location to SJDOT using the open-source Mobility Data Specification, which was originally developed by Los Angeles to track Bird scooters.

Real-time location reporting makes sense for Kiwibots and Spots wandering the streets, but it’s probably overkill for bots confined to cleaning floors or patrolling parking lots. That said, any robots that come in contact with the general public should clearly provide basic credentials and a way to hold their operators accountable. Given that many robots use cameras, people may also be interested in looking up who’s collecting and using that data.

I starting thinking about robot registries after Spot became available in June for anyone to purchase. The idea gained specificity after listening to Andra Keay, founder and managing director at Silicon Valley Robotics, discuss her five rules of ethical robotics at an Arm event in October. I had already been thinking that we needed some way to track robots, but her suggestion to tie robot license plates to a formal registry made me realize that people also need a way to clearly identify individual robots.

Keay pointed out that in addition to sating public curiosity and keeping an eye on robots that could cause harm, a registry could also track robots that have been hacked. For example, robots at risk of being hacked and running amok could be required to report their movements to a database, even if they’re typically restricted to a grocery store or warehouse. While we’re at it, Spot robots should be required to have sirens, because there’s no way I want one of those sneaking up on me.

This article appears in the December 2020 print issue as “Who’s Behind That Robot?” Continue reading

Posted in Human Robots

#437579 Disney Research Makes Robotic Gaze ...

While it’s not totally clear to what extent human-like robots are better than conventional robots for most applications, one area I’m personally comfortable with them is entertainment. The folks over at Disney Research, who are all about entertainment, have been working on this sort of thing for a very long time, and some of their animatronic attractions are actually quite impressive.

The next step for Disney is to make its animatronic figures, which currently feature scripted behaviors, to perform in an interactive manner with visitors. The challenge is that this is where you start to get into potential Uncanny Valley territory, which is what happens when you try to create “the illusion of life,” which is what Disney (they explicitly say) is trying to do.

In a paper presented at IROS this month, a team from Disney Research, Caltech, University of Illinois at Urbana-Champaign, and Walt Disney Imagineering is trying to nail that illusion of life with a single, and perhaps most important, social cue: eye gaze.

Before you watch this video, keep in mind that you’re watching a specific character, as Disney describes:

The robot character plays an elderly man reading a book, perhaps in a library or on a park bench. He has difficulty hearing and his eyesight is in decline. Even so, he is constantly distracted from reading by people passing by or coming up to greet him. Most times, he glances at people moving quickly in the distance, but as people encroach into his personal space, he will stare with disapproval for the interruption, or provide those that are familiar to him with friendly acknowledgment.

What, exactly, does “lifelike” mean in the context of robotic gaze? The paper abstract describes the goal as “[seeking] to create an interaction which demonstrates the illusion of life.” I suppose you could think of it like a sort of old-fashioned Turing test focused on gaze: If the gaze of this robot cannot be distinguished from the gaze of a human, then victory, that’s lifelike. And critically, we’re talking about mutual gaze here—not just a robot gazing off into the distance, but you looking deep into the eyes of this robot and it looking right back at you just like a human would. Or, just like some humans would.

The approach that Disney is using is more animation-y than biology-y or psychology-y. In other words, they’re not trying to figure out what’s going on in our brains to make our eyes move the way that they do when we’re looking at other people and basing their control system on that, but instead, Disney just wants it to look right. This “visual appeal” approach is totally fine, and there’s been an enormous amount of human-robot interaction (HRI) research behind it already, albeit usually with less explicitly human-like platforms. And speaking of human-like platforms, the hardware is a “custom Walt Disney Imagineering Audio-Animatronics bust,” which has DoFs that include neck, eyes, eyelids, and eyebrows.

In order to decide on gaze motions, the system first identifies a person to target with its attention using an RGB-D camera. If more than one person is visible, the system calculates a curiosity score for each, currently simplified to be based on how much motion it sees. Depending on which person that the robot can see has the highest curiosity score, the system will choose from a variety of high level gaze behavior states, including:

Read: The Read state can be considered the “default” state of the character. When not executing another state, the robot character will return to the Read state. Here, the character will appear to read a book located at torso level.

Glance: A transition to the Glance state from the Read or Engage states occurs when the attention engine indicates that there is a stimuli with a curiosity score […] above a certain threshold.

Engage: The Engage state occurs when the attention engine indicates that there is a stimuli […] to meet a threshold and can be triggered from both Read and Glance states. This state causes the robot to gaze at the person-of-interest with both the eyes and head.

Acknowledge: The Acknowledge state is triggered from either Engage or Glance states when the person-of-interest is deemed to be familiar to the robot.

Running underneath these higher level behavior states are lower level motion behaviors like breathing, small head movements, eye blinking, and saccades (the quick eye movements that occur when people, or robots, look between two different focal points). The term for this hierarchical behavioral state layering is a subsumption architecture, which goes all the way back to Rodney Brooks’ work on robots like Genghis in the 1980s and Cog and Kismet in the ’90s, and it provides a way for more complex behaviors to emerge from a set of simple, decentralized low-level behaviors.

“25 years on Disney is using my subsumption architecture for humanoid eye control, better and smoother now than our 1995 implementations on Cog and Kismet.”
—Rodney Brooks, MIT emeritus professor

Brooks, an emeritus professor at MIT and, most recently, cofounder and CTO of Robust.ai, tweeted about the Disney project, saying: “People underestimate how long it takes to get from academic paper to real world robotics. 25 years on Disney is using my subsumption architecture for humanoid eye control, better and smoother now than our 1995 implementations on Cog and Kismet.”

From the paper:

Although originally intended for control of mobile robots, we find that the subsumption architecture, as presented in [17], lends itself as a framework for organizing animatronic behaviors. This is due to the analogous use of subsumption in human behavior: human psychomotor behavior can be intuitively modeled as layered behaviors with incoming sensory inputs, where higher behavioral levels are able to subsume lower behaviors. At the lowest level, we have involuntary movements such as heartbeats, breathing and blinking. However, higher behavioral responses can take over and control lower level behaviors, e.g., fight-or-flight response can induce faster heart rate and breathing. As our robot character is modeled after human morphology, mimicking biological behaviors through the use of a bottom-up approach is straightforward.

The result, as the video shows, appears to be quite good, although it’s hard to tell how it would all come together if the robot had more of, you know, a face. But it seems like you don’t necessarily need to have a lifelike humanoid robot to take advantage of this architecture in an HRI context—any robot that wants to make a gaze-based connection with a human could benefit from doing it in a more human-like way.

“Realistic and Interactive Robot Gaze,” by Matthew K.X.J. Pan, Sungjoon Choi, James Kennedy, Kyna McIntosh, Daniel Campos Zamora, Gunter Niemeyer, Joohyung Kim, Alexis Wieland, and David Christensen from Disney Research, California Institute of Technology, University of Illinois at Urbana-Champaign, and Walt Disney Imagineering, was presented at IROS 2020. You can find the full paper, along with a 13-minute video presentation, on the IROS on-demand conference website.

< Back to IEEE Journal Watch Continue reading

Posted in Human Robots

#437491 3.2 Billion Images and 720,000 Hours of ...

Twitter over the weekend “tagged” as manipulated a video showing US Democratic presidential candidate Joe Biden supposedly forgetting which state he’s in while addressing a crowd.

Biden’s “hello Minnesota” greeting contrasted with prominent signage reading “Tampa, Florida” and “Text FL to 30330.”

The Associated Press’s fact check confirmed the signs were added digitally and the original footage was indeed from a Minnesota rally. But by the time the misleading video was removed it already had more than one million views, The Guardian reports.

A FALSE video claiming Biden forgot what state he was in was viewed more than 1 million times on Twitter in the past 24 hours

In the video, Biden says “Hello, Minnesota.”

The event did indeed happen in MN — signs on stage read MN

But false video edited signs to read Florida pic.twitter.com/LdHQVaky8v

— Donie O'Sullivan (@donie) November 1, 2020

If you use social media, the chances are you see (and forward) some of the more than 3.2 billion images and 720,000 hours of video shared daily. When faced with such a glut of content, how can we know what’s real and what’s not?

While one part of the solution is an increased use of content verification tools, it’s equally important we all boost our digital media literacy. Ultimately, one of the best lines of defense—and the only one you can control—is you.

Seeing Shouldn’t Always Be Believing
Misinformation (when you accidentally share false content) and disinformation (when you intentionally share it) in any medium can erode trust in civil institutions such as news organizations, coalitions and social movements. However, fake photos and videos are often the most potent.

For those with a vested political interest, creating, sharing and/or editing false images can distract, confuse and manipulate viewers to sow discord and uncertainty (especially in already polarized environments). Posters and platforms can also make money from the sharing of fake, sensationalist content.

Only 11-25 percent of journalists globally use social media content verification tools, according to the International Centre for Journalists.

Could You Spot a Doctored Image?
Consider this photo of Martin Luther King Jr.

Dr. Martin Luther King Jr. Giving the middle finger #DopeHistoricPics pic.twitter.com/5W38DRaLHr

— Dope Historic Pics (@dopehistoricpic) December 20, 2013

This altered image clones part of the background over King Jr’s finger, so it looks like he’s flipping off the camera. It has been shared as genuine on Twitter, Reddit, and white supremacist websites.

In the original 1964 photo, King flashed the “V for victory” sign after learning the US Senate had passed the civil rights bill.

“Those who love peace must learn to organize as effectively as those who love war.”
Dr. Martin Luther King Jr.

This photo was taken on June 19th, 1964, showing Dr King giving a peace sign after hearing that the civil rights bill had passed the senate. @snopes pic.twitter.com/LXHmwMYZS5

— Willie's Reserve (@WilliesReserve) January 21, 2019

Beyond adding or removing elements, there’s a whole category of photo manipulation in which images are fused together.

Earlier this year, a photo of an armed man was photoshopped by Fox News, which overlaid the man onto other scenes without disclosing the edits, the Seattle Times reported.

You mean this guy who’s been photoshopped into three separate photos released by Fox News? pic.twitter.com/fAXpIKu77a

— Zander Yates ザンダーイェーツ (@ZanderYates) June 13, 2020

Similarly, the image below was shared thousands of times on social media in January, during Australia’s Black Summer bushfires. The AFP’s fact check confirmed it is not authentic and is actually a combination of several separate photos.

Image is more powerful than screams of Greta. A silent girl is holding a koala. She looks straight at you from the waters of the ocean where they found a refuge. She is wearing a breathing mask. A wall of fire is behind them. I do not know the name of the photographer #Australia pic.twitter.com/CrTX3lltdh

— EVC Music (@EVCMusicUK) January 6, 2020

Fully and Partially Synthetic Content
Online, you’ll also find sophisticated “deepfake” videos showing (usually famous) people saying or doing things they never did. Less advanced versions can be created using apps such as Zao and Reface.

Or, if you don’t want to use your photo for a profile picture, you can default to one of several websites offering hundreds of thousands of AI-generated, photorealistic images of people.

These people don’t exist, they’re just images generated by artificial intelligence. Generated Photos, CC BY

Editing Pixel Values and the (not so) Simple Crop
Cropping can greatly alter the context of a photo, too.

We saw this in 2017, when a US government employee edited official pictures of Donald Trump’s inauguration to make the crowd appear bigger, according to The Guardian. The staffer cropped out the empty space “where the crowd ended” for a set of pictures for Trump.

Views of the crowds at the inaugurations of former US President Barack Obama in 2009 (left) and President Donald Trump in 2017 (right). AP

But what about edits that only alter pixel values such as color, saturation, or contrast?

One historical example illustrates the consequences of this. In 1994, Time magazine’s cover of OJ Simpson considerably “darkened” Simpson in his police mugshot. This added fuel to a case already plagued by racial tension, to which the magazine responded, “No racial implication was intended, by Time or by the artist.”

Tools for Debunking Digital Fakery
For those of us who don’t want to be duped by visual mis/disinformation, there are tools available—although each comes with its own limitations (something we discuss in our recent paper).

Invisible digital watermarking has been proposed as a solution. However, it isn’t widespread and requires buy-in from both content publishers and distributors.

Reverse image search (such as Google’s) is often free and can be helpful for identifying earlier, potentially more authentic copies of images online. That said, it’s not foolproof because it:

Relies on unedited copies of the media already being online.
Doesn’t search the entire web.
Doesn’t always allow filtering by publication time. Some reverse image search services such as TinEye support this function, but Google’s doesn’t.
Returns only exact matches or near-matches, so it’s not thorough. For instance, editing an image and then flipping its orientation can fool Google into thinking it’s an entirely different one.

Most Reliable Tools Are Sophisticated
Meanwhile, manual forensic detection methods for visual mis/disinformation focus mostly on edits visible to the naked eye, or rely on examining features that aren’t included in every image (such as shadows). They’re also time-consuming, expensive, and need specialized expertise.

Still, you can access work in this field by visiting sites such as Snopes.com—which has a growing repository of “fauxtography.”

Computer vision and machine learning also offer relatively advanced detection capabilities for images and videos. But they too require technical expertise to operate and understand.

Moreover, improving them involves using large volumes of “training data,” but the image repositories used for this usually don’t contain the real-world images seen in the news.

If you use an image verification tool such as the REVEAL project’s image verification assistant, you might need an expert to help interpret the results.

The good news, however, is that before turning to any of the above tools, there are some simple questions you can ask yourself to potentially figure out whether a photo or video on social media is fake. Think:

Was it originally made for social media?
How widely and for how long was it circulated?
What responses did it receive?
Who were the intended audiences?

Quite often, the logical conclusions drawn from the answers will be enough to weed out inauthentic visuals. You can access the full list of questions, put together by Manchester Metropolitan University experts, here.

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Image Credit: Simon Steinberger from Pixabay Continue reading

Posted in Human Robots

#437373 Microsoft’s New Deepfake Detector Puts ...

The upcoming US presidential election seems set to be something of a mess—to put it lightly. Covid-19 will likely deter millions from voting in person, and mail-in voting isn’t shaping up to be much more promising. This all comes at a time when political tensions are running higher than they have in decades, issues that shouldn’t be political (like mask-wearing) have become highly politicized, and Americans are dramatically divided along party lines.

So the last thing we need right now is yet another wrench in the spokes of democracy, in the form of disinformation; we all saw how that played out in 2016, and it wasn’t pretty. For the record, disinformation purposely misleads people, while misinformation is simply inaccurate, but without malicious intent. While there’s not a ton tech can do to make people feel safe at crowded polling stations or up the Postal Service’s budget, tech can help with disinformation, and Microsoft is trying to do so.

On Tuesday the company released two new tools designed to combat disinformation, described in a blog post by VP of Customer Security and Trust Tom Burt and Chief Scientific Officer Eric Horvitz.

The first is Microsoft Video Authenticator, which is made to detect deepfakes. In case you’re not familiar with this wicked byproduct of AI progress, “deepfakes” refers to audio or visual files made using artificial intelligence that can manipulate peoples’ voices or likenesses to make it look like they said things they didn’t. Editing a video to string together words and form a sentence someone didn’t say doesn’t count as a deepfake; though there’s manipulation involved, you don’t need a neural network and you’re not generating any original content or footage.

The Authenticator analyzes videos or images and tells users the percentage chance that they’ve been artificially manipulated. For videos, the tool can even analyze individual frames in real time.

Deepfake videos are made by feeding hundreds of hours of video of someone into a neural network, “teaching” the network the minutiae of the person’s voice, pronunciation, mannerisms, gestures, etc. It’s like when you do an imitation of your annoying coworker from accounting, complete with mimicking the way he makes every sentence sound like a question and his eyes widen when he talks about complex spreadsheets. You’ve spent hours—no, months—in his presence and have his personality quirks down pat. An AI algorithm that produces deepfakes needs to learn those same quirks, and more, about whoever the creator’s target is.

Given enough real information and examples, the algorithm can then generate its own fake footage, with deepfake creators using computer graphics and manually tweaking the output to make it as realistic as possible.

The scariest part? To make a deepfake, you don’t need a fancy computer or even a ton of knowledge about software. There are open-source programs people can access for free online, and as far as finding video footage of famous people—well, we’ve got YouTube to thank for how easy that is.

Microsoft’s Video Authenticator can detect the blending boundary of a deepfake and subtle fading or greyscale elements that the human eye may not be able to see.

In the blog post, Burt and Horvitz point out that as time goes by, deepfakes are only going to get better and become harder to detect; after all, they’re generated by neural networks that are continuously learning from and improving themselves.

Microsoft’s counter-tactic is to come in from the opposite angle, that is, being able to confirm beyond doubt that a video, image, or piece of news is real (I mean, can McDonald’s fries cure baldness? Did a seal slap a kayaker in the face with an octopus? Never has it been so imperative that the world know the truth).

A tool built into Microsoft Azure, the company’s cloud computing service, lets content producers add digital hashes and certificates to their content, and a reader (which can be used as a browser extension) checks the certificates and matches the hashes to indicate the content is authentic.

Finally, Microsoft also launched an interactive “Spot the Deepfake” quiz it developed in collaboration with the University of Washington’s Center for an Informed Public, deepfake detection company Sensity, and USA Today. The quiz is intended to help people “learn about synthetic media, develop critical media literacy skills, and gain awareness of the impact of synthetic media on democracy.”

The impact Microsoft’s new tools will have remains to be seen—but hey, we’re glad they’re trying. And they’re not alone; Facebook, Twitter, and YouTube have all taken steps to ban and remove deepfakes from their sites. The AI Foundation’s Reality Defender uses synthetic media detection algorithms to identify fake content. There’s even a coalition of big tech companies teaming up to try to fight election interference.

One thing is for sure: between a global pandemic, widespread protests and riots, mass unemployment, a hobbled economy, and the disinformation that’s remained rife through it all, we’re going to need all the help we can get to make it through not just the election, but the rest of the conga-line-of-catastrophes year that is 2020.

Image Credit: Darius Bashar on Unsplash Continue reading

Posted in Human Robots