Tag Archives: here
#437491 3.2 Billion Images and 720,000 Hours of ...
Twitter over the weekend “tagged” as manipulated a video showing US Democratic presidential candidate Joe Biden supposedly forgetting which state he’s in while addressing a crowd.
Biden’s “hello Minnesota” greeting contrasted with prominent signage reading “Tampa, Florida” and “Text FL to 30330.”
The Associated Press’s fact check confirmed the signs were added digitally and the original footage was indeed from a Minnesota rally. But by the time the misleading video was removed it already had more than one million views, The Guardian reports.
A FALSE video claiming Biden forgot what state he was in was viewed more than 1 million times on Twitter in the past 24 hours
In the video, Biden says “Hello, Minnesota.”
The event did indeed happen in MN — signs on stage read MN
But false video edited signs to read Florida pic.twitter.com/LdHQVaky8v
— Donie O'Sullivan (@donie) November 1, 2020
If you use social media, the chances are you see (and forward) some of the more than 3.2 billion images and 720,000 hours of video shared daily. When faced with such a glut of content, how can we know what’s real and what’s not?
While one part of the solution is an increased use of content verification tools, it’s equally important we all boost our digital media literacy. Ultimately, one of the best lines of defense—and the only one you can control—is you.
Seeing Shouldn’t Always Be Believing
Misinformation (when you accidentally share false content) and disinformation (when you intentionally share it) in any medium can erode trust in civil institutions such as news organizations, coalitions and social movements. However, fake photos and videos are often the most potent.
For those with a vested political interest, creating, sharing and/or editing false images can distract, confuse and manipulate viewers to sow discord and uncertainty (especially in already polarized environments). Posters and platforms can also make money from the sharing of fake, sensationalist content.
Only 11-25 percent of journalists globally use social media content verification tools, according to the International Centre for Journalists.
Could You Spot a Doctored Image?
Consider this photo of Martin Luther King Jr.
Dr. Martin Luther King Jr. Giving the middle finger #DopeHistoricPics pic.twitter.com/5W38DRaLHr
— Dope Historic Pics (@dopehistoricpic) December 20, 2013
This altered image clones part of the background over King Jr’s finger, so it looks like he’s flipping off the camera. It has been shared as genuine on Twitter, Reddit, and white supremacist websites.
In the original 1964 photo, King flashed the “V for victory” sign after learning the US Senate had passed the civil rights bill.
“Those who love peace must learn to organize as effectively as those who love war.”
Dr. Martin Luther King Jr.
This photo was taken on June 19th, 1964, showing Dr King giving a peace sign after hearing that the civil rights bill had passed the senate. @snopes pic.twitter.com/LXHmwMYZS5
— Willie's Reserve (@WilliesReserve) January 21, 2019
Beyond adding or removing elements, there’s a whole category of photo manipulation in which images are fused together.
Earlier this year, a photo of an armed man was photoshopped by Fox News, which overlaid the man onto other scenes without disclosing the edits, the Seattle Times reported.
You mean this guy who’s been photoshopped into three separate photos released by Fox News? pic.twitter.com/fAXpIKu77a
— Zander Yates ザンダーイェーツ (@ZanderYates) June 13, 2020
Similarly, the image below was shared thousands of times on social media in January, during Australia’s Black Summer bushfires. The AFP’s fact check confirmed it is not authentic and is actually a combination of several separate photos.
Image is more powerful than screams of Greta. A silent girl is holding a koala. She looks straight at you from the waters of the ocean where they found a refuge. She is wearing a breathing mask. A wall of fire is behind them. I do not know the name of the photographer #Australia pic.twitter.com/CrTX3lltdh
— EVC Music (@EVCMusicUK) January 6, 2020
Fully and Partially Synthetic Content
Online, you’ll also find sophisticated “deepfake” videos showing (usually famous) people saying or doing things they never did. Less advanced versions can be created using apps such as Zao and Reface.
Or, if you don’t want to use your photo for a profile picture, you can default to one of several websites offering hundreds of thousands of AI-generated, photorealistic images of people.
These people don’t exist, they’re just images generated by artificial intelligence. Generated Photos, CC BY
Editing Pixel Values and the (not so) Simple Crop
Cropping can greatly alter the context of a photo, too.
We saw this in 2017, when a US government employee edited official pictures of Donald Trump’s inauguration to make the crowd appear bigger, according to The Guardian. The staffer cropped out the empty space “where the crowd ended” for a set of pictures for Trump.
Views of the crowds at the inaugurations of former US President Barack Obama in 2009 (left) and President Donald Trump in 2017 (right). AP
But what about edits that only alter pixel values such as color, saturation, or contrast?
One historical example illustrates the consequences of this. In 1994, Time magazine’s cover of OJ Simpson considerably “darkened” Simpson in his police mugshot. This added fuel to a case already plagued by racial tension, to which the magazine responded, “No racial implication was intended, by Time or by the artist.”
Tools for Debunking Digital Fakery
For those of us who don’t want to be duped by visual mis/disinformation, there are tools available—although each comes with its own limitations (something we discuss in our recent paper).
Invisible digital watermarking has been proposed as a solution. However, it isn’t widespread and requires buy-in from both content publishers and distributors.
Reverse image search (such as Google’s) is often free and can be helpful for identifying earlier, potentially more authentic copies of images online. That said, it’s not foolproof because it:
Relies on unedited copies of the media already being online.
Doesn’t search the entire web.
Doesn’t always allow filtering by publication time. Some reverse image search services such as TinEye support this function, but Google’s doesn’t.
Returns only exact matches or near-matches, so it’s not thorough. For instance, editing an image and then flipping its orientation can fool Google into thinking it’s an entirely different one.
Most Reliable Tools Are Sophisticated
Meanwhile, manual forensic detection methods for visual mis/disinformation focus mostly on edits visible to the naked eye, or rely on examining features that aren’t included in every image (such as shadows). They’re also time-consuming, expensive, and need specialized expertise.
Still, you can access work in this field by visiting sites such as Snopes.com—which has a growing repository of “fauxtography.”
Computer vision and machine learning also offer relatively advanced detection capabilities for images and videos. But they too require technical expertise to operate and understand.
Moreover, improving them involves using large volumes of “training data,” but the image repositories used for this usually don’t contain the real-world images seen in the news.
If you use an image verification tool such as the REVEAL project’s image verification assistant, you might need an expert to help interpret the results.
The good news, however, is that before turning to any of the above tools, there are some simple questions you can ask yourself to potentially figure out whether a photo or video on social media is fake. Think:
Was it originally made for social media?
How widely and for how long was it circulated?
What responses did it receive?
Who were the intended audiences?
Quite often, the logical conclusions drawn from the answers will be enough to weed out inauthentic visuals. You can access the full list of questions, put together by Manchester Metropolitan University experts, here.
This article is republished from The Conversation under a Creative Commons license. Read the original article.
Image Credit: Simon Steinberger from Pixabay Continue reading
#437466 How Future AI Could Recognize a Kangaroo ...
AI is continuously taking on new challenges, from detecting deepfakes (which, incidentally, are also made using AI) to winning at poker to giving synthetic biology experiments a boost. These impressive feats result partly from the huge datasets the systems are trained on. That training is costly and time-consuming, and it yields AIs that can really only do one thing well.
For example, to train an AI to differentiate between a picture of a dog and one of a cat, it’s fed thousands—if not millions—of labeled images of dogs and cats. A child, on the other hand, can see a dog or cat just once or twice and remember which is which. How can we make AIs learn more like children do?
A team at the University of Waterloo in Ontario has an answer: change the way AIs are trained.
Here’s the thing about the datasets normally used to train AI—besides being huge, they’re highly specific. A picture of a dog can only be a picture of a dog, right? But what about a really small dog with a long-ish tail? That sort of dog, while still being a dog, looks more like a cat than, say, a fully-grown Golden Retriever.
It’s this concept that the Waterloo team’s methodology is based on. They described their work in a paper published on the pre-print (or non-peer-reviewed) server arXiv last month. Teaching an AI system to identify a new class of objects using just one example is what they call “one-shot learning.” But they take it a step further, focusing on “less than one shot learning,” or LO-shot learning for short.
LO-shot learning consists of a system learning to classify various categories based on a number of examples that’s smaller than the number of categories. That’s not the most straightforward concept to wrap your head around, so let’s go back to the dogs and cats example. Say you want to teach an AI to identify dogs, cats, and kangaroos. How could that possibly be done without several clear examples of each animal?
The key, the Waterloo team says, is in what they call soft labels. Unlike hard labels, which label a data point as belonging to one specific class, soft labels tease out the relationship or degree of similarity between that data point and multiple classes. In the case of an AI trained on only dogs and cats, a third class of objects, say, kangaroos, might be described as 60 percent like a dog and 40 percent like a cat (I know—kangaroos probably aren’t the best animal to have thrown in as a third category).
“Soft labels can be used to represent training sets using fewer prototypes than there are classes, achieving large increases in sample efficiency over regular (hard-label) prototypes,” the paper says. Translation? Tell an AI a kangaroo is some fraction cat and some fraction dog—both of which it’s seen and knows well—and it’ll be able to identify a kangaroo without ever having seen one.
If the soft labels are nuanced enough, you could theoretically teach an AI to identify a large number of categories based on a much smaller number of training examples.
The paper’s authors use a simple machine learning algorithm called k-nearest neighbors (kNN) to explore this idea more in depth. The algorithm operates under the assumption that similar things are most likely to exist near each other; if you go to a dog park, there will be lots of dogs but no cats or kangaroos. Go to the Australian grasslands and there’ll be kangaroos but no cats or dogs. And so on.
To train a kNN algorithm to differentiate between categories, you choose specific features to represent each category (i.e. for animals you could use weight or size as a feature). With one feature on the x-axis and the other on the y-axis, the algorithm creates a graph where data points that are similar to each other are clustered near each other. A line down the center divides the categories, and it’s pretty straightforward for the algorithm to discern which side of the line new data points should fall on.
The Waterloo team kept it simple and used plots of color on a 2D graph. Using the colors and their locations on the graphs, the team created synthetic data sets and accompanying soft labels. One of the more simplistic graphs is pictured below, along with soft labels in the form of pie charts.
Image Credit: Ilia Sucholutsky & Matthias Schonlau
When the team had the algorithm plot the boundary lines of the different colors based on these soft labels, it was able to split the plot up into more colors than the number of data points it was given in the soft labels.
While the results are encouraging, the team acknowledges that they’re just the first step, and there’s much more exploration of this concept yet to be done. The kNN algorithm is one of the least complex models out there; what might happen when LO-shot learning is applied to a far more complex algorithm? Also, to apply it, you still need to distill a larger dataset down into soft labels.
One idea the team is already working on is having other algorithms generate the soft labels for the algorithm that’s going to be trained using LO-shot; manually designing soft labels won’t always be as easy as splitting up some pie charts into different colors.
LO-shot’s potential for reducing the amount of training data needed to yield working AI systems is promising. Besides reducing the cost and the time required to train new models, the method could also make AI more accessible to industries, companies, or individuals who don’t have access to large datasets—an important step for democratization of AI.
Image Credit: pen_ash from Pixabay Continue reading