Tag Archives: explained
#435614 3 Easy Ways to Evaluate AI Claims
When every other tech startup claims to use artificial intelligence, it can be tough to figure out if an AI service or product works as advertised. In the midst of the AI “gold rush,” how can you separate the nuggets from the fool’s gold?
There’s no shortage of cautionary tales involving overhyped AI claims. And applying AI technologies to health care, education, and law enforcement mean that getting it wrong can have real consequences for society—not just for investors who bet on the wrong unicorn.
So IEEE Spectrum asked experts to share their tips for how to identify AI hype in press releases, news articles, research papers, and IPO filings.
“It can be tricky, because I think the people who are out there selling the AI hype—selling this AI snake oil—are getting more sophisticated over time,” says Tim Hwang, director of the Harvard-MIT Ethics and Governance of AI Initiative.
The term “AI” is perhaps most frequently used to describe machine learning algorithms (and deep learning algorithms, which require even less human guidance) that analyze huge amounts of data and make predictions based on patterns that humans might miss. These popular forms of AI are mostly suited to specialized tasks, such as automatically recognizing certain objects within photos. For that reason, they are sometimes described as “weak” or “narrow” AI.
Some researchers and thought leaders like to talk about the idea of “artificial general intelligence” or “strong AI” that has human-level capacity and flexibility to handle many diverse intellectual tasks. But for now, this type of AI remains firmly in the realm of science fiction and is far from being realized in the real world.
“AI has no well-defined meaning and many so-called AI companies are simply trying to take advantage of the buzz around that term,” says Arvind Narayanan, a computer scientist at Princeton University. “Companies have even been caught claiming to use AI when, in fact, the task is done by human workers.”
Here are three ways to recognize AI hype.
Look for Buzzwords
One red flag is what Hwang calls the “hype salad.” This means stringing together the term “AI” with many other tech buzzwords such as “blockchain” or “Internet of Things.” That doesn’t automatically disqualify the technology, but spotting a high volume of buzzwords in a post, pitch, or presentation should raise questions about what exactly the company or individual has developed.
Other experts agree that strings of buzzwords can be a red flag. That’s especially true if the buzzwords are never really explained in technical detail, and are simply tossed around as vague, poorly-defined terms, says Marzyeh Ghassemi, a computer scientist and biomedical engineer at the University of Toronto in Canada.
“I think that if it looks like a Google search—picture ‘interpretable blockchain AI deep learning medicine’—it's probably not high-quality work,” Ghassemi says.
Hwang also suggests mentally replacing all mentions of “AI” in an article with the term “magical fairy dust.” It’s a way of seeing whether an individual or organization is treating the technology like magic. If so—that’s another good reason to ask more questions about what exactly the AI technology involves.
And even the visual imagery used to illustrate AI claims can indicate that an individual or organization is overselling the technology.
“I think that a lot of the people who work on machine learning on a day-to-day basis are pretty humble about the technology, because they’re largely confronted with how frequently it just breaks and doesn't work,” Hwang says. “And so I think that if you see a company or someone representing AI as a Terminator head, or a big glowing HAL eye or something like that, I think it’s also worth asking some questions.”
Interrogate the Data
It can be hard to evaluate AI claims without any relevant expertise, says Ghassemi at the University of Toronto. Even experts need to know the technical details of the AI algorithm in question and have some access to the training data that shaped the AI model’s predictions. Still, savvy readers with some basic knowledge of applied statistics can search for red flags.
To start, readers can look for possible bias in training data based on small sample sizes or a skewed population that fails to reflect the broader population, Ghassemi says. After all, an AI model trained only on health data from white men would not necessarily achieve similar results for other populations of patients.
“For me, a red flag is not demonstrating deep knowledge of how your labels are defined.”
—Marzyeh Ghassemi, University of Toronto
How machine learning and deep learning models perform also depends on how well humans labeled the sample datasets use to train these programs. This task can be straightforward when labeling photos of cats versus dogs, but gets more complicated when assigning disease diagnoses to certain patient cases.
Medical experts frequently disagree with each other on diagnoses—which is why many patients seek a second opinion. Not surprisingly, this ambiguity can also affect the diagnostic labels that experts assign in training datasets. “For me, a red flag is not demonstrating deep knowledge of how your labels are defined,” Ghassemi says.
Such training data can also reflect the cultural stereotypes and biases of the humans who labeled the data, says Narayanan at Princeton University. Like Ghassemi, he recommends taking a hard look at exactly what the AI has learned: “A good way to start critically evaluating AI claims is by asking questions about the training data.”
Another red flag is presenting an AI system’s performance through a single accuracy figure without much explanation, Narayanan says. Claiming that an AI model achieves “99 percent” accuracy doesn’t mean much without knowing the baseline for comparison—such as whether other systems have already achieved 99 percent accuracy—or how well that accuracy holds up in situations beyond the training dataset.
Narayanan also emphasized the need to ask questions about an AI model’s false positive rate—the rate of making wrong predictions about the presence of a given condition. Even if the false positive rate of a hypothetical AI service is just one percent, that could have major consequences if that service ends up screening millions of people for cancer.
Readers can also consider whether using AI in a given situation offers any meaningful improvement compared to traditional statistical methods, says Clayton Aldern, a data scientist and journalist who serves as managing director for Caldern LLC. He gave the hypothetical example of a “super-duper-fancy deep learning model” that achieves a prediction accuracy of 89 percent, compared to a “little polynomial regression model” that achieves 86 percent on the same dataset.
“We're talking about a three-percentage-point increase on something that you learned about in Algebra 1,” Aldern says. “So is it worth the hype?”
Don’t Ignore the Drawbacks
The hype surrounding AI isn’t just about the technical merits of services and products driven by machine learning. Overblown claims about the beneficial impacts of AI technology—or vague promises to address ethical issues related to deploying it—should also raise red flags.
“If a company promises to use its tech ethically, it is important to question if its business model aligns with that promise,” Narayanan says. “Even if employees have noble intentions, it is unrealistic to expect the company as a whole to resist financial imperatives.”
One example might be a company with a business model that depends on leveraging customers’ personal data. Such companies “tend to make empty promises when it comes to privacy,” Narayanan says. And, if companies hire workers to produce training data, it’s also worth asking whether the companies treat those workers ethically.
The transparency—or lack thereof—about any AI claim can also be telling. A company or research group can minimize concerns by publishing technical claims in peer-reviewed journals or allowing credible third parties to evaluate their AI without giving away big intellectual property secrets, Narayanan says. Excessive secrecy is a big red flag.
With these strategies, you don’t need to be a computer engineer or data scientist to start thinking critically about AI claims. And, Narayanan says, the world needs many people from different backgrounds for societies to fully consider the real-world implications of AI.
Editor’s Note: The original version of this story misspelled Clayton Aldern’s last name as Alderton. Continue reading
#435436 Undeclared Wars in Cyberspace Are ...
The US is at war. That’s probably not exactly news, as the country has been engaged in one type of conflict or another for most of its history. The last time we officially declared war was after Japan bombed Pearl Harbor in December 1941.
Our biggest undeclared war today is not being fought by drones in the mountains of Afghanistan or even through the less-lethal barrage of threats over the nuclear programs in North Korea and Iran. In this particular war, it is the US that is under attack and on the defensive.
This is cyberwarfare.
The definition of what constitutes a cyber attack is a broad one, according to Greg White, executive director of the Center for Infrastructure Assurance and Security (CIAS) at The University of Texas at San Antonio (UTSA).
At the level of nation-state attacks, cyberwarfare could involve “attacking systems during peacetime—such as our power grid or election systems—or it could be during war time in which case the attacks may be designed to cause destruction, damage, deception, or death,” he told Singularity Hub.
For the US, the Pearl Harbor of cyberwarfare occurred during 2016 with the Russian interference in the presidential election. However, according to White, an Air Force veteran who has been involved in computer and network security since 1986, the history of cyber war can be traced back much further, to at least the first Gulf War of the early 1990s.
“We started experimenting with cyber attacks during the first Gulf War, so this has been going on a long time,” he said. “Espionage was the prime reason before that. After the war, the possibility of expanding the types of targets utilized expanded somewhat. What is really interesting is the use of social media and things like websites for [psychological operation] purposes during a conflict.”
The 2008 conflict between Russia and the Republic of Georgia is often cited as a cyberwarfare case study due to the large scale and overt nature of the cyber attacks. Russian hackers managed to bring down more than 50 news, government, and financial websites through denial-of-service attacks. In addition, about 35 percent of Georgia’s internet networks suffered decreased functionality during the attacks, coinciding with the Russian invasion of South Ossetia.
The cyberwar also offers lessons for today on Russia’s approach to cyberspace as a tool for “holistic psychological manipulation and information warfare,” according to a 2018 report called Understanding Cyberwarfare from the Modern War Institute at West Point.
US Fights Back
News in recent years has highlighted how Russian hackers have attacked various US government entities and critical infrastructure such as energy and manufacturing. In particular, a shadowy group known as Unit 26165 within the country’s military intelligence directorate is believed to be behind the 2016 US election interference campaign.
However, the US hasn’t been standing idly by. Since at least 2012, the US has put reconnaissance probes into the control systems of the Russian electric grid, The New York Times reported. More recently, we learned that the US military has gone on the offensive, putting “crippling malware” inside the Russian power grid as the U.S. Cyber Command flexes its online muscles thanks to new authority granted to it last year.
“Access to the power grid that is obtained now could be used to shut something important down in the future when we are in a war,” White noted. “Espionage is part of the whole program. It is important to remember that cyber has just provided a new domain in which to conduct the types of activities we have been doing in the real world for years.”
The US is also beginning to pour more money into cybersecurity. The 2020 fiscal budget calls for spending $17.4 billion throughout the government on cyber-related activities, with the Department of Defense (DoD) alone earmarked for $9.6 billion.
Despite the growing emphasis on cybersecurity in the US and around the world, the demand for skilled security professionals is well outpacing the supply, with a projected shortfall of nearly three million open or unfilled positions according to the non-profit IT security organization (ISC)².
UTSA is rare among US educational institutions in that security courses and research are being conducted across three different colleges, according to White. About 10 percent of the school’s 30,000-plus students are enrolled in a cyber-related program, he added, and UTSA is one of only 21 schools that has received the Cyber Operations Center of Excellence designation from the National Security Agency.
“This track in the computer science program is specifically designed to prepare students for the type of jobs they might be involved in if they went to work for the DoD,” White said.
However, White is extremely doubtful there will ever be enough cyber security professionals to meet demand. “I’ve been preaching that we’ve got to worry about cybersecurity in the workforce, not just the cybersecurity workforce, not just cybersecurity professionals. Everybody has a responsibility for cybersecurity.”
Artificial Intelligence in Cybersecurity
Indeed, humans are often seen as the weak link in cybersecurity. That point was driven home at a cybersecurity roundtable discussion during this year’s Brainstorm Tech conference in Aspen, Colorado.
Participant Dorian Daley, general counsel at Oracle, said insider threats are at the top of the list when it comes to cybersecurity. “Sadly, I think some of the biggest challenges are people, and I mean that in a number of ways. A lot of the breaches really come from insiders. So the more that you can automate things and you can eliminate human malicious conduct, the better.”
White noted that automation is already the norm in cybersecurity. “Humans can’t react as fast as systems can launch attacks, so we need to rely on automated defenses as well,” he said. “This doesn’t mean that humans are not in the loop, but much of what is done these days is ‘scripted’.”
The use of artificial intelligence, machine learning, and other advanced automation techniques have been part of the cybersecurity conversation for quite some time, according to White, such as pattern analysis to look for specific behaviors that might indicate an attack is underway.
“What we are seeing quite a bit of today falls under the heading of big data and data analytics,” he explained.
But there are signs that AI is going off-script when it comes to cyber attacks. In the hands of threat groups, AI applications could lead to an increase in the number of cyberattacks, wrote Michelle Cantos, a strategic intelligence analyst at cybersecurity firm FireEye.
“Current AI technology used by businesses to analyze consumer behavior and find new customer bases can be appropriated to help attackers find better targets,” she said. “Adversaries can use AI to analyze datasets and generate recommendations for high-value targets they think the adversary should hit.”
In fact, security researchers have already demonstrated how a machine learning system could be used for malicious purposes. The Social Network Automated Phishing with Reconnaissance system, or SNAP_R, generated more than four times as many spear-phishing tweets on Twitter than a human—and was just as successful at targeting victims in order to steal sensitive information.
Cyber war is upon us. And like the current war on terrorism, there are many battlefields from which the enemy can attack and then disappear. While total victory is highly unlikely in the traditional sense, innovations through AI and other technologies can help keep the lights on against the next cyber attack.
Image Credit: pinkeyes / Shutterstock.com Continue reading