Tag Archives: possible

#435614 3 Easy Ways to Evaluate AI Claims

When every other tech startup claims to use artificial intelligence, it can be tough to figure out if an AI service or product works as advertised. In the midst of the AI “gold rush,” how can you separate the nuggets from the fool’s gold?

There’s no shortage of cautionary tales involving overhyped AI claims. And applying AI technologies to health care, education, and law enforcement mean that getting it wrong can have real consequences for society—not just for investors who bet on the wrong unicorn.

So IEEE Spectrum asked experts to share their tips for how to identify AI hype in press releases, news articles, research papers, and IPO filings.

“It can be tricky, because I think the people who are out there selling the AI hype—selling this AI snake oil—are getting more sophisticated over time,” says Tim Hwang, director of the Harvard-MIT Ethics and Governance of AI Initiative.

The term “AI” is perhaps most frequently used to describe machine learning algorithms (and deep learning algorithms, which require even less human guidance) that analyze huge amounts of data and make predictions based on patterns that humans might miss. These popular forms of AI are mostly suited to specialized tasks, such as automatically recognizing certain objects within photos. For that reason, they are sometimes described as “weak” or “narrow” AI.

Some researchers and thought leaders like to talk about the idea of “artificial general intelligence” or “strong AI” that has human-level capacity and flexibility to handle many diverse intellectual tasks. But for now, this type of AI remains firmly in the realm of science fiction and is far from being realized in the real world.

“AI has no well-defined meaning and many so-called AI companies are simply trying to take advantage of the buzz around that term,” says Arvind Narayanan, a computer scientist at Princeton University. “Companies have even been caught claiming to use AI when, in fact, the task is done by human workers.”

Here are three ways to recognize AI hype.

Look for Buzzwords
One red flag is what Hwang calls the “hype salad.” This means stringing together the term “AI” with many other tech buzzwords such as “blockchain” or “Internet of Things.” That doesn’t automatically disqualify the technology, but spotting a high volume of buzzwords in a post, pitch, or presentation should raise questions about what exactly the company or individual has developed.

Other experts agree that strings of buzzwords can be a red flag. That’s especially true if the buzzwords are never really explained in technical detail, and are simply tossed around as vague, poorly-defined terms, says Marzyeh Ghassemi, a computer scientist and biomedical engineer at the University of Toronto in Canada.

“I think that if it looks like a Google search—picture ‘interpretable blockchain AI deep learning medicine’—it's probably not high-quality work,” Ghassemi says.

Hwang also suggests mentally replacing all mentions of “AI” in an article with the term “magical fairy dust.” It’s a way of seeing whether an individual or organization is treating the technology like magic. If so—that’s another good reason to ask more questions about what exactly the AI technology involves.

And even the visual imagery used to illustrate AI claims can indicate that an individual or organization is overselling the technology.

“I think that a lot of the people who work on machine learning on a day-to-day basis are pretty humble about the technology, because they’re largely confronted with how frequently it just breaks and doesn't work,” Hwang says. “And so I think that if you see a company or someone representing AI as a Terminator head, or a big glowing HAL eye or something like that, I think it’s also worth asking some questions.”

Interrogate the Data

It can be hard to evaluate AI claims without any relevant expertise, says Ghassemi at the University of Toronto. Even experts need to know the technical details of the AI algorithm in question and have some access to the training data that shaped the AI model’s predictions. Still, savvy readers with some basic knowledge of applied statistics can search for red flags.

To start, readers can look for possible bias in training data based on small sample sizes or a skewed population that fails to reflect the broader population, Ghassemi says. After all, an AI model trained only on health data from white men would not necessarily achieve similar results for other populations of patients.

“For me, a red flag is not demonstrating deep knowledge of how your labels are defined.”
—Marzyeh Ghassemi, University of Toronto

How machine learning and deep learning models perform also depends on how well humans labeled the sample datasets use to train these programs. This task can be straightforward when labeling photos of cats versus dogs, but gets more complicated when assigning disease diagnoses to certain patient cases.

Medical experts frequently disagree with each other on diagnoses—which is why many patients seek a second opinion. Not surprisingly, this ambiguity can also affect the diagnostic labels that experts assign in training datasets. “For me, a red flag is not demonstrating deep knowledge of how your labels are defined,” Ghassemi says.

Such training data can also reflect the cultural stereotypes and biases of the humans who labeled the data, says Narayanan at Princeton University. Like Ghassemi, he recommends taking a hard look at exactly what the AI has learned: “A good way to start critically evaluating AI claims is by asking questions about the training data.”

Another red flag is presenting an AI system’s performance through a single accuracy figure without much explanation, Narayanan says. Claiming that an AI model achieves “99 percent” accuracy doesn’t mean much without knowing the baseline for comparison—such as whether other systems have already achieved 99 percent accuracy—or how well that accuracy holds up in situations beyond the training dataset.

Narayanan also emphasized the need to ask questions about an AI model’s false positive rate—the rate of making wrong predictions about the presence of a given condition. Even if the false positive rate of a hypothetical AI service is just one percent, that could have major consequences if that service ends up screening millions of people for cancer.

Readers can also consider whether using AI in a given situation offers any meaningful improvement compared to traditional statistical methods, says Clayton Aldern, a data scientist and journalist who serves as managing director for Caldern LLC. He gave the hypothetical example of a “super-duper-fancy deep learning model” that achieves a prediction accuracy of 89 percent, compared to a “little polynomial regression model” that achieves 86 percent on the same dataset.

“We're talking about a three-percentage-point increase on something that you learned about in Algebra 1,” Aldern says. “So is it worth the hype?”

Don’t Ignore the Drawbacks

The hype surrounding AI isn’t just about the technical merits of services and products driven by machine learning. Overblown claims about the beneficial impacts of AI technology—or vague promises to address ethical issues related to deploying it—should also raise red flags.

“If a company promises to use its tech ethically, it is important to question if its business model aligns with that promise,” Narayanan says. “Even if employees have noble intentions, it is unrealistic to expect the company as a whole to resist financial imperatives.”

One example might be a company with a business model that depends on leveraging customers’ personal data. Such companies “tend to make empty promises when it comes to privacy,” Narayanan says. And, if companies hire workers to produce training data, it’s also worth asking whether the companies treat those workers ethically.

The transparency—or lack thereof—about any AI claim can also be telling. A company or research group can minimize concerns by publishing technical claims in peer-reviewed journals or allowing credible third parties to evaluate their AI without giving away big intellectual property secrets, Narayanan says. Excessive secrecy is a big red flag.

With these strategies, you don’t need to be a computer engineer or data scientist to start thinking critically about AI claims. And, Narayanan says, the world needs many people from different backgrounds for societies to fully consider the real-world implications of AI.

Editor’s Note: The original version of this story misspelled Clayton Aldern’s last name as Alderton. Continue reading

Posted in Human Robots

#435605 All of the Winners in the DARPA ...

The first competitive event in the DARPA Subterranean Challenge concluded last week—hopefully you were able to follow along on the livestream, on Twitter, or with some of the articles that we’ve posted about the event. We’ll have plenty more to say about how things went for the SubT teams, but while they take a bit of a (well earned) rest, we can take a look at the winning teams as well as who won DARPA’s special superlative awards for the competition.

First Place: Team Explorer (25/40 artifacts found)
With their rugged, reliable robots featuring giant wheels and the ability to drop communications nodes, Team Explorer was in the lead from day 1, scoring in double digits on every single run.

Second Place: Team CoSTAR (11/40 artifacts found)
Team CoSTAR had one of the more diverse lineups of robots, and they switched up which robots they decided to send into the mine as they learned more about the course.

Third Place: Team CTU-CRAS (10/40 artifacts found)
While many teams came to SubT with DARPA funding, Team CTU-CRAS was self-funded, making them eligible for a special $200,000 Tunnel Circuit prize.

DARPA also awarded a bunch of “superlative awards” after SubT:

Most Accurate Artifact: Team Explorer

To score a point, teams had to submit the location of an artifact that was correct to within 5 meters of the artifact itself. However, DARPA was tracking the artifact locations with much higher precision—for example, the “zero” point on the backpack artifact was the center of the label on the front, which DARPA tracked to the millimeter. Team Explorer managed to return the location of a backpack with an error of just 0.18 meter, which is kind of amazing.

Down to the Wire: Team CSIRO Data61

With just an hour to find as many artifacts as possible, teams had to find the right balance between sending robots off to explore and bringing them back into communication range to download artifact locations. Team CSIRO Data61 cut their last point pretty close, sliding their final point in with a mere 22 seconds to spare.

Most Distinctive Robots: Team Robotika

Team Robotika had some of the quirkiest and most recognizable robots, which DARPA recognized with the “Most Distinctive” award. Robotika told us that part of the reason for that distinctiveness was practical—having a robot that was effectively in two parts meant that they could disassemble it so that it would fit in the baggage compartment of an airplane, very important for a team based in the Czech Republic.

Most Robots Per Person: Team Coordinated Robotics

Kevin Knoedler, who won NASA’s Space Robotics Challenge entirely by himself, brought his own personal swarm of drones to SubT. With a ratio of seven robots to one human, Kevin was almost certainly the hardest working single human at the challenge.

Fan Favorite: Team NCTU

Photo: Evan Ackerman/IEEE Spectrum

The Fan Favorite award went to the team that was most popular on Twitter (with the #SubTChallenge hashtag), and it may or may not be the case that I personally tweeted enough about Team NCTU’s blimp to win them this award. It’s also true that whenever we asked anyone on other teams what their favorite robot was (besides their own, of course), the blimp was overwhelmingly popular. So either way, the award is well deserved.

DARPA shared this little behind-the-scenes clip of the blimp in action (sort of), showing what happened to the poor thing when the mine ventilation system was turned on between runs and DARPA staff had to chase it down and rescue it:

The thing to keep in mind about the results of the Tunnel Circuit is that unlike past DARPA robotics challenges (like the DRC), they don’t necessarily indicate how things are going to go for the Urban or Cave circuits because of how different things are going to be. Explorer did a great job with a team of rugged wheeled vehicles, which turned out to be ideal for navigating through mines, but they’re likely going to need to change things up substantially for the rest of the challenges, where the terrain will be much more complex.

DARPA hasn’t provided any details on the location of the Urban Circuit yet; all we know is that it’ll be sometime in February 2020. This gives teams just six months to take all the lessons that they learned from the Tunnel Circuit and update their hardware, software, and strategies. What were those lessons, and what do teams plan to do differently next year? Check back next week, and we’ll tell you.

[ DARPA SubT ] Continue reading

Posted in Human Robots

#435597 Water Jet Powered Drone Takes Off With ...

At ICRA 2015, the Aerial Robotics Lab at the Imperial College London presented a concept for a multimodal flying swimming robot called AquaMAV. The really difficult thing about a flying and swimming robot isn’t so much the transition from the first to the second, since you can manage that even if your robot is completely dead (thanks to gravity), but rather the other way: going from water to air, ideally in a stable and repetitive way. The AquaMAV concept solved this by basically just applying as much concentrated power as possible to the problem, using a jet thruster to hurl the robot out of the water with quite a bit of velocity to spare.

In a paper appearing in Science Robotics this week, the roboticists behind AquaMAV present a fully operational robot that uses a solid-fuel powered chemical reaction to generate an explosion that powers the robot into the air.

The 2015 version of AquaMAV, which was mostly just some very vintage-looking computer renderings and a little bit of hardware, used a small cylinder of CO2 to power its water jet thruster. This worked pretty well, but the mass and complexity of the storage and release mechanism for the compressed gas wasn’t all that practical for a flying robot designed for long-term autonomy. It’s a familiar challenge, especially for pneumatically powered soft robots—how do you efficiently generate gas on-demand, especially if you need a lot of pressure all at once?

An explosion propels the drone out of the water
There’s one obvious way of generating large amounts of pressurized gas all at once, and that’s explosions. We’ve seen robots use explosive thrust for mobility before, at a variety of scales, and it’s very effective as long as you can both properly harness the explosion and generate the fuel with a minimum of fuss, and this latest version of AquaMAV manages to do both:

The water jet coming out the back of this robot aircraft is being propelled by a gas explosion. The gas comes from the reaction between a little bit of calcium carbide powder stored inside the robot, and water. Water is mixed with the powder one drop at a time, producing acetylene gas, which gets piped into a combustion chamber along with air and water. When ignited, the acetylene air mixture explodes, forcing the water out of the combustion chamber and providing up to 51 N of thrust, which is enough to launch the 160-gram robot 26 meters up and over the water at 11 m/s. It takes just 50 mg of calcium carbide (mixed with 3 drops of water) to generate enough acetylene for each explosion, and both air and water are of course readily available. With 0.2 g of calcium carbide powder on board, the robot has enough fuel for multiple jumps, and the jump is powerful enough that the robot can get airborne even under fairly aggressive sea conditions.

Image: Science Robotics

The robot can transition from a floating state to an airborne jetting phase and back to floating (A). A 3D model render of the underside of the robot (B) shows the electronics capsule. The capsule contains the fuel tank (C), where calcium carbide reacts with air and water to propel the vehicle.

Next step: getting the robot to fly autonomously
Providing adequate thrust is just one problem that needs to be solved when attempting to conquer the water-air transition with a fixed-wing robot. The overall design of the robot itself is a challenge as well, because the optimal design and balance for the robot is quite different in each phase of operation, as the paper describes:

For the vehicle to fly in a stable manner during the jetting phase, the center of mass must be a significant distance in front of the center of pressure of the vehicle. However, to maintain a stable floating position on the water surface and the desired angle during jetting, the center of mass must be located behind the center of buoyancy. For the gliding phase, a fine balance between the center of mass and the center of pressure must be struck to achieve static longitudinal flight stability passively. During gliding, the center of mass should be slightly forward from the wing’s center of pressure.

The current version is mostly optimized for the jetting phase of flight, and doesn’t have any active flight control surfaces yet, but the researchers are optimistic that if they added some they’d have no problem getting the robot to fly autonomously. It’s just a glider at the moment, but a low-power propeller is the obvious step after that, and to get really fancy, a switchable gearbox could enable efficient movement on water as well as in the air. Long-term, the idea is that robots like these would be useful for tasks like autonomous water sampling over large areas, but I’d personally be satisfied with a remote controlled version that I could take to the beach.

“Consecutive aquatic jump-gliding with water-reactive fuel,” by R. Zufferey, A. Ortega Ancel, A. Farinha, R. Siddall, S. F. Armanini, M. Nasr, R. V. Brahmal, G. Kennedy, and M. Kovac from Imperial College in London, is published in the current issue of Science Robotics. Continue reading

Posted in Human Robots

#435575 How an AI Startup Designed a Drug ...

Discovering a new drug can take decades, billions of dollars, and untold man hours from some of the smartest people on the planet. Now a startup says it’s taken a significant step towards speeding the process up using AI.

The typical drug discovery process involves carrying out physical tests on enormous libraries of molecules, and even with the help of robotics it’s an arduous process. The idea of sidestepping this by using computers to virtually screen for promising candidates has been around for decades. But progress has been underwhelming, and it’s still not a major part of commercial pipelines.

Recent advances in deep learning, however, have reignited hopes for the field, and major pharma companies have started tying up with AI-powered drug discovery startups. And now Insilico Medicine has used AI to design a molecule that effectively targets a protein involved in fibrosis—the formation of excess fibrous tissue—in mice in just 46 days.

The platform the company has developed combines two of the hottest sub-fields of AI: the generative adversarial networks, or GANs, which power deepfakes, and reinforcement learning, which is at the heart of the most impressive game-playing AI advances of recent years.

In a paper in Nature Biotechnology, the company’s researchers describe how they trained their model on all the molecules already known to target this protein as well as many other active molecules from various datasets. The model was then used to generate 30,000 candidate molecules.

Unlike most previous efforts, they went a step further and selected the most promising molecules for testing in the lab. The 30,000 candidates were whittled down to just 6 using more conventional drug discovery approaches and were then synthesized in the lab. They were put through increasingly stringent tests, but the leading candidate was found to be effective at targeting the desired protein and behaved as one would hope a drug would.

The authors are clear that the results are just a proof-of-concept, which company CEO Alex Zhavoronkov told Wired stemmed from a challenge set by a pharma partner to design a drug as quickly as possible. But they say they were able to carry out the process faster than traditional methods for a fraction of the cost.

There are some caveats. For a start, the protein being targeted is already very well known and multiple effective drugs exist for it. That gave the company a wealth of data to train their model on, something that isn’t the case for many of the diseases where we urgently need new drugs.

The company’s platform also only targets the very initial stages of the drug discovery process. The authors concede in their paper that the molecules would still take considerable optimization in the lab before they’d be true contenders for clinical trials.

“And that is where you will start to begin to commence to spend the vast piles of money that you will eventually go through in trying to get a drug to market,” writes Derek Lowe in his blog In The Pipeline. The part of the discovery process that the platform tackles represents a tiny fraction of the total cost of drug development, he says.

Nonetheless, the research is a definite advance for virtual screening technology and an important marker of the potential of AI for designing new medicines. Zhavoronkov also told Wired that this research was done more than a year ago, and they’ve since adapted the platform to go after harder drug targets with less data.

And with big pharma companies desperate to slash their ballooning development costs and find treatments for a host of intractable diseases, they can use all the help they can get.

Image Credit: freestocks.org / Unsplash Continue reading

Posted in Human Robots

#435528 The Time for AI Is Now. Here’s Why

You hear a lot these days about the sheer transformative power of AI.

There’s pure intelligence: DeepMind’s algorithms readily beat humans at Go and StarCraft, and DeepStack triumphs over humans at no-limit hold’em poker. Often, these silicon brains generate gameplay strategies that don’t resemble anything from a human mind.

There’s astonishing speed: algorithms routinely surpass radiologists in diagnosing breast cancer, eye disease, and other ailments visible from medical imaging, essentially collapsing decades of expert training down to a few months.

Although AI’s silent touch is mainly felt today in the technological, financial, and health sectors, its impact across industries is rapidly spreading. At the Singularity University Global Summit in San Francisco this week Neil Jacobstein, Chair of AI and Robotics, painted a picture of a better AI-powered future for humanity that is already here.

Thanks to cloud-based cognitive platforms, sophisticated AI tools like deep learning are no longer relegated to academic labs. For startups looking to tackle humanity’s grand challenges, the tools to efficiently integrate AI into their missions are readily available. The progress of AI is massively accelerating—to the point you need help from AI to track its progress, joked Jacobstein.

Now is the time to consider how AI can impact your industry, and in the process, begin to envision a beneficial relationship with our machine coworkers. As Jacobstein stressed in his talk, the future of a brain-machine mindmeld is a collaborative intelligence that augments our own. “AI is reinventing the way we invent,” he said.

AI’s Rapid Revolution
Machine learning and other AI-based methods may seem academic and abstruse. But Jacobstein pointed out that there are already plenty of real-world AI application frameworks.

Their secret? Rather than coding from scratch, smaller companies—with big visions—are tapping into cloud-based solutions such as Google’s TensorFlow, Microsoft’s Azure, or Amazon’s AWS to kick off their AI journey. These platforms act as all-in-one solutions that not only clean and organize data, but also contain built-in security and drag-and-drop coding that allow anyone to experiment with complicated machine learning algorithms.

Google Cloud’s Anthos, for example, lets anyone migrate data from other servers—IBM Watson or AWS, for example—so users can leverage different computing platforms and algorithms to transform data into insights and solutions.

Rather than coding from scratch, it’s already possible to hop onto a platform and play around with it, said Jacobstein. That’s key: this democratization of AI is how anyone can begin exploring solutions to problems we didn’t even know we had, or those long thought improbable.

The acceleration is only continuing. Much of AI’s mind-bending pace is thanks to a massive infusion of funding. Microsoft recently injected $1 billion into OpenAI, the Elon Musk venture that engineers socially responsible artificial general intelligence (AGI).

The other revolution is in hardware, and Google, IBM, and NVIDIA—among others—are racing to manufacture computing chips tailored to machine learning.

Democratizing AI is like the birth of the printing press. Mechanical printing allowed anyone to become an author; today, an iPhone lets anyone film a movie masterpiece.

However, this diffusion of AI into the fabric of our lives means tech explorers need to bring skepticism to their AI solutions, giving them a dose of empathy, nuance, and humanity.

A Path Towards Ethical AI
The democratization of AI is a double-edged sword: as more people wield the technology’s power in real-world applications, problems embedded in deep learning threaten to disrupt those very judgment calls.

Much of the press on the dangers of AI focuses on superintelligence—AI that’s more adept at learning than humans—taking over the world, said Jacobstein. But the near-term threat, and far more insidious, is in humans misusing the technology.

Deepfakes, for example, allow AI rookies to paste one person’s head on a different body or put words into a person’s mouth. As the panel said, it pays to think of AI as a cybersecurity problem, one with currently shaky accountability and complexity, and one that fails at diversity and bias.

Take bias. Thanks to progress in natural language processing, Google Translate works nearly perfectly today, so much so that many consider the translation problem solved. Not true, the panel said. One famous example is how the algorithm translates gender-neutral terms like “doctor” into “he” and “nurse” into “she.”

These biases reflect our own, and it’s not just a data problem. To truly engineer objective AI systems, ones stripped of our society’s biases, we need to ask who is developing these systems, and consult those who will be impacted by the products. In addition to gender, racial bias is also rampant. For example, one recent report found that a supposedly objective crime-predicting system was trained on falsified data, resulting in outputs that further perpetuate corrupt police practices. Another study from Google just this month found that their hate speech detector more often labeled innocuous tweets from African-Americans as “obscene” compared to tweets from people of other ethnicities.

We often think of building AI as purely an engineering job, the panelists agreed. But similar to gene drives, germ-line genome editing, and other transformative—but dangerous—tools, AI needs to grow under the consultation of policymakers and other stakeholders. It pays to start young: educating newer generations on AI biases will mold malleable minds early, alerting them to the problem of bias and potentially mitigating risks.

As panelist Tess Posner from AI4ALL said, AI is rocket fuel for ambition. If young minds set out using the tools of AI to tackle their chosen problems, while fully aware of its inherent weaknesses, we can begin to build an AI-embedded future that is widely accessible and inclusive.

The bottom line: people who will be impacted by AI need to be in the room at the conception of an AI solution. People will be displaced by the new technology, and ethical AI has to consider how to mitigate human suffering during the transition. Just because AI looks like “magic fairy dust doesn’t mean that you’re home free,” the panelists said. You, the sentient human, bear the burden of being responsible for how you decide to approach the technology.

The time for AI is now. Let’s make it ethical.

Image Credit: GrAI / Shutterstock.com Continue reading

Posted in Human Robots