Tag Archives: reliable

#434837 In Defense of Black Box AI

Deep learning is powering some amazing new capabilities, but we find it hard to scrutinize the workings of these algorithms. Lack of interpretability in AI is a common concern and many are trying to fix it, but is it really always necessary to know what’s going on inside these “black boxes”?

In a recent perspective piece for Science, Elizabeth Holm, a professor of materials science and engineering at Carnegie Mellon University, argued in defense of the black box algorithm. I caught up with her last week to find out more.

Edd Gent: What’s your experience with black box algorithms?

Elizabeth Holm: I got a dual PhD in materials science and engineering and scientific computing. I came to academia about six years ago and part of what I wanted to do in making this career change was to refresh and revitalize my computer science side.

I realized that computer science had changed completely. It used to be about algorithms and making codes run fast, but now it’s about data and artificial intelligence. There are the interpretable methods like random forest algorithms, where we can tell how the machine is making its decisions. And then there are the black box methods, like convolutional neural networks.

Once in a while we can find some information about their inner workings, but most of the time we have to accept their answers and kind of probe around the edges to figure out the space in which we can use them and how reliable and accurate they are.

EG: What made you feel like you had to mount a defense of these black box algorithms?

EH: When I started talking with my colleagues, I found that the black box nature of many of these algorithms was a real problem for them. I could understand that because we’re scientists, we always want to know why and how.

It got me thinking as a bit of a contrarian, “Are black boxes all bad? Must we reject them?” Surely not, because human thought processes are fairly black box. We often rely on human thought processes that the thinker can’t necessarily explain.

It’s looking like we’re going to be stuck with these methods for a while, because they’re really helpful. They do amazing things. And so there’s a very pragmatic realization that these are the best methods we’ve got to do some really important problems, and we’re not right now seeing alternatives that are interpretable. We’re going to have to use them, so we better figure out how.

EG: In what situations do you think we should be using black box algorithms?

EH: I came up with three rules. The simplest rule is: when the cost of a bad decision is small and the value of a good decision is high, it’s worth it. The example I gave in the paper is targeted advertising. If you send an ad no one wants it doesn’t cost a lot. If you’re the receiver it doesn’t cost a lot to get rid of it.

There are cases where the cost is high, and that’s then we choose the black box if it’s the best option to do the job. Things get a little trickier here because we have to ask “what are the costs of bad decisions, and do we really have them fully characterized?” We also have to be very careful knowing that our systems may have biases, they may have limitations in where you can apply them, they may be breakable.

But at the same time, there are certainly domains where we’re going to test these systems so extensively that we know their performance in virtually every situation. And if their performance is better than the other methods, we need to do it. Self driving vehicles are a significant example—it’s almost certain they’re going to have to use black box methods, and that they’re going to end up being better drivers than humans.

The third rule is the more fun one for me as a scientist, and that’s the case where the black box really enlightens us as to a new way to look at something. We have trained a black box to recognize the fracture energy of breaking a piece of metal from a picture of the broken surface. It did a really good job, and humans can’t do this and we don’t know why.

What the computer seems to be seeing is noise. There’s a signal in that noise, and finding it is very difficult, but if we do we may find something significant to the fracture process, and that would be an awesome scientific discovery.

EG: Do you think there’s been too much emphasis on interpretability?

EH: I think the interpretability problem is a fundamental, fascinating computer science grand challenge and there are significant issues where we need to have an interpretable model. But how I would frame it is not that there’s too much emphasis on interpretability, but rather that there’s too much dismissiveness of uninterpretable models.

I think that some of the current social and political issues surrounding some very bad black box outcomes have convinced people that all machine learning and AI should be interpretable because that will somehow solve those problems.

Asking humans to explain their rationale has not eliminated bias, or stereotyping, or bad decision-making in humans. Relying too much on interpreted ability perhaps puts the responsibility in the wrong place for getting better results. I can make a better black box without knowing exactly in what way the first one was bad.

EG: Looking further into the future, do you think there will be situations where humans will have to rely on black box algorithms to solve problems we can’t get our heads around?

EH: I do think so, and it’s not as much of a stretch as we think it is. For example, humans don’t design the circuit map of computer chips anymore. We haven’t for years. It’s not a black box algorithm that designs those circuit boards, but we’ve long since given up trying to understand a particular computer chip’s design.

With the billions of circuits in every computer chip, the human mind can’t encompass it, either in scope or just the pure time that it would take to trace every circuit. There are going to be cases where we want a system so complex that only the patience that computers have and their ability to work in very high-dimensional spaces is going to be able to do it.

So we can continue to argue about interpretability, but we need to acknowledge that we’re going to need to use black boxes. And this is our opportunity to do our due diligence to understand how to use them responsibly, ethically, and with benefits rather than harm. And that’s going to be a social conversation as well as as a scientific one.

*Responses have been edited for length and style

Image Credit: Chingraph / Shutterstock.com Continue reading

Posted in Human Robots

#434767 7 Non-Obvious Trends Shaping the Future

When you think of trends that might be shaping the future, the first things that come to mind probably have something to do with technology: Robots taking over jobs. Artificial intelligence advancing and proliferating. 5G making everything faster, connected cities making everything easier, data making everything more targeted.

Technology is undoubtedly changing the way we live, and will continue to do so—probably at an accelerating rate—in the near and far future. But there are other trends impacting the course of our lives and societies, too. They’re less obvious, and some have nothing to do with technology.

For the past nine years, entrepreneur and author Rohit Bhargava has read hundreds of articles across all types of publications, tagged and categorized them by topic, funneled frequent topics into broader trends, analyzed those trends, narrowed them down to the most significant ones, and published a book about them as part of his ‘Non-Obvious’ series. He defines a trend as “a unique curated observation of the accelerating present.”

In an encore session at South by Southwest last week (his initial talk couldn’t fit hundreds of people who wanted to attend, so a re-do was scheduled), Bhargava shared details of his creative process, why it’s hard to think non-obviously, the most important trends of this year, and how to make sure they don’t get the best of you.

Thinking Differently
“Non-obvious thinking is seeing the world in a way other people don’t see it,” Bhargava said. “The secret is curating your ideas.” Curation collects ideas and presents them in a meaningful way; museum curators, for example, decide which works of art to include in an exhibit and how to present them.

For his own curation process, Bhargava uses what he calls the haystack method. Rather than searching for a needle in a haystack, he gathers ‘hay’ (ideas and stories) then uses them to locate and define a ‘needle’ (a trend). “If you spend enough time gathering information, you can put the needle into the middle of the haystack,” he said.

A big part of gathering information is looking for it in places you wouldn’t normally think to look. In his case, that means that on top of reading what everyone else reads—the New York Times, the Washington Post, the Economist—he also buys publications like Modern Farmer, Teen Vogue, and Ink magazine. “It’s like stepping into someone else’s world who’s not like me,” he said. “That’s impossible to do online because everything is personalized.”

Three common barriers make non-obvious thinking hard.

The first is unquestioned assumptions, which are facts or habits we think will never change. When James Dyson first invented the bagless vacuum, he wanted to sell the license to it, but no one believed people would want to spend more money up front on a vacuum then not have to buy bags. The success of Dyson’s business today shows how mistaken that assumption—that people wouldn’t adapt to a product that, at the end of the day, was far more sensible—turned out to be. “Making the wrong basic assumptions can doom you,” Bhargava said.

The second barrier to thinking differently is constant disruption. “Everything is changing as industries blend together,” Bhargava said. “The speed of change makes everyone want everything, all the time, and people expect the impossible.” We’ve come to expect every alternative to be presented to us in every moment, but in many cases this doesn’t serve us well; we’re surrounded by noise and have trouble discerning what’s valuable and authentic.

This ties into the third barrier, which Bhargava calls the believability crisis. “Constant sensationalism makes people skeptical about everything,” he said. With the advent of fake news and technology like deepfakes, we’re in a post-truth, post-fact era, and are in a constant battle to discern what’s real from what’s not.

2019 Trends
Bhargava’s efforts to see past these barriers and curate information yielded 15 trends he believes are currently shaping the future. He shared seven of them, along with thoughts on how to stay ahead of the curve.

Retro Trust
We tend to trust things we have a history with. “People like nostalgic experiences,” Bhargava said. With tech moving as fast as it is, old things are quickly getting replaced by shinier, newer, often more complex things. But not everyone’s jumping on board—and some who’ve been on board are choosing to jump off in favor of what worked for them in the past.

“We’re turning back to vinyl records and film cameras, deliberately downgrading to phones that only text and call,” Bhargava said. In a period of too much change too fast, people are craving familiarity and dependability. To capitalize on that sentiment, entrepreneurs should seek out opportunities for collaboration—how can you build a product that’s new, but feels reliable and familiar?

Muddled Masculinity
Women have increasingly taken on more leadership roles, advanced in the workplace, now own more homes than men, and have higher college graduation rates. That’s all great for us ladies—but not so great for men or, perhaps more generally, for the concept of masculinity.

“Female empowerment is causing confusion about what it means to be a man today,” Bhargava said. “Men don’t know what to do—should they say something? Would that make them an asshole? Should they keep quiet? Would that make them an asshole?”

By encouraging the non-conforming, we can help take some weight off the traditional gender roles, and their corresponding divisions and pressures.

Innovation Envy
Innovation has become an over-used word, to the point that it’s thrown onto ideas and actions that aren’t really innovative at all. “We innovate by looking at someone else and doing the same,” Bhargava said. If an employee brings a radical idea to someone in a leadership role, in many companies the leadership will say they need a case study before implementing the radical idea—but if it’s already been done, it’s not innovative. “With most innovation what ends up happening is not spectacular failure, but irrelevance,” Bhargava said.

He suggests that rather than being on the defensive, companies should play offense with innovation, and when it doesn’t work “fail as if no one’s watching” (often, no one will be).

Artificial Influence
Thanks to social media and other technologies, there are a growing number of fabricated things that, despite not being real, influence how we think. “15 percent of all Twitter accounts may be fake, and there are 60 million fake Facebook accounts,” Bhargava said. There are virtual influencers and even virtual performers.

“Don’t hide the artificial ingredients,” Bhargava advised. “Some people are going to pretend it’s all real. We have to be ethical.” The creators of fabrications meant to influence the way people think, or the products they buy, or the decisions they make, should make it crystal-clear that there aren’t living, breathing people behind the avatars.

Enterprise Empathy
Another reaction to the fast pace of change these days—and the fast pace of life, for that matter—is that empathy is regaining value and even becoming a driver of innovation. Companies are searching for ways to give people a sense of reassurance. The Tesco grocery brand in the UK has a “relaxed lane” for those who don’t want to feel rushed as they check out. Starbucks opened a “signing store” in Washington DC, and most of its regular customers have learned some sign language.

“Use empathy as a principle to help yourself stand out,” Bhargava said. Besides being a good business strategy, “made with empathy” will ideally promote, well, more empathy, a quality there’s often a shortage of.

Robot Renaissance
From automating factory jobs to flipping burgers to cleaning our floors, robots have firmly taken their place in our day-to-day lives—and they’re not going away anytime soon. “There are more situations with robots than ever before,” Bhargava said. “They’re exploring underwater. They’re concierges at hotels.”

The robot revolution feels intimidating. But Bhargava suggests embracing robots with more curiosity than concern. While they may replace some tasks we don’t want replaced, they’ll also be hugely helpful in multiple contexts, from elderly care to dangerous manual tasks.

Back-storytelling
Similar to retro trust and enterprise empathy, organizations have started to tell their brand’s story to gain customer loyalty. “Stories give us meaning, and meaning is what we need in order to be able to put the pieces together,” Bhargava said. “Stories give us a way of understanding the world.”

Finding the story behind your business, brand, or even yourself, and sharing it openly, can help you connect with people, be they customers, coworkers, or friends.

Tech’s Ripple Effects
While it may not overtly sound like it, most of the trends Bhargava identified for 2019 are tied to technology, and are in fact a sort of backlash against it. Tech has made us question who to trust, how to innovate, what’s real and what’s fake, how to make the best decisions, and even what it is that makes us human.

By being aware of these trends, sharing them, and having conversations about them, we’ll help shape the way tech continues to be built, and thus the way it impacts us down the road.

Image Credit: Rohit Bhargava by Brian Smale Continue reading

Posted in Human Robots

#434759 To Be Ethical, AI Must Become ...

As over-hyped as artificial intelligence is—everyone’s talking about it, few fully understand it, it might leave us all unemployed but also solve all the world’s problems—its list of accomplishments is growing. AI can now write realistic-sounding text, give a debating champ a run for his money, diagnose illnesses, and generate fake human faces—among much more.

After training these systems on massive datasets, their creators essentially just let them do their thing to arrive at certain conclusions or outcomes. The problem is that more often than not, even the creators don’t know exactly why they’ve arrived at those conclusions or outcomes. There’s no easy way to trace a machine learning system’s rationale, so to speak. The further we let AI go down this opaque path, the more likely we are to end up somewhere we don’t want to be—and may not be able to come back from.

In a panel at the South by Southwest interactive festival last week titled “Ethics and AI: How to plan for the unpredictable,” experts in the field shared their thoughts on building more transparent, explainable, and accountable AI systems.

Not New, but Different
Ryan Welsh, founder and director of explainable AI startup Kyndi, pointed out that having knowledge-based systems perform advanced tasks isn’t new; he cited logistical, scheduling, and tax software as examples. What’s new is the learning component, our inability to trace how that learning occurs, and the ethical implications that could result.

“Now we have these systems that are learning from data, and we’re trying to understand why they’re arriving at certain outcomes,” Welsh said. “We’ve never actually had this broad society discussion about ethics in those scenarios.”

Rather than continuing to build AIs with opaque inner workings, engineers must start focusing on explainability, which Welsh broke down into three subcategories. Transparency and interpretability come first, and refer to being able to find the units of high influence in a machine learning network, as well as the weights of those units and how they map to specific data and outputs.

Then there’s provenance: knowing where something comes from. In an ideal scenario, for example, Open AI’s new text generator would be able to generate citations in its text that reference academic (and human-created) papers or studies.

Explainability itself is the highest and final bar and refers to a system’s ability to explain itself in natural language to the average user by being able to say, “I generated this output because x, y, z.”

“Humans are unique in our ability and our desire to ask why,” said Josh Marcuse, executive director of the Defense Innovation Board, which advises Department of Defense senior leaders on innovation. “The reason we want explanations from people is so we can understand their belief system and see if we agree with it and want to continue to work with them.”

Similarly, we need to have the ability to interrogate AIs.

Two Types of Thinking
Welsh explained that one big barrier standing in the way of explainability is the tension between the deep learning community and the symbolic AI community, which see themselves as two different paradigms and historically haven’t collaborated much.

Symbolic or classical AI focuses on concepts and rules, while deep learning is centered around perceptions. In human thought this is the difference between, for example, deciding to pass a soccer ball to a teammate who is open (you make the decision because conceptually you know that only open players can receive passes), and registering that the ball is at your feet when someone else passes it to you (you’re taking in information without making a decision about it).

“Symbolic AI has abstractions and representation based on logic that’s more humanly comprehensible,” Welsh said. To truly mimic human thinking, AI needs to be able to both perceive information and conceptualize it. An example of perception (deep learning) in an AI is recognizing numbers within an image, while conceptualization (symbolic learning) would give those numbers a hierarchical order and extract rules from the hierachy (4 is greater than 3, and 5 is greater than 4, therefore 5 is also greater than 3).

Explainability comes in when the system can say, “I saw a, b, and c, and based on that decided x, y, or z.” DeepMind and others have recently published papers emphasizing the need to fuse the two paradigms together.

Implications Across Industries
One of the most prominent fields where AI ethics will come into play, and where the transparency and accountability of AI systems will be crucial, is defense. Marcuse said, “We’re accountable beings, and we’re responsible for the choices we make. Bringing in tech or AI to a battlefield doesn’t strip away that meaning and accountability.”

In fact, he added, rather than worrying about how AI might degrade human values, people should be asking how the tech could be used to help us make better moral choices.

It’s also important not to conflate AI with autonomy—a worst-case scenario that springs to mind is an intelligent destructive machine on a rampage. But in fact, Marcuse said, in the defense space, “We have autonomous systems today that don’t rely on AI, and most of the AI systems we’re contemplating won’t be autonomous.”

The US Department of Defense released its 2018 artificial intelligence strategy last month. It includes developing a robust and transparent set of principles for defense AI, investing in research and development for AI that’s reliable and secure, continuing to fund research in explainability, advocating for a global set of military AI guidelines, and finding ways to use AI to reduce the risk of civilian casualties and other collateral damage.

Though these were designed with defense-specific aims in mind, Marcuse said, their implications extend across industries. “The defense community thinks of their problems as being unique, that no one deals with the stakes and complexity we deal with. That’s just wrong,” he said. Making high-stakes decisions with technology is widespread; safety-critical systems are key to aviation, medicine, and self-driving cars, to name a few.

Marcuse believes the Department of Defense can invest in AI safety in a way that has far-reaching benefits. “We all depend on technology to keep us alive and safe, and no one wants machines to harm us,” he said.

A Creation Superior to Its Creator
That said, we’ve come to expect technology to meet our needs in just the way we want, all the time—servers must never be down, GPS had better not take us on a longer route, Google must always produce the answer we’re looking for.

With AI, though, our expectations of perfection may be less reasonable.

“Right now we’re holding machines to superhuman standards,” Marcuse said. “We expect them to be perfect and infallible.” Take self-driving cars. They’re conceived of, built by, and programmed by people, and people as a whole generally aren’t great drivers—just look at traffic accident death rates to confirm that. But the few times self-driving cars have had fatal accidents, there’s been an ensuing uproar and backlash against the industry, as well as talk of implementing more restrictive regulations.

This can be extrapolated to ethics more generally. We as humans have the ability to explain our decisions, but many of us aren’t very good at doing so. As Marcuse put it, “People are emotional, they confabulate, they lie, they’re full of unconscious motivations. They don’t pass the explainability test.”

Why, then, should explainability be the standard for AI?

Even if humans aren’t good at explaining our choices, at least we can try, and we can answer questions that probe at our decision-making process. A deep learning system can’t do this yet, so working towards being able to identify which input data the systems are triggering on to make decisions—even if the decisions and the process aren’t perfect—is the direction we need to head.

Image Credit: a-image / Shutterstock.com Continue reading

Posted in Human Robots

#434673 The World’s Most Valuable AI ...

It recognizes our faces. It knows the videos we might like. And it can even, perhaps, recommend the best course of action to take to maximize our personal health.

Artificial intelligence and its subset of disciplines—such as machine learning, natural language processing, and computer vision—are seemingly becoming integrated into our daily lives whether we like it or not. What was once sci-fi is now ubiquitous research and development in company and university labs around the world.

Similarly, the startups working on many of these AI technologies have seen their proverbial stock rise. More than 30 of these companies are now valued at over a billion dollars, according to data research firm CB Insights, which itself employs algorithms to provide insights into the tech business world.

Private companies with a billion-dollar valuation were so uncommon not that long ago that they were dubbed unicorns. Now there are 325 of these once-rare creatures, with a combined valuation north of a trillion dollars, as CB Insights maintains a running count of this exclusive Unicorn Club.

The subset of AI startups accounts for about 10 percent of the total membership, growing rapidly in just 4 years from 0 to 32. Last year, an unprecedented 17 AI startups broke the billion-dollar barrier, with 2018 also a record year for venture capital into private US AI companies at $9.3 billion, CB Insights reported.

What exactly is all this money funding?

AI Keeps an Eye Out for You
Let’s start with the bad news first.

Facial recognition is probably one of the most ubiquitous applications of AI today. It’s actually a decades-old technology often credited to a man named Woodrow Bledsoe, who used an instrument called a RAND tablet that could semi-autonomously match faces from a database. That was in the 1960s.

Today, most of us are familiar with facial recognition as a way to unlock our smartphones. But the technology has gained notoriety as a surveillance tool of law enforcement, particularly in China.

It’s no secret that the facial recognition algorithms developed by several of the AI unicorns from China—SenseTime, CloudWalk, and Face++ (also known as Megvii)—are used to monitor the country’s 1.3 billion citizens. Police there are even equipped with AI-powered eyeglasses for such purposes.

A fourth billion-dollar Chinese startup, Yitu Technologies, also produces a platform for facial recognition in the security realm, and develops AI systems in healthcare on top of that. For example, its CARE.AITM Intelligent 4D Imaging System for Chest CT can reputedly identify in real time a variety of lesions for the possible early detection of cancer.

The AI Doctor Is In
As Peter Diamandis recently noted, AI is rapidly augmenting healthcare and longevity. He mentioned another AI unicorn from China in this regard—iCarbonX, which plans to use machines to develop personalized health plans for every individual.

A couple of AI unicorns on the hardware side of healthcare are OrCam Technologies and Butterfly. The former, an Israeli company, has developed a wearable device for the vision impaired called MyEye that attaches to one’s eyeglasses. The device can identify people and products, as well as read text, conveying the information through discrete audio.

Butterfly Network, out of Connecticut, has completely upended the healthcare market with a handheld ultrasound machine that works with a smartphone.

“Orcam and Butterfly are amazing examples of how machine learning can be integrated into solutions that provide a step-function improvement over state of the art in ultra-competitive markets,” noted Andrew Byrnes, investment director at Comet Labs, a venture capital firm focused on AI and robotics, in an email exchange with Singularity Hub.

AI in the Driver’s Seat
Comet Labs’ portfolio includes two AI unicorns, Megvii and Pony.ai.

The latter is one of three billion-dollar startups developing the AI technology behind self-driving cars, with the other two being Momenta.ai and Zoox.

Founded in 2016 near San Francisco (with another headquarters in China), Pony.ai debuted its latest self-driving system, called PonyAlpha, last year. The platform uses multiple sensors (LiDAR, cameras, and radar) to navigate its environment, but its “sensor fusion technology” makes things simple by choosing the most reliable sensor data for any given driving scenario.

Zoox is another San Francisco area startup founded a couple of years earlier. In late 2018, it got the green light from the state of California to be the first autonomous vehicle company to transport a passenger as part of a pilot program. Meanwhile, China-based Momenta.ai is testing level four autonomy for its self-driving system. Autonomous driving levels are ranked zero to five, with level five being equal to a human behind the wheel.

The hype around autonomous driving is currently in overdrive, and Byrnes thinks regulatory roadblocks will keep most self-driving cars in idle for the foreseeable future. The exception, he said, is China, which is adopting a “systems” approach to autonomy for passenger transport.

“If [autonomous mobility] solves bigger problems like traffic that can elicit government backing, then that has the potential to go big fast,” he said. “This is why we believe Pony.ai will be a winner in the space.”

AI in the Back Office
An AI-powered technology that perhaps only fans of the cult classic Office Space might appreciate has suddenly taken the business world by storm—robotic process automation (RPA).

RPA companies take the mundane back office work, such as filling out invoices or processing insurance claims, and turn it over to bots. The intelligent part comes into play because these bots can tackle unstructured data, such as text in an email or even video and pictures, in order to accomplish an increasing variety of tasks.

Both Automation Anywhere and UiPath are older companies, founded in 2003 and 2005, respectively. However, since just 2017, they have raised nearly a combined $1 billion in disclosed capital.

Cybersecurity Embraces AI
Cybersecurity is another industry where AI is driving investment into startups. Sporting imposing names like CrowdStrike, Darktrace, and Tanium, these cybersecurity companies employ different machine-learning techniques to protect computers and other IT assets beyond the latest software update or virus scan.

Darktrace, for instance, takes its inspiration from the human immune system. Its algorithms can purportedly “learn” the unique pattern of each device and user on a network, detecting emerging problems before things spin out of control.

All three companies are used by major corporations and governments around the world. CrowdStrike itself made headlines a few years ago when it linked the hacking of the Democratic National Committee email servers to the Russian government.

Looking Forward
I could go on, and introduce you to the world’s most valuable startup, a Chinese company called Bytedance that is valued at $75 billion for news curation and an app to create 15-second viral videos. But that’s probably not where VC firms like Comet Labs are generally putting their money.

Byrnes sees real value in startups that are taking “data-driven approaches to problems specific to unique industries.” Take the example of Chicago-based unicorn Uptake Technologies, which analyzes incoming data from machines, from wind turbines to tractors, to predict problems before they occur with the machinery. A not-yet unicorn called PingThings in the Comet Labs portfolio does similar predictive analytics for the energy utilities sector.

“One question we like asking is, ‘What does the state of the art look like in your industry in three to five years?’” Byrnes said. “We ask that a lot, then we go out and find the technology-focused teams building those things.”

Image Credit: Andrey Suslov / Shutterstock.com Continue reading

Posted in Human Robots

#434648 The Pediatric AI That Outperformed ...

Training a doctor takes years of grueling work in universities and hospitals. Building a doctor may be as easy as teaching an AI how to read.

Artificial intelligence has taken another step towards becoming an integral part of 21st-century medicine. New research out of Guangzhou, China, published February 11th in Nature Medicine Letters, has demonstrated a natural-language processing AI that is capable of out-performing rookie pediatricians in diagnosing common childhood ailments.

The massive study examined the electronic health records (EHR) from nearly 600,000 patients over an 18-month period at the Guangzhou Women and Children’s Medical Center and then compared AI-generated diagnoses against new assessments from physicians with a range of experience.

The verdict? On average, the AI was noticeably more accurate than junior physicians and nearly as reliable as the more senior ones. These results are the latest demonstration that artificial intelligence is on the cusp of becoming a healthcare staple on a global scale.

Less Like a Computer, More Like a Person
To outshine human doctors, the AI first had to become more human. Like IBM’s Watson, the pediatric AI leverages natural language processing, in essence “reading” written notes from EHRs not unlike how a human doctor would review those same records. But the similarities to human doctors don’t end there. The AI is a machine learning classifier (MLC), capable of placing the information learned from the EHRs into categories to improve performance.

Like traditionally-trained pediatricians, the AI broke cases down into major organ groups and infection areas (upper/lower respiratory, gastrointestinal, etc.) before breaking them down even further into subcategories. It could then develop associations between various symptoms and organ groups and use those associations to improve its diagnoses. This hierarchical approach mimics the deductive reasoning human doctors employ.

Another key strength of the AI developed for this study was the enormous size of the dataset collected to teach it: 1,362,559 outpatient visits from 567,498 patients yielded some 101.6 million data points for the MLC to devour on its quest for pediatric dominance. This allowed the AI the depth of learning needed to distinguish and accurately select from the 55 different diagnosis codes across the various organ groups and subcategories.

When comparing against the human doctors, the study used 11,926 records from an unrelated group of children, giving both the MLC and the 20 humans it was compared against an even playing field. The results were clear: while cohorts of senior pediatricians performed better than the AI, junior pediatricians (those with 3-15 years of experience) were outclassed.

Helping, Not Replacing
While the research used a competitive analysis to measure the success of the AI, the results should be seen as anything but hostile to human doctors. The near future of artificial intelligence in medicine will see these machine learning programs augment, not replace, human physicians. The authors of the study specifically call out augmentation as the key short-term application of their work. Triaging incoming patients via intake forms, performing massive metastudies using EHRs, providing rapid ‘second opinions’—the applications for an AI doctor that is better-but-not-the-best are as varied as the healthcare industry itself.

That’s only considering how artificial intelligence could make a positive impact immediately upon implementation. It’s easy to see how long-term use of a diagnostic assistant could reshape the way modern medical institutions approach their work.

Look at how the MLC results fit snugly between the junior and senior physician groups. Essentially, it took nearly 15 years before a physician could consistently out-diagnose the machine. That’s a decade and a half wherein an AI diagnostic assistant would be an invaluable partner—both as a training tool and a safety measure. Likewise, on the other side of the experience curve you have physicians whose performance could be continuously leveraged to improve the AI’s effectiveness. This is a clear opportunity for a symbiotic relationship, with humans and machines each assisting the other as they mature.

Closer to Us, But Still Dependent on Us
No matter the ultimate application, the AI doctors of the future are drawing nearer to us step by step. This latest research is a demonstration that artificial intelligence can mimic the results of human deductive reasoning even in some of the most complex and important decision-making processes. True, the MLC required input from humans to function; both the initial data points and the cases used to evaluate the AI depended on EHRs written by physicians. While every effort was made to design a test schema that removed any indication of the eventual diagnosis, some “data leakage” is bound to occur.

In other words, when AIs use human-created data, they inherit human insight to some degree. Yet the progress made in machine imaging, chatbots, sensors, and other fields all suggest that this dependence on human input is more about where we are right now than where we could be in the near future.

Data, and More Data
That near future may also have some clear winners and losers. For now, those winners seem to be the institutions that can capture and apply the largest sets of data. With a rapidly digitized society gathering incredible amounts of data, China has a clear advantage. Combined with their relatively relaxed approach to privacy, they are likely to continue as one of the driving forces behind machine learning and its applications. So too will Google/Alphabet with their massive medical studies. Data is the uranium in this AI arms race, and everyone seems to be scrambling to collect more.

In a global community that seems increasingly aware of the potential problems arising from this need for and reliance on data, it’s nice to know there’ll be an upside as well. The technology behind AI medical assistants is looking more and more mature—even if we are still struggling to find exactly where, when, and how that technology should first become universal.

Yet wherever we see the next push to make AI a standard tool in a real-world medical setting, I have little doubt it will greatly improve the lives of human patients. Today Doctor AI is performing as well as a human colleague with more than 10 years of experience. By next year or so, it may take twice as long for humans to be competitive. And in a decade, the combined medical knowledge of all human history may be a tool as common as a stethoscope in your doctor’s hands.

Image Credit: Nadia Snopek / Shutterstock.com Continue reading

Posted in Human Robots