Tag Archives: business

#435757 Robotic Animal Agility

An off-shore wind power platform, somewhere in the North Sea, on a freezing cold night, with howling winds and waves crashing against the impressive structure. An imperturbable ANYmal is quietly conducting its inspection.

ANYmal, a medium sized dog-like quadruped robot, walks down the stairs, lifts a “paw” to open doors or to call the elevator and trots along corridors. Darkness is no problem: it knows the place perfectly, having 3D-mapped it. Its laser sensors keep it informed about its precise path, location and potential obstacles. It conducts its inspection across several rooms. Its cameras zoom in on counters, recording the measurements displayed. Its thermal sensors record the temperature of machines and equipment and its ultrasound microphone checks for potential gas leaks. The robot also inspects lever positions as well as the correct positioning of regulatory fire extinguishers. As the electronic buzz of its engines resumes, it carries on working tirelessly.

After a little over two hours of inspection, the robot returns to its docking station for recharging. It will soon head back out to conduct its next solitary patrol. ANYmal played alongside Mulder and Scully in the “X-Files” TV series*, but it is in no way a Hollywood robot. It genuinely exists and surveillance missions are part of its very near future.

Off-shore oil platforms, the first test fields and probably the first actual application of ANYmal. ©ANYbotics

This quadruped robot was designed by ANYbotics, a spinoff of the Swiss Federal Institute of Technology in Zurich (ETH Zurich). Made of carbon fibre and aluminium, it weighs about thirty kilos. It is fully ruggedised, water- and dust-proof (IP-67). A kevlar belly protects its main body, carrying its powerful brain, batteries, network device, power management system and navigational systems.

ANYmal was designed for all types of terrain, including rubble, sand or snow. It has been field tested on industrial sites and is at ease with new obstacles to overcome (and it can even get up after a fall). Depending on its mission, its batteries last 2 to 4 hours.

On its jointed legs, protected by rubber pads, it can walk (at the speed of human steps), trot, climb, curl upon itself to crawl, carry a load or even jump and dance. It is the need to move on all surfaces that has driven its designers to choose a quadruped. “Biped robots are not easy to stabilise, especially on irregular terrain” explains Dr Péter Fankhauser, co-founder and chief business development officer of ANYbotics. “Wheeled or tracked robots can carry heavy loads, but they are bulky and less agile. Flying drones are highly mobile, but cannot carry load, handle objects or operate in bad weather conditions. We believe that quadrupeds combine the optimal characteristics, both in terms of mobility and versatility.”

What served as a source of inspiration for the team behind the project, the Robotic Systems Lab of the ETH Zurich, is a champion of agility on rugged terrain: the mountain goat. “We are of course still a long way” says Fankhauser. “However, it remains our objective on the longer term.

The first prototype, ALoF, was designed already back in 2009. It was still rather slow, very rigid and clumsy – more of a proof of concept than a robot ready for application. In 2012, StarlETH, fitted with spring joints, could hop, jump and climb. It was with this robot that the team started participating in 2014 in ARGOS, a full-scale challenge, launched by the Total oil group. The idea was to present a robot capable of inspecting an off-shore drilling station autonomously.

Up against dozens of competitors, the ETH Zurich team was the only team to enter the competition with such a quadrupedal robot. They didn’t win, but the multiple field tests were growing evermore convincing. Especially because, during the challenge, the team designed new joints with elastic actuators made in-house. These joints, inspired by tendons and muscles, are compact, sealed and include their own custom control electronics. They can regulate joint torque, position and impedance directly. Thanks to this innovation, the team could enter the same competition with a new version of its robot, ANYmal, fitted with three joints on each leg.

The ARGOS experience confirms the relevance of the selected means of locomotion. “Our robot is lighter, takes up less space on site and it is less noisy” says Fankhauser. “It also overcomes bigger obstacles than larger wheeled or tracked robots!” As ANYmal generated public interest and its transformation into a genuine product seemed more than possible, the startup ANYbotics was launched in 2016. It sold not only its robot, but also its revolutionary joints, called ANYdrive.

Today, ANYmal is not yet ready for sale to companies. However, ANYbotics has a growing number of partnerships with several industries, testing the robot for a few days or several weeks, for all types of tasks. Last October, for example, ANYmal navigated its way through the dark sewage system of the city of Zurich in order to test its capacity to help workers in similar difficult, repetitive and even dangerous tasks.

Why such an early interest among companies? “Because many companies want to integrate robots into their maintenance tasks” answers Fankhauser. “With ANYmal, they can actually evaluate its feasibility and plan their strategy. Eventually, both the architecture and the equipment of buildings could be rethought to be adapted to these maintenance robots”.

ANYmal requires ruggedised, sealed and extremely reliable interconnection solutions, such as LEMO. ©ANYbotics

Through field demonstrations and testing, ANYbotics can gather masses of information (up to 50,000 measurements are recorded every second during each test!) “It helps us to shape the product.” In due time, the startup will be ready to deliver a commercial product which really caters for companies’ needs.

Inspection and surveillance tasks on industrial sites are not the only applications considered. The startup is also thinking of agricultural inspections – with its onboard sensors, ANYmal is capable of mapping its environment, measuring bio mass and even taking soil samples. In the longer term, it could also be used for search and rescue operations. By the way, the robot can already be switched to “remote control” mode at any time and can be easily tele-operated. It is also capable of live audio and video transmission.

The transition from the prototype to the marketed product stage will involve a number of further developments. These include increasing ANYmal’s agility and speed, extending its capacity to map large-scale environments, improving safety, security, user handling and integrating the system with the customer’s data management software. It will also be necessary to enhance the robot’s reliability “so that it can work for days, weeks, or even months without human supervision.” All required certifications will have to be obtained. The locomotion system, which had triggered the whole business, is only one of a number of considerations of ANYbotics.

Designed for extreme environments, for ANYmal smoke is not a problem and it can walk in the snow, through rubble or in water. ©ANYbotics

The startup is not all alone. In fact, it has sold ANYmal robots to a dozen major universities who use them to develop their know-how in robotics. The startup has also founded ANYmal Research, a community including members such as Toyota Research Institute, the German Aerospace Center and the computer company Nvidia. Members have full access to ANYmal’s control software, simulations and documentation. Sharing has boosted both software and hardware ideas and developments (built on ROS, the open-source Robot Operating System). In particular, payload variations, providing for expandability and scalability. For instance, one of the universities uses a robotic arm which enables ANYmal to grasp or handle objects and open doors.

Among possible applications, ANYbotics mentions entertainment. It is not only about playing in more films or TV series, but rather about participating in various attractions (trade shows, museums, etc.). “ANYmal is so novel that it attracts a great amount of interest” confirms Fankhauser with a smile. “Whenever we present it somewhere, people gather around.”

Videos of these events show a fascinated and sometimes slightly fearful audience, when ANYmal gets too close to them. Is it fear of the “bad robot”? “This fear exists indeed and we are happy to be able to use ANYmal also to promote public awareness towards robotics and robots.” Reminiscent of a young dog, ANYmal is truly adapted for the purpose.

However, Péter Fankhauser softens the image of humans and sophisticated robots living together. “These coming years, robots will continue to work in the background, like they have for a long time in factories. Then, they will be used in public places in a selective and targeted way, for instance for dangerous missions. We will need to wait another ten years before animal-like robots, such as ANYmal will share our everyday lives!”

At the Consumer Electronics Show (CES) in Las Vegas in January, Continental, the German automotive manufacturing company, used robots to demonstrate a last-mile delivery. It showed ANYmal getting out of an autonomous vehicle with a parcel, climbing onto the front porch, lifting a paw to ring the doorbell, depositing the parcel before getting back into the vehicle. This futuristic image seems very close indeed.

*X-Files, season 11, episode 7, aired in February 2018 Continue reading

Posted in Human Robots

#435674 MIT Future of Work Report: We ...

Robots aren’t going to take everyone’s jobs, but technology has already reshaped the world of work in ways that are creating clear winners and losers. And it will continue to do so without intervention, says the first report of MIT’s Task Force on the Work of the Future.

The supergroup of MIT academics was set up by MIT President Rafael Reif in early 2018 to investigate how emerging technologies will impact employment and devise strategies to steer developments in a positive direction. And the headline finding from their first publication is that it’s not the quantity of jobs we should be worried about, but the quality.

Widespread press reports of a looming “employment apocalypse” brought on by AI and automation are probably wide of the mark, according to the authors. Shrinking workforces as developed countries age and outstanding limitations in what machines can do mean we’re unlikely to have a shortage of jobs.

But while unemployment is historically low, recent decades have seen a polarization of the workforce as the number of both high- and low-skilled jobs have grown at the expense of the middle-skilled ones, driving growing income inequality and depriving the non-college-educated of viable careers.

This is at least partly attributable to the growth of digital technology and automation, the report notes, which are rendering obsolete many middle-skilled jobs based around routine work like assembly lines and administrative support.

That leaves workers to either pursue high-skilled jobs that require deep knowledge and creativity, or settle for low-paid jobs that rely on skills—like manual dexterity or interpersonal communication—that are still beyond machines, but generic to most humans and therefore not valued by employers. And the growth of emerging technology like AI and robotics is only likely to exacerbate the problem.

This isn’t the first report to note this trend. The World Bank’s 2016 World Development Report noted how technology is causing a “hollowing out” of labor markets. But the MIT report goes further in saying that the cause isn’t simply technology, but the institutions and policies we’ve built around it.

The motivation for introducing new technology is broadly assumed to be to increase productivity, but the authors note a rarely-acknowledged fact: “Not all innovations that raise productivity displace workers, and not all innovations that displace workers substantially raise productivity.”

Examples of the former include computer-aided design software that makes engineers and architects more productive, while examples of the latter include self-service checkouts and automated customer support that replace human workers, often at the expense of a worse customer experience.

While the report notes that companies have increasingly adopted the language of technology augmenting labor, in reality this has only really benefited high-skilled workers. For lower-skilled jobs the motivation is primarily labor cost savings, which highlights the other major force shaping technology’s impact on employment: shareholder capitalism.

The authors note that up until the 1980s, increasing productivity resulted in wage growth across the economic spectrum, but since then average wage growth has failed to keep pace and gains have dramatically skewed towards the top earners.

The report shies away from directly linking this trend to the birth of Reaganomics (something others have been happy to do), but it notes that American veneration of the shareholder as the primary stakeholder in a business and tax policies that incentivize investment in capital rather than labor have exacerbated the negative impacts technology can have on employment.

That means the current focus on re-skilling workers to thrive in the new economy is a necessary, but not sufficient, solution to the disruptive impact technology is having on work, the authors say.

Alongside significant investment in education, fiscal policies need to be re-balanced away from subsidizing investment in physical capital and towards boosting investment in human capital, the authors write, and workers need to have a greater say in corporate decision-making.

The authors point to other developed economies where productivity growth, income growth, and equality haven’t become so disconnected thanks to investments in worker skills, social safety nets, and incentives to invest in human capital. Whether such a radical reshaping of US economic policy is achievable in today’s political climate remains to be seen, but the authors conclude with a call to arms.

“The failure of the US labor market to deliver broadly shared prosperity despite rising productivity is not an inevitable byproduct of current technologies or free markets,” they write. “We can and should do better.”

Image Credit: Simon Abrams / Unsplash/a> Continue reading

Posted in Human Robots

#435656 Will AI Be Fashion Forward—or a ...

The narrative that often accompanies most stories about artificial intelligence these days is how machines will disrupt any number of industries, from healthcare to transportation. It makes sense. After all, technology already drives many of the innovations in these sectors of the economy.

But sneakers and the red carpet? The definitively low-tech fashion industry would seem to be one of the last to turn over its creative direction to data scientists and machine learning algorithms.

However, big brands, e-commerce giants, and numerous startups are betting that AI can ingest data and spit out Chanel. Maybe it’s not surprising, given that fashion is partly about buzz and trends—and there’s nothing more buzzy and trendy in the world of tech today than AI.

In its annual survey of the $3 trillion fashion industry, consulting firm McKinsey predicted that while AI didn’t hit a “critical mass” in 2018, it would increasingly influence the business of everything from design to manufacturing.

“Fashion as an industry really has been so slow to understand its potential roles interwoven with technology. And, to be perfectly honest, the technology doesn’t take fashion seriously.” This comment comes from Zowie Broach, head of fashion at London’s Royal College of Arts, who as a self-described “old fashioned” designer has embraced the disruptive nature of technology—with some caveats.

Co-founder in the late 1990s of the avant-garde fashion label Boudicca, Broach has always seen tech as a tool for designers, even setting up a website for the company circa 1998, way before an online presence became, well, fashionable.

Broach told Singularity Hub that while she is generally optimistic about the future of technology in fashion—the designer has avidly been consuming old sci-fi novels over the last few years—there are still a lot of difficult questions to answer about the interface of algorithms, art, and apparel.

For instance, can AI do what the great designers of the past have done? Fashion was “about designing, it was about a narrative, it was about meaning, it was about expression,” according to Broach.

AI that designs products based on data gleaned from human behavior can potentially tap into the Pavlovian response in consumers in order to make money, Broach noted. But is that channeling creativity, or just digitally dabbling in basic human brain chemistry?

She is concerned about people retaining control of the process, whether we’re talking about their data or their designs. But being empowered with the insights machines could provide into, for example, the geographical nuances of fashion between Dubai, Moscow, and Toronto is thrilling.

“What is it that we want the future to be from a fashion, an identity, and design perspective?” she asked.

Off on the Right Foot
Silicon Valley and some of the biggest brands in the industry offer a few answers about where AI and fashion are headed (though not at the sort of depths that address Broach’s broader questions of aesthetics and ethics).

Take what is arguably the biggest brand in fashion, at least by market cap but probably not by the measure of appearances on Oscar night: Nike. The $100 billion shoe company just gobbled up an AI startup called Celect to bolster its data analytics and optimize its inventory. In other words, Nike hopes it will be able to figure out what’s hot and what’s not in a particular location to stock its stores more efficiently.

The company is going even further with Nike Fit, a foot-scanning platform using a smartphone camera that applies AI techniques from fields like computer vision and machine learning to find the best fit for each person’s foot. The algorithms then identify and recommend the appropriately sized and shaped shoe in different styles.

No doubt the next step will be to 3D print personalized and on-demand sneakers at any store.

San Francisco-based startup ThirdLove is trying to bring a similar approach to bra sizes. Its 20-member data team, Fortune reported, has developed the Fit Finder quiz that uses machine learning algorithms to help pick just the right garment for every body type.

Data scientists are also a big part of the team at Stitch Fix, a former San Francisco startup that went public in 2017 and today sports a market cap of more than $2 billion. The online “personal styling” company uses hundreds of algorithms to not only make recommendations to customers, but to help design new styles and even manage the subscription-based supply chain.

Future of Fashion
E-commerce giant Amazon has thrown its own considerable resources into developing AI applications for retail fashion—with mixed results.

One notable attempt involved a “styling assistant” that came with the company’s Echo Look camera that helped people catalog and manage their wardrobes, evening helping pick out each day’s attire. The company more recently revisited the direct consumer side of AI with an app called StyleSnap, which matches clothes and accessories uploaded to the site with the retailer’s vast inventory and recommends similar styles.

Behind the curtains, Amazon is going even further. A team of researchers in Israel have developed algorithms that can deduce whether a particular look is stylish based on a few labeled images. Another group at the company’s San Francisco research center was working on tech that could generate new designs of items based on images of a particular style the algorithms trained on.

“I will say that the accumulation of many new technologies across the industry could manifest in a highly specialized style assistant, far better than the examples we’ve seen today. However, the most likely thing is that the least sexy of the machine learning work will become the most impactful, and the public may never hear about it.”

That prediction is from an online interview with Leanne Luce, a fashion technology blogger and product manager at Google who recently wrote a book called, succinctly enough, Artificial Intelligence and Fashion.

Data Meets Design
Academics are also sticking their beakers into AI and fashion. Researchers at the University of California, San Diego, and Adobe Research have previously demonstrated that neural networks, a type of AI designed to mimic some aspects of the human brain, can be trained to generate (i.e., design) new product images to match a buyer’s preference, much like the team at Amazon.

Meanwhile, scientists at Hong Kong Polytechnic University are working with China’s answer to Amazon, Alibaba, on developing a FashionAI Dataset to help machines better understand fashion. The effort will focus on how algorithms approach certain building blocks of design, what are called “key points” such as neckline and waistline, and “fashion attributes” like collar types and skirt styles.

The man largely behind the university’s research team is Calvin Wong, a professor and associate head of Hong Kong Polytechnic University’s Institute of Textiles and Clothing. His group has also developed an “intelligent fabric defect detection system” called WiseEye for quality control, reducing the chance of producing substandard fabric by 90 percent.

Wong and company also recently inked an agreement with RCA to establish an AI-powered design laboratory, though the details of that venture have yet to be worked out, according to Broach.

One hope is that such collaborations will not just get at the technological challenges of using machines in creative endeavors like fashion, but will also address the more personal relationships humans have with their machines.

“I think who we are, and how we use AI in fashion, as our identity, is not a superficial skin. It’s very, very important for how we define our future,” Broach said.

Image Credit: Inspirationfeed / Unsplash Continue reading

Posted in Human Robots

#435621 ANYbotics Introduces Sleek New ANYmal C ...

Quadrupedal robots are making significant advances lately, and just in the past few months we’ve seen Boston Dynamics’ Spot hauling a truck, IIT’s HyQReal pulling a plane, MIT’s MiniCheetah doing backflips, Unitree Robotics’ Laikago towing a van, and Ghost Robotics’ Vision 60 exploring a mine. Robot makers are betting that their four-legged machines will prove useful in a variety of applications in construction, security, delivery, and even at home.

ANYbotics has been working on such applications for years, testing out their ANYmal robot in places where humans typically don’t want to go (like offshore platforms) as well as places where humans really don’t want to go (like sewers), and they have a better idea than most companies what can make quadruped robots successful.

This week, ANYbotics is announcing a completely new quadruped platform, ANYmal C, a major upgrade from the really quite research-y ANYmal B. The new quadruped has been optimized for ruggedness and reliability in industrial environments, with a streamlined body painted a color that lets you know it means business.

ANYmal C’s physical specs are pretty impressive for a production quadruped. It can move at 1 meter per second, manage 20-degree slopes and 45-degree stairs, cross 25-centimeter gaps, and squeeze through passages just 60 centimeters wide. It’s packed with cameras and 3D sensors, including a lidar for 3D mapping and simultaneous localization and mapping (SLAM). All these sensors (along with the vast volume of gait research that’s been done with ANYmal) make this one of the most reliably autonomous quadrupeds out there, with real-time motion planning and obstacle avoidance.

Image: ANYbotics

ANYmal can autonomously attach itself to a cone-shaped docking station to recharge.

ANYmal C is also one of the ruggedest legged robots in existence. The 50-kilogram robot is IP67 rated, meaning that it’s completely impervious to dust and can withstand being submerged in a meter of water for an hour. If it’s submerged for longer than that, you’re absolutely doing something wrong. The robot will run for over 2 hours on battery power, and if that’s not enough endurance, don’t worry, because ANYmal can autonomously impale itself on a weird cone-shaped docking station to recharge.

Photo: ANYbotics

ANYmal C’s sensor payload includes cameras and a lidar for 3D mapping and SLAM.

As far as what ANYmal C is designed to actually do, it’s mostly remote inspection tasks where you need to move around through a relatively complex environment, but where for whatever reason you’d be better off not sending a human. ANYmal C has a sensor payload that gives it lots of visual options, like thermal imaging, and with the ability to handle a 10-kilogram payload, the robot can be adapted to many different environments.

Over the next few months, we’re hoping to see more examples of ANYmal C being deployed to do useful stuff in real-world environments, but for now, we do have a bit more detail from ANYbotics CTO Christian Gehring.

IEEE Spectrum: Can you tell us about the development process for ANYmal C?

Christian Gehring: We tested the previous generation of ANYmal (B) in a broad range of environments over the last few years and gained a lot of insights. Based on our learnings, it became clear that we would have to re-design the robot to meet the requirements of industrial customers in terms of safety, quality, reliability, and lifetime. There were different prototype stages both for the new drives and for single robot assemblies. Apart from electrical tests, we thoroughly tested the thermal control and ingress protection of various subsystems like the depth cameras and actuators.

What can ANYmal C do that the previous version of ANYmal can’t?

ANYmal C was redesigned with a focus on performance increase regarding actuation (new drives), computational power (new hexacore Intel i7 PCs), locomotion and navigation skills, and autonomy (new depth cameras). The new robot additionally features a docking system for autonomous recharging and an inspection payload as an option. The design of ANYmal C is far more integrated than its predecessor, which increases both performance and reliability.

How much of ANYmal C’s development and design was driven by your experience with commercial or industry customers?

Tests (such as the offshore installation with TenneT) and discussions with industry customers were important to get the necessary design input in terms of performance, safety, quality, reliability, and lifetime. Most customers ask for very similar inspection tasks that can be performed with our standard inspection payload and the required software packages. Some are looking for a robot that can also solve some simple manipulation tasks like pushing a button. Overall, most use cases customers have in mind are realistic and achievable, but some are really tough for the robot, like climbing 50° stairs in hot environments of 50°C.

Can you describe how much autonomy you expect ANYmal C to have in industrial or commercial operations?

ANYmal C is primarily developed to perform autonomous routine inspections in industrial environments. This autonomy especially adds value for operations that are difficult to access, as human operation is extremely costly. The robot can naturally also be operated via a remote control and we are working on long-distance remote operation as well.

Do you expect that researchers will be interested in ANYmal C? What research applications could it be useful for?

ANYmal C has been designed to also address the needs of the research community. The robot comes with two powerful hexacore Intel i7 computers and can additionally be equipped with an NVIDIA Jetson Xavier graphics card for learning-based applications. Payload interfaces enable users to easily install and test new sensors. By joining our established ANYmal Research community, researchers get access to simulation tools and software APIs, which boosts their research in various areas like control, machine learning, and navigation.

[ ANYmal C ] Continue reading

Posted in Human Robots

#435614 3 Easy Ways to Evaluate AI Claims

When every other tech startup claims to use artificial intelligence, it can be tough to figure out if an AI service or product works as advertised. In the midst of the AI “gold rush,” how can you separate the nuggets from the fool’s gold?

There’s no shortage of cautionary tales involving overhyped AI claims. And applying AI technologies to health care, education, and law enforcement mean that getting it wrong can have real consequences for society—not just for investors who bet on the wrong unicorn.

So IEEE Spectrum asked experts to share their tips for how to identify AI hype in press releases, news articles, research papers, and IPO filings.

“It can be tricky, because I think the people who are out there selling the AI hype—selling this AI snake oil—are getting more sophisticated over time,” says Tim Hwang, director of the Harvard-MIT Ethics and Governance of AI Initiative.

The term “AI” is perhaps most frequently used to describe machine learning algorithms (and deep learning algorithms, which require even less human guidance) that analyze huge amounts of data and make predictions based on patterns that humans might miss. These popular forms of AI are mostly suited to specialized tasks, such as automatically recognizing certain objects within photos. For that reason, they are sometimes described as “weak” or “narrow” AI.

Some researchers and thought leaders like to talk about the idea of “artificial general intelligence” or “strong AI” that has human-level capacity and flexibility to handle many diverse intellectual tasks. But for now, this type of AI remains firmly in the realm of science fiction and is far from being realized in the real world.

“AI has no well-defined meaning and many so-called AI companies are simply trying to take advantage of the buzz around that term,” says Arvind Narayanan, a computer scientist at Princeton University. “Companies have even been caught claiming to use AI when, in fact, the task is done by human workers.”

Here are three ways to recognize AI hype.

Look for Buzzwords
One red flag is what Hwang calls the “hype salad.” This means stringing together the term “AI” with many other tech buzzwords such as “blockchain” or “Internet of Things.” That doesn’t automatically disqualify the technology, but spotting a high volume of buzzwords in a post, pitch, or presentation should raise questions about what exactly the company or individual has developed.

Other experts agree that strings of buzzwords can be a red flag. That’s especially true if the buzzwords are never really explained in technical detail, and are simply tossed around as vague, poorly-defined terms, says Marzyeh Ghassemi, a computer scientist and biomedical engineer at the University of Toronto in Canada.

“I think that if it looks like a Google search—picture ‘interpretable blockchain AI deep learning medicine’—it's probably not high-quality work,” Ghassemi says.

Hwang also suggests mentally replacing all mentions of “AI” in an article with the term “magical fairy dust.” It’s a way of seeing whether an individual or organization is treating the technology like magic. If so—that’s another good reason to ask more questions about what exactly the AI technology involves.

And even the visual imagery used to illustrate AI claims can indicate that an individual or organization is overselling the technology.

“I think that a lot of the people who work on machine learning on a day-to-day basis are pretty humble about the technology, because they’re largely confronted with how frequently it just breaks and doesn't work,” Hwang says. “And so I think that if you see a company or someone representing AI as a Terminator head, or a big glowing HAL eye or something like that, I think it’s also worth asking some questions.”

Interrogate the Data

It can be hard to evaluate AI claims without any relevant expertise, says Ghassemi at the University of Toronto. Even experts need to know the technical details of the AI algorithm in question and have some access to the training data that shaped the AI model’s predictions. Still, savvy readers with some basic knowledge of applied statistics can search for red flags.

To start, readers can look for possible bias in training data based on small sample sizes or a skewed population that fails to reflect the broader population, Ghassemi says. After all, an AI model trained only on health data from white men would not necessarily achieve similar results for other populations of patients.

“For me, a red flag is not demonstrating deep knowledge of how your labels are defined.”
—Marzyeh Ghassemi, University of Toronto

How machine learning and deep learning models perform also depends on how well humans labeled the sample datasets use to train these programs. This task can be straightforward when labeling photos of cats versus dogs, but gets more complicated when assigning disease diagnoses to certain patient cases.

Medical experts frequently disagree with each other on diagnoses—which is why many patients seek a second opinion. Not surprisingly, this ambiguity can also affect the diagnostic labels that experts assign in training datasets. “For me, a red flag is not demonstrating deep knowledge of how your labels are defined,” Ghassemi says.

Such training data can also reflect the cultural stereotypes and biases of the humans who labeled the data, says Narayanan at Princeton University. Like Ghassemi, he recommends taking a hard look at exactly what the AI has learned: “A good way to start critically evaluating AI claims is by asking questions about the training data.”

Another red flag is presenting an AI system’s performance through a single accuracy figure without much explanation, Narayanan says. Claiming that an AI model achieves “99 percent” accuracy doesn’t mean much without knowing the baseline for comparison—such as whether other systems have already achieved 99 percent accuracy—or how well that accuracy holds up in situations beyond the training dataset.

Narayanan also emphasized the need to ask questions about an AI model’s false positive rate—the rate of making wrong predictions about the presence of a given condition. Even if the false positive rate of a hypothetical AI service is just one percent, that could have major consequences if that service ends up screening millions of people for cancer.

Readers can also consider whether using AI in a given situation offers any meaningful improvement compared to traditional statistical methods, says Clayton Aldern, a data scientist and journalist who serves as managing director for Caldern LLC. He gave the hypothetical example of a “super-duper-fancy deep learning model” that achieves a prediction accuracy of 89 percent, compared to a “little polynomial regression model” that achieves 86 percent on the same dataset.

“We're talking about a three-percentage-point increase on something that you learned about in Algebra 1,” Aldern says. “So is it worth the hype?”

Don’t Ignore the Drawbacks

The hype surrounding AI isn’t just about the technical merits of services and products driven by machine learning. Overblown claims about the beneficial impacts of AI technology—or vague promises to address ethical issues related to deploying it—should also raise red flags.

“If a company promises to use its tech ethically, it is important to question if its business model aligns with that promise,” Narayanan says. “Even if employees have noble intentions, it is unrealistic to expect the company as a whole to resist financial imperatives.”

One example might be a company with a business model that depends on leveraging customers’ personal data. Such companies “tend to make empty promises when it comes to privacy,” Narayanan says. And, if companies hire workers to produce training data, it’s also worth asking whether the companies treat those workers ethically.

The transparency—or lack thereof—about any AI claim can also be telling. A company or research group can minimize concerns by publishing technical claims in peer-reviewed journals or allowing credible third parties to evaluate their AI without giving away big intellectual property secrets, Narayanan says. Excessive secrecy is a big red flag.

With these strategies, you don’t need to be a computer engineer or data scientist to start thinking critically about AI claims. And, Narayanan says, the world needs many people from different backgrounds for societies to fully consider the real-world implications of AI.

Editor’s Note: The original version of this story misspelled Clayton Aldern’s last name as Alderton. Continue reading

Posted in Human Robots