Tag Archives: wrong

#435726 This Is the Most Powerful Robot Arm Ever ...

Last month, engineers at NASA’s Jet Propulsion Laboratory wrapped up the installation of the Mars 2020 rover’s 2.1-meter-long robot arm. This is the most powerful arm ever installed on a Mars rover. Even though the Mars 2020 rover shares much of its design with Curiosity, the new arm was redesigned to be able to do much more complex science, drilling into rocks to collect samples that can be stored for later recovery.

JPL is well known for developing robots that do amazing work in incredibly distant and hostile environments. The Opportunity Mars rover, to name just one example, had a 90-day planned mission but remained operational for 5,498 days in a robot unfriendly place full of dust and wild temperature swings where even the most basic maintenance or repair is utterly impossible. (Its twin rover, Spirit, operated for 2,269 days.)

To learn more about the process behind designing robotic systems that are capable of feats like these, we talked with Matt Robinson, one of the engineers who designed the Mars 2020 rover’s new robot arm.

The Mars 2020 rover (which will be officially named through a public contest which opens this fall) is scheduled to launch in July of 2020, landing in Jezero Crater on February 18, 2021. The overall design is similar to the Mars Science Laboratory (MSL) rover, named Curiosity, which has been exploring Gale Crater on Mars since August 2012, except Mars 2020 will be a bit bigger and capable of doing even more amazing science. It will outweigh Curiosity by about 150 kilograms, but it’s otherwise about the same size, and uses the same type of radioisotope thermoelectric generator for power. Upgraded aluminum wheels will be more durable than Curiosity’s wheels, which have suffered significant wear. Mars 2020 will land on Mars in the same way that Curiosity did, with a mildly insane descent to the surface from a rocket-powered hovering “skycrane.”

Photo: NASA/JPL-Caltech

Last month, engineers at NASA's Jet Propulsion Laboratory install the main robotic arm on the Mars 2020 rover. Measuring 2.1 meters long, the arm will allow the rover to work as a human geologist would: by holding and using science tools with its turret.

Mars 2020 really steps it up when it comes to science. The most interesting new capability (besides serving as the base station for a highly experimental autonomous helicopter) is that the rover will be able to take surface samples of rock and soil, put them into tubes, seal the tubes up, and then cache the tubes on the surface for later retrieval (and potentially return to Earth for analysis). Collecting the samples is the job of a drill on the end of the robot arm that can be equipped with a variety of interchangeable bits, but the arm holds a number of other instruments as well. A “turret” can swap between the drill, a mineral identification sensor suite called SHERLOC, and an X-ray spectrometer and camera called PIXL. Fundamentally, most of Mars 2020’s science work is going to depend on the arm and the hardware that it carries, both in terms of close-up surface investigations and collecting samples for caching.

Matt Robinson is the Deputy Delivery Manager for the Sample Caching System on the Mars 2020 rover, which covers the robotic arm itself, the drill at the end of the arm, and the sample caching system within the body of the rover that manages the samples. Robinson has been at JPL since 2001, and he’s worked on the Mars Phoenix Lander mission as the robotic arm flight software developer and robotic arm test and operations engineer, as well as on Curiosity as the robotic arm test and operations lead engineer.

We spoke with Robinson about how the Mars 2020 arm was designed, and what it’s like to be building robots for exploring other planets.

IEEE Spectrum: How’d you end up working on robots at JPL?

Matt Robinson: When I was a grad student, my focus was on vision-based robotics research, so the kinds of things they do at JPL, or that we do at JPL now, were right within my wheelhouse. One of my advisors in grad school had a former student who was out here at JPL, so that’s how I made the contact. But I was very excited to come to JPL—as a young grad student working in robotics, space robotics was where it’s at.

For a robotics engineer, working in space is kind of the gold standard. You’re working in a challenging environment and you have to be prepared for any time of eventuality that may occur. And when you send your robot out to space, there’s no getting it back.

Once the rover arrives on Mars and you receive pictures back from it operating, there’s no greater feeling. You’ve built something that is now working 200+ million miles away. It’s an awesome experience! I have to pinch myself sometimes with the job I do. Working at JPL on space robotics is the holy grail for a roboticist.

What’s different about designing an arm for a rover that will operate on Mars?

We spent over five years designing, manufacturing, assembling, and testing the arm. Scientists have defined the high-level goals for what the mission has to do—acquire core samples and process them for return, carry science instruments on the arm to help determine what rocks to sample, and so on. We, as engineers, define the next level of requirements that support those goals.

When you’re building a robotic arm for another planet, you want to design something that is robust to the environment as well as robust from fault-protection standpoint. On Mars, we’re talking about an environment where the temperature can vary 100 degrees Celsius over the course of the day, so it’s very challenging thermally. With force sensing for instance, that’s a major problem. Force sensors aren’t typically designed to operate or even survive in temperature ranges that we’re talking about. So a lot of effort has to go into force sensor design and testing.

And then there’s a do-no-harm aspect—you’re sending this piece of hardware 200 million miles away, and you can’t get it back, so you want to make sure your hardware and software are robust and cannot do any harm to the system. It’s definitely a change in mindset from a terrestrial robot, where if you make a mistake, you can repair it.

“Once the rover arrives on Mars and you receive pictures back from it, there’s no greater feeling . . . I have to pinch myself sometimes with the job I do.”
—Matt Robinson, NASA JPL

How do you decide how much redundancy is enough?

That’s always a big question. It comes down to a couple of things, typically: mass and volume. You have a certain amount of mass that’s allocated to the robotic arm and we have a volume that it has to fit within, so those are often the drivers of the amount of redundancy that you can fit. We also have a lot of experience with sending arms to other planets, and at the beginning of projects, we establish a number of requirements that the design has to meet, and that’s where the redundancy is captured.

How much is the design of the arm driven by this need for redundancy, as opposed to trying to pack in all of the instrumentation that you want to have on there to do as much science as possible?

The requirements were driven by a couple of things. We knew roughly how big the instruments on the end of the arm were going to be, so the arm design is partially driven by that, because as the instruments get bigger and heavier, the arm has to get bigger and stronger. We have our coring drill at the end of the arm, and coring requires a certain level of force, so the arm has to be strong enough to do that. Those all became requirements that drove the design of the arm. On top of that, there was also that this arm also has to operate within the Martian environment, so you have things like the temperature changes and thermal expansion—you have to design for that as well. It’s a combination of both, really.

You were a test engineer for the arm used on the MSL rover. What did you learn from Spirit and Opportunity that informed the design of the arm on Curiosity?

Spirit and Opportunity did not have any force-sensing on the robotic arm. We had contact sensors that were good enough. Spirit and Opportunity’s arms were used to place instruments, that’s all it had to do, primarily. When you’re talking about actually acquiring samples, it’s not a matter of just placing the tool—you also have to apply forces to the environment. And once you start doing that, you really need a force sensor to protect you, and also to determine how much load to apply. So that was a big theme, a big difference between MSL and Spirit and Opportunity.

The size grew a lot too. If you look at Spirit and Opportunity, they’re the size of a riding lawnmower. Curiosity and the Mars 2020 rovers are the size of a small car. The Spirit and Opportunity arm was under a meter long, and the 2020 arm is twice that, and it has to apply forces that are much higher than the Spirit and Opportunity arm. From Curiosity to 2020, the payload of the arm grew by 50 percent, but the mass of the arm did not grow a whole lot, because our mass budget was kind of tight. We had to design an arm that was stronger, that had more capability, without adding more mass. That was a big challenge. We were fairly efficient on Curiosity, but on 2020, we sharpened the pencil even more.

Photo: NASA/JPL-Caltech

Three generations of Mars rovers developed at NASA’s Jet Propulsion Laboratory. Front and center: Sojourner rover, which landed on Mars in 1997 as part of the Mars Pathfinder Project. Left: Mars Exploration Rover Project rover (Spirit and Opportunity), which landed on Mars in 2004. Right: Mars Science Laboratory rover (Curiosity), which landed on Mars in August 2012.

MSL used its arm to drill into rocks like Mars 2020 will—how has the experience of operating MSL on Mars changed your thinking on how to make that work?

On MSL, the force sensor was used primarily for fault protection, just to protect the arm from being overloaded. [When drilling] we used a stiffness model of the arm to apply the force. The force sensor was only used in case you overloaded, and that’s very different from doing active force control, where you’re actually using the force sensor in a control loop.

On Mars 2020, we’re taking it to the next step, using the force sensor to actually actively control the level of force, both for pushing on the ground and for doing bit exchange. That’s a key point because fault protection to prevent damage usually has larger error bars. When you’re trying to actually push on the environment to apply force, and you’re doing active force control, the force sensor has to be significantly more accurate.

So a big thing that we learned on MSL—it was the first time we’d actually flown a force sensor, and we learned a lot about how to design and test force sensors to be used on the surface of Mars.

How do you effectively test the Mars 2020 arm on Earth?

That’s a good question. The arm was designed to operate on either Earth or Mars. It’s strong enough to do both. We also have a stiffness model of the arm which includes allows us to compensate for differences in gravity. For testing, we make two copies of the robotic arm. We have our copy that we’re going to fly to Mars, which is what we call our flight model, and we have our engineering model. They’re effectively duplicates of each other. The engineering arm stays on earth, so even once we’ve sent the flight model to Mars, we can continue to test. And if something were to happen, if say a drill bit got stuck in the ground on Mars, we could try to replicate those conditions on Earth with our engineering model arm, and use that to test out different scenarios to overcome the problem.

How much autonomy will the arm have?

We have different models of autonomy. We have pretty high levels flight software and, for instance, we have a command that just says “dock,” that moves the arm does all the force control to the dock the arm with the carousel. For surface interaction, we have stereo cameras on the rover, and those cameras allow us to generate 3D terrain models. Using those 3D terrain models, scientists can select a target on that surface, and then we can position the arm on the target.

Scientists like to select the particular sample targets, because they have very specific types of rocks they’re looking for to sample from. On 2020, we’re providing the ability for the next level of autonomy for the rover to drive up to an area and at least do the initial surveying of that area, so the scientists can select the specific target. So the way that that would happen is, if there’s an area off in the distance that the scientists find potentially interesting, the rover will autonomously drive up to it, and deploy the arm and take all the pictures so that we can generate those 3D terrain models and then the next day the scientists can pick the specific target they want. It’s really cool.

JPL is famous for making robots that operate for far longer than NASA necessarily plans for. What’s it like designing hardware and software for a system that will (hopefully) become part of that legacy?

The way that I look at it is, when you’re building an arm that’s going to go to another planet, all the things that could go wrong… You have to build something that’s robust and that can survive all that. It’s not that we’re trying to overdesign arms so that they’ll end up lasting much, much longer, it’s that, given all the things that you can encounter within a fairly unknown environment, and the level of robustness of the design you have to apply, it just so happens we end up with designs that end up lasting a lot longer than they do. Which is great, but we’re not held to that, although we’re very excited when we see them last that long. Without any calibration, without any maintenance, exactly, it’s amazing. They show their wear over time, but they still operate, it’s super exciting, it’s very inspirational to see.

[ Mars 2020 Rover ] Continue reading

Posted in Human Robots

#435626 Video Friday: Watch Robots Make a Crepe ...

Video Friday is your weekly selection of awesome robotics videos, collected by your Automaton bloggers. Every week, we also post a calendar of upcoming robotics events; here's what we have so far (send us your events!):

Robotronica – August 18, 2019 – Brisbane, Australia
CLAWAR 2019 – August 26-28, 2019 – Kuala Lumpur, Malaysia
IEEE Africon 2019 – September 25-27, 2019 – Accra, Ghana
ISRR 2019 – October 6-10, 2019 – Hanoi, Vietnam
Ro-Man 2019 – October 14-18, 2019 – New Delhi
Humanoids 2019 – October 15-17, 2019 – Toronto
ARSO 2019 – October 31-November 2, 2019 – Beijing
ROSCon 2019 – October 31-November 1, 2019 – Macau
IROS 2019 – November 4-8, 2019 – Macau
Let us know if you have suggestions for next week, and enjoy today's videos.

Team CoSTAR (JPL, MIT, Caltech, KAIST, LTU) has one of the more diverse teams of robots that we’ve seen:

[ Team CoSTAR ]

A team from Carnegie Mellon University and Oregon State University is sending ground and aerial autonomous robots into a Pittsburgh-area mine to prepare for this month’s DARPA Subterranean Challenge.

“Look at that fire extinguisher, what a beauty!” Expect to hear a lot more of that kind of weirdness during SubT.

[ CMU ]

Unitree Robotics is starting to batch-manufacture Laikago Pro quadrupeds, and if you buy four of them, they can carry you around in a chair!

I’m also really liking these videos from companies that are like, “We have a whole bunch of robot dogs now—what weird stuff can we do with them?”

[ Unitree Robotics ]

Why take a handful of pills every day for all the stuff that's wrong with you, when you could take one custom pill instead? Because custom pills are time-consuming to make, that’s why. But robots don’t care!

Multiply Labs’ factory is designed to operate in parallel. All the filling robots and all the quality-control robots are operating at the same time. The robotic arm, in the meanwhile, shuttles dozens of trays up and down the production floor, making sure that each capsule is filled with the right drugs. The manufacturing cell shown in this article can produce 10,000 personalized capsules in an 8-hour shift. A single cell occupies just 128 square feet (12 square meters) on the production floor. This means that a regular production facility (~10,000 square feet, or 929 m2 ) can house 78 cells, for an overall output of 780,000 capsules per shift. This exceeds the output of most traditional manufacturers—while producing unique personalized capsules!

[ Multiply Labs ]

Thanks Fred!

If you’re getting tired of all those annoying drones that sound like giant bees, just have a listen to this turbine-powered one:

[ Malloy Aeronautics ]

In retrospect, it’s kind of amazing that nobody has bothered to put a functional robotic dog head on a quadruped robot before this, right?

Equipped with sensors, high-tech radar imaging, cameras and a directional microphone, this 100-pound (45-kilogram) super-robot is still a “puppy-in-training.” Just like a regular dog, he responds to commands such as “sit,” “stand,” and “lie down.” Eventually, he will be able to understand and respond to hand signals, detect different colors, comprehend many languages, coordinate his efforts with drones, distinguish human faces, and even recognize other dogs.

As an information scout, Astro’s key missions will include detecting guns, explosives and gun residue to assist police, the military, and security personnel. This robodog’s talents won’t just end there, he also can be programmed to assist as a service dog for the visually impaired or to provide medical diagnostic monitoring. The MPCR team also is training Astro to serve as a first responder for search-and-rescue missions such as hurricane reconnaissance as well as military maneuvers.

[ FAU ]

And now this amazing video, “The Coke Thief,” from ICRA 2005 (!):

[ Paper ]

CYBATHLON Series put the focus on one or two of the six disciplines and are organized in cooperation with international universities and partners. The CYBATHLON Arm and Leg Prosthesis Series took place in Karlsruhe, Germany, from 16 to 18 May and was organized in cooperation with the Karlsruhe Institute of Technology (KIT) and the trade fair REHAB Karlsruhe.

The CYBATHLON Wheelchair Series took place in Kawasaki, Japan on 5 May 2019 and was organized in cooperation with the CYBATHLON Wheelchair Series Japan Organizing Committee and supported by the Swiss Embassy.

[ Cybathlon ]

Rainbow crepe robot!

There’s also this other robot, which I assume does something besides what's in the video, because otherwise it appears to be a massively overengineered way of shaping cooked rice into a chubby triangle.

[ PC Watch ]

The Weaponized Plastic Fighting League at Fetch Robotics has had another season of shardation, deintegration, explodification, and other -tions. Here are a couple fan favorite match videos:

[ Fetch Robotics ]

This video is in German, but it’s worth watching for the three seconds of extremely satisfying footage showing a robot twisting dough into pretzels.

[ Festo ]

Putting brains into farming equipment is a no-brainer, since it’s a semi-structured environment that's generally clear of wayward humans driving other vehicles.

[ Lovol ]

Thanks Fan!

Watch some robots assemble suspiciously Lego-like (but definitely not actually Lego) minifigs.

[ DevLinks ]

The Robotics Innovation Facility (RIFBristol) helps businesses, entrepreneurs, researchers and public sector bodies to embrace the concept of ‘Industry 4.0'. From training your staff in robotics, and demonstrating how automation can improve your manufacturing processes, to prototyping and validating your new innovations—we can provide the support you need.

[ RIF ]

Ryan Gariepy from Clearpath Robotics (and a bunch of other stuff) gave a talk at ICRA with the title of “Move Fast and (Don’t) Break Things: Commercializing Robotics at the Speed of Venture Capital,” which is more interesting when you know that this year’s theme was “Notable Failures.”

[ Clearpath Robotics ]

In this week’s episode of Robots in Depth, Per interviews Michael Nielsen, a computer vision researcher at the Danish Technological Institute.

Michael worked with a fusion of sensors like stereo vision, thermography, radar, lidar and high-frame-rate cameras, merging multiple images for high dynamic range. All this, to be able to navigate the tricky situation in a farm field where you need to navigate close to or even in what is grown. Multibaseline cameras were also used to provide range detection over a wide range of distances.

We also learn about how he expanded his work into sorting recycling, a very challenging problem. We also hear about the problems faced when using time of flight and sheet of light cameras. He then shares some good results using stereo vision, especially combined with blue light random dot projectors.

[ Robots in Depth ] Continue reading

Posted in Human Robots

#435621 ANYbotics Introduces Sleek New ANYmal C ...

Quadrupedal robots are making significant advances lately, and just in the past few months we’ve seen Boston Dynamics’ Spot hauling a truck, IIT’s HyQReal pulling a plane, MIT’s MiniCheetah doing backflips, Unitree Robotics’ Laikago towing a van, and Ghost Robotics’ Vision 60 exploring a mine. Robot makers are betting that their four-legged machines will prove useful in a variety of applications in construction, security, delivery, and even at home.

ANYbotics has been working on such applications for years, testing out their ANYmal robot in places where humans typically don’t want to go (like offshore platforms) as well as places where humans really don’t want to go (like sewers), and they have a better idea than most companies what can make quadruped robots successful.

This week, ANYbotics is announcing a completely new quadruped platform, ANYmal C, a major upgrade from the really quite research-y ANYmal B. The new quadruped has been optimized for ruggedness and reliability in industrial environments, with a streamlined body painted a color that lets you know it means business.

ANYmal C’s physical specs are pretty impressive for a production quadruped. It can move at 1 meter per second, manage 20-degree slopes and 45-degree stairs, cross 25-centimeter gaps, and squeeze through passages just 60 centimeters wide. It’s packed with cameras and 3D sensors, including a lidar for 3D mapping and simultaneous localization and mapping (SLAM). All these sensors (along with the vast volume of gait research that’s been done with ANYmal) make this one of the most reliably autonomous quadrupeds out there, with real-time motion planning and obstacle avoidance.

Image: ANYbotics

ANYmal can autonomously attach itself to a cone-shaped docking station to recharge.

ANYmal C is also one of the ruggedest legged robots in existence. The 50-kilogram robot is IP67 rated, meaning that it’s completely impervious to dust and can withstand being submerged in a meter of water for an hour. If it’s submerged for longer than that, you’re absolutely doing something wrong. The robot will run for over 2 hours on battery power, and if that’s not enough endurance, don’t worry, because ANYmal can autonomously impale itself on a weird cone-shaped docking station to recharge.

Photo: ANYbotics

ANYmal C’s sensor payload includes cameras and a lidar for 3D mapping and SLAM.

As far as what ANYmal C is designed to actually do, it’s mostly remote inspection tasks where you need to move around through a relatively complex environment, but where for whatever reason you’d be better off not sending a human. ANYmal C has a sensor payload that gives it lots of visual options, like thermal imaging, and with the ability to handle a 10-kilogram payload, the robot can be adapted to many different environments.

Over the next few months, we’re hoping to see more examples of ANYmal C being deployed to do useful stuff in real-world environments, but for now, we do have a bit more detail from ANYbotics CTO Christian Gehring.

IEEE Spectrum: Can you tell us about the development process for ANYmal C?

Christian Gehring: We tested the previous generation of ANYmal (B) in a broad range of environments over the last few years and gained a lot of insights. Based on our learnings, it became clear that we would have to re-design the robot to meet the requirements of industrial customers in terms of safety, quality, reliability, and lifetime. There were different prototype stages both for the new drives and for single robot assemblies. Apart from electrical tests, we thoroughly tested the thermal control and ingress protection of various subsystems like the depth cameras and actuators.

What can ANYmal C do that the previous version of ANYmal can’t?

ANYmal C was redesigned with a focus on performance increase regarding actuation (new drives), computational power (new hexacore Intel i7 PCs), locomotion and navigation skills, and autonomy (new depth cameras). The new robot additionally features a docking system for autonomous recharging and an inspection payload as an option. The design of ANYmal C is far more integrated than its predecessor, which increases both performance and reliability.

How much of ANYmal C’s development and design was driven by your experience with commercial or industry customers?

Tests (such as the offshore installation with TenneT) and discussions with industry customers were important to get the necessary design input in terms of performance, safety, quality, reliability, and lifetime. Most customers ask for very similar inspection tasks that can be performed with our standard inspection payload and the required software packages. Some are looking for a robot that can also solve some simple manipulation tasks like pushing a button. Overall, most use cases customers have in mind are realistic and achievable, but some are really tough for the robot, like climbing 50° stairs in hot environments of 50°C.

Can you describe how much autonomy you expect ANYmal C to have in industrial or commercial operations?

ANYmal C is primarily developed to perform autonomous routine inspections in industrial environments. This autonomy especially adds value for operations that are difficult to access, as human operation is extremely costly. The robot can naturally also be operated via a remote control and we are working on long-distance remote operation as well.

Do you expect that researchers will be interested in ANYmal C? What research applications could it be useful for?

ANYmal C has been designed to also address the needs of the research community. The robot comes with two powerful hexacore Intel i7 computers and can additionally be equipped with an NVIDIA Jetson Xavier graphics card for learning-based applications. Payload interfaces enable users to easily install and test new sensors. By joining our established ANYmal Research community, researchers get access to simulation tools and software APIs, which boosts their research in various areas like control, machine learning, and navigation.

[ ANYmal C ] Continue reading

Posted in Human Robots

#435614 3 Easy Ways to Evaluate AI Claims

When every other tech startup claims to use artificial intelligence, it can be tough to figure out if an AI service or product works as advertised. In the midst of the AI “gold rush,” how can you separate the nuggets from the fool’s gold?

There’s no shortage of cautionary tales involving overhyped AI claims. And applying AI technologies to health care, education, and law enforcement mean that getting it wrong can have real consequences for society—not just for investors who bet on the wrong unicorn.

So IEEE Spectrum asked experts to share their tips for how to identify AI hype in press releases, news articles, research papers, and IPO filings.

“It can be tricky, because I think the people who are out there selling the AI hype—selling this AI snake oil—are getting more sophisticated over time,” says Tim Hwang, director of the Harvard-MIT Ethics and Governance of AI Initiative.

The term “AI” is perhaps most frequently used to describe machine learning algorithms (and deep learning algorithms, which require even less human guidance) that analyze huge amounts of data and make predictions based on patterns that humans might miss. These popular forms of AI are mostly suited to specialized tasks, such as automatically recognizing certain objects within photos. For that reason, they are sometimes described as “weak” or “narrow” AI.

Some researchers and thought leaders like to talk about the idea of “artificial general intelligence” or “strong AI” that has human-level capacity and flexibility to handle many diverse intellectual tasks. But for now, this type of AI remains firmly in the realm of science fiction and is far from being realized in the real world.

“AI has no well-defined meaning and many so-called AI companies are simply trying to take advantage of the buzz around that term,” says Arvind Narayanan, a computer scientist at Princeton University. “Companies have even been caught claiming to use AI when, in fact, the task is done by human workers.”

Here are three ways to recognize AI hype.

Look for Buzzwords
One red flag is what Hwang calls the “hype salad.” This means stringing together the term “AI” with many other tech buzzwords such as “blockchain” or “Internet of Things.” That doesn’t automatically disqualify the technology, but spotting a high volume of buzzwords in a post, pitch, or presentation should raise questions about what exactly the company or individual has developed.

Other experts agree that strings of buzzwords can be a red flag. That’s especially true if the buzzwords are never really explained in technical detail, and are simply tossed around as vague, poorly-defined terms, says Marzyeh Ghassemi, a computer scientist and biomedical engineer at the University of Toronto in Canada.

“I think that if it looks like a Google search—picture ‘interpretable blockchain AI deep learning medicine’—it's probably not high-quality work,” Ghassemi says.

Hwang also suggests mentally replacing all mentions of “AI” in an article with the term “magical fairy dust.” It’s a way of seeing whether an individual or organization is treating the technology like magic. If so—that’s another good reason to ask more questions about what exactly the AI technology involves.

And even the visual imagery used to illustrate AI claims can indicate that an individual or organization is overselling the technology.

“I think that a lot of the people who work on machine learning on a day-to-day basis are pretty humble about the technology, because they’re largely confronted with how frequently it just breaks and doesn't work,” Hwang says. “And so I think that if you see a company or someone representing AI as a Terminator head, or a big glowing HAL eye or something like that, I think it’s also worth asking some questions.”

Interrogate the Data

It can be hard to evaluate AI claims without any relevant expertise, says Ghassemi at the University of Toronto. Even experts need to know the technical details of the AI algorithm in question and have some access to the training data that shaped the AI model’s predictions. Still, savvy readers with some basic knowledge of applied statistics can search for red flags.

To start, readers can look for possible bias in training data based on small sample sizes or a skewed population that fails to reflect the broader population, Ghassemi says. After all, an AI model trained only on health data from white men would not necessarily achieve similar results for other populations of patients.

“For me, a red flag is not demonstrating deep knowledge of how your labels are defined.”
—Marzyeh Ghassemi, University of Toronto

How machine learning and deep learning models perform also depends on how well humans labeled the sample datasets use to train these programs. This task can be straightforward when labeling photos of cats versus dogs, but gets more complicated when assigning disease diagnoses to certain patient cases.

Medical experts frequently disagree with each other on diagnoses—which is why many patients seek a second opinion. Not surprisingly, this ambiguity can also affect the diagnostic labels that experts assign in training datasets. “For me, a red flag is not demonstrating deep knowledge of how your labels are defined,” Ghassemi says.

Such training data can also reflect the cultural stereotypes and biases of the humans who labeled the data, says Narayanan at Princeton University. Like Ghassemi, he recommends taking a hard look at exactly what the AI has learned: “A good way to start critically evaluating AI claims is by asking questions about the training data.”

Another red flag is presenting an AI system’s performance through a single accuracy figure without much explanation, Narayanan says. Claiming that an AI model achieves “99 percent” accuracy doesn’t mean much without knowing the baseline for comparison—such as whether other systems have already achieved 99 percent accuracy—or how well that accuracy holds up in situations beyond the training dataset.

Narayanan also emphasized the need to ask questions about an AI model’s false positive rate—the rate of making wrong predictions about the presence of a given condition. Even if the false positive rate of a hypothetical AI service is just one percent, that could have major consequences if that service ends up screening millions of people for cancer.

Readers can also consider whether using AI in a given situation offers any meaningful improvement compared to traditional statistical methods, says Clayton Aldern, a data scientist and journalist who serves as managing director for Caldern LLC. He gave the hypothetical example of a “super-duper-fancy deep learning model” that achieves a prediction accuracy of 89 percent, compared to a “little polynomial regression model” that achieves 86 percent on the same dataset.

“We're talking about a three-percentage-point increase on something that you learned about in Algebra 1,” Aldern says. “So is it worth the hype?”

Don’t Ignore the Drawbacks

The hype surrounding AI isn’t just about the technical merits of services and products driven by machine learning. Overblown claims about the beneficial impacts of AI technology—or vague promises to address ethical issues related to deploying it—should also raise red flags.

“If a company promises to use its tech ethically, it is important to question if its business model aligns with that promise,” Narayanan says. “Even if employees have noble intentions, it is unrealistic to expect the company as a whole to resist financial imperatives.”

One example might be a company with a business model that depends on leveraging customers’ personal data. Such companies “tend to make empty promises when it comes to privacy,” Narayanan says. And, if companies hire workers to produce training data, it’s also worth asking whether the companies treat those workers ethically.

The transparency—or lack thereof—about any AI claim can also be telling. A company or research group can minimize concerns by publishing technical claims in peer-reviewed journals or allowing credible third parties to evaluate their AI without giving away big intellectual property secrets, Narayanan says. Excessive secrecy is a big red flag.

With these strategies, you don’t need to be a computer engineer or data scientist to start thinking critically about AI claims. And, Narayanan says, the world needs many people from different backgrounds for societies to fully consider the real-world implications of AI.

Editor’s Note: The original version of this story misspelled Clayton Aldern’s last name as Alderton. Continue reading

Posted in Human Robots

#435127 Teaching AI the Concept of ‘Similar, ...

As a human you instinctively know that a leopard is closer to a cat than a motorbike, but the way we train most AI makes them oblivious to these kinds of relations. Building the concept of similarity into our algorithms could make them far more capable, writes the author of a new paper in Science Robotics.

Convolutional neural networks have revolutionized the field of computer vision to the point that machines are now outperforming humans on some of the most challenging visual tasks. But the way we train them to analyze images is very different from the way humans learn, says Atsuto Maki, an associate professor at KTH Royal Institute of Technology.

“Imagine that you are two years old and being quizzed on what you see in a photo of a leopard,” he writes. “You might answer ‘a cat’ and your parents might say, ‘yeah, not quite but similar’.”

In contrast, the way we train neural networks rarely gives that kind of partial credit. They are typically trained to have very high confidence in the correct label and consider all incorrect labels, whether ”cat” or “motorbike,” equally wrong. That’s a mistake, says Maki, because ignoring the fact that something can be “less wrong” means you’re not exploiting all of the information in the training data.

Even when models are trained this way, there will be small differences in the probabilities assigned to incorrect labels that can tell you a lot about how well the model can generalize what it has learned to unseen data.

If you show a model a picture of a leopard and it gives “cat” a probability of five percent and “motorbike” one percent, that suggests it picked up on the fact that a cat is closer to a leopard than a motorbike. In contrast, if the figures are the other way around it means the model hasn’t learned the broad features that make cats and leopards similar, something that could potentially be helpful when analyzing new data.

If we could boost this ability to identify similarities between classes we should be able to create more flexible models better able to generalize, says Maki. And recent research has demonstrated how variations of an approach called regularization might help us achieve that goal.

Neural networks are prone to a problem called “overfitting,” which refers to a tendency to pay too much attention to tiny details and noise specific to their training set. When that happens, models will perform excellently on their training data but poorly when applied to unseen test data without these particular quirks.

Regularization is used to circumvent this problem, typically by reducing the network’s capacity to learn all this unnecessary information and therefore boost its ability to generalize to new data. Techniques are varied, but generally involve modifying the network’s structure or the strength of the weights between artificial neurons.

More recently, though, researchers have suggested new regularization approaches that work by encouraging a broader spread of probabilities across all classes. This essentially helps them capture more of the class similarities, says Maki, and therefore boosts their ability to generalize.

One such approach was devised in 2017 by Google Brain researchers, led by deep learning pioneer Geoffrey Hinton. They introduced a penalty to their training process that directly punished overconfident predictions in the model’s outputs, and a technique called label smoothing that prevents the largest probability becoming much larger than all others. This meant the probabilities were lower for correct labels and higher for incorrect ones, which was found to boost performance of models on varied tasks from image classification to speech recognition.

Another came from Maki himself in 2017 and achieves the same goal, but by suppressing high values in the model’s feature vector—the mathematical construct that describes all of an object’s important characteristics. This has a knock-on effect on the spread of output probabilities and also helped boost performance on various image classification tasks.

While it’s still early days for the approach, the fact that humans are able to exploit these kinds of similarities to learn more efficiently suggests that models that incorporate them hold promise. Maki points out that it could be particularly useful in applications such as robotic grasping, where distinguishing various similar objects is important.

Image Credit: Marianna Kalashnyk / Shutterstock.com Continue reading

Posted in Human Robots