Tag Archives: recognition
#437974 China Wants to Be the World’s AI ...
China’s star has been steadily rising for decades. Besides slashing extreme poverty rates from 88 percent to under 2 percent in just 30 years, the country has become a global powerhouse in manufacturing and technology. Its pace of growth may slow due to an aging population, but China is nonetheless one of the world’s biggest players in multiple cutting-edge tech fields.
One of these fields, and perhaps the most significant, is artificial intelligence. The Chinese government announced a plan in 2017 to become the world leader in AI by 2030, and has since poured billions of dollars into AI projects and research across academia, government, and private industry. The government’s venture capital fund is investing over $30 billion in AI; the northeastern city of Tianjin budgeted $16 billion for advancing AI; and a $2 billion AI research park is being built in Beijing.
On top of these huge investments, the government and private companies in China have access to an unprecedented quantity of data, on everything from citizens’ health to their smartphone use. WeChat, a multi-functional app where people can chat, date, send payments, hail rides, read news, and more, gives the CCP full access to user data upon request; as one BBC journalist put it, WeChat “was ahead of the game on the global stage and it has found its way into all corners of people’s existence. It could deliver to the Communist Party a life map of pretty much everybody in this country, citizens and foreigners alike.” And that’s just one (albeit big) source of data.
Many believe these factors are giving China a serious leg up in AI development, even providing enough of a boost that its progress will surpass that of the US.
But there’s more to AI than data, and there’s more to progress than investing billions of dollars. Analyzing China’s potential to become a world leader in AI—or in any technology that requires consistent innovation—from multiple angles provides a more nuanced picture of its strengths and limitations. In a June 2020 article in Foreign Affairs, Oxford fellows Carl Benedikt Frey and Michael Osborne argued that China’s big advantages may not actually be that advantageous in the long run—and its limitations may be very limiting.
Moving the AI Needle
To get an idea of who’s likely to take the lead in AI, it could help to first consider how the technology will advance beyond its current state.
To put it plainly, AI is somewhat stuck at the moment. Algorithms and neural networks continue to achieve new and impressive feats—like DeepMind’s AlphaFold accurately predicting protein structures or OpenAI’s GPT-3 writing convincing articles based on short prompts—but for the most part these systems’ capabilities are still defined as narrow intelligence: completing a specific task for which the system was painstakingly trained on loads of data.
(It’s worth noting here that some have speculated OpenAI’s GPT-3 may be an exception, the first example of machine intelligence that, while not “general,” has surpassed the definition of “narrow”; the algorithm was trained to write text, but ended up being able to translate between languages, write code, autocomplete images, do math, and perform other language-related tasks it wasn’t specifically trained for. However, all of GPT-3’s capabilities are limited to skills it learned in the language domain, whether spoken, written, or programming language).
Both AlphaFold’s and GPT-3’s success was due largely to the massive datasets they were trained on; no revolutionary new training methods or architectures were involved. If all it was going to take to advance AI was a continuation or scaling-up of this paradigm—more input data yields increased capability—China could well have an advantage.
But one of the biggest hurdles AI needs to clear to advance in leaps and bounds rather than baby steps is precisely this reliance on extensive, task-specific data. Other significant challenges include the technology’s fast approach to the limits of current computing power and its immense energy consumption.
Thus, while China’s trove of data may give it an advantage now, it may not be much of a long-term foothold on the climb to AI dominance. It’s useful for building products that incorporate or rely on today’s AI, but not for pushing the needle on how artificially intelligent systems learn. WeChat data on users’ spending habits, for example, would be valuable in building an AI that helps people save money or suggests items they might want to purchase. It will enable (and already has enabled) highly tailored products that will earn their creators and the companies that use them a lot of money.
But data quantity isn’t what’s going to advance AI. As Frey and Osborne put it, “Data efficiency is the holy grail of further progress in artificial intelligence.”
To that end, research teams in academia and private industry are working on ways to make AI less data-hungry. New training methods like one-shot learning and less-than-one-shot learning have begun to emerge, along with myriad efforts to make AI that learns more like the human brain.
While not insignificant, these advancements still fall into the “baby steps” category. No one knows how AI is going to progress beyond these small steps—and that uncertainty, in Frey and Osborne’s opinion, is a major speed bump on China’s fast-track to AI dominance.
How Innovation Happens
A lot of great inventions have happened by accident, and some of the world’s most successful companies started in garages, dorm rooms, or similarly low-budget, nondescript circumstances (including Google, Facebook, Amazon, and Apple, to name a few). Innovation, the authors point out, often happens “through serendipity and recombination, as inventors and entrepreneurs interact and exchange ideas.”
Frey and Osborne argue that although China has great reserves of talent and a history of building on technologies conceived elsewhere, it doesn’t yet have a glowing track record in terms of innovation. They note that of the 100 most-cited patents from 2003 to present, none came from China. Giants Tencent, Alibaba, and Baidu are all wildly successful in the Chinese market, but they’re rooted in technologies or business models that came out of the US and were tweaked for the Chinese population.
“The most innovative societies have always been those that allowed people to pursue controversial ideas,” Frey and Osborne write. China’s heavy censorship of the internet and surveillance of citizens don’t quite encourage the pursuit of controversial ideas. The country’s social credit system rewards people who follow the rules and punishes those who step out of line. Frey adds that top-down execution of problem-solving is effective when the problem at hand is clearly defined—and the next big leaps in AI are not.
It’s debatable how strongly a culture of social conformism can impact technological innovation, and of course there can be exceptions. But a relevant historical example is the Soviet Union, which, despite heavy investment in science and technology that briefly rivaled the US in fields like nuclear energy and space exploration, ended up lagging far behind primarily due to political and cultural factors.
Similarly, China’s focus on computer science in its education system could give it an edge—but, as Frey told me in an email, “The best students are not necessarily the best researchers. Being a good researcher also requires coming up with new ideas.”
Winner Take All?
Beyond the question of whether China will achieve AI dominance is the issue of how it will use the powerful technology. Several of the ways China has already implemented AI could be considered morally questionable, from facial recognition systems used aggressively against ethnic minorities to smart glasses for policemen that can pull up information about whoever the wearer looks at.
This isn’t to say the US would use AI for purely ethical purposes. The military’s Project Maven, for example, used artificially intelligent algorithms to identify insurgent targets in Iraq and Syria, and American law enforcement agencies are also using (mostly unregulated) facial recognition systems.
It’s conceivable that “dominance” in AI won’t go to one country; each nation could meet milestones in different ways, or meet different milestones. Researchers from both countries, at least in the academic sphere, could (and likely will) continue to collaborate and share their work, as they’ve done on many projects to date.
If one country does take the lead, it will certainly see some major advantages as a result. Brookings Institute fellow Indermit Gill goes so far as to say that whoever leads in AI in 2030 will “rule the world” until 2100. But Gill points out that in addition to considering each country’s strengths, we should consider how willing they are to improve upon their weaknesses.
While China leads in investment and the US in innovation, both nations are grappling with huge economic inequalities that could negatively impact technological uptake. “Attitudes toward the social change that accompanies new technologies matter as much as the technologies, pointing to the need for complementary policies that shape the economy and society,” Gill writes.
Will China’s leadership be willing to relax its grip to foster innovation? Will the US business environment be enough to compete with China’s data, investment, and education advantages? And can both countries find a way to distribute technology’s economic benefits more equitably?
Time will tell, but it seems we’ve got our work cut out for us—and China does too.
Image Credit: Adam Birkett on Unsplash Continue reading
#437924 How a Software Map of the Entire Planet ...
i
“3D map data is the scaffolding of the 21st century.”
–Edward Miller, Founder, Scape Technologies, UK
Covered in cameras, sensors, and a distinctly spaceship looking laser system, Google’s autonomous vehicles were easy to spot when they first hit public roads in 2015. The key hardware ingredient is a spinning laser fixed to the roof, called lidar, which provides the car with a pair of eyes to see the world. Lidar works by sending out beams of light and measuring the time it takes to bounce off objects back to the source. By timing the light’s journey, these depth-sensing systems construct fully 3D maps of their surroundings.
3D maps like these are essentially software copies of the real world. They will be crucial to the development of a wide range of emerging technologies including autonomous driving, drone delivery, robotics, and a fast-approaching future filled with augmented reality.
Like other rapidly improving technologies, lidar is moving quickly through its development cycle. What was an expensive technology on the roof of a well-funded research project is now becoming cheaper, more capable, and readily available to consumers. At some point, lidar will come standard on most mobile devices and is now available to early-adopting owners of the iPhone 12 Pro.
Consumer lidar represents the inevitable shift from wealthy tech companies generating our world’s map data, to a more scalable crowd-sourced approach. To develop the repository for their Street View Maps product, Google reportedly spent $1-2 billion sending cars across continents photographing every street. Compare that to a live-mapping service like Waze, which uses crowd-sourced user data from its millions of users to generate accurate and real-time traffic conditions. Though these maps serve different functions, one is a static, expensive, unchanging map of the world while the other is dynamic, real-time, and constructed by users themselves.
Soon millions of people may be scanning everything from bedrooms to neighborhoods, resulting in 3D maps of significant quality. An online search for lidar room scans demonstrates just how richly textured these three-dimensional maps are compared to anything we’ve had before. With lidar and other depth-sensing systems, we now have the tools to create exact software copies of everywhere and everything on earth.
At some point, likely aided by crowdsourcing initiatives, these maps will become living breathing, real-time representations of the world. Some refer to this idea as a “digital twin” of the planet. In a feature cover story, Kevin Kelly, the cofounder of Wired magazine, calls this concept the “mirrorworld,” a one-to-one software map of everything.
So why is that such a big deal? Take augmented reality as an example.
Of all the emerging industries dependent on such a map, none are more invested in seeing this concept emerge than those within the AR landscape. Apple, for example, is not-so-secretly developing a pair of AR glasses, which they hope will deliver a mainstream turning point for the technology.
For Apple’s AR devices to work as anticipated, they will require virtual maps of the world, a concept AR insiders call the “AR cloud,” which is synonymous with the “mirrorworld” concept. These maps will be two things. First, they will be a tool that creators use to place AR content in very specific locations; like a world canvas to paint on. Second, they will help AR devices both locate and understand the world around them so they can render content in a believable way.
Imagine walking down a street wanting to check the trading hours of a local business. Instead of pulling out your phone to do a tedious search online, you conduct the equivalent of a visual google search simply by gazing at the store. Albeit a trivial example, the AR cloud represents an entirely non-trivial new way of managing how we organize the world’s information. Access to knowledge can be shifted away from the faraway monitors in our pocket, to its relevant real-world location.
Ultimately this describes a blurring of physical and digital infrastructure. Our public and private spaces will thus be comprised equally of both.
No example demonstrates this idea better than Pokémon Go. The game is straightforward enough; users capture virtual characters scattered around the real world. Today, the game relies on traditional GPS technology to place its characters, but GPS is accurate only to within a few meters of a location. For a car navigating on a highway or locating Pikachus in the world, that level of precision is sufficient. For drone deliveries, driverless cars, or placing a Pikachu in a specific location, say on a tree branch in a park, GPS isn’t accurate enough. As astonishing as it may seem, many experimental AR cloud concepts, even entirely mapped cities, are location specific down to the centimeter.
Niantic, the $4 billion publisher behind Pokémon Go, is aggressively working on developing a crowd-sourced approach to building better AR Cloud maps by encouraging their users to scan the world for them. Their recent acquisition of 6D.ai, a mapping software company developed by the University of Oxford’s Victor Prisacariu through his work at Oxford’s Active Vision Lab, indicates Niantic’s ambition to compete with the tech giants in this space.
With 6D.ai’s technology, Niantic is developing the in-house ability to generate their own 3D maps while gaining better semantic understanding of the world. By going beyond just knowing there’s a temporary collection of orange cones in a certain location, for example, the game may one day understand the meaning behind this; that a temporary construction zone means no Pokémon should spawn here to avoid drawing players to this location.
Niantic is not the only company working on this. Many of the big tech firms you would expect have entire teams focused on map data. Facebook, for example, recently acquired the UK-based Scape technologies, a computer vision startup mapping entire cities with centimeter precision.
As our digital maps of the world improve, expect a relentless and justified discussion of privacy concerns as well. How will society react to the idea of a real-time 3D map of their bedroom living on a Facebook or Amazon server? Those horrified by the use of facial recognition AI being used in public spaces are unlikely to find comfort in the idea of a machine-readable world subject to infinite monitoring.
The ability to build high-precision maps of the world could reshape the way we engage with our planet and promises to be one of the biggest technology developments of the next decade. While these maps may stay hidden as behind-the-scenes infrastructure powering much flashier technologies that capture the world’s attention, they will soon prop up large portions of our technological future.
Keep that in mind when a car with no driver is sharing your road.
Image credit: sergio souza / Pexels Continue reading
#437809 Q&A: The Masterminds Behind ...
Illustration: iStockphoto
Getting a car to drive itself is undoubtedly the most ambitious commercial application of artificial intelligence (AI). The research project was kicked into life by the 2004 DARPA Urban Challenge and then taken up as a business proposition, first by Alphabet, and later by the big automakers.
The industry-wide effort vacuumed up many of the world’s best roboticists and set rival companies on a multibillion-dollar acquisitions spree. It also launched a cycle of hype that paraded ever more ambitious deadlines—the most famous of which, made by Alphabet’s Sergei Brin in 2012, was that full self-driving technology would be ready by 2017. Those deadlines have all been missed.
Much of the exhilaration was inspired by the seeming miracles that a new kind of AI—deep learning—was achieving in playing games, recognizing faces, and transliterating voices. Deep learning excels at tasks involving pattern recognition—a particular challenge for older, rule-based AI techniques. However, it now seems that deep learning will not soon master the other intellectual challenges of driving, such as anticipating what human beings might do.
Among the roboticists who have been involved from the start are Gill Pratt, the chief executive officer of Toyota Research Institute (TRI) , formerly a program manager at the Defense Advanced Research Projects Agency (DARPA); and Wolfram Burgard, vice president of automated driving technology for TRI and president of the IEEE Robotics and Automation Society. The duo spoke with IEEE Spectrum’s Philip Ross at TRI’s offices in Palo Alto, Calif.
This interview has been condensed and edited for clarity.
IEEE Spectrum: How does AI handle the various parts of the self-driving problem?
Photo: Toyota
Gill Pratt
Gill Pratt: There are three different systems that you need in a self-driving car: It starts with perception, then goes to prediction, and then goes to planning.
The one that by far is the most problematic is prediction. It’s not prediction of other automated cars, because if all cars were automated, this problem would be much more simple. How do you predict what a human being is going to do? That’s difficult for deep learning to learn right now.
Spectrum: Can you offset the weakness in prediction with stupendous perception?
Photo: Toyota Research Institute for Burgard
Wolfram Burgard
Wolfram Burgard: Yes, that is what car companies basically do. A camera provides semantics, lidar provides distance, radar provides velocities. But all this comes with problems, because sometimes you look at the world from different positions—that’s called parallax. Sometimes you don’t know which range estimate that pixel belongs to. That might make the decision complicated as to whether that is a person painted onto the side of a truck or whether this is an actual person.
With deep learning there is this promise that if you throw enough data at these networks, it’s going to work—finally. But it turns out that the amount of data that you need for self-driving cars is far larger than we expected.
Spectrum: When do deep learning’s limitations become apparent?
Pratt: The way to think about deep learning is that it’s really high-performance pattern matching. You have input and output as training pairs; you say this image should lead to that result; and you just do that again and again, for hundreds of thousands, millions of times.
Here’s the logical fallacy that I think most people have fallen prey to with deep learning. A lot of what we do with our brains can be thought of as pattern matching: “Oh, I see this stop sign, so I should stop.” But it doesn’t mean all of intelligence can be done through pattern matching.
“I asked myself, if all of those cars had automated drive, how good would they have to be to tolerate the number of crashes that would still occur?”
—Gill Pratt, Toyota Research Institute
For instance, when I’m driving and I see a mother holding the hand of a child on a corner and trying to cross the street, I am pretty sure she’s not going to cross at a red light and jaywalk. I know from my experience being a human being that mothers and children don’t act that way. On the other hand, say there are two teenagers—with blue hair, skateboards, and a disaffected look. Are they going to jaywalk? I look at that, you look at that, and instantly the probability in your mind that they’ll jaywalk is much higher than for the mother holding the hand of the child. It’s not that you’ve seen 100,000 cases of young kids—it’s that you understand what it is to be either a teenager or a mother holding a child’s hand.
You can try to fake that kind of intelligence. If you specifically train a neural network on data like that, you could pattern-match that. But you’d have to know to do it.
Spectrum: So you’re saying that when you substitute pattern recognition for reasoning, the marginal return on the investment falls off pretty fast?
Pratt: That’s absolutely right. Unfortunately, we don’t have the ability to make an AI that thinks yet, so we don’t know what to do. We keep trying to use the deep-learning hammer to hammer more nails—we say, well, let’s just pour more data in, and more data.
Spectrum: Couldn’t you train the deep-learning system to recognize teenagers and to assign the category a high propensity for jaywalking?
Burgard: People have been doing that. But it turns out that these heuristics you come up with are extremely hard to tweak. Also, sometimes the heuristics are contradictory, which makes it extremely hard to design these expert systems based on rules. This is where the strength of the deep-learning methods lies, because somehow they encode a way to see a pattern where, for example, here’s a feature and over there is another feature; it’s about the sheer number of parameters you have available.
Our separation of the components of a self-driving AI eases the development and even the learning of the AI systems. Some companies even think about using deep learning to do the job fully, from end to end, not having any structure at all—basically, directly mapping perceptions to actions.
Pratt: There are companies that have tried it; Nvidia certainly tried it. In general, it’s been found not to work very well. So people divide the problem into blocks, where we understand what each block does, and we try to make each block work well. Some of the blocks end up more like the expert system we talked about, where we actually code things, and other blocks end up more like machine learning.
Spectrum: So, what’s next—what new technique is in the offing?
Pratt: If I knew the answer, we’d do it. [Laughter]
Spectrum: You said that if all cars on the road were automated, the problem would be easy. Why not “geofence” the heck out of the self-driving problem, and have areas where only self-driving cars are allowed?
Pratt: That means putting in constraints on the operational design domain. This includes the geography—where the car should be automated; it includes the weather, it includes the level of traffic, it includes speed. If the car is going slow enough to avoid colliding without risking a rear-end collision, that makes the problem much easier. Street trolleys operate with traffic still in some parts of the world, and that seems to work out just fine. People learn that this vehicle may stop at unexpected times. My suspicion is, that is where we’ll see Level 4 autonomy in cities. It’s going to be in the lower speeds.
“We are now in the age of deep learning, and we don’t know what will come after.”
—Wolfram Burgard, Toyota Research Institute
That’s a sweet spot in the operational design domain, without a doubt. There’s another one at high speed on a highway, because access to highways is so limited. But unfortunately there is still the occasional debris that suddenly crosses the road, and the weather gets bad. The classic example is when somebody irresponsibly ties a mattress to the top of a car and it falls off; what are you going to do? And the answer is that terrible things happen—even for humans.
Spectrum: Learning by doing worked for the first cars, the first planes, the first steam boilers, and even the first nuclear reactors. We ran risks then; why not now?
Pratt: It has to do with the times. During the era where cars took off, all kinds of accidents happened, women died in childbirth, all sorts of diseases ran rampant; the expected characteristic of life was that bad things happened. Expectations have changed. Now the chance of dying in some freak accident is quite low because of all the learning that’s gone on, the OSHA [Occupational Safety and Health Administration] rules, UL code for electrical appliances, all the building standards, medicine.
Furthermore—and we think this is very important—we believe that empathy for a human being at the wheel is a significant factor in public acceptance when there is a crash. We don’t know this for sure—it’s a speculation on our part. I’ve driven, I’ve had close calls; that could have been me that made that mistake and had that wreck. I think people are more tolerant when somebody else makes mistakes, and there’s an awful crash. In the case of an automated car, we worry that that empathy won’t be there.
Photo: Toyota
Toyota is using this
Platform 4 automated driving test vehicle, based on the Lexus LS, to develop Level-4 self-driving capabilities for its “Chauffeur” project.
Spectrum: Toyota is building a system called Guardian to back up the driver, and a more futuristic system called Chauffeur, to replace the driver. How can Chauffeur ever succeed? It has to be better than a human plus Guardian!
Pratt: In the discussions we’ve had with others in this field, we’ve talked about that a lot. What is the standard? Is it a person in a basic car? Or is it a person with a car that has active safety systems in it? And what will people think is good enough?
These systems will never be perfect—there will always be some accidents, and no matter how hard we try there will still be occasions where there will be some fatalities. At what threshold are people willing to say that’s okay?
Spectrum: You were among the first top researchers to warn against hyping self-driving technology. What did you see that so many other players did not?
Pratt: First, in my own case, during my time at DARPA I worked on robotics, not cars. So I was somewhat of an outsider. I was looking at it from a fresh perspective, and that helps a lot.
Second, [when I joined Toyota in 2015] I was joining a company that is very careful—even though we have made some giant leaps—with the Prius hybrid drive system as an example. Even so, in general, the philosophy at Toyota is kaizen—making the cars incrementally better every single day. That care meant that I was tasked with thinking very deeply about this thing before making prognostications.
And the final part: It was a new job for me. The first night after I signed the contract I felt this incredible responsibility. I couldn’t sleep that whole night, so I started to multiply out the numbers, all using a factor of 10. How many cars do we have on the road? Cars on average last 10 years, though ours last 20, but let’s call it 10. They travel on an order of 10,000 miles per year. Multiply all that out and you get 10 to the 10th miles per year for our fleet on Planet Earth, a really big number. I asked myself, if all of those cars had automated drive, how good would they have to be to tolerate the number of crashes that would still occur? And the answer was so incredibly good that I knew it would take a long time. That was five years ago.
Burgard: We are now in the age of deep learning, and we don’t know what will come after. We are still making progress with existing techniques, and they look very promising. But the gradient is not as steep as it was a few years ago.
Pratt: There isn’t anything that’s telling us that it can’t be done; I should be very clear on that. Just because we don’t know how to do it doesn’t mean it can’t be done. Continue reading
#437709 iRobot Announces Major Software Update, ...
Since the release of the very first Roomba in 2002, iRobot’s long-term goal has been to deliver cleaner floors in a way that’s effortless and invisible. Which sounds pretty great, right? And arguably, iRobot has managed to do exactly this, with its most recent generation of robot vacuums that make their own maps and empty their own dustbins. For those of us who trust our robots, this is awesome, but iRobot has gradually been realizing that many Roomba users either don’t want this level of autonomy, or aren’t ready for it.
Today, iRobot is announcing a major new update to its app that represents a significant shift of its overall approach to home robot autonomy. Humans are being brought back into the loop through software that tries to learn when, where, and how you clean so that your Roomba can adapt itself to your life rather than the other way around.
To understand why this is such a shift for iRobot, let’s take a very brief look back at how the Roomba interface has evolved over the last couple of decades. The first generation of Roomba had three buttons on it that allowed (or required) the user to select whether the room being vacuumed was small or medium or large in size. iRobot ditched that system one generation later, replacing the room size buttons with one single “clean” button. Programmable scheduling meant that users no longer needed to push any buttons at all, and with Roombas able to find their way back to their docking stations, all you needed to do was empty the dustbin. And with the most recent few generations (the S and i series), the dustbin emptying is also done for you, reducing direct interaction with the robot to once a month or less.
Image: iRobot
iRobot CEO Colin Angle believes that working toward more intelligent human-robot collaboration is “the brave new frontier” of AI. “This whole journey has been earning the right to take this next step, because a robot can’t be responsive if it’s incompetent,” he says. “But thinking that autonomy was the destination was where I was just completely wrong.”
The point that the top-end Roombas are at now reflects a goal that iRobot has been working toward since 2002: With autonomy, scheduling, and the clean base to empty the bin, you can set up your Roomba to vacuum when you’re not home, giving you cleaner floors every single day without you even being aware that the Roomba is hard at work while you’re out. It’s not just hands-off, it’s brain-off. No noise, no fuss, just things being cleaner thanks to the efforts of a robot that does its best to be invisible to you. Personally, I’ve been completely sold on this idea for home robots, and iRobot CEO Colin Angle was as well.
“I probably told you that the perfect Roomba is the Roomba that you never see, you never touch, you just come home everyday and it’s done the right thing,” Angle told us. “But customers don’t want that—they want to be able to control what the robot does. We started to hear this a couple years ago, and it took a while before it sunk in, but it made sense.”
How? Angle compares it to having a human come into your house to clean, but you weren’t allowed to tell them where or when to do their job. Maybe after a while, you’ll build up the amount of trust necessary for that to work, but in the short term, it would likely be frustrating. And people get frustrated with their Roombas for this reason. “The desire to have more control over what the robot does kept coming up, and for me, it required a pretty big shift in my view of what intelligence we were trying to build. Autonomy is not intelligence. We need to do something more.”
That something more, Angle says, is a partnership as opposed to autonomy. It’s an acknowledgement that not everyone has the same level of trust in robots as the people who build them. It’s an understanding that people want to have a feeling of control over their homes, that they have set up the way that they want, and that they’ve been cleaning the way that they want, and a robot shouldn’t just come in and do its own thing.
This change in direction also represents a substantial shift in resources for iRobot, and the company has pivoted two-thirds of its engineering organization to focus on software-based collaborative intelligence rather than hardware.
“Until the robot proves that it knows enough about your home and about the way that you want your home cleaned,” Angle says, “you can’t move forward.” He adds that this is one of those things that seem obvious in retrospect, but even if they’d wanted to address the issue before, they didn’t have the technology to solve the problem. Now they do. “This whole journey has been earning the right to take this next step, because a robot can’t be responsive if it’s incompetent,” Angle says. “But thinking that autonomy was the destination was where I was just completely wrong.”
The previous iteration of the iRobot app (and Roombas themselves) are built around one big fat CLEAN button. The new approach instead tries to figure out in much more detail where the robot should clean, and when, using a mixture of autonomous technology and interaction with the user.
Where to Clean
Knowing where to clean depends on your Roomba having a detailed and accurate map of its environment. For several generations now, Roombas have been using visual mapping and localization (VSLAM) to build persistent maps of your home. These maps have been used to tell the Roomba to clean in specific rooms, but that’s about it. With the new update, Roombas with cameras will be able to recognize some objects and features in your home, including chairs, tables, couches, and even countertops. The robots will use these features to identify where messes tend to happen so that they can focus on those areas—like around the dining room table or along the front of the couch.
We should take a minute here to clarify how the Roomba is using its camera. The original (primary?) purpose of the camera was for VSLAM, where the robot would take photos of your home, downsample them into QR-code-like patterns of light and dark, and then use those (with the assistance of other sensors) to navigate. Now the camera is also being used to take pictures of other stuff around your house to make that map more useful.
Photo: iRobot
The robots will now try to fit into the kinds of cleaning routines that many people already have established. For example, the app may suggest an “after dinner” routine that cleans just around the kitchen and dining room table.
This is done through machine learning using a library of images of common household objects from a floor perspective that iRobot had to develop from scratch. Angle clarified for us that this is all done via a neural net that runs on the robot, and that “no recognizable images are ever stored on the robot or kept, and no images ever leave the robot.” Worst case, if all the data iRobot has about your home gets somehow stolen, the hacker would only know that (for example) your dining room has a table in it and the approximate size and location of that table, because the map iRobot has of your place only stores symbolic representations rather than images.
Another useful new feature is intended to help manage the “evil Roomba places” (as Angle puts it) that every home has that cause Roombas to get stuck. If the place is evil enough that Roomba has to call you for help because it gave up completely, Roomba will now remember, and suggest that either you make some changes or that it stops cleaning there, which seems reasonable.
When to Clean
It turns out that the primary cause of mission failure for Roombas is not that they get stuck or that they run out of battery—it’s user cancellation, usually because the robot is getting in the way or being noisy when you don’t want it to be. “If you kill a Roomba’s job because it annoys you,” points out Angle, “how is that robot being a good partner? I think it’s an epic fail.” Of course, it’s not the robot’s fault, because Roombas only clean when we tell them to, which Angle says is part of the problem. “People actually aren’t very good at making their own schedules—they tend to oversimplify, and not think through what their schedules are actually about, which leads to lots of [figurative] Roomba death.”
To help you figure out when the robot should actually be cleaning, the new app will look for patterns in when you ask the robot to clean, and then recommend a schedule based on those patterns. That might mean the robot cleans different areas at different times every day of the week. The app will also make scheduling recommendations that are event-based as well, integrated with other smart home devices. Would you prefer the Roomba to clean every time you leave the house? The app can integrate with your security system (or garage door, or any number of other things) and take care of that for you.
More generally, Roomba will now try to fit into the kinds of cleaning routines that many people already have established. For example, the app may suggest an “after dinner” routine that cleans just around the kitchen and dining room table. The app will also, to some extent, pay attention to the environment and season. It might suggest increasing your vacuuming frequency if pollen counts are especially high, or if it’s pet shedding season and you have a dog. Unfortunately, Roomba isn’t (yet?) capable of recognizing dogs on its own, so the app has to cheat a little bit by asking you some basic questions.
A Smarter App
Image: iRobot
The previous iteration of the iRobot app (and Roombas themselves) are built around one big fat CLEAN button. The new approach instead tries to figure out in much more detail where the robot should clean, and when, using a mixture of autonomous technology and interaction with the user.
The app update, which should be available starting today, is free. The scheduling and recommendations will work on every Roomba model, although for object recognition and anything related to mapping, you’ll need one of the more recent and fancier models with a camera. Future app updates will happen on a more aggressive schedule. Major app releases should happen every six months, with incremental updates happening even more frequently than that.
Angle also told us that overall, this change in direction also represents a substantial shift in resources for iRobot, and the company has pivoted two-thirds of its engineering organization to focus on software-based collaborative intelligence rather than hardware. “It’s not like we’re done doing hardware,” Angle assured us. “But we do think about hardware differently. We view our robots as platforms that have longer life cycles, and each platform will be able to support multiple generations of software. We’ve kind of decoupled robot intelligence from hardware, and that’s a change.”
Angle believes that working toward more intelligent collaboration between humans and robots is “the brave new frontier of artificial intelligence. I expect it to be the frontier for a reasonable amount of time to come,” he adds. “We have a lot of work to do to create the type of easy-to-use experience that consumer robots need.” Continue reading