Tag Archives: vehicles
This is a guest post. The views expressed here are solely those of the author and do not represent positions of IEEE Spectrum or the IEEE.
Autonomous robots are coming around slowly. We already got autonomous vacuum cleaners, autonomous lawn mowers, toys that bleep and blink, and (maybe) soon autonomous cars. Yet, generation after generation, we keep waiting for the robots that we all know from movies and TV shows. Instead, businesses seem to get farther and farther away from the robots that are able to do a large variety of tasks using general-purpose, human anatomy-inspired hardware.
Although these are the droids we have been looking for, anything that came close, such as Willow Garage’s PR2 or Rethink Robotics’ Baxter has bitten the dust. With building a robotic company being particularly hard, compounding business risk with technological risk, the trend goes from selling robots to selling actual services like mowing your lawn, provide taxi rides, fulfilling retail orders, or picking strawberries by the pound. Unfortunately for fans of R2-D2 and C-3PO, these kind of business models emphasize specialized, room- or fridge-sized hardware that is optimized for one very specific task, but does not contribute to a general-purpose robotic platform.
We have actually seen something very similar in the personal computer (PC) industry. In the 1950s, even though computers could be as big as an entire room and were only available to a selected few, the public already had a good idea of what computers would look like. A long list of fictional computers started to populate mainstream entertainment during that time. In a 1962 New York Times article titled “Pocket Computer to Replace Shopping List,” visionary scientist John Mauchly stated that “there is no reason to suppose the average boy or girl cannot be master of a personal computer.”
In 1968, Douglas Engelbart gave us the “mother of all demos,” browsing hypertext on a graphical screen and a mouse, and other ideas that have become standard only decades later. Now that we have finally seen all of this, it might be helpful to examine what actually enabled the computing revolution to learn where robotics is really at and what we need to do next.
The parallels between computers and robots
In the 1970s, mainframes were about to be replaced by the emerging class of mini-computers, fridge-sized devices that cost less than US $25,000 ($165,000 in 2019 dollars). These computers did not use punch-cards, but could be programmed in Fortran and BASIC, dramatically expanding the ease with which potential applications could be created. Yet it was still unclear whether mini-computers could ever replace big mainframes in applications that require fast and efficient processing of large amounts of data, let alone enter every living room. This is very similar to the robotics industry right now, where large-scale factory robots (mainframes) that have existed since the 1960s are seeing competition from a growing industry of collaborative robots that can safely work next to humans and can easily be installed and programmed (minicomputers). As in the ’70s, applications for these devices that reach system prices comparable to that of a luxury car are quite limited, and it is hard to see how they could ever become a consumer product.
Yet, as in the computer industry, successful architectures are quickly being cloned, driving prices down, and entirely new approaches on how to construct or program robotic arms are sprouting left and right. Arm makers are joined by manufacturers of autonomous carts, robotic grippers, and sensors. These components can be combined, paving the way for standard general purpose platforms that follow the model of the IBM PC, which built a capable, open architecture relying as much on commodity parts as possible.
General purpose robotic systems have not been successful for similar reasons that general purpose, also known as “personal,” computers took decades to emerge. Mainframes were custom-built for each application, while typewriters got smarter and smarter, not really leaving room for general purpose computers in between. Indeed, given the cost of hardware and the relatively little abilities of today’s autonomous robots, it is almost always smarter to build a special purpose machine than trying to make a collaborative mobile manipulator smart.
A current example is e-commerce grocery fulfillment. The current trend is to reserve underutilized parts of a brick-and-mortar store for a micro-fulfillment center that stores goods in little crates with an automated retrieval system and a (human) picker. A number of startups like Alert Innovation, Fabric, Ocado Technology, TakeOff Technologies, and Tompkins Robotics, to just name a few, have raised hundreds of millions of venture capital recently to build mainframe equivalents of robotic fulfillment centers. This is in contrast with a robotic picker, which would drive through the aisles to restock and pick from shelves. Such a robotic store clerk would come much closer to our vision of a general purpose robot, but would require many copies of itself that crowd the aisles to churn out hundreds of orders per hour as a microwarehouse could. Although eventually more efficient, the margins in retail are already low and make it unlikely that this industry will produce the technological jump that we need to get friendly C-3POs manning the aisles.
Startups have raised hundreds of millions of venture capital recently to build mainframe equivalents of robotic fulfillment centers. This is in contrast with a robotic picker, which would drive through the aisles to restock and pick from shelves, and would come much closer to our vision of a general purpose robot.
Mainframes were also attacked from the bottom. Fascination with the new digital technology has led to a hobbyist movement to create microcomputers that were sold via mail order or at RadioShack. Initially, a large number of small businesses was selling tens, at most hundreds, of devices, usually as a kit and with wooden enclosures. This trend culminated into the “1977 Trinity” in the form of the Apple II, the Commodore PET, and the Tandy TRS-80, complete computers that were sold for prices around $2500 (TRS) to $5000 (Apple) in today’s dollars. The main application of these computers was their programmability (in BASIC), which would enable consumers to “learn to chart your biorhythms, balance your checking account, or even control your home environment,” according to an original Apple advertisement. Similarly, there exists a myriad of gadgets that explore different aspects of robotics such as mobility, manipulation, and entertainment.
As in the fledgling personal computing industry, the advertised functionality was at best a model of the real deal. A now-famous milestone in entertainment robotics was the original Sony’s Aibo, a robotic dog that was advertised to have many properties that a real dog has such as develop its own personality, play with a toy, and interact with its owner. Released in 1999, and re-launched in 2018, the platform has a solid following among hobbyists and academics who like its programmability, but probably only very few users who accept the device as a pet stand-in.
There also exist countless “build-your-own-robotic-arm” kits. One of the more successful examples is the uArm, which sells for around $800, and is advertised to perform pick and place, assembly, 3D printing, laser engraving, and many other things that sound like high value applications. Using compelling videos of the robot actually doing these things in a constrained environment has led to two successful crowd-funding campaigns, and have established the robot as a successful educational tool.
Finally, there exist platforms that allow hobbyist programmers to explore mobility to construct robots that patrol your house, deliver items, or provide their users with telepresence abilities. An example of that is the Misty II. Much like with the original Apple II, there remains a disconnect between the price of the hardware and the fidelity of the applications that were available.
For computers, this disconnect began to disappear with the invention of the first electronic spreadsheet software VisiCalc that spun out of Harvard in 1979 and prompted many people to buy an entire microcomputer just to run the program. VisiCalc was soon joined by WordStar, a word processing application, that sold for close to $2000 in today’s dollars. WordStar, too, would entice many people to buy the entire hardware just to use the software. The two programs are early examples of what became known as “killer application.”
With factory automation being mature, and robots with the price tag of a minicomputer being capable of driving around and autonomously carrying out many manipulation tasks, the robotics industry is somewhere where the PC industry was between 1973—the release of the Xerox Alto, the first computer with a graphical user interface, mouse, and special software—and 1979—when microcomputers in the under $5000 category began to take off.
Killer apps for robots
So what would it take for robotics to continue to advance like computers did? The market itself already has done a good job distilling what the possible killer apps are. VCs and customers alike push companies who have set out with lofty goals to reduce their offering to a simple value proposition. As a result, companies that started at opposite ends often converge to mirror images of each other that offer very similar autonomous carts, (bin) picking, palletizing, depalletizing, or sorting solutions. Each of these companies usually serves a single application to a single vertical—for example bin-picking clothes, transporting warehouse goods, or picking strawberries by the pound. They are trying to prove that their specific technology works without spreading themselves too thin.
Very few of these companies have really taken off. One example is Kiva Systems, which turned into the logistic robotics division of Amazon. Kiva and others are structured around sound value propositions that are grounded in well-known user needs. As these solutions are very specialized, however, it is unlikely that they result into any economies of scale of the same magnitude that early computer users who bought both a spreadsheet and a word processor application for their expensive minicomputer could enjoy. What would make these robotic solutions more interesting is when functionality becomes stackable. Instead of just being able to do bin picking, palletizing, and transportation with the same hardware, these three skills could be combined to model entire processes.
A skill that is yet little addressed by startups and is historically owned by the mainframe equivalent of robotics is assembly of simple mechatronic devices. The ability to assemble mechatronic parts is equivalent to other tasks such as changing a light bulb, changing the batteries in a remote control, or tending machines like a lever-based espresso machine. These tasks would involve the autonomous execution of complete workflows possible using a single machine, eventually leading to an explosion of industrial productivity across all sectors. For example, picking up an item from a bin, arranging it on the robot, moving it elsewhere, and placing it into a shelf or a machine is a process that equally applies to a manufacturing environment, a retail store, or someone’s kitchen.
Image: Robotic Materials Inc.
Autonomous, vision and force-based assembly of the
Siemens robot learning challenge.
Even though many of the above applications are becoming possible, it is still very hard to get a platform off the ground without added components that provide “killer app” value of their own. Interesting examples are Rethink Robotics or the Robot Operating System (ROS). Rethink Robotics’ Baxter and Sawyer robots pioneered a great user experience (like the 1973 Xerox Alto, really the first PC), but its applications were difficult to extend beyond simple pick-and-place and palletizing and depalletizing items.
ROS pioneered interprocess communication software that was adapted to robotic needs (multiple computers, different programming languages) and the idea of software modularity in robotics, but—in the absence of a common hardware platform—hasn’t yet delivered a single application, e.g. for navigation, path planning, or grasping, that performs beyond research-grade demonstration level and won’t get discarded once developers turn to production systems. At the same time, an increasing number of robotic devices, such as robot arms or 3D perception systems that offer intelligent functionality, provide other ways to wire them together that do not require an intermediary computer, while keeping close control over the real-time aspects of their hardware.
Image: Robotic Materials Inc.
Robotic Materials GPR-1 combines a MIR-100 autonomous cart with an UR-5 collaborative robotic arm, an onRobot force/torque sensor and Robotic Materials’ SmartHand to perform out-of-the-box mobile assembly, bin picking, palletizing, and depalletizing tasks.
At my company, Robotic Materials Inc., we have made strides to identify a few applications such as bin picking and assembly, making them configurable with a single click by combining machine learning and optimization with an intuitive user interface. Here, users can define object classes and how to grasp them using a web browser, which then appear as first-class objects in a robot-specific graphical programming language. We have also done this for assembly, allowing users to stack perception-based picking and force-based assembly primitives by simply dragging and dropping appropriate commands together.
While such an approach might answer the question of a killer app for robots priced in the “minicomputer” range, it is unclear how killer app-type value can be generated with robots in the less-than-$5000 category. A possible answer is two-fold: First, with low-cost arms, mobility platforms, and entertainment devices continuously improving, a confluence of technology readiness and user innovation, like with the Apple II and VisiCalc, will eventually happen. For example, there is not much innovation needed to turn Misty into a home security system; the uArm into a low-cost bin-picking system; or an Aibo-like device into a therapeutic system for the elderly or children with autism.
Second, robots and their components have to become dramatically cheaper. Indeed, computers have seen an exponential reduction in price accompanied by an exponential increase in computational power, thanks in great part to Moore’s Law. This development has helped robotics too, allowing us to reach breakthroughs in mobility and manipulation due to the ability to process massive amounts of image and depth data in real-time, and we can expect it to continue to do so.
Is there a Moore’s Law for robots?
One might ask, however, how a similar dynamics might be possible for robots as a whole, including all their motors and gears, and what a “Moore’s Law” would look like for the robotics industry. Here, it helps to remember that the perpetuation of Moore’s Law is not the reason, but the result of the PC revolution. Indeed, the first killer apps for bookkeeping, editing, and gaming were so good that they unleashed tremendous consumer demand, beating the benchmark on what was thought to be physically possible over and over again. (I vividly remember 56 kbps to be the absolute maximum data rate for copper phone lines until DSL appeared.)
That these economies of scale are also applicable to mechatronics is impressively demonstrated by the car industry. A good example is the 2020 Prius Prime, a highly computerized plug-in hybrid, that is available for one third of the cost of my company’s GPR-1 mobile manipulator while being orders of magnitude more complex, sporting an electrical motor, a combustion engine, and a myriad of sensors and computers. It is therefore very well conceivable to produce a mobile manipulator that retails at one tenth of the cost of a modern car, once robotics enjoy similar mass-market appeal. Given that these robots are part of the equation, actively lowering cost of production, this might happen as fast as never before in the history of industrialization.
It is therefore very well conceivable to produce a mobile manipulator that retails at one tenth of the cost of a modern car, once robotics enjoy similar mass-market appeal.
There is one more driver that might make robots exponentially more capable: the cloud. Once a general purpose robot has learned or was programmed with a new skill, it could share it with every other robot. At some point, a grocer who buys a robot could assume that it already knows how to recognize and handle 99 percent of the retail items in the store. Likewise, a manufacturer can assume that the robot can handle and assemble every item available from McMaster-Carr and Misumi. Finally, families could expect a robot to know every kitchen item that Ikea and Pottery Barn is selling. Sounds like a labor intense problem, but probably more manageable than collecting footage for Google’s Street View using cars, tricycles, and snowmobiles, among other vehicles.
Strategies for robot startups
While we are waiting for these two trends—better and better applications and hardware with decreasing cost—to converge, we as a community have to keep exploring what the canonical robotic applications beyond mobility, bin picking, palletizing, depalletizing, and assembly are. We must also continue to solve the fundamental challenges that stand in the way of making these solutions truly general and robust.
For both questions, it might help to look at the strategies that have been critical in the development of the personal computer, which might equally well apply to robotics:
Start with a solution to a problem your customers have. Unfortunately, their problem is almost never that they need your sensor, widget, or piece of code, but something that already costs them money or negatively affects them in some other way. Example: There are many more people who had a problem calculating their taxes (and wanted to buy VisiCalc) than writing their own solution in BASIC.
Build as little of your own hardware as necessary. Your business model should be stronger than the margin you can make on the hardware. Why taking the risk? Example: Why build your own typewriter if you can write the best typewriting application that makes it worth buying a computer just for that?
If your goal is a platform, make sure it comes with a killer application, which alone justifies the platform cost. Example: Microcomputer companies came and went until the “1977 Trinity” intersected with the killer apps spreadsheet and word processors. Corollary: You can also get lucky.
Use an open architecture, which creates an ecosystem where others compete on creating better components and peripherals, while allowing others to integrate your solution into their vertical and stack it with other devices. Example: Both the Apple II and the IBM PC were completely open architectures, enabling many clones, thereby growing the user and developer base.
It’s worthwhile pursuing this. With most business processes already being digitized, general purpose robots will allow us to fill in gaps in mobility and manipulation, increasing productivity at levels only limited by the amount of resources and energy that are available, possibly creating a utopia in which creativity becomes the ultimate currency. Maybe we’ll even get R2-D2.
Nikolaus Correll is an associate professor of computer science at the University of Colorado at Boulder where he works on mobile manipulation and other robotics applications. He’s co-founder and CTO of Robotic Materials Inc., which is supported by the National Science Foundation and the National Institute of Standards and Technology via their Small Business Innovative Research (SBIR) programs. Continue reading →
What’s the world’s hardest machine learning problem? Autonomous vehicles? Robots that can walk? Cancer detection?
Nope, says Julian Sanchez. It’s agriculture.
Sanchez might be a little biased. He is the director of precision agriculture for John Deere, and is in charge of adding intelligence to traditional farm vehicles. But he does have a little perspective, having spent time working on software for both medical devices and air traffic control systems.
I met with Sanchez and Alexey Rostapshov, head of digital innovation at John Deere Labs, at the organization’s San Francisco offices last month. Labs launched in 2017 to take advantage of the area’s tech expertise, both to apply machine learning to in-house agricultural problems and to work with partners to build technologies that play nicely with Deere’s big green machines. Deere’s neighbors in San Francisco’s tech-heavy South of Market are LinkedIn, Salesforce, and Planet Labs, which puts it in a good position for recruiting.
“We’ve literally had folks knock on the door and say, ‘What are you doing here?’” says Rostapshov, and some return to drop off resumes.
Here’s why Sanchez believes agriculture is such a big challenge for artificial intelligence.
“It’s not just about driving tractors around,” he says, although autonomous driving technologies are part of the mix. (John Deere is doing a lot of work with precision GPS to improve autonomous driving, for example, and allow tractors to plan their own routes around fields.)
But more complex than the driving problem, says Sanchez, are the classification problems.
Corn: A Classic Classification Problem
Photo: Tekla Perry
One key effort, Sanchez says, are AI systems “that allow me to tell whether grain being harvested is good quality or low quality and to make automatic adjustment systems for the harvester.” The company is already selling an early version of this image analysis technology. But the many differences between grain types, and grains grown under different conditions, make this task a tough one for machine learning.
“Take corn,” Sanchez says. “Let’s say we are building a deep learning algorithm to detect this corn. And we take lots of pictures of kernels to give it. Say we pick those kernels in central Illinois. But, one mile over, the farmer planted a slightly different hybrid which has slightly different coloration of yellow. Meanwhile, this other farm harvested three days later in a field five miles away; it’s the same hybrid, but it also looks different.
“It’s an overwhelming classification challenge, and that’s just for corn. But you are not only doing it for corn, you have to add 20 more varieties of grain to the mix; and some, like canola, are almost microscopic.”
Even the ground conditions vary dramatically—far more than road conditions, Sanchez points out.
“Let’s say we are building a deep learning algorithm to detect how much residue is left on the soil after a harvest, including stubble and some chaff. Let’s drive 2,000 acres of fields in the Midwest looking at residue. That’s great, but I guarantee that if you go drive those the next year, it will look significantly different.
“Deep learning is great at interpolating conditions between what it knows; it is not good at extrapolating to situations it hasn’t seen. And in agriculture, you always feel that there is a set of conditions that you haven’t yet classified.”
A Flood of Big Data
The scale of the data is also daunting, Rostapshov points out. “We are one of the largest users of cloud computing services in the world,” he says. “We are gathering 5 to 15 million measurements per second from 130,000 connected machines globally. We have over 150 million acres in our databases, using petabytes and petabytes [of storage]. We process more data than Twitter does.”
Much of this information is so-called dirty data, that is, it doesn’t share the same format or structure, because it’s coming not only from a wide variety of John Deere machines, but also includes data from some 100 other companies that have access to the platform, including weather information, aerial imagery, and soil analyses.
As a result, says Sanchez, Deere has had to make “tremendous investments in back-end data cleanup.”
Deep learning is great at interpolating conditions between what it knows; it is not good at extrapolating to situations it hasn’t seen.”
—Julian Sanchez, John Deere
“We have gotten progressively more skilled at that problem,” he says. “We started simply by cleaning up our own data. You’d think it would be nice and neat, since it’s coming from our own machines, but there is a wide variety of different models and different years. Then we started geospatially tagging the agronomic data—the information about where you are applying herbicides and fertilizer and the like—coming in from our vehicles. When we started bringing in other data, from drones, say, we were already good at cleaning it up.”
John Deere’s Hiring Pitch
Hard problems can be a good thing to have for a company looking to hire machine learning engineers.
“Our opening line to potential recruits,” Sanchez says, “is ‘This stuff matters.’ Then, if we get a chance to talk to them more, we follow up with ‘Not only does this stuff matter, but the problems are really hard and interesting.’ When we explain the variability in farming and how we have to apply all the latest tools to these problems, we get their attention.”
Software engineers “know that feeding a growing population is a massive problem and are excited about the prospect of making a difference,” Rostapshov says.
Only 20 engineers work in the San Francisco labs right now, and that’s on a busy day—some of the researchers spend part of their time at Blue River Technology, a startup based in Sunnyvale that was acquired by Deere in 2017. About half of the researchers are focusing on AI. The Lab is in the process of doubling its office space (no word on staffing plans for that expansion yet).
“We are one of the largest users of cloud computing services in the world.”
—Alexey Rostapshov, John Deere Labs
Company-wide, Deere has thousands of software engineers, with many using AI and machine learning tools in their work, and about the same number of mechanical and electrical engineers, Sanchez reports. “If you look at our hiring 10 years ago,” he says, “it was heavily weighted to mechanical engineers. But if you look at those numbers now, it is by a large majority [engineers working] in the software space. We still need mechanical engineers—we do build green machines—but if you go by our footprint of tech talent, it is pretty safe to call John Deere a software company. And if you follow the key conversations that are happening in the company right now, 95 percent of them are software-related.”
For now, these software engineers are focused on developing technologies that allow farmers to “do more with less,” Sanchez says. Meaning, to get more and better crops from less fuel, less seed, less fertilizer, less pesticide, and fewer workers, and putting together building blocks that, he says, could eventually lead to fully autonomous farm vehicles. The data Deere collects today, for the most part, stays in silos (the virtual kind), with AI algorithms that analyze specific sets of data to provide guidance to individual farmers. At some point, however, with tools to anonymize data and buy-in from farmers, aggregating data could provide some powerful insights.
“We are not asking farmers for that yet,” Sanchez says. “We are not doing aggregation to look for patterns. We are focused on offering technology that allows an individual farmer to use less, on positioning ourselves to be in a neutral spot. We are not about selling you more seed or more fertilizer. So we are building up a good trust level. In the long term, we can have conversations about doing more with deep learning.” Continue reading →
In 2016, Cruise, an autonomous vehicle startup acquired by General Motors, had about 50 employees. At the beginning of 2019, the headcount at its San Francisco headquarters—mostly software engineers, mostly working on projects connected to machine learning and artificial intelligence—hit around 1000. Now that number is up to 1500, and by the end of this year it’s expected to reach about 2000, sprawling into a recently purchased building that had housed Dropbox. And that’s not counting the 200 or so tech workers that Cruise is aiming to install in a Seattle, Wash., satellite development center and a handful of others in Phoenix, Ariz., and Pasadena, Calif.
Cruise’s recent hires aren’t all engineers—it takes more than engineering talent to manage operations. And there are hundreds of so-called safety drivers that are required to sit in the 180 or so autonomous test vehicles whenever they roam the San Francisco streets. But that’s still a lot of AI experts to be hiring in a time of AI engineer shortages.
Hussein Mehanna, head of AI/ML at Cruise, says the company’s hiring efforts are on track, due to the appeal of the challenge of autonomous vehicles in drawing in AI experts from other fields. Mehanna himself joined Cruise in May from Google, where he was director of engineering at Google Cloud AI. Mehanna had been there about a year and a half, a relatively quick career stop after a short stint at Snap following four years working in machine learning at Facebook.
Mehanna has been immersed in AI and machine learning research since his graduate studies in speech recognition and natural language processing at the University of Cambridge. I sat down with Mehanna to talk about his career, the challenges of recruiting AI experts and autonomous vehicle development in general—and some of the challenges specific to San Francisco. We were joined by Michael Thomas, Cruise’s manager of AI/ML recruiting, who had also spent time recruiting AI engineers at Google and then Facebook.
IEEE Spectrum: When you were at Cambridge, did you think AI was going to take off like a rocket?
Mehanna: Did I imagine that AI was going to be as dominant and prevailing and sometimes hyped as it is now? No. I do recall in 2003 that my supervisor and I were wondering if neural networks could help at all in speech recognition. I remember my supervisor saying if anyone could figure out how use a neural net for speech he would give them a grant immediately. So he was on the right path. Now neural networks have dominated vision, speech, and language [processing]. But that boom started in 2012.
“In the early days, Facebook wasn’t that open to PhDs, it actually had a negative sentiment about researchers, and then Facebook shifted”
I didn’t [expect it], but I certainly aimed for it when [I was at] Microsoft, where I deliberately pushed my career towards machine learning instead of big data, which was more popular at the time. And [I aimed for it] when I joined Facebook.
In the early days, Facebook wasn’t that open to PhDs, or researchers. It actually had a negative sentiment about researchers. And then Facebook shifted to becoming one of the key places where PhD students wanted to do internships or join after they graduated. It was a mindset shift, they were [once] at a point in time where they thought what was needed for success wasn’t research, but now it’s different.
There was definitely an element of risk [in taking a machine learning career path], but I was very lucky, things developed very fast.
IEEE Spectrum: Is it getting harder or easier to find AI engineers to hire, given the reported shortages?
Mehanna: There is a mismatch [between job openings and qualified engineers], though it is hard to quantify it with numbers. There is good news as well: I see a lot more students diving deep into machine learning and data in their [undergraduate] computer science studies, so it’s not as bleak as it seems. But there is massive demand in the market.
Here at Cruise, demand for AI talent is just growing and growing. It might be is saturating or slowing down at other kinds of companies, though, [which] are leveraging more traditional applications—ad prediction, recommendations—that have been out there in the market for a while. These are more mature, better understood problems.
I believe autonomous vehicle technologies is the most difficult AI problem out there. The magnitude of the challenge of these problems is 1000 times more than other problems. They aren’t as well understood yet, and they require far deeper technology. And also the quality at which they are expected to operate is off the roof.
The autonomous vehicle problem is the engineering challenge of our generation. There’s a lot of code to write, and if we think we are going to hire armies of people to write it line by line, it’s not going to work. Machine learning can accelerate the process of generating the code, but that doesn’t mean we aren’t going to have engineers; we actually need a lot more engineers.
Sometimes people worry that AI is taking jobs. It is taking some developer jobs, but it is actually generating other developer jobs as well, protecting developers from the mundane and helping them build software faster and faster.
IEEE Spectrum: Are you concerned that the demand for AI in industry is drawing out the people in academia who are needed to educate future engineers, that is, the “eating the seed corn” problem?
Mehanna: There are some negative examples in the industry, but that’s not our style. We are looking for collaborations with professors, we want to cultivate a very deep and respectful relationship with universities.
And there’s another angle to this: Universities require a thriving industry for them to thrive. It is going to be extremely beneficial for academia to have this flourishing industry in AI, because it attracts more students to academia. I think we are doing them a fantastic favor by building these career opportunities. This is not the same as in my early days, [when] people told me “don’t go to AI; go to networking, work in the mobile industry; mobile is flourishing.”
IEEE Spectrum: Where are you looking as you try to find a thousand or so engineers to hire this year?
Thomas: We look for people who want to use machine learning to solve problems. They can be in many different industries—in the financial markets, in social media, in advertising. The autonomous vehicle industry is in its infancy. You can compare it to mobile in the early days: When the iPhone first came out, everyone was looking for developers with mobile experience, but you weren’t going to find them unless you went to straight to Apple, [so you had to hire other kinds of engineers]. This is the same type of thing: it is so new that you aren’t going to find experts in this area, because we are all still learning.
“You don’t have to be an autonomous vehicle expert to flourish in this world. It’s not too late to move…now would be a great time for AI experts working on other problems to shift their attention to autonomous vehicles.”
Mehanna: Because autonomous vehicle technology is the new frontier for AI experts, [the number of] people with both AI and autonomous vehicle experience is quite limited. So we are acquiring AI experts wherever they are, and helping them grow into the autonomous vehicle area. You don’t have to be an autonomous vehicle expert to flourish in this world. It’s not too late to move; even though there is a lot of great tech developed, there’s even more innovation ahead, so now would be a great time for AI experts working on other problems or applications to shift their attention to autonomous vehicles.
It feels like the Internet in 1980. It’s about to happen, but there are endless applications [to be developed over] the next few decades. Even if we can get a car to drive safely, there is the question of how can we tune the ride comfort, and then applying it all to different cities, different vehicles, different driving situations, and who knows to what other applications.
I can see how I can spend a lifetime career trying to solve this problem.
IEEE Spectrum: Why are you doing most of your development in San Francisco?
Mehanna: I think the best talent of the world is in Silicon Valley, and solving the autonomous vehicle problem is going to require the best of the best. It’s not just the engineering talent that is here, but [also] the entrepreneurial spirit. Solving the problem just as a technology is not going to be successful, you need to solve the product and the technology together. And the entrepreneurial spirit is one of the key reasons Cruise secured 7.5 billion in funding [besides GM, the company has a number of outside investors, including Honda, Softbank, and T. Rowe Price]. That [funding] is another reason Cruise is ahead of many others, because this problem requires deep resources.
“If you can do an autonomous vehicle in San Francisco you can do it almost anywhere.”
[And then there is the driving environment.] When I speak to my peers in the industry, they have a lot of respect for us, because the problems to solve in San Francisco technically are an order of magnitude harder. It is a tight environment, with a lot of pedestrians, and driving patterns that, let’s put it this way, are not necessarily the best in the nation. Which means we are seeing more problems ahead of our competitors, which gets us to better [software]. I think if you can do an autonomous vehicle in San Francisco you can do it almost anywhere.
A version of this post appears in the September 2019 print magazine as “AI Engineers: The Autonomous-Vehicle Industry Wants You.” Continue reading →
Video Friday is your weekly selection of awesome robotics videos, collected by your Automaton bloggers. We’ll also be posting a weekly calendar of upcoming robotics events for the next few months; here's what we have so far (send us your events!):
IEEE Africon 2019 – September 25-27, 2019 – Accra, Ghana
RoboBusiness 2019 – October 1-3, 2019 – Santa Clara, CA, USA
ISRR 2019 – October 6-10, 2019 – Hanoi, Vietnam
Ro-Man 2019 – October 14-18, 2019 – New Delhi, India
Humanoids 2019 – October 15-17, 2019 – Toronto, Canada
ARSO 2019 – October 31-1, 2019 – Beijing, China
ROSCon 2019 – October 31-1, 2019 – Macau
IROS 2019 – November 4-8, 2019 – Macau
Let us know if you have suggestions for next week, and enjoy today's videos.
We got a sneak peek of a new version of ANYmal equipped with actuated wheels for feet at the DARPA SubT Challenge, where it did surprisingly well at quickly and (mostly) robustly navigating some very tricky terrain. And when you're not expecting it to travel through a muddy, rocky, and dark tunnel, it looks even more capable:
[ Paper ]
In Langley’s makerspace lab, researchers are developing a series of soft robot actuators to investigate the viability of soft robotics in space exploration and assembly. By design, the actuator has chambers, or air bladders, that expand and compress based on the amount of air in them.
[ NASA ]
I’m not normally a fan of the AdultSize RoboCup soccer competition, but NimbRo had a very impressive season.
I don’t know how it managed to not fall over at 45 seconds, but damn.
[ NimbRo ]
This is more AI than robotics, but that’s okay, because it’s totally cool.
I’m wondering whether the hiders ever tried another possibly effective strategy: trapping the seekers in a locked shelter right at the start.
[ OpenAI ]
We haven’t heard much from Piaggio Fast Forward in a while, but evidently they’ve still got a Gita robot going on, designed to be your personal autonomous caddy for absolutely anything that can fit into something the size of a portable cooler.
Available this fall, I guess?
[ Gita ]
This passively triggered robotic hand is startlingly fast, and seems almost predatory when it grabs stuff, especially once they fit it onto a drone.
[ New Dexterity ]
Autonomous vehicles seem like a recent thing, but CMU has been working on them since the mid 1980s.
CMU was also working on drones back before drones were even really a thing:
[ CMU NavLab ] and [ CMU ]
Welcome to the most complicated and expensive robotic ice cream deployment system ever created.
[ Niska ]
Some impressive dexterity from a robot hand equipped with magnetic gears.
[ Ishikawa Senoo Lab ]
The Buddy Arduino social robot kit is now live on Kickstarter, and you can pledge for one of these little dudes for 49 bucks.
[ Kickstarter ]
Mobile manipulation robots have high potential to support rescue forces in disaster-response missions. Despite the difficulties imposed by real-world scenarios, robots are promising to perform mission tasks from a safe distance. In the CENTAURO project, we developed a disaster-response system which consists of the highly flexible Centauro robot and suitable control interfaces including an immersive telepresence suit and support-operator controls on different levels of autonomy.
[ CENTAURO ]
Determined robots are the cutest robots.
[ Paper ]
The goal of the Dronument project is to create an aerial platform enabling interior and exterior documentation of heritage sites.
It’s got a base station that helps with localization, but still, flying that close to a chandelier in a UNESCO world heritage site makes me nervous.
[ Dronument ]
Avast ye! No hornswaggling, lick-spittlering, or run-rigging over here – Only serious tech for devs. All hands hoay to check out Misty's capabilities and to build your own skills with plenty of heave ho! ARRRRRRRRGH…
International Talk Like a Pirate Day was yesterday, but I'm sure nobody will look at you funny if you keep at it today too.
[ Misty Robotics ]
This video presents an unobtrusive bimanual teleoperation setup with very low weight, consisting of two Vive visual motion trackers and two Myo surface electromyography bracelets. The video demonstrates complex, dexterous teleoperated bimanual daily-living tasks performed by the torque-controlled humanoid robot TORO.
[ DLR RMC ]
Lex Fridman interviews iRobot’s Colin Angle on the Artificial Intelligence Podcast.
Colin Angle is the CEO and co-founder of iRobot, a robotics company that for 29 years has been creating robots that operate successfully in the real world, not as a demo or on a scale of dozens, but on a scale of thousands and millions. As of this year, iRobot has sold more than 25 million robots to consumers, including the Roomba vacuum cleaning robot, the Braava floor mopping robot, and soon the Terra lawn mowing robot. 25 million robots successfully operating autonomously in people's homes to me is an incredible accomplishment of science, engineering, logistics, and all kinds of entrepreneurial innovation.
[ AI Podcast ]
This week’s CMU RI Seminar comes from CMU’s own Sarah Bergbreiter, on Microsystems-Inspired Robotics.
The ability to manufacture micro-scale sensors and actuators has inspired the robotics community for over 30 years. There have been huge success stories; MEMS inertial sensors have enabled an entire market of low-cost, small UAVs. However, the promise of ant-scale robots has largely failed. Ants can move high speeds on surfaces from picnic tables to front lawns, but the few legged microrobots that have walked have done so at slow speeds (< 1 body length/sec) on smooth silicon wafers. In addition, the vision of large numbers of microfabricated sensors interacting directly with the environment has suffered in part due to the brittle materials used in micro-fabrication. This talk will present our progress in the design of sensors, mechanisms, and actuators that utilize new microfabrication processes to incorporate materials with widely varying moduli and functionality to achieve more robustness, dynamic range, and complexity in smaller packages.
[ CMU RI ] Continue reading →
How each of us sees the world is about to change dramatically.
For all of human history, the experience of looking at the world was roughly the same for everyone. But boundaries between the digital and physical are beginning to fade.
The world around us is gaining layer upon layer of digitized, virtually overlaid information—making it rich, meaningful, and interactive. As a result, our respective experiences of the same environment are becoming vastly different, personalized to our goals, dreams, and desires.
Welcome to Web 3.0, or the Spatial Web. In version 1.0, static documents and read-only interactions limited the internet to one-way exchanges. Web 2.0 provided quite an upgrade, introducing multimedia content, interactive web pages, and participatory social media. Yet, all this was still mediated by two-dimensional screens.
Today, we are witnessing the rise of Web 3.0, riding the convergence of high-bandwidth 5G connectivity, rapidly evolving AR eyewear, an emerging trillion-sensor economy, and powerful artificial intelligence.
As a result, we will soon be able to superimpose digital information atop any physical surrounding—freeing our eyes from the tyranny of the screen, immersing us in smart environments, and making our world endlessly dynamic.
In the third post of our five-part series on augmented reality, we will explore the convergence of AR, AI, sensors, and blockchain and dive into the implications through a key use case in manufacturing.
A Tale of Convergence
Let’s deconstruct everything beneath the sleek AR display.
It all begins with graphics processing units (GPUs)—electric circuits that perform rapid calculations to render images. (GPUs can be found in mobile phones, game consoles, and computers.)
However, because AR requires such extensive computing power, single GPUs will not suffice. Instead, blockchain can now enable distributed GPU processing power, and blockchains specifically dedicated to AR holographic processing are on the rise.
Next up, cameras and sensors will aggregate real-time data from any environment to seamlessly integrate physical and virtual worlds. Meanwhile, body-tracking sensors are critical for aligning a user’s self-rendering in AR with a virtually enhanced environment. Depth sensors then provide data for 3D spatial maps, while cameras absorb more surface-level, detailed visual input. In some cases, sensors might even collect biometric data, such as heart rate and brain activity, to incorporate health-related feedback in our everyday AR interfaces and personal recommendation engines.
The next step in the pipeline involves none other than AI. Processing enormous volumes of data instantaneously, embedded AI algorithms will power customized AR experiences in everything from artistic virtual overlays to personalized dietary annotations.
In retail, AIs will use your purchasing history, current closet inventory, and possibly even mood indicators to display digitally rendered items most suitable for your wardrobe, tailored to your measurements.
In healthcare, smart AR glasses will provide physicians with immediately accessible and maximally relevant information (parsed from the entirety of a patient’s medical records and current research) to aid in accurate diagnoses and treatments, freeing doctors to engage in the more human-centric tasks of establishing trust, educating patients and demonstrating empathy.
Image Credit: PHD Ventures.
Convergence in Manufacturing
One of the nearest-term use cases of AR is manufacturing, as large producers begin dedicating capital to enterprise AR headsets. And over the next ten years, AR will converge with AI, sensors, and blockchain to multiply manufacturer productivity and employee experience.
(1) Convergence with AI
In initial application, digital guides superimposed on production tables will vastly improve employee accuracy and speed, while minimizing error rates.
Already, the International Air Transport Association (IATA) — whose airlines supply 82 percent of air travel — recently implemented industrial tech company Atheer’s AR headsets in cargo management. And with barely any delay, IATA reported a whopping 30 percent improvement in cargo handling speed and no less than a 90 percent reduction in errors.
With similar success rates, Boeing brought Skylight’s smart AR glasses to the runway, now used in the manufacturing of hundreds of airplanes. Sure enough—the aerospace giant has now seen a 25 percent drop in production time and near-zero error rates.
Beyond cargo management and air travel, however, smart AR headsets will also enable on-the-job training without reducing the productivity of other workers or sacrificing hardware. Jaguar Land Rover, for instance, implemented Bosch’s Re’flekt One AR solution to gear technicians with “x-ray” vision: allowing them to visualize the insides of Range Rover Sport vehicles without removing any dashboards.
And as enterprise capabilities continue to soar, AIs will soon become the go-to experts, offering support to manufacturers in need of assembly assistance. Instant guidance and real-time feedback will dramatically reduce production downtime, boost overall output, and even help customers struggling with DIY assembly at home.
Perhaps one of the most profitable business opportunities, AR guidance through centralized AI systems will also serve to mitigate supply chain inefficiencies at extraordinary scale. Coordinating moving parts, eliminating the need for manned scanners at each checkpoint, and directing traffic within warehouses, joint AI-AR systems will vastly improve workflow while overseeing quality assurance.
After its initial implementation of AR “vision picking” in 2015, leading courier company DHL recently announced it would continue to use Google’s newest smart lens in warehouses across the world. Motivated by the initial group’s reported 15 percent jump in productivity, DHL’s decision is part of the logistics giant’s $300 million investment in new technologies.
And as direct-to-consumer e-commerce fundamentally transforms the retail sector, supply chain optimization will only grow increasingly vital. AR could very well prove the definitive step for gaining a competitive edge in delivery speeds.
As explained by Vital Enterprises CEO Ash Eldritch, “All these technologies that are coming together around artificial intelligence are going to augment the capabilities of the worker and that’s very powerful. I call it Augmented Intelligence. The idea is that you can take someone of a certain skill level and by augmenting them with artificial intelligence via augmented reality and the Internet of Things, you can elevate the skill level of that worker.”
Already, large producers like Goodyear, thyssenkrupp, and Johnson Controls are using the Microsoft HoloLens 2—priced at $3,500 per headset—for manufacturing and design purposes.
Perhaps the most heartening outcome of the AI-AR convergence is that, rather than replacing humans in manufacturing, AR is an ideal interface for human collaboration with AI. And as AI merges with human capital, prepare to see exponential improvements in productivity, professional training, and product quality.
(2) Convergence with Sensors
On the hardware front, these AI-AR systems will require a mass proliferation of sensors to detect the external environment and apply computer vision in AI decision-making.
To measure depth, for instance, some scanning depth sensors project a structured pattern of infrared light dots onto a scene, detecting and analyzing reflected light to generate 3D maps of the environment. Stereoscopic imaging, using two lenses, has also been commonly used for depth measurements. But leading technology like Microsoft’s HoloLens 2 and Intel’s RealSense 400-series camera implement a new method called “phased time-of-flight” (ToF).
In ToF sensing, the HoloLens 2 uses numerous lasers, each with 100 milliwatts (mW) of power, in quick bursts. The distance between nearby objects and the headset wearer is then measured by the amount of light in the return beam that has shifted from the original signal. Finally, the phase difference reveals the location of each object within the field of view, which enables accurate hand-tracking and surface reconstruction.
With a far lower computing power requirement, the phased ToF sensor is also more durable than stereoscopic sensing, which relies on the precise alignment of two prisms. The phased ToF sensor’s silicon base also makes it easily mass-produced, rendering the HoloLens 2 a far better candidate for widespread consumer adoption.
To apply inertial measurement—typically used in airplanes and spacecraft—the HoloLens 2 additionally uses a built-in accelerometer, gyroscope, and magnetometer. Further equipped with four “environment understanding cameras” that track head movements, the headset also uses a 2.4MP HD photographic video camera and ambient light sensor that work in concert to enable advanced computer vision.
For natural viewing experiences, sensor-supplied gaze tracking increasingly creates depth in digital displays. Nvidia’s work on Foveated AR Display, for instance, brings the primary foveal area into focus, while peripheral regions fall into a softer background— mimicking natural visual perception and concentrating computing power on the area that needs it most.
Gaze tracking sensors are also slated to grant users control over their (now immersive) screens without any hand gestures. Conducting simple visual cues, even staring at an object for more than three seconds, will activate commands instantaneously.
And our manufacturing example above is not the only one. Stacked convergence of blockchain, sensors, AI and AR will disrupt almost every major industry.
Take healthcare, for example, wherein biometric sensors will soon customize users’ AR experiences. Already, MIT Media Lab’s Deep Reality group has created an underwater VR relaxation experience that responds to real-time brain activity detected by a modified version of the Muse EEG. The experience even adapts to users’ biometric data, from heart rate to electro dermal activity (inputted from an Empatica E4 wristband).
Now rapidly dematerializing, sensors will converge with AR to improve physical-digital surface integration, intuitive hand and eye controls, and an increasingly personalized augmented world. Keep an eye on companies like MicroVision, now making tremendous leaps in sensor technology.
While I’ll be doing a deep dive into sensor applications across each industry in our next blog, it’s critical to first discuss how we might power sensor- and AI-driven augmented worlds.
(3) Convergence with Blockchain
Because AR requires much more compute power than typical 2D experiences, centralized GPUs and cloud computing systems are hard at work to provide the necessary infrastructure. Nonetheless, the workload is taxing and blockchain may prove the best solution.
A major player in this pursuit, Otoy aims to create the largest distributed GPU network in the world, called the Render Network RNDR. Built specifically on the Ethereum blockchain for holographic media, and undergoing Beta testing, this network is set to revolutionize AR deployment accessibility.
Alphabet Chairman Eric Schmidt (an investor in Otoy’s network), has even said, “I predicted that 90% of computing would eventually reside in the web based cloud… Otoy has created a remarkable technology which moves that last 10%—high-end graphics processing—entirely to the cloud. This is a disruptive and important achievement. In my view, it marks the tipping point where the web replaces the PC as the dominant computing platform of the future.”
Leveraging the crowd, RNDR allows anyone with a GPU to contribute their power to the network for a commission of up to $300 a month in RNDR tokens. These can then be redeemed in cash or used to create users’ own AR content.
In a double win, Otoy’s blockchain network and similar iterations not only allow designers to profit when not using their GPUs, but also democratize the experience for newer artists in the field.
And beyond these networks’ power suppliers, distributing GPU processing power will allow more manufacturing companies to access AR design tools and customize learning experiences. By further dispersing content creation across a broad network of individuals, blockchain also has the valuable potential to boost AR hardware investment across a number of industry beneficiaries.
On the consumer side, startups like Scanetchain are also entering the blockchain-AR space for a different reason. Allowing users to scan items with their smartphone, Scanetchain’s app provides access to a trove of information, from manufacturer and price, to origin and shipping details.
Based on NEM (a peer-to-peer cryptocurrency that implements a blockchain consensus algorithm), the app aims to make information far more accessible and, in the process, create a social network of purchasing behavior. Users earn tokens by watching ads, and all transactions are hashed into blocks and securely recorded.
The writing is on the wall—our future of brick-and-mortar retail will largely lean on blockchain to create the necessary digital links.
Integrating AI into AR creates an “auto-magical” manufacturing pipeline that will fundamentally transform the industry, cutting down on marginal costs, reducing inefficiencies and waste, and maximizing employee productivity.
Bolstering the AI-AR convergence, sensor technology is already blurring the boundaries between our augmented and physical worlds, soon to be near-undetectable. While intuitive hand and eye motions dictate commands in a hands-free interface, biometric data is poised to customize each AR experience to be far more in touch with our mental and physical health.
And underpinning it all, distributed computing power with blockchain networks like RNDR will democratize AR, boosting global consumer adoption at plummeting price points.
As AR soars in importance—whether in retail, manufacturing, entertainment, or beyond—the stacked convergence discussed above merits significant investment over the next decade. The augmented world is only just getting started.
(1) A360 Executive Mastermind: Want even more context about how converging exponential technologies will transform your business and industry? Consider joining Abundance 360, a highly selective community of 360 exponentially minded CEOs, who are on a 25-year journey with me—or as I call it, a “countdown to the Singularity.” If you’d like to learn more and consider joining our 2020 membership, apply here.
Share this with your friends, especially if they are interested in any of the areas outlined above.
(2) Abundance-Digital Online Community: I’ve also created a Digital/Online community of bold, abundance-minded entrepreneurs called Abundance-Digital. Abundance-Digital is Singularity University’s ‘onramp’ for exponential entrepreneurs — those who want to get involved and play at a higher level. Click here to learn more.
This article originally appeared on Diamandis.com
Image Credit: Funky Focus / Pixabay Continue reading →