Tag Archives: speech

#436151 Natural Language Processing Dates Back ...

This is part one of a six-part series on the history of natural language processing.

We’re in the middle of a boom time for natural language processing (NLP), the field of computer science that focuses on linguistic interactions between humans and machines. Thanks to advances in machine learning over the past decade, we’ve seen vast improvements in speech recognition and machine translation software. Language generators are now good enough to write coherent news articles, and virtual agents like Siri and Alexa are becoming part of our daily lives.

Most trace the origins of this field back to the beginning of the computer age, when Alan Turing, writing in 1950, imagined a smart machine that could interact fluently with a human via typed text on a screen. For this reason, machine-generated language is mostly understood as a digital phenomenon—and a central goal of artificial intelligence (AI) research.

This six-part series will challenge that common understanding of NLP. In fact, attempts to design formal rules and machines that can analyze, process, and generate language go back hundreds of years.

Attempts to design formal rules and machines that can analyze, process, and generate language go back hundreds of years.

While specific technologies have changed over time, the basic idea of treating language as a material that can be artificially manipulated by rule-based systems has been pursued by many people in many cultures and for many different reasons. These historical experiments reveal the promise and perils of attempting to simulate human language in non-human ways—and they hold lessons for today’s practitioners of cutting-edge NLP techniques.

The story begins in medieval Spain. In the late 1200s, a Jewish mystic by the name of Abraham Abulafia sat down at a table in his small house in Barcelona, picked up a quill, dipped it in ink, and began combining the letters of the Hebrew alphabet in strange and seemingly random ways. Aleph with Bet, Bet with Gimmel, Gimmel with Aleph and Bet, and so on.

Abulafia called this practice “the science of the combination of letters.” He wasn’t actually combining letters at random; instead he was carefully following a secret set of rules that he had devised while studying an ancient Kabbalistic text called the Sefer Yetsirah. This book describes how God created “all that is formed and all that is spoken” by combining Hebrew letters according to sacred formulas. In one section, God exhausts all possible two-letter combinations of the 22 Hebrew letters.

By studying the Sefer Yetsirah, Abulafia gained the insight that linguistic symbols can be manipulated with formal rules in order to create new, interesting, insightful sentences. To this end, he spent months generating thousands of combinations of the 22 letters of the Hebrew alphabet and eventually emerged with a series of books that he claimed were endowed with prophetic wisdom.

For Abulafia, generating language according to divine rules offered insight into the sacred and the unknown, or as he put it, allowed him to “grasp things which by human tradition or by thyself thou would not be able to know.”

Combining letters to generate language allows thou to “grasp things which by human tradition or by thyself thou would not be able to know.”
—Abraham Abulafia, mystic

But other Jewish scholars considered this rudimentary language generation a dangerous act that bordered on the profane. The Talmud tells stories of rabbis who, by the magical act of permuting language according to the formulas set out in the Sefer Yetsirah, created artificial creatures called golems. In these tales, rabbis manipulated the letters of the Hebrew alphabet to replicate God’s act of creation, using the sacred formulas to imbue inanimate objects with life.

In some of these myths, the rabbis used this skill for practical reasons, to make animals to eat when hungry or servants to help them with domestic duties. But many of these golem stories end badly. In one particularly well-known fable, Judah Loew ben Bezalel, the 16th century rabbi of Prague, used the sacred practice of letter combinatorics to conjure a golem to protect the Jewish community from antisemitic attacks, only to see the golem turn violently on him instead.

This “science of the combination of letters” was a rudimentary form of natural language processing, as it involved combining letters of the Hebrew alphabet according to specific rules. For Kabbalists, it was a double-edged sword: a way to access new forms of knowledge and wisdom, but also an inherently dangerous practice that could bring about unintended consequences.

This tension reappears throughout the long history of language processing, and still echoes in discussions about the most cutting-edge NLP technology of our digital era.

This is the first installment of a six-part series on the history of natural language processing. Come back next Monday for part two, “In the 17th Century, Leibniz Dreamed of a Machine That Could Calculate Ideas​.”

You can also check out our prior series on the untold history of AI. Continue reading

Posted in Human Robots

#435824 A Q&A with Cruise’s head of AI, ...

In 2016, Cruise, an autonomous vehicle startup acquired by General Motors, had about 50 employees. At the beginning of 2019, the headcount at its San Francisco headquarters—mostly software engineers, mostly working on projects connected to machine learning and artificial intelligence—hit around 1000. Now that number is up to 1500, and by the end of this year it’s expected to reach about 2000, sprawling into a recently purchased building that had housed Dropbox. And that’s not counting the 200 or so tech workers that Cruise is aiming to install in a Seattle, Wash., satellite development center and a handful of others in Phoenix, Ariz., and Pasadena, Calif.

Cruise’s recent hires aren’t all engineers—it takes more than engineering talent to manage operations. And there are hundreds of so-called safety drivers that are required to sit in the 180 or so autonomous test vehicles whenever they roam the San Francisco streets. But that’s still a lot of AI experts to be hiring in a time of AI engineer shortages.

Hussein Mehanna, head of AI/ML at Cruise, says the company’s hiring efforts are on track, due to the appeal of the challenge of autonomous vehicles in drawing in AI experts from other fields. Mehanna himself joined Cruise in May from Google, where he was director of engineering at Google Cloud AI. Mehanna had been there about a year and a half, a relatively quick career stop after a short stint at Snap following four years working in machine learning at Facebook.

Mehanna has been immersed in AI and machine learning research since his graduate studies in speech recognition and natural language processing at the University of Cambridge. I sat down with Mehanna to talk about his career, the challenges of recruiting AI experts and autonomous vehicle development in general—and some of the challenges specific to San Francisco. We were joined by Michael Thomas, Cruise’s manager of AI/ML recruiting, who had also spent time recruiting AI engineers at Google and then Facebook.

IEEE Spectrum: When you were at Cambridge, did you think AI was going to take off like a rocket?

Mehanna: Did I imagine that AI was going to be as dominant and prevailing and sometimes hyped as it is now? No. I do recall in 2003 that my supervisor and I were wondering if neural networks could help at all in speech recognition. I remember my supervisor saying if anyone could figure out how use a neural net for speech he would give them a grant immediately. So he was on the right path. Now neural networks have dominated vision, speech, and language [processing]. But that boom started in 2012.

“In the early days, Facebook wasn’t that open to PhDs, it actually had a negative sentiment about researchers, and then Facebook shifted”

I didn’t [expect it], but I certainly aimed for it when [I was at] Microsoft, where I deliberately pushed my career towards machine learning instead of big data, which was more popular at the time. And [I aimed for it] when I joined Facebook.

In the early days, Facebook wasn’t that open to PhDs, or researchers. It actually had a negative sentiment about researchers. And then Facebook shifted to becoming one of the key places where PhD students wanted to do internships or join after they graduated. It was a mindset shift, they were [once] at a point in time where they thought what was needed for success wasn’t research, but now it’s different.

There was definitely an element of risk [in taking a machine learning career path], but I was very lucky, things developed very fast.

IEEE Spectrum: Is it getting harder or easier to find AI engineers to hire, given the reported shortages?

Mehanna: There is a mismatch [between job openings and qualified engineers], though it is hard to quantify it with numbers. There is good news as well: I see a lot more students diving deep into machine learning and data in their [undergraduate] computer science studies, so it’s not as bleak as it seems. But there is massive demand in the market.

Here at Cruise, demand for AI talent is just growing and growing. It might be is saturating or slowing down at other kinds of companies, though, [which] are leveraging more traditional applications—ad prediction, recommendations—that have been out there in the market for a while. These are more mature, better understood problems.

I believe autonomous vehicle technologies is the most difficult AI problem out there. The magnitude of the challenge of these problems is 1000 times more than other problems. They aren’t as well understood yet, and they require far deeper technology. And also the quality at which they are expected to operate is off the roof.

The autonomous vehicle problem is the engineering challenge of our generation. There’s a lot of code to write, and if we think we are going to hire armies of people to write it line by line, it’s not going to work. Machine learning can accelerate the process of generating the code, but that doesn’t mean we aren’t going to have engineers; we actually need a lot more engineers.

Sometimes people worry that AI is taking jobs. It is taking some developer jobs, but it is actually generating other developer jobs as well, protecting developers from the mundane and helping them build software faster and faster.

IEEE Spectrum: Are you concerned that the demand for AI in industry is drawing out the people in academia who are needed to educate future engineers, that is, the “eating the seed corn” problem?

Mehanna: There are some negative examples in the industry, but that’s not our style. We are looking for collaborations with professors, we want to cultivate a very deep and respectful relationship with universities.

And there’s another angle to this: Universities require a thriving industry for them to thrive. It is going to be extremely beneficial for academia to have this flourishing industry in AI, because it attracts more students to academia. I think we are doing them a fantastic favor by building these career opportunities. This is not the same as in my early days, [when] people told me “don’t go to AI; go to networking, work in the mobile industry; mobile is flourishing.”

IEEE Spectrum: Where are you looking as you try to find a thousand or so engineers to hire this year?

Thomas: We look for people who want to use machine learning to solve problems. They can be in many different industries—in the financial markets, in social media, in advertising. The autonomous vehicle industry is in its infancy. You can compare it to mobile in the early days: When the iPhone first came out, everyone was looking for developers with mobile experience, but you weren’t going to find them unless you went to straight to Apple, [so you had to hire other kinds of engineers]. This is the same type of thing: it is so new that you aren’t going to find experts in this area, because we are all still learning.

“You don’t have to be an autonomous vehicle expert to flourish in this world. It’s not too late to move…now would be a great time for AI experts working on other problems to shift their attention to autonomous vehicles.”

Mehanna: Because autonomous vehicle technology is the new frontier for AI experts, [the number of] people with both AI and autonomous vehicle experience is quite limited. So we are acquiring AI experts wherever they are, and helping them grow into the autonomous vehicle area. You don’t have to be an autonomous vehicle expert to flourish in this world. It’s not too late to move; even though there is a lot of great tech developed, there’s even more innovation ahead, so now would be a great time for AI experts working on other problems or applications to shift their attention to autonomous vehicles.

It feels like the Internet in 1980. It’s about to happen, but there are endless applications [to be developed over] the next few decades. Even if we can get a car to drive safely, there is the question of how can we tune the ride comfort, and then applying it all to different cities, different vehicles, different driving situations, and who knows to what other applications.

I can see how I can spend a lifetime career trying to solve this problem.

IEEE Spectrum: Why are you doing most of your development in San Francisco?

Mehanna: I think the best talent of the world is in Silicon Valley, and solving the autonomous vehicle problem is going to require the best of the best. It’s not just the engineering talent that is here, but [also] the entrepreneurial spirit. Solving the problem just as a technology is not going to be successful, you need to solve the product and the technology together. And the entrepreneurial spirit is one of the key reasons Cruise secured 7.5 billion in funding [besides GM, the company has a number of outside investors, including Honda, Softbank, and T. Rowe Price]. That [funding] is another reason Cruise is ahead of many others, because this problem requires deep resources.

“If you can do an autonomous vehicle in San Francisco you can do it almost anywhere.”

[And then there is the driving environment.] When I speak to my peers in the industry, they have a lot of respect for us, because the problems to solve in San Francisco technically are an order of magnitude harder. It is a tight environment, with a lot of pedestrians, and driving patterns that, let’s put it this way, are not necessarily the best in the nation. Which means we are seeing more problems ahead of our competitors, which gets us to better [software]. I think if you can do an autonomous vehicle in San Francisco you can do it almost anywhere.

A version of this post appears in the September 2019 print magazine as “AI Engineers: The Autonomous-Vehicle Industry Wants You.” Continue reading

Posted in Human Robots

#435742 This ‘Useless’ Social Robot ...

The recent high profile failures of some home social robots (and the companies behind them) have made it even more challenging than it was before to develop robots in that space. And it was challenging enough to begin with—making a robot that can autonomous interact with random humans in their homes over a long period of time for a price that people can afford is extraordinarily difficult. However, the massive amount of initial interest in robots like Jibo, Kuri, Vector, and Buddy prove that people do want these things, or at least think they do, and while that’s the case, there’s incentive for other companies to give social home robots a try.

One of those companies is Zoetic, founded in 2107 by Mita Yun and Jitu Das, both ex-Googlers. Their robot, Kiki, is more or less exactly what you’d expect from a social home robot: It’s cute, white, roundish, has big eyes, promises that it will be your “robot sidekick,” and is not cheap: It’s on Kicksterter for $800. Kiki is among what appears to be a sort of tentative second wave of social home robots, where designers have (presumably) had a chance to take everything that they learned from the social home robot pioneers and use it to make things better this time around.

Kiki’s Kickstarter video is, again, more or less exactly what you’d expect from a social home robot crowdfunding campaign:

We won’t get into all of the details on Kiki in this article (the Kickstarter page has tons of information), but a few distinguishing features:

Each Kiki will develop its own personality over time through its daily interactions with its owner, other people, and other Kikis.
Interacting with Kiki is more abstract than with most robots—it can understand some specific words and phrases, and will occasionally use a few specific words or two, but otherwise it’s mostly listening to your tone of voice and responding with sounds rather than speech.
Kiki doesn’t move on its own, but it can operate for up to two hours away from its charging dock.
Depending on how your treat Kiki, it can get depressed or neurotic. It also needs to be fed, which you can do by drawing different kinds of food in the app.
Everything Kiki does runs on-board the robot. It has Wi-Fi connectivity for updates, but doesn’t rely on the cloud for anything in real-time, meaning that your data stays on the robot and that the robot will continue to function even if its remote service shuts down.

It’s hard to say whether features like these are unique enough to help Kiki be successful where other social home robots haven’t been, so we spoke with Zoetic co-founder Mita Yun and asked her why she believes that Kiki is going to be the social home robot that makes it.

IEEE Spectrum: What’s your background?

Mita Yun: I was an only child growing up, and so I always wanted something like Doraemon or Totoro. Something that when you come home it’s there to greet you, not just because it’s programmed to do that but because it’s actually actively happy to see you, and only you. I was so interested in this that I went to study robotics at CMU and then after I graduated I joined Google and worked there for five years. I tended to go for the more risky and more fun projects, but they always got cancelled—the first project I joined was called Android at Home, and then I joined Google Glass, and then I joined a team called Robots for Kids. That project was building educational robots, and then I just realized that when we’re adding technology to something, to a product, we’re actually taking the life away somehow, and the kids were more connected with stuffed animals compared to the educational robots we were building. That project was also cancelled, and in 2017, I left with a coworker of mine (Jitu Das) to bring this dream into reality. And now we’re building Kiki.

“Jibo was Alexa plus cuteness equals $800, and I feel like that equation doesn’t work for most people, and that eventually killed the company. So, for Kiki, we are actually building something very different. We’re building something that’s completely useless”
—Mita Yun, Zoetic

You started working on Kiki in 2017, when things were already getting challenging for Jibo—why did you decide to start developing a social home robot at that point?

I thought Jibo was great. It had a special magical way of moving, and it was such a new idea that you could have this robot with embodiment and it can actually be your assistant. The problem with Jibo, in my opinion, was that it took too long to fulfill the orders. It took them three to four years to actually manufacture, because it was a very complex piece of hardware, and then during that period of time Alexa and Google Home came out, and they started selling these voice systems for $30 and then you have Jibo for $800. Jibo was Alexa plus cuteness equals $800, and I feel like that equation doesn’t work for most people, and that eventually killed the company. So, for Kiki, we are actually building something very different. We’re building something that’s completely useless.

Can you elaborate on “completely useless?”

I feel like people are initially connected with robots because they remind them of a character. And it’s the closest we can get to a character other than an organic character like an animal. So we’re connected to a character like when we have a robot in a mall that’s roaming around, even if it looks really ugly, like if it doesn’t have eyes, people still take selfies with it. Why? Because they think it’s a character. And humans are just hardwired to love characters and love stories. With Kiki, we just wanted to build a character that’s alive, we don’t want to have a character do anything super useful.

I understand why other robotics companies are adding Alexa integration to their robots, and I think that’s great. But the dream I had, and the understanding I have about robotics technology, is that for a consumer robot especially, it is very very difficult for the robot to justify its price through usefulness. And then there’s also research showing that the more useless something is, the easier it is to have an emotional connection, so that’s why we want to keep Kiki very useless.

What kind of character are you creating with Kiki?

The whole design principle around Kiki is we want to make it a very vulnerable character. In terms of its status at home, it’s not going to be higher or equal status as the owner, but slightly lower status than the human, and it’s vulnerable and needs you to take care of it in order to grow up into a good personality robot.

We don’t let Kiki speak full English sentences, because whenever it does that, people are going to think it’s at least as intelligent as a baby, which is impossible for robots at this point. And we also don’t let it move around, because when you have it move around, people are going to think “I’m going to call Kiki’s name, and then Kiki is will come to me.” But that is actually very difficult to build. And then also we don’t have any voice integration so it doesn’t tell you about the stock market price and so on.

Photo: Zoetic

Kiki is designed to be “vulnerable,” and it needs you to take care of it so it can “grow up into a good personality robot,” according to its creators.

That sounds similar to what Mayfield did with Kuri, emphasizing an emotional connection rather than specific functionality.

It is very similar, but one of the key differences from Kuri, I think, is that Kuri started with a Kobuki base, and then it’s wrapped into a cute shell, and they added sounds. So Kuri started with utility in mind—navigation is an important part of Kuri, so they started with that challenge. For Kiki, we started with the eyes. The entire thing started with the character itself.

How will you be able to convince your customers to spend $800 on a robot that you’ve described as “useless” in some ways?

Because it’s useless, it’s actually easier to convince people, because it provides you with an emotional connection. I think Kiki is not a utility-driven product, so the adoption cycle is different. For a functional product, it’s very easy to pick up, because you can justify it by saying “I’m going to pay this much and then my life can become this much more efficient.” But it’s also very easy to be replaced and forgotten. For an emotional-driven product, it’s slower to pick up, but once people actually pick it up, they’re going to be hooked—they get be connected with it, and they’re willing to invest more into taking care of the robot so it will grow up to be smarter.

Maintaining value over time has been another challenge for social home robots. How will you make sure that people don’t get bored with Kiki after a few weeks?

Of course Kiki has limits in what it can do. We can combine the eyes, the facial expression, the motors, and lights and sounds, but is it going to be constantly entertaining? So we think of this as, imagine if a human is actually puppeteering Kiki—can Kiki stay interesting if a human is puppeteering it and interacting with the owner? So I think what makes a robot interesting is not just in the physical expressions, but the part in between that and the robot conveying its intentions and emotions.

For example, if you come into the room and then Kiki decides it will turn the other direction, ignore you, and then you feel like, huh, why did the robot do that to me? Did I do something wrong? And then maybe you will come up to it and you will try to figure out why it did that. So, even though Kiki can only express in four different dimensions, it can still make things very interesting, and then when its strategies change, it makes it feel like a new experience.

There’s also an explore and exploit process going on. Kiki wants to make you smile, and it will try different things. It could try to chase its tail, and if you smile, Kiki learns that this works and will exploit it. But maybe after doing it three times, you no longer find it funny, because you’re bored of it, and then Kiki will observe your reactions and be motivated to explore a new strategy.

Photo: Zoetic

Kiki’s creators are hoping that, with an emotionally engaging robot, it will be easier for people to get attached to it and willing to spend time taking care of it.

A particular risk with crowdfunding a robot like this is setting expectations unreasonably high. The emphasis on personality and emotional engagement with Kiki seems like it may be very difficult for the robot to live up to in practice.

I think we invested more than most robotics companies into really building out Kiki’s personality, because that is the single most important thing to us. For Jibo a lot of the focus was in the assistant, and for Kuri, it’s more in the movement. For Kiki, it’s very much in the personality.

I feel like when most people talk about personality, they’re mainly talking about expression. With Kiki, it’s not just in the expression itself, not just in the voice or the eyes or the output layer, it’s in the layer in between—when Kiki receives input, how will it make decisions about what to do? We actually don’t think the personality of Kiki is categorizable, which is why I feel like Kiki has a deeper implementation of how personalities should work. And you’re right, Kiki doesn’t really understand why you’re feeling a certain way, it just reads your facial expressions. It’s maybe not your best friend, but maybe closer to your little guinea pig robot.

Photo: Zoetic

The team behind Kiki paid particular attention to its eyes, and designed the robot to always face the person that it is interacting with.

Is that where you’d put Kiki on the scale of human to pet?

Kiki is definitely not human, we want to keep it very far away from human. And it’s also not a dog or cat. When we were designing Kiki, we took inspiration from mammals because humans are deeply connected to mammals since we’re mammals ourselves. And specifically we’re connected to predator animals. With prey animals, their eyes are usually on the sides of their heads, because they need to see different angles. A predator animal needs to hunt, they need to focus. Cats and dogs are predator animals. So with Kiki, that’s why we made sure the eyes are on one side of the face and the head can actuate independently from the body and the body can turn so it’s always facing the person that it’s paying attention to.

I feel like Kiki is probably does more than a plant. It does more than a fish, because a fish doesn’t look you in the eyes. It’s not as smart as a cat or a dog, so I would just put it in this guinea pig kind of category.

What have you found so far when running user studies with Kiki?

When we were first designing Kiki we went through a whole series of prototypes. One of the earlier prototypes of Kiki looked like a CRT, like a very old monitor, and when we were testing that with people they didn’t even want to touch it. Kiki’s design inspiration actually came from an airplane, with a very angular, futuristic look, but based on user feedback we made it more round and more friendly to the touch. The lights were another feature request from the users, which adds another layer of expressivity to Kiki, and they wanted to see multiple Kikis working together with different personalities. Users also wanted different looks for Kiki, to make it look like a deer or a unicorn, for example, and we actually did take that into consideration because it doesn’t look like any particular mammal. In the future, you’ll be able to have different ears to make it look like completely different animals.

There has been a lot of user feedback that we didn’t implement—I believe we should observe the users reactions and feedback but not listen to their advice. The users shouldn’t be our product designers, because if you test Kiki with 10 users, eight of them will tell you they want Alexa in it. But we’re never going to add Alexa integration to Kiki because that’s not what it’s meant to do.

While it’s far too early to tell whether Kiki will be a long-term success, the Kickstarter campaign is currently over 95 percent funded with 8 days to go, and 34 robots are still available for a May 2020 delivery.

[ Kickstarter ] Continue reading

Posted in Human Robots

#435681 Video Friday: This NASA Robot Uses ...

Video Friday is your weekly selection of awesome robotics videos, collected by your Automaton bloggers. We’ll also be posting a weekly calendar of upcoming robotics events for the next few months; here’s what we have so far (send us your events!):

ICRES 2019 – July 29-30, 2019 – London, U.K.
DARPA SubT Tunnel Circuit – August 15-22, 2019 – Pittsburgh, Pa., USA
IEEE Africon 2019 – September 25-27, 2019 – Accra, Ghana
ISRR 2019 – October 6-10, 2019 – Hanoi, Vietnam
Let us know if you have suggestions for next week, and enjoy today’s videos.

Robots can land on the Moon and drive on Mars, but what about the places they can’t reach? Designed by engineers as NASA’s Jet Propulsion Laboratory in Pasadena, California, a four-limbed robot named LEMUR (Limbed Excursion Mechanical Utility Robot) can scale rock walls, gripping with hundreds of tiny fishhooks in each of its 16 fingers and using artificial intelligence to find its way around obstacles. In its last field test in Death Valley, California, in early 2019, LEMUR chose a route up a cliff, scanning the rock for ancient fossils from the sea that once filled the area.

The LEMUR project has since concluded, but it helped lead to a new generation of walking, climbing and crawling robots. In future missions to Mars or icy moons, robots with AI and climbing technology derived from LEMUR could discover similar signs of life. Those robots are being developed now, honing technology that may one day be part of future missions to distant worlds.

[ NASA ]

This video demonstrates the autonomous footstep planning developed by IHMC. Robots in this video are the Atlas humanoid robot (DRC version) and the NASA Valkyrie. The operator specifies a goal location in the world, which is modeled as planar regions using the robot’s perception sensors. The planner then automatically computes the necessary steps to reach the goal using a Weighted A* algorithm. The algorithm does not reject footholds that have a certain amount of support, but instead modifies them after the plan is found to try and increase that support area.

Currently, narrow terrain has a success rate of about 50%, rough terrain is about 90%, whereas flat ground is near 100%. We plan on increasing planner speed and the ability to plan through mazes and to unseen goals by including a body-path planner as the first step. Control, Perception, and Planning algorithms by IHMC Robotics.

[ IHMC ]

I’ve never really been able to get into watching people play poker, but throw an AI from CMU and Facebook into a game of no-limit Texas hold’em with five humans, and I’m there.

[ Facebook ]

In this video, Cassie Blue is navigating autonomously. Right now, her world is very small, the Wavefield at the University of Michigan, where she is told to turn left at intersections. You’re right, that is not a lot of independence, but it’s a first step away from a human and an RC controller!

Using a RealSense RGBD Camera, an IMU, and our version of an InEKF with contact factors, Cassie Blue is building a 3D semantic map in real time that identifies sidewalks, grass, poles, bicycles, and buildings. From the semantic map, occupancy and cost maps are built with the sidewalk identified as walk-able area and everything else considered as an obstacle. A planner then sets a goal to stay approximately 50 cm to the right of the sidewalk’s left edge and plans a path around obstacles and corners using D*. The path is translated into way-points that are achieved via Cassie Blue’s gait controller.

[ University of Michigan ]

Thanks Jesse!

Dave from HEBI Robotics wrote in to share some new actuators that are designed to get all kinds of dirty: “The R-Series takes HEBI’s X-Series to the next level, providing a sealed robotics solution for rugged, industrial applications and laying the groundwork for industrial users to address challenges that are not well met by traditional robotics. To prove it, we shot some video right in the Allegheny River here in Pittsburgh. Not a bad way to spend an afternoon :-)”

The R-Series Actuator is a full-featured robotic component as opposed to a simple servo motor. The output rotates continuously, requires no calibration or homing on boot-up, and contains a thru-bore for easy daisy-chaining of wiring. Modular in nature, R-Series Actuators can be used in everything from wheeled robots to collaborative robotic arms. They are sealed to IP67 and designed with a lightweight form factor for challenging field applications, and they’re packed with sensors that enable simultaneous control of position, velocity, and torque.

[ HEBI Robotics ]

Thanks Dave!

If your robot hands out karate chops on purpose, that’s great. If it hands out karate chops accidentally, maybe you should fix that.

COVR is short for “being safe around collaborative and versatile robots in shared spaces”. Our mission is to significantly reduce the complexity in safety certifying cobots. Increasing safety for collaborative robots enables new innovative applications, thus increasing production and job creation for companies utilizing the technology. Whether you’re an established company seeking to deploy cobots or an innovative startup with a prototype of a cobot related product, COVR will help you analyze, test and validate the safety for that application.

[ COVR ]

Thanks Anna!

EPFL startup Flybotix has developed a novel drone with just two propellers and an advanced stabilization system that allow it to fly for twice as long as conventional models. That fact, together with its small size, makes it perfect for inspecting hard-to-reach parts of industrial facilities such as ducts.

[ Flybotix ]

SpaceBok is a quadruped robot designed and built by a Swiss student team from ETH Zurich and ZHAW Zurich, currently being tested using Automation and Robotics Laboratories (ARL) facilities at our technical centre in the Netherlands. The robot is being used to investigate the potential of ‘dynamic walking’ and jumping to get around in low gravity environments.

SpaceBok could potentially go up to 2 m high in lunar gravity, although such a height poses new challenges. Once it comes off the ground the legged robot needs to stabilise itself to come down again safely – like a mini-spacecraft. So, like a spacecraft. SpaceBok uses a reaction wheel to control its orientation.

[ ESA ]

A new video from GITAI showing progress on their immersive telepresence robot for space.

[ GITAI ]

Tech United’s HERO robot (a Toyota HSR) competed in the RoboCup@Home competition, and it had a couple of garbage-related hiccups.

[ Tech United ]

Even small drones are getting better at autonomous obstacle avoidance in cluttered environments at useful speeds, as this work from the HKUST Aerial Robotics Group shows.

[ HKUST ]

DelFly Nimbles now come in swarms.

[ DelFly Nimble ]

This is a very short video, but it’s a fairly impressive look at a Baxter robot collaboratively helping someone put a shirt on, a useful task for folks with disabilities.

[ Shibata Lab ]

ANYmal can inspect the concrete in sewers for deterioration by sliding its feet along the ground.

[ ETH Zurich ]

HUG is a haptic user interface for teleoperating advanced robotic systems as the humanoid robot Justin or the assistive robotic system EDAN. With its lightweight robot arms, HUG can measure human movements and simultaneously display forces from the distant environment. In addition to such teleoperation applications, HUG serves as a research platform for virtual assembly simulations, rehabilitation, and training.

[ DLR ]

This video about “image understanding” from CMU in 1979 (!) is amazing, and even though it’s long, you won’t regret watching until 3:30. Or maybe you will.

[ ARGOS (pdf) ]

Will Burrard-Lucas’ BeetleCam turned 10 this month, and in this video, he recounts the history of his little robotic camera.

[ BeetleCam ]

In this week’s episode of Robots in Depth, Per speaks with Gabriel Skantze from Furhat Robotics.

Gabriel Skantze is co-founder and Chief Scientist at Furhat Robotics and Professor in speech technology at KTH with a specialization in conversational systems. He has a background in research into how humans use spoken communication to interact.

In this interview, Gabriel talks about how the social robot revolution makes it necessary to communicate with humans in a human ways through speech and facial expressions. This is necessary as we expand the number of people that interact with robots as well as the types of interaction. Gabriel gives us more insight into the many challenges of implementing spoken communication for co-bots, where robots and humans work closely together. They need to communicate about the world, the objects in it and how to handle them. We also get to hear how having an embodied system using the Furhat robot head helps the interaction between humans and the system.

[ Robots in Depth ] Continue reading

Posted in Human Robots

#435593 AI at the Speed of Light

Neural networks shine for solving tough problems such as facial and voice recognition, but conventional electronic versions are limited in speed and hungry for power. In theory, optics could beat digital electronic computers in the matrix calculations used in neural networks. However, optics had been limited by their inability to do some complex calculations that had required electronics. Now new experiments show that all-optical neural networks can tackle those problems.

The key attraction of neural networks is their massive interconnections among processors, comparable to the complex interconnections among neurons in the brain. This lets them perform many operations simultaneously, like the human brain does when looking at faces or listening to speech, making them more efficient for facial and voice recognition than traditional electronic computers that execute one instruction at a time.

Today's electronic neural networks have reached eight million neurons, but their future use in artificial intelligence may be limited by their high power usage and limited parallelism in connections. Optical connections through lenses are inherently parallel. The lens in your eye simultaneously focuses light from across your field of view onto the retina in the back of your eye, where an array of light-detecting nerve cells detects the light. Each cell then relays the signal it receives to neurons in the brain that process the visual signals to show us an image.

Glass lenses process optical signals by focusing light, which performs a complex mathematical operation called a Fourier transform that preserves the information in the original scene but rearranges is completely. One use of Fourier transforms is converting time variations in signal intensity into a plot of the frequencies present in the signal. The military used this trick in the 1950s to convert raw radar return signals recorded by an aircraft in flight into a three-dimensional image of the landscape viewed by the plane. Today that conversion is done electronically, but the vacuum-tube computers of the 1950s were not up to the task.

Development of neural networks for artificial intelligence started with electronics, but their AI applications have been limited by their slow processing and need for extensive computing resources. Some researchers have developed hybrid neural networks, in which optics perform simple linear operations, but electronics perform more complex nonlinear calculations. Now two groups have demonstrated simple all-optical neural networks that do all processing with light.

In May, Wolfram Pernice of the Institute of Physics at the University of Münster in Germany and colleagues reported testing an all-optical “neuron” in which signals change target materials between liquid and solid states, an effect that has been used for optical data storage. They demonstrated nonlinear processing, and produced output pulses like those from organic neurons. They then produced an integrated photonic circuit that incorporated four optical neurons operating at different wavelengths, each of which connected to 15 optical synapses. The photonic circuit contained more than 140 components and could recognize simple optical patterns. The group wrote that their device is scalable, and that the technology promises “access to the high speed and high bandwidth inherent to optical systems, thus enabling the direct processing of optical telecommunication and visual data.”

Now a group at the Hong Kong University of Science and Technology reports in Optica that they have made an all-optical neural network based on a different process, electromagnetically induced transparency, in which incident light affects how atoms shift between quantum-mechanical energy levels. The process is nonlinear and can be triggered by very weak light signals, says Shengwang Du, a physics professor and coauthor of the paper.

In their demonstration, they illuminated rubidium-85 atoms cooled by lasers to about 10 microKelvin (10 microdegrees above absolute zero). Although the technique may seem unusually complex, Du said the system was the most accessible one in the lab that could produce the desired effects. “As a pure quantum atomic system [it] is ideal for this proof-of-principle experiment,” he says.

Next, they plan to scale up the demonstration using a hot atomic vapor center, which is less expensive, does not require time-consuming preparation of cold atoms, and can be integrated with photonic chips. Du says the major challenges are reducing cost of the nonlinear processing medium and increasing the scale of the all-optical neural network for more complex tasks.

“Their demonstration seems valid,” says Volker Sorger, an electrical engineer at George Washington University in Washington who was not involved in either demonstration. He says the all-optical approach is attractive because it offers very high parallelism, but the update rate is limited to about 100 hertz because of the liquid crystals used in their test, and he is not completely convinced their approach can be scaled error-free. Continue reading

Posted in Human Robots