Tag Archives: online
#436258 For Centuries, People Dreamed of a ...
This is part six of a six-part series on the history of natural language processing.
In February of this year, OpenAI, one of the foremost artificial intelligence labs in the world, announced that a team of researchers had built a powerful new text generator called the Generative Pre-Trained Transformer 2, or GPT-2 for short. The researchers used a reinforcement learning algorithm to train their system on a broad set of natural language processing (NLP) capabilities, including reading comprehension, machine translation, and the ability to generate long strings of coherent text.
But as is often the case with NLP technology, the tool held both great promise and great peril. Researchers and policy makers at the lab were concerned that their system, if widely released, could be exploited by bad actors and misappropriated for “malicious purposes.”
The people of OpenAI, which defines its mission as “discovering and enacting the path to safe artificial general intelligence,” were concerned that GPT-2 could be used to flood the Internet with fake text, thereby degrading an already fragile information ecosystem. For this reason, OpenAI decided that it would not release the full version of GPT-2 to the public or other researchers.
GPT-2 is an example of a technique in NLP called language modeling, whereby the computational system internalizes a statistical blueprint of a text so it’s able to mimic it. Just like the predictive text on your phone—which selects words based on words you’ve used before—GPT-2 can look at a string of text and then predict what the next word is likely to be based on the probabilities inherent in that text.
GPT-2 can be seen as a descendant of the statistical language modeling that the Russian mathematician A. A. Markov developed in the early 20th century (covered in part three of this series).
GPT-2 used cutting-edge machine learning algorithms to do linguistic analysis with over 1.5 million parameters.
What’s different with GPT-2, though, is the scale of the textual data modeled by the system. Whereas Markov analyzed a string of 20,000 letters to create a rudimentary model that could predict the likelihood of the next letter of a text being a consonant or a vowel, GPT-2 used 8 million articles scraped from Reddit to predict what the next word might be within that entire dataset.
And whereas Markov manually trained his model by counting only two parameters—vowels and consonants—GPT-2 used cutting-edge machine learning algorithms to do linguistic analysis with over 1.5 million parameters, burning through huge amounts of computational power in the process.
The results were impressive. In their blog post, OpenAI reported that GPT-2 could generate synthetic text in response to prompts, mimicking whatever style of text it was shown. If you prompt the system with a line of William Blake’s poetry, it can generate a line back in the Romantic poet’s style. If you prompt the system with a cake recipe, you get a newly invented recipe in response.
Perhaps the most compelling feature of GPT-2 is that it can answer questions accurately. For example, when OpenAI researchers asked the system, “Who wrote the book The Origin of Species?”—it responded: “Charles Darwin.” While only able to respond accurately some of the time, the feature does seem to be a limited realization of Gottfried Leibniz’s dream of a language-generating machine that could answer any and all human questions (described in part two of this series).
After observing the power of the new system in practice, OpenAI elected not to release the fully trained model. In the lead up to its release in February, there had been heightened awareness about “deepfakes”—synthetic images and videos, generated via machine learning techniques, in which people do and say things they haven’t really done and said. Researchers at OpenAI worried that GPT-2 could be used to essentially create deepfake text, making it harder for people to trust textual information online.
Responses to this decision varied. On one hand, OpenAI’s caution prompted an overblown reaction in the media, with articles about the “dangerous” technology feeding into the Frankenstein narrative that often surrounds developments in AI.
Others took issue with OpenAI’s self-promotion, with some even suggesting that OpenAI purposefully exaggerated GPT-2s power in order to create hype—while contravening a norm in the AI research community, where labs routinely share data, code, and pre-trained models. As machine learning researcher Zachary Lipton tweeted, “Perhaps what's *most remarkable* about the @OpenAI controversy is how *unremarkable* the technology is. Despite their outsize attention & budget, the research itself is perfectly ordinary—right in the main branch of deep learning NLP research.”
OpenAI stood by its decision to release only a limited version of GPT-2, but has since released larger models for other researchers and the public to experiment with. As yet, there has been no reported case of a widely distributed fake news article generated by the system. But there have been a number of interesting spin-off projects, including GPT-2 poetry and a webpage where you can prompt the system with questions yourself.
Mimicking humans on Reddit, the bots have long conversations about a variety of topics, including conspiracy theories and
Star Wars movies.
There’s even a Reddit group populated entirely with text produced by GPT-2-powered bots. Mimicking humans on Reddit, the bots have long conversations about a variety of topics, including conspiracy theories and Star Wars movies.
This bot-powered conversation may signify the new condition of life online, where language is increasingly created by a combination of human and non-human agents, and where maintaining the distinction between human and non-human, despite our best efforts, is increasingly difficult.
The idea of using rules, mechanisms, and algorithms to generate language has inspired people in many different cultures throughout history. But it’s in the online world that this powerful form of wordcraft may really find its natural milieu—in an environment where the identity of speakers becomes more ambiguous, and perhaps, less relevant. It remains to be seen what the consequences will be for language, communication, and our sense of human identity, which is so bound up with our ability to speak in natural language.
This is the sixth installment of a six-part series on the history of natural language processing. Last week’s post explained how an innocent Microsoft chatbot turned instantly racist on Twitter.
You can also check out our prior series on the untold history of AI. Continue reading →
#436252 After AI, Fashion and Shopping Will ...
AI and broadband are eating retail for breakfast. In the first half of 2019, we’ve seen 19 retailer bankruptcies. And the retail apocalypse is only accelerating.
What’s coming next is astounding. Why drive when you can speak? Revenue from products purchased via voice commands is expected to quadruple from today’s US$2 billion to US$8 billion by 2023.
Virtual reality, augmented reality, and 3D printing are converging with artificial intelligence, drones, and 5G to transform shopping on every dimension. And as a result, shopping is becoming dematerialized, demonetized, democratized, and delocalized… a top-to-bottom transformation of the retail world.
Welcome to Part 1 of our series on the future of retail, a deep-dive into AI and its far-reaching implications.
Let’s dive in.
A Day in the Life of 2029
Welcome to April 21, 2029, a sunny day in Dallas. You’ve got a fundraising luncheon tomorrow, but nothing to wear. The last thing you want to do is spend the day at the mall.
No sweat. Your body image data is still current, as you were scanned only a week ago. Put on your VR headset and have a conversation with your AI. “It’s time to buy a dress for tomorrow’s event” is all you have to say. In a moment, you’re teleported to a virtual clothing store. Zero travel time. No freeway traffic, parking hassles, or angry hordes wielding baby strollers.
Instead, you’ve entered your own personal clothing store. Everything is in your exact size…. And I mean everything. The store has access to nearly every designer and style on the planet. Ask your AI to show you what’s hot in Shanghai, and presto—instant fashion show. Every model strutting down the runway looks exactly like you, only dressed in Shanghai’s latest.
When you’re done selecting an outfit, your AI pays the bill. And as your new clothes are being 3D printed at a warehouse—before speeding your way via drone delivery—a digital version has been added to your personal inventory for use at future virtual events.
The cost? Thanks to an era of no middlemen, less than half of what you pay in stores today. Yet this future is not all that far off…
Digital Assistants
Let’s begin with the basics: the act of turning desire into purchase.
Most of us navigate shopping malls or online marketplaces alone, hoping to stumble across the right item and fit. But if you’re lucky enough to employ a personal assistant, you have the luxury of describing what you want to someone who knows you well enough to buy that exact right thing most of the time.
For most of us who don’t, enter the digital assistant.
Right now, the four horsemen of the retail apocalypse are waging war for our wallets. Amazon’s Alexa, Google’s Now, Apple’s Siri, and Alibaba’s Tmall Genie are going head-to-head in a battle to become the platform du jour for voice-activated, AI-assisted commerce.
For baby boomers who grew up watching Captain Kirk talk to the Enterprise’s computer on Star Trek, digital assistants seem a little like science fiction. But for millennials, it’s just the next logical step in a world that is auto-magical.
And as those millennials enter their consumer prime, revenue from products purchased via voice-driven commands is projected to leap from today’s US$2 billion to US$8 billion by 2023.
We are already seeing a major change in purchasing habits. On average, consumers using Amazon Echo spent more than standard Amazon Prime customers: US$1,700 versus US$1,300.
And as far as an AI fashion advisor goes, those too are here, courtesy of both Alibaba and Amazon. During its annual Singles’ Day (November 11) shopping festival, Alibaba’s FashionAI concept store uses deep learning to make suggestions based on advice from human fashion experts and store inventory, driving a significant portion of the day’s US$25 billion in sales.
Similarly, Amazon’s shopping algorithm makes personalized clothing recommendations based on user preferences and social media behavior.
Customer Service
But AI is disrupting more than just personalized fashion and e-commerce. Its next big break will take place in the customer service arena.
According to a recent Zendesk study, good customer service increases the possibility of a purchase by 42 percent, while bad customer service translates into a 52 percent chance of losing that sale forever. This means more than half of us will stop shopping at a store due to a single disappointing customer service interaction. These are significant financial stakes. They’re also problems perfectly suited for an AI solution.
During the 2018 Google I/O conference, CEO Sundar Pichai demoed the Google Duplex, their next generation digital assistant. Pichai played the audience a series of pre-recorded phone calls made by Google Duplex. The first call made a reservation at a restaurant, the second one booked a haircut appointment, amusing the audience with a long “hmmm” mid-call.
In neither case did the person on the other end of the phone have any idea they were talking to an AI. The system’s success speaks to how seamlessly AI can blend into our retail lives and how convenient it will continue to make them. The same technology Pichai demonstrated that can make phone calls for consumers can also answer phones for retailers—a development that’s unfolding in two different ways:
(1) Customer service coaches: First, for organizations interested in keeping humans involved, there’s Beyond Verbal, a Tel Aviv-based startup that has built an AI customer service coach. Simply by analyzing customer voice intonation, the system can tell whether the person on the phone is about to blow a gasket, is genuinely excited, or anything in between.
Based on research of over 70,000 subjects in more than 30 languages, Beyond Verbal’s app can detect 400 different markers of human moods, attitudes, and personality traits. Already it’s been integrated in call centers to help human sales agents understand and react to customer emotions, making those calls more pleasant, and also more profitable.
For example, by analyzing word choice and vocal style, Beyond Verbal’s system can tell what kind of shopper the person on the line actually is. If they’re an early adopter, the AI alerts the sales agent to offer them the latest and greatest. If they’re more conservative, it suggests items more tried-and-true.
(2) Replacing customer service agents: Second, companies like New Zealand’s Soul Machines are working to replace human customer service agents altogether. Powered by IBM’s Watson, Soul Machines builds lifelike customer service avatars designed for empathy, making them one of many helping to pioneer the field of emotionally intelligent computing.
With their technology, 40 percent of all customer service interactions are now resolved with a high degree of satisfaction, no human intervention needed. And because the system is built using neural nets, it’s continuously learning from every interaction—meaning that percentage will continue to improve.
The number of these interactions continues to grow as well. Software manufacturer Autodesk now includes a Soul Machine avatar named AVA (Autodesk Virtual Assistant) in all of its new offerings. She lives in a small window on the screen, ready to soothe tempers, troubleshoot problems, and forever banish those long tech support hold times.
For Daimler Financial Services, Soul Machines built an avatar named Sarah, who helps customers with arguably three of modernity’s most annoying tasks: financing, leasing, and insuring a car.
This isn’t just about AI—it’s about AI converging with additional exponentials. Add networks and sensors to the story and it raises the scale of disruption, upping the FQ—the frictionless quotient—in our frictionless shopping adventure.
Final Thoughts
AI makes retail cheaper, faster, and more efficient, touching everything from customer service to product delivery. It also redefines the shopping experience, making it frictionless and—once we allow AI to make purchases for us—ultimately invisible.
Prepare for a future in which shopping is dematerialized, demonetized, democratized, and delocalized—otherwise known as “the end of malls.”
Of course, if you wait a few more years, you’ll be able to take an autonomous flying taxi to Westfield’s Destination 2028—so perhaps today’s converging exponentials are not so much spelling the end of malls but rather the beginning of an experience economy far smarter, more immersive, and whimsically imaginative than today’s shopping centers.
Either way, it’s a top-to-bottom transformation of the retail world.
Over the coming blog series, we will continue our discussion of the future of retail. Stay tuned to learn new implications for your business and how to future-proof your company in an age of smart, ultra-efficient, experiential retail.
Want a copy of my next book? If you’ve enjoyed this blogified snippet of The Future is Faster Than You Think, sign up here to be eligible for an early copy and access up to $800 worth of pre-launch giveaways!
Join Me
(1) A360 Executive Mastermind: If you’re an exponentially and abundance-minded entrepreneur who would like coaching directly from me, consider joining my Abundance 360 Mastermind, a highly selective community of 360 CEOs and entrepreneurs who I coach for 3 days every January in Beverly Hills, Ca. Through A360, I provide my members with context and clarity about how converging exponential technologies will transform every industry. I’m committed to running A360 for the course of an ongoing 25-year journey as a “countdown to the Singularity.”
If you’d like to learn more and consider joining our 2020 membership, apply here.
(2) Abundance-Digital Online Community: I’ve also created a Digital/Online community of bold, abundance-minded entrepreneurs called Abundance-Digital. Abundance-Digital is Singularity University’s ‘onramp’ for exponential entrepreneurs — those who want to get involved and play at a higher level. Click here to learn more.
(Both A360 and Abundance-Digital are part of Singularity University — your participation opens you to a global community.)
This article originally appeared on diamandis.com. Read the original article here.
Image Credit: Image by Pexels from Pixabay Continue reading →
#436220 How Boston Dynamics Is Redefining Robot ...
Gif: Bob O’Connor/IEEE Spectrum
With their jaw-dropping agility and animal-like reflexes, Boston Dynamics’ bioinspired robots have always seemed to have no equal. But that preeminence hasn’t stopped the company from pushing its technology to new heights, sometimes literally. Its latest crop of legged machines can trudge up and down hills, clamber over obstacles, and even leap into the air like a gymnast. There’s no denying their appeal: Every time Boston Dynamics uploads a new video to YouTube, it quickly racks up millions of views. These are probably the first robots you could call Internet stars.
Spot
Photo: Bob O’Connor
84 cm HEIGHT
25 kg WEIGHT
5.76 km/h SPEED
SENSING: Stereo cameras, inertial measurement unit, position/force sensors
ACTUATION: 12 DC motors
POWER: Battery (90 minutes per charge)
Boston Dynamics, once owned by Google’s parent company, Alphabet, and now by the Japanese conglomerate SoftBank, has long been secretive about its designs. Few publications have been granted access to its Waltham, Mass., headquarters, near Boston. But one morning this past August, IEEE Spectrum got in. We were given permission to do a unique kind of photo shoot that day. We set out to capture the company’s robots in action—running, climbing, jumping—by using high-speed cameras coupled with powerful strobes. The results you see on this page: freeze-frames of pure robotic agility.
We also used the photos to create interactive views, which you can explore online on our Robots Guide. These interactives let you spin the robots 360 degrees, or make them walk and jump on your screen.
Boston Dynamics has amassed a minizoo of robotic beasts over the years, with names like BigDog, SandFlea, and WildCat. When we visited, we focused on the two most advanced machines the company has ever built: Spot, a nimble quadruped, and Atlas, an adult-size humanoid.
Spot can navigate almost any kind of terrain while sensing its environment. Boston Dynamics recently made it available for lease, with plans to manufacture something like a thousand units per year. It envisions Spot, or even packs of them, inspecting industrial sites, carrying out hazmat missions, and delivering packages. And its YouTube fame has not gone unnoticed: Even entertainment is a possibility, with Cirque du Soleil auditioning Spot as a potential new troupe member.
“It’s really a milestone for us going from robots that work in the lab to these that are hardened for work out in the field,” Boston Dynamics CEO Marc Raibert says in an interview.
Atlas
Photo: Bob O’Connor
150 cm HEIGHT
80 kg WEIGHT
5.4 km/h SPEED
SENSING: Lidar and stereo vision
ACTUATION: 28 hydraulic actuators
POWER: Battery
Our other photographic subject, Atlas, is Boston Dynamics’ biggest celebrity. This 150-centimeter-tall (4-foot-11-inch-tall) humanoid is capable of impressive athletic feats. Its actuators are driven by a compact yet powerful hydraulic system that the company engineered from scratch. The unique system gives the 80-kilogram (176-pound) robot the explosive strength needed to perform acrobatic leaps and flips that don’t seem possible for such a large humanoid to do. Atlas has inspired a string of parody videos on YouTube and more than a few jokes about a robot takeover.
While Boston Dynamics excels at making robots, it has yet to prove that it can sell them. Ever since its founding in 1992 as a spin-off from MIT, the company has been an R&D-centric operation, with most of its early funding coming from U.S. military programs. The emphasis on commercialization seems to have intensified after the acquisition by SoftBank, in 2017. SoftBank’s founder and CEO, Masayoshi Son, is known to love robots—and profits.
The launch of Spot is a significant step for Boston Dynamics as it seeks to “productize” its creations. Still, Raibert says his long-term goals have remained the same: He wants to build machines that interact with the world dynamically, just as animals and humans do. Has anything changed at all? Yes, one thing, he adds with a grin. In his early career as a roboticist, he used to write papers and count his citations. Now he counts YouTube views.
In the Spotlight
Photo: Bob O’Connor
Boston Dynamics designed Spot as a versatile mobile machine suitable for a variety of applications. The company has not announced how much Spot will cost, saying only that it is being made available to select customers, which will be able to lease the robot. A payload bay lets you add up to 14 kilograms of extra hardware to the robot’s back. One of the accessories that Boston Dynamics plans to offer is a 6-degrees-of-freedom arm, which will allow Spot to grasp objects and open doors.
Super Senses
Photo: Bob O’Connor
Spot’s hardware is almost entirely custom-designed. It includes powerful processing boards for control as well as sensor modules for perception. The sensors are located on the front, rear, and sides of the robot’s body. Each module consists of a pair of stereo cameras, a wide-angle camera, and a texture projector, which enhances 3D sensing in low light. The sensors allow the robot to use the navigation method known as SLAM, or simultaneous localization and mapping, to get around autonomously.
Stepping Up
Photo: Bob O’Connor
In addition to its autonomous behaviors, Spot can also be steered by a remote operator with a game-style controller. But even when in manual mode, the robot still exhibits a high degree of autonomy. If there’s an obstacle ahead, Spot will go around it. If there are stairs, Spot will climb them. The robot goes into these operating modes and then performs the related actions completely on its own, without any input from the operator. To go down a flight of stairs, Spot walks backward, an approach Boston Dynamics says provides greater stability.
Funky Feet
Gif: Bob O’Connor/IEEE Spectrum
Spot’s legs are powered by 12 custom DC motors, each geared down to provide high torque. The robot can walk forward, sideways, and backward, and trot at a top speed of 1.6 meters per second. It can also turn in place. Other gaits include crawling and pacing. In one wildly popular YouTube video, Spot shows off its fancy footwork by dancing to the pop hit “Uptown Funk.”
Robot Blood
Photo: Bob O’Connor
Atlas is powered by a hydraulic system consisting of 28 actuators. These actuators are basically cylinders filled with pressurized fluid that can drive a piston with great force. Their high performance is due in part to custom servo valves that are significantly smaller and lighter than the aerospace models that Boston Dynamics had been using in earlier designs. Though not visible from the outside, the innards of an Atlas are filled with these hydraulic actuators as well as the lines of fluid that connect them. When one of those lines ruptures, Atlas bleeds the hydraulic fluid, which happens to be red.
Next Generation
Gif: Bob O’Connor/IEEE Spectrum
The current version of Atlas is a thorough upgrade of the original model, which was built for the DARPA Robotics Challenge in 2015. The newest robot is lighter and more agile. Boston Dynamics used industrial-grade 3D printers to make key structural parts, giving the robot greater strength-to-weight ratio than earlier designs. The next-gen Atlas can also do something that its predecessor, famously, could not: It can get up after a fall.
Walk This Way
Photo: Bob O’Connor
To control Atlas, an operator provides general steering via a manual controller while the robot uses its stereo cameras and lidar to adjust to changes in the environment. Atlas can also perform certain tasks autonomously. For example, if you add special bar-code-type tags to cardboard boxes, Atlas can pick them up and stack them or place them on shelves.
Biologically Inspired
Photos: Bob O’Connor
Atlas’s control software doesn’t explicitly tell the robot how to move its joints, but rather it employs mathematical models of the underlying physics of the robot’s body and how it interacts with the environment. Atlas relies on its whole body to balance and move. When jumping over an obstacle or doing acrobatic stunts, the robot uses not only its legs but also its upper body, swinging its arms to propel itself just as an athlete would.
This article appears in the December 2019 print issue as “By Leaps and Bounds.” Continue reading →
#436188 The Blogger Behind “AI ...
Sure, artificial intelligence is transforming the world’s societies and economies—but can an AI come up with plausible ideas for a Halloween costume?
Janelle Shane has been asking such probing questions since she started her AI Weirdness blog in 2016. She specializes in training neural networks (which underpin most of today’s machine learning techniques) on quirky data sets such as compilations of knitting instructions, ice cream flavors, and names of paint colors. Then she asks the neural net to generate its own contributions to these categories—and hilarity ensues. AI is not likely to disrupt the paint industry with names like “Ronching Blue,” “Dorkwood,” and “Turdly.”
Shane’s antics have a serious purpose. She aims to illustrate the serious limitations of today’s AI, and to counteract the prevailing narrative that describes AI as well on its way to superintelligence and complete human domination. “The danger of AI is not that it’s too smart,” Shane writes in her new book, “but that it’s not smart enough.”
The book, which came out on Tuesday, is called You Look Like a Thing and I Love You. It takes its odd title from a list of AI-generated pick-up lines, all of which would at least get a person’s attention if shouted, preferably by a robot, in a crowded bar. Shane’s book is shot through with her trademark absurdist humor, but it also contains real explanations of machine learning concepts and techniques. It’s a painless way to take AI 101.
She spoke with IEEE Spectrum about the perils of placing too much trust in AI systems, the strange AI phenomenon of “giraffing,” and her next potential Halloween costume.
Janelle Shane on . . .
The un-delicious origin of her blog
“The narrower the problem, the smarter the AI will seem”
Why overestimating AI is dangerous
Giraffing!
Machine and human creativity
The un-delicious origin of her blog IEEE Spectrum: You studied electrical engineering as an undergrad, then got a master’s degree in physics. How did that lead to you becoming the comedian of AI?
Janelle Shane: I’ve been interested in machine learning since freshman year of college. During orientation at Michigan State, a professor who worked on evolutionary algorithms gave a talk about his work. It was full of the most interesting anecdotes–some of which I’ve used in my book. He told an anecdote about people setting up a machine learning algorithm to do lens design, and the algorithm did end up designing an optical system that works… except one of the lenses was 50 feet thick, because they didn’t specify that it couldn’t do that.
I started working in his lab on optics, doing ultra-short laser pulse work. I ended up doing a lot more optics than machine learning, but I always found it interesting. One day I came across a list of recipes that someone had generated using a neural net, and I thought it was hilarious and remembered why I thought machine learning was so cool. That was in 2016, ages ago in machine learning land.
Spectrum: So you decided to “establish weirdness as your goal” for your blog. What was the first weird experiment that you blogged about?
Shane: It was generating cookbook recipes. The neural net came up with ingredients like: “Take ¼ pounds of bones or fresh bread.” That recipe started out: “Brown the salmon in oil, add creamed meat to the mixture.” It was making mistakes that showed the thing had no memory at all.
Spectrum: You say in the book that you can learn a lot about AI by giving it a task and watching it flail. What do you learn?
Shane: One thing you learn is how much it relies on surface appearances rather than deep understanding. With the recipes, for example: It got the structure of title, category, ingredients, instructions, yield at the end. But when you look more closely, it has instructions like “Fold the water and roll it into cubes.” So clearly this thing does not understand water, let alone the other things. It’s recognizing certain phrases that tend to occur, but it doesn’t have a concept that these recipes are describing something real. You start to realize how very narrow the algorithms in this world are. They only know exactly what we tell them in our data set.
BACK TO TOP↑ “The narrower the problem, the smarter the AI will seem” Spectrum: That makes me think of DeepMind’s AlphaGo, which was universally hailed as a triumph for AI. It can play the game of Go better than any human, but it doesn’t know what Go is. It doesn’t know that it’s playing a game.
Shane: It doesn’t know what a human is, or if it’s playing against a human or another program. That’s also a nice illustration of how well these algorithms do when they have a really narrow and well-defined problem.
The narrower the problem, the smarter the AI will seem. If it’s not just doing something repeatedly but instead has to understand something, coherence goes down. For example, take an algorithm that can generate images of objects. If the algorithm is restricted to birds, it could do a recognizable bird. If this same algorithm is asked to generate images of any animal, if its task is that broad, the bird it generates becomes an unrecognizable brown feathered smear against a green background.
Spectrum: That sounds… disturbing.
Shane: It’s disturbing in a weird amusing way. What’s really disturbing is the humans it generates. It hasn’t seen them enough times to have a good representation, so you end up with an amorphous, usually pale-faced thing with way too many orifices. If you asked it to generate an image of a person eating pizza, you’ll have blocks of pizza texture floating around. But if you give that image to an image-recognition algorithm that was trained on that same data set, it will say, “Oh yes, that’s a person eating pizza.”
BACK TO TOP↑ Why overestimating AI is dangerous Spectrum: Do you see it as your role to puncture the AI hype?
Shane: I do see it that way. Not a lot of people are bringing out this side of AI. When I first started posting my results, I’d get people saying, “I don’t understand, this is AI, shouldn’t it be better than this? Why doesn't it understand?” Many of the impressive examples of AI have a really narrow task, or they’ve been set up to hide how little understanding it has. There’s a motivation, especially among people selling products based on AI, to represent the AI as more competent and understanding than it actually is.
Spectrum: If people overestimate the abilities of AI, what risk does that pose?
Shane: I worry when I see people trusting AI with decisions it can’t handle, like hiring decisions or decisions about moderating content. These are really tough tasks for AI to do well on. There are going to be a lot of glitches. I see people saying, “The computer decided this so it must be unbiased, it must be objective.”
“If the algorithm’s task is to replicate human hiring decisions, it’s going to glom onto gender bias and race bias.”
—Janelle Shane, AI Weirdness blogger
That’s another thing I find myself highlighting in the work I’m doing. If the data includes bias, the algorithm will copy that bias. You can’t tell it not to be biased, because it doesn’t understand what bias is. I think that message is an important one for people to understand.
If there’s bias to be found, the algorithm is going to go after it. It’s like, “Thank goodness, finally a signal that’s reliable.” But for a tough problem like: Look at these resumes and decide who’s best for the job. If its task is to replicate human hiring decisions, it’s going to glom onto gender bias and race bias. There’s an example in the book of a hiring algorithm that Amazon was developing that discriminated against women, because the historical data it was trained on had that gender bias.
Spectrum: What are the other downsides of using AI systems that don’t really understand their tasks?
Shane: There is a risk in putting too much trust in AI and not examining its decisions. Another issue is that it can solve the wrong problems, without anyone realizing it. There have been a couple of cases in medicine. For example, there was an algorithm that was trained to recognize things like skin cancer. But instead of recognizing the actual skin condition, it latched onto signals like the markings a surgeon makes on the skin, or a ruler placed there for scale. It was treating those things as a sign of skin cancer. It’s another indication that these algorithms don’t understand what they’re looking at and what the goal really is.
BACK TO TOP↑ Giraffing Spectrum: In your blog, you often have neural nets generate names for things—such as ice cream flavors, paint colors, cats, mushrooms, and types of apples. How do you decide on topics?
Shane: Quite often it’s because someone has written in with an idea or a data set. They’ll say something like, “I’m the MIT librarian and I have a whole list of MIT thesis titles.” That one was delightful. Or they’ll say, “We are a high school robotics team, and we know where there’s a list of robotics team names.” It’s fun to peek into a different world. I have to be careful that I’m not making fun of the naming conventions in the field. But there’s a lot of humor simply in the neural net’s complete failure to understand. Puns in particular—it really struggles with puns.
Spectrum: Your blog is quite absurd, but it strikes me that machine learning is often absurd in itself. Can you explain the concept of giraffing?
Shane: This concept was originally introduced by [internet security expert] Melissa Elliott. She proposed this phrase as a way to describe the algorithms’ tendency to see giraffes way more often than would be likely in the real world. She posted a whole bunch of examples, like a photo of an empty field in which an image-recognition algorithm has confidently reported that there are giraffes. Why does it think giraffes are present so often when they’re actually really rare? Because they’re trained on data sets from online. People tend to say, “Hey look, a giraffe!” And then take a photo and share it. They don’t do that so often when they see an empty field with rocks.
There’s also a chatbot that has a delightful quirk. If you show it some photo and ask it how many giraffes are in the picture, it will always answer with some non zero number. This quirk comes from the way the training data was generated: These were questions asked and answered by humans online. People tended not to ask the question “How many giraffes are there?” when the answer was zero. So you can show it a picture of someone holding a Wii remote. If you ask it how many giraffes are in the picture, it will say two.
BACK TO TOP↑ Machine and human creativity Spectrum: AI can be absurd, and maybe also creative. But you make the point that AI art projects are really human-AI collaborations: Collecting the data set, training the algorithm, and curating the output are all artistic acts on the part of the human. Do you see your work as a human-AI art project?
Shane: Yes, I think there is artistic intent in my work; you could call it literary or visual. It’s not so interesting to just take a pre-trained algorithm that’s been trained on utilitarian data, and tell it to generate a bunch of stuff. Even if the algorithm isn’t one that I’ve trained myself, I think about, what is it doing that’s interesting, what kind of story can I tell around it, and what do I want to show people.
The Halloween costume algorithm “was able to draw on its knowledge of which words are related to suggest things like sexy barnacle.”
—Janelle Shane, AI Weirdness blogger
Spectrum: For the past three years you’ve been getting neural nets to generate ideas for Halloween costumes. As language models have gotten dramatically better over the past three years, are the costume suggestions getting less absurd?
Shane: Yes. Before I would get a lot more nonsense words. This time I got phrases that were related to real things in the data set. I don’t believe the training data had the words Flying Dutchman or barnacle. But it was able to draw on its knowledge of which words are related to suggest things like sexy barnacle and sexy Flying Dutchman.
Spectrum: This year, I saw on Twitter that someone made the gothy giraffe costume happen. Would you ever dress up for Halloween in a costume that the neural net suggested?
Shane: I think that would be fun. But there would be some challenges. I would love to go as the sexy Flying Dutchman. But my ambition may constrict me to do something more like a list of leg parts.
BACK TO TOP↑ Continue reading →
#436186 Video Friday: Invasion of the Mini ...
Video Friday is your weekly selection of awesome robotics videos, collected by your Automaton bloggers. We’ll also be posting a weekly calendar of upcoming robotics events for the next few months; here's what we have so far (send us your events!):
DARPA SubT Urban Circuit – February 18-27, 2020 – Olympia, Wash., USA
Let us know if you have suggestions for next week, and enjoy today’s videos.
There will be a Mini-Cheetah Workshop (sponsored by Naver Labs) a year from now at IROS 2020 in Las Vegas. Mini-Cheetahs for everyone!
That’s just a rendering, of course, but this isn’t:
[ MCW ]
I was like 95 percent sure that the Urban Circuit of the DARPA SubT Challenge was going to be in something very subway station-y. Oops!
In the Subterranean (SubT) Challenge, teams deploy autonomous ground and aerial systems to attempt to map, identify, and report artifacts along competition courses in underground environments. The artifacts represent items a first responder or service member may encounter in unknown underground sites. This video provides a preview of the Urban Circuit event location. The Urban Circuit is scheduled for February 18-27, 2020, at Satsop Business Park west of Olympia, Washington.
[ SubT ]
Researchers at SEAS and the Wyss Institute for Biologically Inspired Engineering have developed a resilient RoboBee powered by soft artificial muscles that can crash into walls, fall onto the floor, and collide with other RoboBees without being damaged. It is the first microrobot powered by soft actuators to achieve controlled flight.
To solve the problem of power density, the researchers built upon the electrically-driven soft actuators developed in the lab of David Clarke, the Extended Tarr Family Professor of Materials. These soft actuators are made using dielectric elastomers, soft materials with good insulating properties, that deform when an electric field is applied. By improving the electrode conductivity, the researchers were able to operate the actuator at 500 Hertz, on par with the rigid actuators used previously in similar robots.
Next, the researchers aim to increase the efficiency of the soft-powered robot, which still lags far behind more traditional flying robots.
[ Harvard ]
We present a system for fast and robust handovers with a robot character, together with a user study investigating the effect of robot speed and reaction time on perceived interaction quality. The system can match and exceed human speeds and confirms that users prefer human-level timing.
In a 3×3 user study, we vary the speed of the robot and add variable sensorimotor delays. We evaluate the social perception of the robot using the Robot Social Attribute Scale (RoSAS). Inclusion of a small delay, mimicking the delay of the human sensorimotor system, leads to an improvement in perceived qualities over both no delay and long delay conditions. Specifically, with no delay the robot is perceived as more discomforting and with a long delay, it is perceived as less warm.
[ Disney Research ]
When cars are autonomous, they’re not going to be able to pump themselves full of gas. Or, more likely, electrons. Kuka has the solution.
[ Kuka ]
This looks like fun, right?
[ Robocoaster ]
NASA is leading the way in the use of On-orbit Servicing, Assembly, and Manufacturing to enable large, persistent, upgradable, and maintainable spacecraft. This video was developed by the Advanced Concepts Lab (ACL) at NASA Langley Research Center.
[ NASA ]
The noisiest workshop by far at Humanoids last month (by far) was Musical Interactions With Humanoids, the end result of which was this:
[ Workshop ]
IROS is an IEEE event, and in furthering the IEEE mission to benefit humanity through technological innovation, IROS is doing a great job. But don’t take it from us – we are joined by IEEE President-Elect Professor Toshio Fukuda to find out a bit more about the impact events like IROS can have, as well as examine some of the issues around intelligent robotics and systems – from privacy to transparency of the systems at play.
[ IROS ]
Speaking of IROS, we hope you’ve been enjoying our coverage. We have already featured Harvard’s strange sea-urchin-inspired robot and a Japanese quadruped that can climb vertical ladders, with more stories to come over the next several weeks.
In the mean time, enjoy these 10 videos from the conference (as usual, we’re including the title, authors, and abstract for each—if you’d like more details about any of these projects, let us know and we’ll find out more for you).
“A Passive Closing, Tendon Driven, Adaptive Robot Hand for Ultra-Fast, Aerial Grasping and Perching,” by Andrew McLaren, Zak Fitzgerald, Geng Gao, and Minas Liarokapis from the University of Auckland, New Zealand.
Current grasping methods for aerial vehicles are slow, inaccurate and they cannot adapt to any target object. Thus, they do not allow for on-the-fly, ultra-fast grasping. In this paper, we present a passive closing, adaptive robot hand design that offers ultra-fast, aerial grasping for a wide range of everyday objects. We investigate alternative uses of structural compliance for the development of simple, adaptive robot grippers and hands and we propose an appropriate quick release mechanism that facilitates an instantaneous grasping execution. The quick release mechanism is triggered by a simple distance sensor. The proposed hand utilizes only two actuators to control multiple degrees of freedom over three fingers and it retains the superior grasping capabilities of adaptive grasping mechanisms, even under significant object pose or other environmental uncertainties. The hand achieves a grasping time of 96 ms, a maximum grasping force of 56 N and it is able to secure objects of various shapes at high speeds. The proposed hand can serve as the end-effector of grasping capable Unmanned Aerial Vehicle (UAV) platforms and it can offer perching capabilities, facilitating autonomous docking.
“Unstructured Terrain Navigation and Topographic Mapping With a Low-Cost Mobile Cuboid Robot,” by Andrew S. Morgan, Robert L. Baines, Hayley McClintock, and Brian Scassellati from Yale University, USA.
Current robotic terrain mapping techniques require expensive sensor suites to construct an environmental representation. In this work, we present a cube-shaped robot that can roll through unstructured terrain and construct a detailed topographic map of the surface that it traverses in real time with low computational and monetary expense. Our approach devolves many of the complexities of locomotion and mapping to passive mechanical features. Namely, rolling movement is achieved by sequentially inflating latex bladders that are located on four sides of the robot to destabilize and tip it. Sensing is achieved via arrays of fine plastic pins that passively conform to the geometry of underlying terrain, retracting into the cube. We developed a topography by shade algorithm to process images of the displaced pins to reconstruct terrain contours and elevation. We experimentally validated the efficacy of the proposed robot through object mapping and terrain locomotion tasks.
“Toward a Ballbot for Physically Leading People: A Human-Centered Approach,” by Zhongyu Li and Ralph Hollis from Carnegie Mellon University, USA.
This work presents a new human-centered method for indoor service robots to provide people with physical assistance and active guidance while traveling through congested and narrow spaces. As most previous work is robot-centered, this paper develops an end-to-end framework which includes a feedback path of the measured human positions. The framework combines a planning algorithm and a human-robot interaction module to guide the led person to a specified planned position. The approach is deployed on a person-size dynamically stable mobile robot, the CMU ballbot. Trials were conducted where the ballbot physically led a blindfolded person to safely navigate in a cluttered environment.
“Achievement of Online Agile Manipulation Task for Aerial Transformable Multilink Robot,” by Fan Shi, Moju Zhao, Tomoki Anzai, Keita Ito, Xiangyu Chen, Kei Okada, and Masayuki Inaba from the University of Tokyo, Japan.
Transformable aerial robots are favorable in aerial manipulation tasks for their flexible ability to change configuration during the flight. By assuming robot keeping in the mild motion, the previous researches sacrifice aerial agility to simplify the complex non-linear system into a single rigid body with a linear controller. In this paper, we present a framework towards agile swing motion for the transformable multi-links aerial robot. We introduce a computational-efficient non-linear model predictive controller and joints motion primitive frame-work to achieve agile transforming motions and validate with a novel robot named HYRURS-X. Finally, we implement our framework under a table tennis task to validate the online and agile performance.
“Small-Scale Compliant Dual Arm With Tail for Winged Aerial Robots,” by Alejandro Suarez, Manuel Perez, Guillermo Heredia, and Anibal Ollero from the University of Seville, Spain.
Winged aerial robots represent an evolution of aerial manipulation robots, replacing the multirotor vehicles by fixed or flapping wing platforms. The development of this morphology is motivated in terms of efficiency, endurance and safety in some inspection operations where multirotor platforms may not be suitable. This paper presents a first prototype of compliant dual arm as preliminary step towards the realization of a winged aerial robot capable of perching and manipulating with the wings folded. The dual arm provides 6 DOF (degrees of freedom) for end effector positioning in a human-like kinematic configuration, with a reach of 25 cm (half-scale w.r.t. the human arm), and 0.2 kg weight. The prototype is built with micro metal gear motors, measuring the joint angles and the deflection with small potentiometers. The paper covers the design, electronics, modeling and control of the arms. Experimental results in test-bench validate the developed prototype and its functionalities, including joint position and torque control, bimanual grasping, the dynamic equilibrium with the tail, and the generation of 3D maps with laser sensors attached at the arms.
“A Novel Small-Scale Turtle-inspired Amphibious Spherical Robot,” by Huiming Xing, Shuxiang Guo, Liwei Shi, Xihuan Hou, Yu Liu, Huikang Liu, Yao Hu, Debin Xia, and Zan Li from Beijing Institute of Technology, China.
This paper describes a novel small-scale turtle-inspired Amphibious Spherical Robot (ASRobot) to accomplish exploration tasks in the restricted environment, such as amphibious areas and narrow underwater cave. A Legged, Multi-Vectored Water-Jet Composite Propulsion Mechanism (LMVWCPM) is designed with four legs, one of which contains three connecting rod parts, one water-jet thruster and three joints driven by digital servos. Using this mechanism, the robot is able to walk like amphibious turtles on various terrains and swim flexibly in submarine environment. A simplified kinematic model is established to analyze crawling gaits. With simulation of the crawling gait, the driving torques of different joints contributed to the choice of servos and the size of links of legs. Then we also modeled the robot in water and proposed several underwater locomotion. In order to assess the performance of the proposed robot, a series of experiments were carried out in the lab pool and on flat ground using the prototype robot. Experiments results verified the effectiveness of LMVWCPM and the amphibious control approaches.
“Advanced Autonomy on a Low-Cost Educational Drone Platform,” by Luke Eller, Theo Guerin, Baichuan Huang, Garrett Warren, Sophie Yang, Josh Roy, and Stefanie Tellex from Brown University, USA.
PiDrone is a quadrotor platform created to accompany an introductory robotics course. Students build an autonomous flying robot from scratch and learn to program it through assignments and projects. Existing educational robots do not have significant autonomous capabilities, such as high-level planning and mapping. We present a hardware and software framework for an autonomous aerial robot, in which all software for autonomy can run onboard the drone, implemented in Python. We present an Unscented Kalman Filter (UKF) for accurate state estimation. Next, we present an implementation of Monte Carlo (MC) Localization and Fast-SLAM for Simultaneous Localization and Mapping (SLAM). The performance of UKF, localization, and SLAM is tested and compared to ground truth, provided by a motion-capture system. Our evaluation demonstrates that our autonomous educational framework runs quickly and accurately on a Raspberry Pi in Python, making it ideal for use in educational settings.
“FlightGoggles: Photorealistic Sensor Simulation for Perception-driven Robotics using Photogrammetry and Virtual Reality,” by Winter Guerra, Ezra Tal, Varun Murali, Gilhyun Ryou and Sertac Karaman from the Massachusetts Institute of Technology, USA.
FlightGoggles is a photorealistic sensor simulator for perception-driven robotic vehicles. The key contributions of FlightGoggles are twofold. First, FlightGoggles provides photorealistic exteroceptive sensor simulation using graphics assets generated with photogrammetry. Second, it provides the ability to combine (i) synthetic exteroceptive measurements generated in silico in real time and (ii) vehicle dynamics and proprioceptive measurements generated in motio by vehicle(s) in flight in a motion-capture facility. FlightGoggles is capable of simulating a virtual-reality environment around autonomous vehicle(s) in flight. While a vehicle is in flight in the FlightGoggles virtual reality environment, exteroceptive sensors are rendered synthetically in real time while all complex dynamics are generated organically through natural interactions of the vehicle. The FlightGoggles framework allows for researchers to accelerate development by circumventing the need to estimate complex and hard-to-model interactions such as aerodynamics, motor mechanics, battery electrochemistry, and behavior of other agents. The ability to perform vehicle-in-the-loop experiments with photorealistic exteroceptive sensor simulation facilitates novel research directions involving, e.g., fast and agile autonomous flight in obstacle-rich environments, safe human interaction, and flexible sensor selection. FlightGoggles has been utilized as the main test for selecting nine teams that will advance in the AlphaPilot autonomous drone racing challenge. We survey approaches and results from the top AlphaPilot teams, which may be of independent interest. FlightGoggles is distributed as open-source software along with the photorealistic graphics assets for several simulation environments, under the MIT license at http://flightgoggles.mit.edu.
“An Autonomous Quadrotor System for Robust High-Speed Flight Through Cluttered Environments Without GPS,” by Marc Rigter, Benjamin Morrell, Robert G. Reid, Gene B. Merewether, Theodore Tzanetos, Vinay Rajur, KC Wong, and Larry H. Matthies from University of Sydney, Australia; NASA Jet Propulsion Laboratory, California Institute of Technology, USA; and Georgia Institute of Technology, USA.
Robust autonomous flight without GPS is key to many emerging drone applications, such as delivery, search and rescue, and warehouse inspection. These and other appli- cations require accurate trajectory tracking through cluttered static environments, where GPS can be unreliable, while high- speed, agile, flight can increase efficiency. We describe the hardware and software of a quadrotor system that meets these requirements with onboard processing: a custom 300 mm wide quadrotor that uses two wide-field-of-view cameras for visual- inertial motion tracking and relocalization to a prior map. Collision-free trajectories are planned offline and tracked online with a custom tracking controller. This controller includes compensation for drag and variability in propeller performance, enabling accurate trajectory tracking, even at high speeds where aerodynamic effects are significant. We describe a system identification approach that identifies quadrotor-specific parameters via maximum likelihood estimation from flight data. Results from flight experiments are presented, which 1) validate the system identification method, 2) show that our controller with aerodynamic compensation reduces tracking error by more than 50% in both horizontal flights at up to 8.5 m/s and vertical flights at up to 3.1 m/s compared to the state-of-the-art, and 3) demonstrate our system tracking complex, aggressive, trajectories.
“Morphing Structure for Changing Hydrodynamic Characteristics of a Soft Underwater Walking Robot,” by Michael Ishida, Dylan Drotman, Benjamin Shih, Mark Hermes, Mitul Luhar, and Michael T. Tolley from the University of California, San Diego (UCSD) and University of Southern California, USA.
Existing platforms for underwater exploration and inspection are often limited to traversing open water and must expend large amounts of energy to maintain a position in flow for long periods of time. Many benthic animals overcome these limitations using legged locomotion and have different hydrodynamic profiles dictated by different body morphologies. This work presents an underwater legged robot with soft legs and a soft inflatable morphing body that can change shape to influence its hydrodynamic characteristics. Flow over the morphing body separates behind the trailing edge of the inflated shape, so whether the protrusion is at the front, center, or back of the robot influences the amount of drag and lift. When the legged robot (2.87 N underwater weight) needs to remain stationary in flow, an asymmetrically inflated body resists sliding by reducing lift on the body by 40% (from 0.52 N to 0.31 N) at the highest flow rate tested while only increasing drag by 5.5% (from 1.75 N to 1.85 N). When the legged robot needs to walk with flow, a large inflated body is pushed along by the flow, causing the robot to walk 16% faster than it would with an uninflated body. The body shape significantly affects the ability of the robot to walk against flow as it is able to walk against 0.09 m/s flow with the uninflated body, but is pushed backwards with a large inflated body. We demonstrate that the robot can detect changes in flow velocity with a commercial force sensor and respond by morphing into a hydrodynamically preferable shape. Continue reading →