Tag Archives: respond
#436258 For Centuries, People Dreamed of a ...
This is part six of a six-part series on the history of natural language processing.
In February of this year, OpenAI, one of the foremost artificial intelligence labs in the world, announced that a team of researchers had built a powerful new text generator called the Generative Pre-Trained Transformer 2, or GPT-2 for short. The researchers used a reinforcement learning algorithm to train their system on a broad set of natural language processing (NLP) capabilities, including reading comprehension, machine translation, and the ability to generate long strings of coherent text.
But as is often the case with NLP technology, the tool held both great promise and great peril. Researchers and policy makers at the lab were concerned that their system, if widely released, could be exploited by bad actors and misappropriated for “malicious purposes.”
The people of OpenAI, which defines its mission as “discovering and enacting the path to safe artificial general intelligence,” were concerned that GPT-2 could be used to flood the Internet with fake text, thereby degrading an already fragile information ecosystem. For this reason, OpenAI decided that it would not release the full version of GPT-2 to the public or other researchers.
GPT-2 is an example of a technique in NLP called language modeling, whereby the computational system internalizes a statistical blueprint of a text so it’s able to mimic it. Just like the predictive text on your phone—which selects words based on words you’ve used before—GPT-2 can look at a string of text and then predict what the next word is likely to be based on the probabilities inherent in that text.
GPT-2 can be seen as a descendant of the statistical language modeling that the Russian mathematician A. A. Markov developed in the early 20th century (covered in part three of this series).
GPT-2 used cutting-edge machine learning algorithms to do linguistic analysis with over 1.5 million parameters.
What’s different with GPT-2, though, is the scale of the textual data modeled by the system. Whereas Markov analyzed a string of 20,000 letters to create a rudimentary model that could predict the likelihood of the next letter of a text being a consonant or a vowel, GPT-2 used 8 million articles scraped from Reddit to predict what the next word might be within that entire dataset.
And whereas Markov manually trained his model by counting only two parameters—vowels and consonants—GPT-2 used cutting-edge machine learning algorithms to do linguistic analysis with over 1.5 million parameters, burning through huge amounts of computational power in the process.
The results were impressive. In their blog post, OpenAI reported that GPT-2 could generate synthetic text in response to prompts, mimicking whatever style of text it was shown. If you prompt the system with a line of William Blake’s poetry, it can generate a line back in the Romantic poet’s style. If you prompt the system with a cake recipe, you get a newly invented recipe in response.
Perhaps the most compelling feature of GPT-2 is that it can answer questions accurately. For example, when OpenAI researchers asked the system, “Who wrote the book The Origin of Species?”—it responded: “Charles Darwin.” While only able to respond accurately some of the time, the feature does seem to be a limited realization of Gottfried Leibniz’s dream of a language-generating machine that could answer any and all human questions (described in part two of this series).
After observing the power of the new system in practice, OpenAI elected not to release the fully trained model. In the lead up to its release in February, there had been heightened awareness about “deepfakes”—synthetic images and videos, generated via machine learning techniques, in which people do and say things they haven’t really done and said. Researchers at OpenAI worried that GPT-2 could be used to essentially create deepfake text, making it harder for people to trust textual information online.
Responses to this decision varied. On one hand, OpenAI’s caution prompted an overblown reaction in the media, with articles about the “dangerous” technology feeding into the Frankenstein narrative that often surrounds developments in AI.
Others took issue with OpenAI’s self-promotion, with some even suggesting that OpenAI purposefully exaggerated GPT-2s power in order to create hype—while contravening a norm in the AI research community, where labs routinely share data, code, and pre-trained models. As machine learning researcher Zachary Lipton tweeted, “Perhaps what's *most remarkable* about the @OpenAI controversy is how *unremarkable* the technology is. Despite their outsize attention & budget, the research itself is perfectly ordinary—right in the main branch of deep learning NLP research.”
OpenAI stood by its decision to release only a limited version of GPT-2, but has since released larger models for other researchers and the public to experiment with. As yet, there has been no reported case of a widely distributed fake news article generated by the system. But there have been a number of interesting spin-off projects, including GPT-2 poetry and a webpage where you can prompt the system with questions yourself.
Mimicking humans on Reddit, the bots have long conversations about a variety of topics, including conspiracy theories and
Star Wars movies.
There’s even a Reddit group populated entirely with text produced by GPT-2-powered bots. Mimicking humans on Reddit, the bots have long conversations about a variety of topics, including conspiracy theories and Star Wars movies.
This bot-powered conversation may signify the new condition of life online, where language is increasingly created by a combination of human and non-human agents, and where maintaining the distinction between human and non-human, despite our best efforts, is increasingly difficult.
The idea of using rules, mechanisms, and algorithms to generate language has inspired people in many different cultures throughout history. But it’s in the online world that this powerful form of wordcraft may really find its natural milieu—in an environment where the identity of speakers becomes more ambiguous, and perhaps, less relevant. It remains to be seen what the consequences will be for language, communication, and our sense of human identity, which is so bound up with our ability to speak in natural language.
This is the sixth installment of a six-part series on the history of natural language processing. Last week’s post explained how an innocent Microsoft chatbot turned instantly racist on Twitter.
You can also check out our prior series on the untold history of AI. Continue reading
#436174 How Selfish Are You? It Matters for ...
Our personalities impact almost everything we do, from the career path we choose to the way we interact with others to how we spend our free time.
But what about the way we drive—could personality be used to predict whether a driver will cut someone off, speed, or, say, zoom through a yellow light instead of braking?
There must be something to the idea that those of us who are more mild-mannered are likely to drive a little differently than the more assertive among us. At least, that’s what a team from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) is betting on.
“Working with and around humans means figuring out their intentions to better understand their behavior,” said graduate student Wilko Schwarting, lead author on the paper published this week in Proceedings of the National Academy of Sciences. “People’s tendencies to be collaborative or competitive often spills over into how they behave as drivers. In this paper we sought to understand if this was something we could actually quantify.”
The team is building a model that classifies drivers according to how selfish or selfless they are, then uses that classification to help predict how drivers will behave on the road. Ideally, the system will help improve safety for self-driving cars by integrating a degree of ‘humanity’ into how their software perceives its surroundings; right now, human drivers and their cars are just another object, not much different than a tree or a sign.
But unlike trees and signs, humans have behavioral patterns and motivations. For greater success on roads that are still dominated by us mercurial humans, the CSAIL team believes, driverless cars should take our personalities into account.
How Selfish Are You?
About how important is your own well-being to you vs. the well-being of other people? It’s a hard question to answer without specifying who the other people are; your answer would likely differ if we’re talking about your friends, loved ones, strangers, or people you actively dislike.
In social psychology, social value orientation (SVO) refers to people’s preferences for allocating resources between themselves and others. The two broad categories people can fall into are pro-social (people who are more cooperative, and expect cooperation from others) and pro-self (pretty self-explanatory: “Me first!”).
Based on drivers’ behavior in two different road scenarios—merging and making a left turn—the CSAIL team’s model classified drivers as pro-social or egoistic. Slowing down to let someone merge into your lane in front of you would earn you a pro-social classification, while cutting someone off or not slowing down to allow a left turn would make you egoistic.
On the Road
The system then uses these classifications to model and predict drivers’ behavior. The team demonstrated that using their model, errors in predicting the behavior of other cars were reduced by 25 percent.
In a left-turn simulation, for example, their car would wait when an approaching car had an egoistic driver, but go ahead and make the turn when the other driver was prosocial. Similarly, if a self-driving car is trying to merge into the left lane and it’s identified the drivers in that lane as egoistic, it will assume they won’t slow down to let it in, and will wait to merge behind them. If, on the other hand, the self-driving car knows that the human drivers in the left lane are prosocial, it will attempt to merge between them since they’re likely to let it in.
So how does this all translate to better safety?
It’s essentially a starting point for imbuing driverless cars with some of the abilities and instincts that are innate to humans. If you’re driving down the highway and you see a car swerving outside its lane, you’ll probably distance yourself from that car because you know it’s more likely to cause an accident. Our senses take in information we can immediately interpret and act on, and this includes predictions about what might happen based on observations of what just happened. Our observations can clue us in to a driver’s personality (the swerver must be careless) or simply to the circumstances of a given moment (the swerver was texting).
But right now, self-driving cars assume all human drivers behave the same way, and they have no mechanism for incorporating observations about behavioral differences between drivers into their decisions.
“Creating more human-like behavior in autonomous vehicles (AVs) is fundamental for the safety of passengers and surrounding vehicles, since behaving in a predictable manner enables humans to understand and appropriately respond to the AV’s actions,” said Schwarting.
Though it may feel a bit unsettling to think of an algorithm lumping you into a category and driving accordingly around you, maybe it’s less unsettling than thinking of self-driving cars as pre-programmed, oblivious robots unable to adapt to different driving styles.
The team’s next step is to apply their model to pedestrians, bikes, and other agents frequently found in driving environments. They also plan to look into other robotic systems acting among people, like household robots, and integrating social value orientation into their algorithms.
Image Credit: Image by Free-Photos from Pixabay Continue reading