Tag Archives: human
A scientist from the Graduate School of Engineering at Osaka University proposed a numerical scale to quantify the expressiveness of robotic android faces. By focusing on the range of deformation of the face instead of the number of mechanical actuators, the new system can more accurately measure how much robots are able to mimic actual human emotions. This work, published in Advanced Robotics, may help develop more lifelike robots that can rapidly convey information. Continue reading
Progress in artificial intelligence has enabled the creation of AIs that perform tasks previously thought only possible for humans, such as translating languages, driving cars, playing board games at world-champion level, and extracting the structure of proteins. However, each of these AIs has been designed and exhaustively trained for a single task and has the ability to learn only what’s needed for that specific task.
Recent AIs that produce fluent text, including in conversation with humans, and generate impressive and unique art can give the false impression of a mind at work. But even these are specialized systems that carry out narrowly defined tasks and require massive amounts of training.
It still remains a daunting challenge to combine multiple AIs into one that can learn and perform many different tasks, much less pursue the full breadth of tasks performed by humans or leverage the range of experiences available to humans that reduce the amount of data otherwise required to learn how to perform these tasks. The best current AIs in this respect, such as AlphaZero and Gato, can handle a variety of tasks that fit a single mold, like game-playing. Artificial general intelligence (AGI) that is capable of a breadth of tasks remains elusive.
Ultimately, AGIs need to be able to interact effectively with each other and people in various physical environments and social contexts, integrate the wide varieties of skill and knowledge needed to do so, and learn flexibly and efficiently from these interactions.
Building AGIs comes down to building artificial minds, albeit greatly simplified compared to human minds. And to build an artificial mind, you need to start with a model of cognition.
From Human to Artificial General Intelligence
Humans have an almost unbounded set of skills and knowledge, and quickly learn new information without needing to be re-engineered to do so. It is conceivable that an AGI can be built using an approach that is fundamentally different from human intelligence. However, as three longtime researchers in AI and cognitive science, our approach is to draw inspiration and insights from the structure of the human mind. We are working toward AGI by trying to better understand the human mind, and better understand the human mind by working toward AGI.
From research in neuroscience, cognitive science, and psychology, we know that the human brain is neither a huge homogeneous set of neurons nor a massive set of task-specific programs that each solves a single problem. Instead, it is a set of regions with different properties that support the basic cognitive capabilities that together form the human mind.
These capabilities include perception and action; short-term memory for what is relevant in the current situation; long-term memories for skills, experience, and knowledge; reasoning and decision making; emotion and motivation; and learning new skills and knowledge from the full range of what a person perceives and experiences.
Instead of focusing on specific capabilities in isolation, AI pioneer Allen Newell in 1990 suggested developing Unified Theories of Cognition that integrate all aspects of human thought. Researchers have been able to build software programs called cognitive architectures that embody such theories, making it possible to test and refine them.
Cognitive architectures are grounded in multiple scientific fields with distinct perspectives. Neuroscience focuses on the organization of the human brain, cognitive psychology on human behavior in controlled experiments, and artificial intelligence on useful capabilities.
The Common Model of Cognition
We have been involved in the development of three cognitive architectures: ACT-R, Soar, and Sigma. Other researchers have also been busy on alternative approaches. One paper identified nearly 50 active cognitive architectures. This proliferation of architectures is partly a direct reflection of the multiple perspectives involved, and partly an exploration of a wide array of potential solutions. Yet, whatever the cause, it raises awkward questions both scientifically and with respect to finding a coherent path to AGI.
Fortunately, this proliferation has brought the field to a major inflection point. The three of us have identified a striking convergence among architectures, reflecting a combination of neural, behavioral, and computational studies. In response, we initiated a communitywide effort to capture this convergence in a manner akin to the Standard Model of Particle Physics that emerged in the second half of the 20th century.
This basic model of cognition both explains human thinking and provides a blueprint for true artificial intelligence. Andrea Stocco, CC BY-ND
This Common Model of Cognition divides humanlike thought into multiple modules, with a short-term memory module at the center of the model. The other modules (perception, action, skills, and knowledge) interact through it.
Learning, rather than occurring intentionally, happens automatically as a side effect of processing. In other words, you don’t decide what is stored in long-term memory. Instead, the architecture determines what is learned based on whatever you do think about. This can yield learning of new facts you are exposed to or new skills that you attempt. It can also yield refinements to existing facts and skills.
The modules themselves operate in parallel; for example, allowing you to remember something while listening and looking around your environment. Each module’s computations are massively parallel, meaning many small computational steps happening at the same time. For example, in retrieving a relevant fact from a vast trove of prior experiences, the long-term memory module can determine the relevance of all known facts simultaneously, in a single step.
Guiding the Way to Artificial General Intelligence
The Common Model is based on the current consensus in research in cognitive architectures and has the potential to guide research on both natural and artificial general intelligence. When used to model communication patterns in the brain, the Common Model yields more accurate results than leading models from neuroscience. This extends its ability to model humans—the one system proven capable of general intelligence—beyond cognitive considerations to include the organization of the brain itself.
We are starting to see efforts to relate existing cognitive architectures to the Common Model and to use it as a baseline for new work—for example, an interactive AI designed to coach people toward better health behavior. One of us was involved in developing an AI based on Soar, dubbed Rosie, that learns new tasks via instructions in English from human teachers. It learns 60 different puzzles and games and can transfer what it learns from one game to another. It also learns to control a mobile robot for tasks such as fetching and delivering packages and patrolling buildings.
Rosie is just one example of how to build an AI that approaches AGI via a cognitive architecture that is well characterized by the Common Model. In this case, the AI automatically learns new skills and knowledge during general reasoning that combines natural language instruction from humans and a minimal amount of experience—in other words, an AI that functions more like a human mind than today’s AIs, which learn via brute computing force and massive amounts of data.
From a broader AGI perspective, we look to the Common Model both as a guide in developing such architectures and AIs, and as a means for integrating the insights derived from those attempts into a consensus that ultimately leads to AGI.
This article is republished from The Conversation under a Creative Commons license. Read the original article.
Image Credit: Shutterstock.com/wowowG Continue reading
Humans behave and act in a way that other humans can recognize as human-like. If humanness has specific features, is it possible to replicate these features on a machine like a robot? Researchers at IIT-Istituto Italiano di Tecnologia (Italian Institute of Technology) tried to answer that question by implementing a non-verbal Turing test in a human-robot interaction task. They involved human participants and the humanoid robot iCub in a joint action experiment. What they found is that specific features of human behavior, namely response timing, can be translated into the robot in a way that humans cannot distinguish whether they are interacting with a person or a machine. Continue reading
Watching robots operate with speed and precision is always impressive, if not, at this point, always surprising. Sophisticated sensors and fast computing means that a powerful and agile robot, like a drone, that knows exactly where it is and exactly where it’s going can reliably move in highly dynamic ways. This is not to say that it’s easy for the drone, but if you’ve got a nice external localization system, a powerful off-board computer, and a talented team of roboticists, you can perform some amazingly agile high-speed maneuvers that most humans could never hope to match.
I say “most” humans, because there are some exceptionally talented humans who are, in fact, able to achieve a level of performance similar to that of even the fastest and most agile drones. The sport of FPV (first-person view) drone racing tests what’s possible with absurdly powerful drones in the hands of humans who must navigate complex courses with speed and precision that seems like it shouldn’t be possible, all while relying solely on a video feed sent from a camera on the front of the drone to the pilot’s VR headset. It’s honestly astonishing to watch.
A year ago, autonomous racing quadrotors from Davide Scaramuzza’s Robotics and Perception Group at the University of Zurich (UZH) proved that they could beat the world’s fastest humans in a drone race. However, the drones relied on a motion-capture system to provide very high resolution position information in real time, along with a computer sending control information from the safety and comfort of a nearby desk, which doesn’t really seem like a fair competition.
Earlier this month, a trio of champion drone racers traveled to Zurich for a rematch, but this time, the race would be fair: no motion-capture system. Nothing off-board. Just drones and humans using their own vision systems and their own computers (or brains) to fly around a drone racing track as fast as possible.
To understand what kind of a challenge this is, it’s important to have some context for the level of speed and agility. So here’s a video of one of UZH’s racing drones completing three laps of a track using the motion-capture system and off-board computation. This particular demo isn’t “fair,” but it does give an indication of what peak performance of a racing drone looks like, with a reaction from one of the professional human pilots (Thomas Bitmatta) at the end:
As Thomas says at the end of the video, the autonomous drone made it through one lap of the course in 5.3 seconds. With a peak speed of 110 kilometers per hour, this was a staggering 1.8 seconds per lap faster than Thomas, who has twice won FPV drone racing’s MultiGP International World Cup.
The autonomous drone has several advantages in this particular race. First, it has near-perfect state estimation, thanks to a motion-capture system that covers the entire course. In other words, the drone always knows exactly where it is, as well as its precise speed and orientation. Experienced human pilots develop an intuition for estimating the state of their system, but they can’t even watch their own drone while racing since they’re immersed in the first-person view the entire time. The second advantage the autonomous drone has is that it’s able to compute a trajectory that traverses the course in a time-optimal way, considering the course layout and the constraints imposed by the drone itself. Human pilots have to practice on a course for hours (or even days) to discover what they think is an optimal trajectory, but they have no way of knowing for sure whether their racing lines can be improved or not.
Elia Kaufmann prepares one of UZH's vision-based racing drones on its launch platform.Evan Ackerman/IEEE Spectrum
So what, then, would make for a drone race in which humans and robots can compete fairly but doesn’t ask the robots to be less robotic or the humans to be less human-y?
No external help. No motion-capture system or off-board compute. Arguably, the humans have something of an advantage here, since they are off-board by definition, but the broader point of this research is to endow drones with the ability to fly themselves in aggressive and agile ways, so it’s a necessary compromise.Complete knowledge of the course. Nothing on the course is secret, and humans can walk through it and develop a mental model. The robotic system, meanwhile, gets an actual CAD model. Both humans and robots also get practice time—humans on the physical course with real drones, and the system practices in simulation. Both humans and robots can use this practice time to find an optimal trajectory in advance.Vision only. The autonomous drones use Intel RealSense stereo-vision sensors, while the humans use a monocular camera streaming video from the drone. The humans may not get a stereo feed, but they do get better resolution and higher frames per second than the RealSense gives the autonomous drone.Three world-class human pilots were invited to Zurich for this race. Along with Thomas Bitmatta, UZH hosted Alex Vanover (2019 Drone Racing League champion) and Marvin Schäpper (2021 Swiss Drone League champion). Each pilot had as much time as they wanted on the course in advance, flying more than 700 practice laps in total. And on a Friday night in a military aircraft hangar outside of Zurich, the races began. Here are some preliminary clips from one of the vision-based autonomous drones flying computer-to-head with a human; the human-piloted drone is red, while the autonomous drone is blue:
With a top speed of 80 km/h, the vision-based autonomous drone outraced the fastest human by 0.5 second during a three-lap race, where just one or two-tenths of a second is frequently the difference between a win and a loss. This victory for the vision-based autonomous drone is a big deal, as Davide Scaramuzza explains:
This demonstrates that AI-vs.-human drone racing has the potential to revolutionize drone racing as a sport. What’s clear is that superhuman performance with AI drones can be achieved, but there is still a lot of work to be done to robustify these AI systems to bring them from a controlled environment to real-world applications. More details will be given in follow-up scientific publications. I was at this event in Zurich, and I’d love to tell you more about it. I will tell you more about it, but as Davide says, the UZH researchers are working on publishing their results, meaning that all the fascinating details about exactly what happened and why will have to wait a bit until they’ve got everything properly written up. So stay tuned—we’ll have lots more for you on this.
The University of Zurich provided travel support to assist us with covering this event in person. Continue reading
When you read a sentence like this one, your past experience tells you that it’s written by a thinking, feeling human. And, in this case, there is indeed a human typing these words: [Hi, there!]. But these days, some sentences that appear remarkably humanlike are actually generated by artificial intelligence systems trained on massive amounts of human text.
People are so accustomed to assuming that fluent language comes from a thinking, feeling human that evidence to the contrary can be difficult to wrap your head around. How are people likely to navigate this relatively uncharted territory? Because of a persistent tendency to associate fluent expression with fluent thought, it is natural—but potentially misleading—to think that if an AI model can express itself fluently, that means it thinks and feels just like humans do.
Thus, it is perhaps unsurprising that a former Google engineer recently claimed that Google’s AI system LaMDA has a sense of self because it can eloquently generate text about its purported feelings. This event and the subsequent media coverage led to a number of rightly skeptical articles and posts about the claim that computational models of human language are sentient, meaning capable of thinking and feeling and experiencing.
The question of what it would mean for an AI model to be sentient is complicated (see, for instance, our colleague’s take), and our goal here is not to settle it. But as language researchers, we can use our work in cognitive science and linguistics to explain why it is all too easy for humans to fall into the cognitive trap of thinking that an entity that can use language fluently is sentient, conscious, or intelligent.
Using AI to Generate Humanlike Language
Text generated by models like Google’s LaMDA can be hard to distinguish from text written by humans. This impressive achievement is a result of a decades-long program to build models that generate grammatical, meaningful language.
The first computer system to engage people in dialogue was psychotherapy software called Eliza, built more than half a century ago. Image Credit: Rosenfeld Media/Flickr, CC BY
Early versions dating back to at least the 1950s, known as n-gram models, simply counted up occurrences of specific phrases and used them to guess what words were likely to occur in particular contexts. For instance, it’s easy to know that “peanut butter and jelly” is a more likely phrase than “peanut butter and pineapples.” If you have enough English text, you will see the phrase “peanut butter and jelly” again and again but might never see the phrase “peanut butter and pineapples.”
Today’s models, sets of data and rules that approximate human language, differ from these early attempts in several important ways. First, they are trained on essentially the entire internet. Second, they can learn relationships between words that are far apart, not just words that are neighbors. Third, they are tuned by a huge number of internal “knobs”—so many that it is hard for even the engineers who design them to understand why they generate one sequence of words rather than another.
The models’ task, however, remains the same as in the 1950s: determine which word is likely to come next. Today, they are so good at this task that almost all sentences they generate seem fluid and grammatical.
Peanut Butter and Pineapples?
We asked a large language model, GPT-3, to complete the sentence “Peanut butter and pineapples___”. It said: “Peanut butter and pineapples are a great combination. The sweet and savory flavors of peanut butter and pineapple complement each other perfectly.” If a person said this, one might infer that they had tried peanut butter and pineapple together, formed an opinion, and shared it with the reader.
But how did GPT-3 come up with this paragraph? By generating a word that fit the context we provided. And then another one. And then another one. The model never saw, touched, or tasted pineapples—it just processed all the texts on the internet that mention them. And yet reading this paragraph can lead the human mind—even that of a Google engineer—to imagine GPT-3 as an intelligent being that can reason about peanut butter and pineapple dishes.
The human brain is hardwired to infer intentions behind words. Every time you engage in conversation, your mind automatically constructs a mental model of your conversation partner. You then use the words they say to fill in the model with that person’s goals, feelings, and beliefs.
The process of jumping from words to the mental model is seamless, getting triggered every time you receive a fully fledged sentence. This cognitive process saves you a lot of time and effort in everyday life, greatly facilitating your social interactions.
However, in the case of AI systems, it misfires, building a mental model out of thin air.
A little more probing can reveal the severity of this misfire. Consider the following prompt: “Peanut butter and feathers taste great together because___”. GPT-3 continued: “Peanut butter and feathers taste great together because they both have a nutty flavor. Peanut butter is also smooth and creamy, which helps to offset the feather’s texture.”
The text in this case is as fluent as our example with pineapples, but this time the model is saying something decidedly less sensible. One begins to suspect that GPT-3 has never actually tried peanut butter and feathers.
Ascribing Intelligence to Machines, Denying it to Humans
A sad irony is that the same cognitive bias that makes people ascribe humanity to GPT-3 can cause them to treat actual humans in inhumane ways. Sociocultural linguistics—the study of language in its social and cultural context—shows that assuming an overly tight link between fluent expression and fluent thinking can lead to bias against people who speak differently.
For instance, people with a foreign accent are often perceived as less intelligent and are less likely to get the jobs they are qualified for. Similar biases exist against speakers of dialects that are not considered prestigious, such as Southern English in the US, against deaf people using sign languages, and against people with speech impediments such as stuttering.
These biases are deeply harmful, often lead to racist and sexist assumptions, and have been shown again and again to be unfounded.
Fluent Language Alone Does Not Imply Humanity
Will AI ever become sentient? This question requires deep consideration, and indeed philosophers have pondered it for decades. What researchers have determined, however, is that you cannot simply trust a language model when it tells you how it feels. Words can be misleading, and it is all too easy to mistake fluent speech for fluent thought.
This article is republished from The Conversation under a Creative Commons license. Read the original article.
Image Credit: Tancha/Shutterstock.com Continue reading