Tag Archives: serious
After 25 million games, the AI agents playing hide-and-seek with each other had mastered four basic game strategies. The researchers expected that part.
After a total of 380 million games, the AI players developed strategies that the researchers didn’t know were possible in the game environment—which the researchers had themselves created. That was the part that surprised the team at OpenAI, a research company based in San Francisco.
The AI players learned everything via a machine learning technique known as reinforcement learning. In this learning method, AI agents start out by taking random actions. Sometimes those random actions produce desired results, which earn them rewards. Via trial-and-error on a massive scale, they can learn sophisticated strategies.
In the context of games, this process can be abetted by having the AI play against another version of itself, ensuring that the opponents will be evenly matched. It also locks the AI into a process of one-upmanship, where any new strategy that emerges forces the opponent to search for a countermeasure. Over time, this “self-play” amounted to what the researchers call an “auto-curriculum.”
According to OpenAI researcher Igor Mordatch, this experiment shows that self-play “is enough for the agents to learn surprising behaviors on their own—it’s like children playing with each other.”
Reinforcement is a hot field of AI research right now. OpenAI’s researchers used the technique when they trained a team of bots to play the video game Dota 2, which squashed a world-champion human team last April. The Alphabet subsidiary DeepMind has used it to triumph in the ancient board game Go and the video game StarCraft.
Aniruddha Kembhavi, a researcher at the Allen Institute for Artificial Intelligence (AI2) in Seattle, says games such as hide-and-seek offer a good way for AI agents to learn “foundational skills.” He worked on a team that taught their AllenAI to play Pictionary with humans, viewing the gameplay as a way for the AI to work on common sense reasoning and communication. “We are, however, quite far away from being able to translate these preliminary findings in highly simplified environments into the real world,” says Kembhavi.
AI agents construct a fort during a hide-and-seek game developed by OpenAI.
In OpenAI’s game of hide-and-seek, both the hiders and the seekers received a reward only if they won the game, leaving the AI players to develop their own strategies. Within a simple 3D environment containing walls, blocks, and ramps, the players first learned to run around and chase each other (strategy 1). The hiders next learned to move the blocks around to build forts (2), and then the seekers learned to move the ramps (3), enabling them to jump inside the forts. Then the hiders learned to move all the ramps into their forts before the seekers could use them (4).
The two strategies that surprised the researchers came next. First the seekers learned that they could jump onto a box and “surf” it over to a fort (5), allowing them to jump in—a maneuver that the researchers hadn’t realized was physically possible in the game environment. So as a final countermeasure, the hiders learned to lock all the boxes into place (6) so they weren’t available for use as surfboards.
An AI agent uses a nearby box to surf its way into a competitor’s fort.
In this circumstance, having AI agents behave in an unexpected way wasn’t a problem: They found different paths to their rewards, but didn’t cause any trouble. However, you can imagine situations in which the outcome would be rather serious. Robots acting in the real world could do real damage. And then there’s Nick Bostrom’s famous example of a paper clip factory run by an AI, whose goal is to make as many paper clips as possible. As Bostrom told IEEE Spectrum back in 2014, the AI might realize that “human bodies consist of atoms, and those atoms could be used to make some very nice paper clips.”
Bowen Baker, another member of the OpenAI research team, notes that it’s hard to predict all the ways an AI agent will act inside an environment—even a simple one. “Building these environments is hard,” he says. “The agents will come up with these unexpected behaviors, which will be a safety problem down the road when you put them in more complex environments.”
AI researcher Katja Hofmann at Microsoft Research Cambridge, in England, has seen a lot of gameplay by AI agents: She started a competition that uses Minecraft as the playing field. She says the emergent behavior seen in this game, and in prior experiments by other researchers, shows that games can be a useful for studies of safe and responsible AI.
“I find demonstrations like this, in games and game-like settings, a great way to explore the capabilities and limitations of existing approaches in a safe environment,” says Hofmann. “Results like these will help us develop a better understanding on how to validate and debug reinforcement learning systems–a crucial step on the path towards real-world applications.”
Baker says there’s also a hopeful takeaway from the surprises in the hide-and-seek experiment. “If you put these agents into a rich enough environment they will find strategies that we never knew were possible,” he says. “Maybe they can solve problems that we can’t imagine solutions to.” Continue reading
The Tunnel Circuit of the DARPA Subterranean Challenge starts later this week at the NIOSH research mine just outside of Pittsburgh, Pennsylvania. From 15-22 August, 11 teams will send robots into a mine that they've never seen before, with the goal of making maps and locating items. All DARPA SubT events involve tunnels of one sort or another, but in this case, the “Tunnel Circuit” refers to mines as opposed to urban underground areas or natural caves. This month’s challenge is the first of three discrete events leading up to a huge final event in August of 2021.
While the Tunnel Circuit competition will be closed to the public, and media are only allowed access for a single day (which we'll be at, of course), DARPA has provided a substantial amount of information about what teams will be able to expect. We also have details from the SubT Integration Exercise, called STIX, which was a completely closed event that took place back in April. STIX was aimed at giving some teams (and DARPA) a chance to practice in a real tunnel environment.
For more general background on SubT, here are some articles to get you all caught up:
SubT: The Next DARPA Challenge for Robotics
Q&A with DARPA Program Manager Tim Chung
Meet The First Nine Teams
It makes sense to take a closer look at what happened at April's STIX exercise, because it is (probably) very similar to what teams will experience in the upcoming Tunnel Circuit. STIX took place at Edgar Experimental Mine in Colorado, and while no two mines are the same (and many are very, very different), there are enough similarities for STIX to have been a valuable experience for teams. Here's an overview video of the exercise from DARPA:
DARPA has also put together a much more detailed walkthrough of the STIX mine exercise, which gives you a sense of just how vast, complicated, and (frankly) challenging for robots the mine environment is:
So, that's the kind of thing that teams had to deal with back in April. Since the event was an exercise, rather than a competition, DARPA didn't really keep score, and wouldn't comment on the performance of individual teams. We've been trolling YouTube for STIX footage, though, to get a sense of how things went, and we found a few interesting videos.
Here's a nice overview from Team CERBERUS, which used drones plus an ANYmal quadruped:
Team CTU-CRAS also used drones, along with a tracked robot:
Team Robotika was brave enough to post video of a “fatal failure” experienced by its wheeled robot; the poor little bot gets rescued at about 7:00 in case you get worried:
So that was STIX. But what about the Tunnel Circuit competition this week? Here's a course preview video from DARPA:
It sort of looks like the NIOSH mine might be a bit less dusty than the Edgar mine was, but it could also be wetter and muddier. It’s hard to tell, because we’re just getting a few snapshots of what’s probably an enormous area with kilometers of tunnels that the robots will have to explore. But DARPA has promised “constrained passages, sharp turns, large drops/climbs, inclines, steps, ladders, and mud, sand, and/or water.” Combine that with the serious challenge to communications imposed by the mine itself, and robots will have to be both physically capable, and almost entirely autonomous. Which is, of course, exactly what DARPA is looking to test with this challenge.
Lastly, we had a chance to catch up with Tim Chung, Program Manager for the Subterranean Challenge at DARPA, and ask him a few brief questions about STIX and what we have to look forward to this week.
IEEE Spectrum: How did STIX go?
Tim Chung: It was a lot of fun! I think it gave a lot of the teams a great opportunity to really get a taste of what these types of real world environments look like, and also what DARPA has in store for them in the SubT Challenge. STIX I saw as an experiment—a learning experience for all the teams involved (as well as the DARPA team) so that we can continue our calibration.
What do you think teams took away from STIX, and what do you think DARPA took away from STIX?
I think the thing that teams took away was that, when DARPA hosts a challenge, we have very audacious visions for what the art of the possible is. And that's what we want—in my mind, the purpose of a DARPA Grand Challenge is to provide that inspiration of, ‘Holy cow, someone thinks we can do this!’ So I do think the teams walked away with a better understanding of what DARPA's vision is for the capabilities we're seeking in the SubT Challenge, and hopefully walked away with a better understanding of the technical, physical, even maybe mental challenges of doing this in the wild— which will all roll back into how they think about the problem, and how they develop their systems.
This was a collaborative exercise, so the DARPA field team was out there interacting with the other engineers, figuring out what their strengths and weaknesses and needs might be, and even understanding how to handle the robots themselves. That will help [strengthen] connections between these university teams and DARPA going forward. Across the board, I think that collaborative spirit is something we really wish to encourage, and something that the DARPA folks were able to take away.
What do we have to look forward to during the Tunnel Circuit?
The vision here is that the Tunnel Circuit is representative of one of the three subterranean subdomains, along with urban and cave. Characteristics of all of these three subdomains will be mashed together in an epic final course, so that teams will have to face hints of tunnel once again in that final event.
Without giving too much away, the NIOSH mine will be similar to the Edgar mine in that it's a human-made environment that supports mining operations and research. But of course, every site is different, and these differences, I think, will provide good opportunities for the teams to shine.
Again, we'll be visiting the NIOSH mine in Pennsylvania during the Tunnel Circuit and will post as much as we can from there. But if you’re an actual participant in the Subterranean Challenge, please tweet me @BotJunkie so that I can follow and help share live updates.
[ DARPA Subterranean Challenge ] Continue reading
When I lived in Beijing back in the 90s, a man walking his bike was nothing to look at. But today, I did a serious double-take at a video of a bike walking his man.
The bike itself looks overloaded but otherwise completely normal. Underneath its simplicity, however, is a hybrid computer chip that combines brain-inspired circuits with machine learning processes into a computing behemoth. Thanks to its smart chip, the bike self-balances as it gingerly rolls down a paved track before smoothly gaining speed into a jogging pace while navigating dexterously around obstacles. It can even respond to simple voice commands such as “speed up,” “left,” or “straight.”
Far from a circus trick, the bike is a real-world demo of the AI community’s latest attempt at fashioning specialized hardware to keep up with the challenges of machine learning algorithms. The Tianjic (天机*) chip isn’t just your standard neuromorphic chip. Rather, it has the architecture of a brain-like chip, but can also run deep learning algorithms—a match made in heaven that basically mashes together neuro-inspired hardware and software.
The study shows that China is readily nipping at the heels of Google, Facebook, NVIDIA, and other tech behemoths investing in developing new AI chip designs—hell, with billions in government investment it may have already had a head start. A sweeping AI plan from 2017 looks to catch up with the US on AI technology and application by 2020. By 2030, China’s aiming to be the global leader—and a champion for building general AI that matches humans in intellectual competence.
The country’s ambition is reflected in the team’s parting words.
“Our study is expected to stimulate AGI [artificial general intelligence] development by paving the way to more generalized hardware platforms,” said the authors, led by Dr. Luping Shi at Tsinghua University.
A Hardware Conundrum
Shi’s autonomous bike isn’t the first robotic two-wheeler. Back in 2015, the famed research nonprofit SRI International in Menlo Park, California teamed up with Yamaha to engineer MOTOBOT, a humanoid robot capable of driving a motorcycle. Powered by state-of-the-art robotic hardware and machine learning, MOTOBOT eventually raced MotoGPTM world champion Valentino Rossi in a nail-biting match-off.
However, the technological core of MOTOBOT and Shi’s bike vastly differ, and that difference reflects two pathways towards more powerful AI. One, exemplified by MOTOBOT, is software—developing brain-like algorithms with increasingly efficient architecture, efficacy, and speed. That sounds great, but deep neural nets demand so many computational resources that general-purpose chips can’t keep up.
As Shi told China Science Daily: “CPUs and other chips are driven by miniaturization technologies based on physics. Transistors might shrink to nanoscale-level in 10, 20 years. But what then?” As more transistors are squeezed onto these chips, efficient cooling becomes a limiting factor in computational speed. Tax them too much, and they melt.
For AI processes to continue, we need better hardware. An increasingly popular idea is to build neuromorphic chips, which resemble the brain from the ground up. IBM’s TrueNorth, for example, contains a massively parallel architecture nothing like the traditional Von Neumann structure of classic CPUs and GPUs. Similar to biological brains, TrueNorth’s memory is stored within “synapses” between physical “neurons” etched onto the chip, which dramatically cuts down on energy consumption.
But even these chips are limited. Because computation is tethered to hardware architecture, most chips resemble just one specific type of brain-inspired network called spiking neural networks (SNNs). Without doubt, neuromorphic chips are highly efficient setups with dynamics similar to biological networks. They also don’t play nicely with deep learning and other software-based AI.
Brain-AI Hybrid Core
Shi’s new Tianjic chip brought the two incompatibilities together onto a single piece of brainy hardware.
First was to bridge the deep learning and SNN divide. The two have very different computation philosophies and memory organizations, the team said. The biggest difference, however, is that artificial neural networks transform multidimensional data—image pixels, for example—into a single, continuous, multi-bit 0 and 1 stream. In contrast, neurons in SNNs activate using something called “binary spikes” that code for specific activation events in time.
Confused? Yeah, it’s hard to wrap my head around it too. That’s because SNNs act very similarly to our neural networks and nothing like computers. A particular neuron needs to generate an electrical signal (a “spike”) large enough to transfer down to the next one; little blips in signals don’t count. The way they transmit data also heavily depends on how they’re connected, or the network topology. The takeaway: SNNs work pretty differently than deep learning.
Shi’s team first recreated this firing quirk in the language of computers—0s and 1s—so that the coding mechanism would become compatible with deep learning algorithms. They then carefully aligned the step-by-step building blocks of the two models, which allowed them to tease out similarities into a common ground to further build on. “On the basis of this unified abstraction, we built a cross-paradigm neuron scheme,” they said.
In general, the design allowed both computational approaches to share the synapses, where neurons connect and store data, and the dendrites, the outgoing branches of the neurons. In contrast, the neuron body, where signals integrate, was left reconfigurable for each type of computation, as were the input branches. Each building block was combined into a single unified functional core (FCore), which acts like a deep learning/SNN converter depending on its specific setup. Translation: the chip can do both types of previously incompatible computation.
Using nanoscale fabrication, the team arranged 156 FCores, containing roughly 40,000 neurons and 10 million synapses, onto a chip less than a fifth of an inch in length and width. Initial tests showcased the chip’s versatility, in that it can run both SNNs and deep learning algorithms such as the popular convolutional neural network (CNNs) often used in machine vision.
Compared to IBM TrueNorth, the density of Tianjic’s cores increased by 20 percent, speeding up performance ten times and increasing bandwidth at least 100-fold, the team said. When pitted against GPUs, the current hardware darling of machine learning, the chip increased processing throughput up to 100 times, while using just a sliver (1/10,000) of energy.
Although these stats are great, real-life performance is even better as a demo. Here’s where the authors gave their Tianjic brain a body. The team combined one chip with multiple specialized networks to process vision, balance, voice commands, and decision-making in real time. Object detection and target tracking, for example, relied on a deep neural net CNN, whereas voice commands and balance data were recognized using an SNN. The inputs were then integrated inside a neural state machine, which churned out decisions to downstream output modules—for example, controlling the handle bar to turn left.
Thanks to the chip’s brain-like architecture and bilingual ability, Tianjic “allowed all of the neural network models to operate in parallel and realized seamless communication across the models,” the team said. The result is an autonomous bike that rolls after its human, balances across speed bumps, avoids crashing into roadblocks, and answers to voice commands.
“It’s a wonderful demonstration and quite impressive,” said the editorial team at Nature, which published the study on its cover last week.
However, they cautioned, when comparing Tianjic with state-of-the-art chips designed for a single problem toe-to-toe on that particular problem, Tianjic falls behind. But building these jack-of-all-trades hybrid chips is definitely worth the effort. Compared to today’s limited AI, what people really want is artificial general intelligence, which will require new architectures that aren’t designed to solve one particular problem.
Until people start to explore, innovate, and play around with different designs, it’s not clear how we can further progress in the pursuit of general AI. A self-driving bike might not be much to look at, but its hybrid brain is a pretty neat place to start.
*The name, in Chinese, means “heavenly machine,” “unknowable mystery of nature,” or “confidentiality.” Go figure.
Image Credit: Alexander Ryabintsev / Shutterstock.com Continue reading