Tag Archives: memory

#437620 The Trillion-Transistor Chip That Just ...

The history of computer chips is a thrilling tale of extreme miniaturization.

The smaller, the better is a trend that’s given birth to the digital world as we know it. So, why on earth would you want to reverse course and make chips a lot bigger? Well, while there’s no particularly good reason to have a chip the size of an iPad in an iPad, such a chip may prove to be genius for more specific uses, like artificial intelligence or simulations of the physical world.

At least, that’s what Cerebras, the maker of the biggest computer chip in the world, is hoping.

The Cerebras Wafer-Scale Engine is massive any way you slice it. The chip is 8.5 inches to a side and houses 1.2 trillion transistors. The next biggest chip, NVIDIA’s A100 GPU, measures an inch to a side and has a mere 54 billion transistors. The former is new, largely untested and, so far, one-of-a-kind. The latter is well-loved, mass-produced, and has taken over the world of AI and supercomputing in the last decade.

So can Goliath flip the script on David? Cerebras is on a mission to find out.

Big Chips Beyond AI
When Cerebras first came out of stealth last year, the company said it could significantly speed up the training of deep learning models.

Since then, the WSE has made its way into a handful of supercomputing labs, where the company’s customers are putting it through its paces. One of those labs, the National Energy Technology Laboratory, is looking to see what it can do beyond AI.

So, in a recent trial, researchers pitted the chip—which is housed in an all-in-one system about the size of a dorm room mini-fridge called the CS-1—against a supercomputer in a fluid dynamics simulation. Simulating the movement of fluids is a common supercomputer application useful for solving complex problems like weather forecasting and airplane wing design.

The trial was described in a preprint paper written by a team led by Cerebras’s Michael James and NETL’s Dirk Van Essendelft and presented at the supercomputing conference SC20 this week. The team said the CS-1 completed a simulation of combustion in a power plant roughly 200 times faster than it took the Joule 2.0 supercomputer to do a similar task.

The CS-1 was actually faster-than-real-time. As Cerebrus wrote in a blog post, “It can tell you what is going to happen in the future faster than the laws of physics produce the same result.”

The researchers said the CS-1’s performance couldn’t be matched by any number of CPUs and GPUs. And CEO and cofounder Andrew Feldman told VentureBeat that would be true “no matter how large the supercomputer is.” At a point, scaling a supercomputer like Joule no longer produces better results in this kind of problem. That’s why Joule’s simulation speed peaked at 16,384 cores, a fraction of its total 86,400 cores.

A comparison of the two machines drives the point home. Joule is the 81st fastest supercomputer in the world, takes up dozens of server racks, consumes up to 450 kilowatts of power, and required tens of millions of dollars to build. The CS-1, by comparison, fits in a third of a server rack, consumes 20 kilowatts of power, and sells for a few million dollars.

While the task is niche (but useful) and the problem well-suited to the CS-1, it’s still a pretty stunning result. So how’d they pull it off? It’s all in the design.

Cut the Commute
Computer chips begin life on a big piece of silicon called a wafer. Multiple chips are etched onto the same wafer and then the wafer is cut into individual chips. While the WSE is also etched onto a silicon wafer, the wafer is left intact as a single, operating unit. This wafer-scale chip contains almost 400,000 processing cores. Each core is connected to its own dedicated memory and its four neighboring cores.

Putting that many cores on a single chip and giving them their own memory is why the WSE is bigger; it’s also why, in this case, it’s better.

Most large-scale computing tasks depend on massively parallel processing. Researchers distribute the task among hundreds or thousands of chips. The chips need to work in concert, so they’re in constant communication, shuttling information back and forth. A similar process takes place within each chip, as information moves between processor cores, which are doing the calculations, and shared memory to store the results.

It’s a little like an old-timey company that does all its business on paper.

The company uses couriers to send and collect documents from other branches and archives across town. The couriers know the best routes through the city, but the trips take some minimum amount of time determined by the distance between the branches and archives, the courier’s top speed, and how many other couriers are on the road. In short, distance and traffic slow things down.

Now, imagine the company builds a brand new gleaming skyscraper. Every branch is moved into the new building and every worker gets a small filing cabinet in their office to store documents. Now any document they need can be stored and retrieved in the time it takes to step across the office or down the hall to their neighbor’s office. The information commute has all but disappeared. Everything’s in the same house.

Cerebras’s megachip is a bit like that skyscraper. The way it shuttles information—aided further by its specially tailored compiling software—is far more efficient compared to a traditional supercomputer that needs to network a ton of traditional chips.

Simulating the World as It Unfolds
It’s worth noting the chip can only handle problems small enough to fit on the wafer. But such problems may have quite practical applications because of the machine’s ability to do high-fidelity simulation in real-time. The authors note, for example, the machine should in theory be able to accurately simulate the air flow around a helicopter trying to land on a flight deck and semi-automate the process—something not possible with traditional chips.

Another opportunity, they note, would be to use a simulation as input to train a neural network also residing on the chip. In an intriguing and related example, a Caltech machine learning technique recently proved to be 1,000 times faster at solving the same kind of partial differential equations at play here to simulate fluid dynamics.

They also note that improvements in the chip (and others like it, should they arrive) will push back the limits of what can be accomplished. Already, Cerebras has teased the release of its next-generation chip, which will have 2.6 trillion transistors, 850,00 cores, and more than double the memory.

Of course, it still remains to be seen whether wafer-scale computing really takes off. The idea has been around for decades, but Cerebras is the first to pursue it seriously. Clearly, they believe they’ve solved the problem in a way that’s useful and economical.

Other new architectures are also being pursued in the lab. Memristor-based neuromorphic chips, for example, mimic the brain by putting processing and memory into individual transistor-like components. And of course, quantum computers are in a separate lane, but tackle similar problems.

It could be that one of these technologies eventually rises to rule them all. Or, and this seems just as likely, computing may splinter into a bizarre quilt of radical chips, all stitched together to make the most of each depending on the situation.

Image credit: Cerebras Continue reading

Posted in Human Robots

#437600 Brain-Inspired Robot Controller Uses ...

Robots operating in the real world are starting to find themselves constrained by the amount of computing power they have available. Computers are certainly getting faster and more efficient, but they’re not keeping up with the potential of robotic systems, which have access to better sensors and more data, which in turn makes decision making more complex. A relatively new kind of computing device called a memristor could potentially help robotics smash through this barrier, through a combination of lower complexity, lower cost, and higher speed.

In a paper published today in Science Robotics, a team of researchers from the University of Southern California in Los Angeles and the Air Force Research Laboratory in Rome, N.Y., demonstrate a simple self-balancing robot that uses memristors to form a highly effective analog control system, inspired by the functional structure of the human brain.

First, we should go over just what the heck a memristor is. As the name suggests, it’s a type of memory that is resistance-based. That is, the resistance of a memristor can be programmed, and the memristor remembers that resistance even after it’s powered off (the resistance depends on the magnitude of the voltage applied to the memristor’s two terminals and the length of time that voltage has been applied). Memristors are potentially the ideal hybrid between RAM and flash memory, offering high speed, high density, non-volatile storage. So that’s cool, but what we’re most interested in as far as robot control systems go is that memristors store resistance, making them analog devices rather than digital ones.

By adding a memristor to an analog circuit with inputs from a gyroscope and an accelerometer, the researchers created a completely analog Kalman filter, which coupled to a second memristor functioned as a PD controller.

Nowadays, the word “analog” sounds like a bad thing, but robots are stuck in an analog world, and any physical interactions they have with the world (mediated through sensors) are fundamentally analog in nature. The challenge is that an analog signal is often “messy”—full of noise and non-linearities—and as such, the usual approach now is to get it converted to a digital signal and then processed to get anything useful out of it. This is fine, but it’s also not particularly fast or efficient. Where memristors come in is that they’re inherently analog, and in addition to storing data, they can also act as tiny analog computers, which is pretty wild.

By adding a memristor to an analog circuit with inputs from a gyroscope and an accelerometer, the researchers, led by Wei Wu, an associate professor of electrical engineering at USC, created a completely analog and completely physical Kalman filter to remove noise from the sensor signal. In addition, they used a second memristor can be used to turn that sensor data into a proportional-derivative (PD) controller. Next they put those two components together to build an analogy system that can do a bunch of the work required to keep an inverted pendulum robot upright far more efficiently than a traditional system. The difference in performance is readily apparent:

The shaking you see in the traditionally-controlled robot on the bottom comes from the non-linearity of the dynamic system, which changes faster than the on-board controller can keep up with. The memristors substantially reduce the cycle time, so the robot can balance much more smoothly. Specifically, cycle time is reduced from 3,034 microseconds to just 6 microseconds.

Of course, there’s more going on here, like motor drivers and a digital computer to talk to them, so this robot is really a hybrid system. But guess what? As the researchers point out, so are we!

The human brain consists of the cerebrum, the cerebellum, and the brainstem. The cerebrum is a major part of the brain in charge of vision, hearing, and thinking, whereas the cerebellum plays an important role in motion control. Through this cooperation of the cerebrum and the cerebellum, the human brain can conduct multiple tasks simultaneously with extremely low power consumption. Inspired by this, we developed a hybrid analog-digital computation platform, in which the digital component runs the high-level algorithm, whereas the analog component is responsible for sensor fusion and motion control.

By offloading a bunch of computation onto the memristors, the higher brain functions of the robot have more breathing room. Overall, you reduce power, space, and cost, while substantially improving performance. This has only become possible relatively recently due to memristor advances and availability, and the researchers expect that memristor-based hybrid computing will soon be able to “improve the robustness and the performance of mobile robotic systems with higher” degrees of freedom.

“A memristor-based hybrid analog-digital computing platform for mobile robotics,” by Buyun Chen, Hao Yang, Boxiang Song, Deming Meng, Xiaodong Yan, Yuanrui Li, Yunxiang Wang, Pan Hu, Tse-Hsien Ou, Mark Barnell, Qing Wu, Han Wang, and Wei Wu, from USC Viterbi and AFRL, was published in Science Robotics. Continue reading

Posted in Human Robots

#437471 How Giving Robots a Hybrid, Human-Like ...

Squeezing a lot of computing power into robots without using up too much space or energy is a constant battle for their designers. But a new approach that mimics the structure of the human brain could provide a workaround.

The capabilities of most of today’s mobile robots are fairly rudimentary, but giving them the smarts to do their jobs is still a serious challenge. Controlling a body in a dynamic environment takes a surprising amount of processing power, which requires both real estate for chips and considerable amounts of energy to power them.

As robots get more complex and capable, those demands are only going to increase. Today’s most powerful AI systems run in massive data centers across far more chips than can realistically fit inside a machine on the move. And the slow death of Moore’s Law suggests we can’t rely on conventional processors getting significantly more efficient or compact anytime soon.

That prompted a team from the University of Southern California to resurrect an idea from more than 40 years ago: mimicking the human brain’s division of labor between two complimentary structures. While the cerebrum is responsible for higher cognitive functions like vision, hearing, and thinking, the cerebellum integrates sensory data and governs movement, balance, and posture.

When the idea was first proposed the technology didn’t exist to make it a reality, but in a paper recently published in Science Robotics, the researchers describe a hybrid system that combines analog circuits that control motion and digital circuits that govern perception and decision-making in an inverted pendulum robot.

“Through this cooperation of the cerebrum and the cerebellum, the robot can conduct multiple tasks simultaneously with a much shorter latency and lower power consumption,” write the researchers.

The type of robot the researchers were experimenting with looks essentially like a pole balancing on a pair of wheels. They have a broad range of applications, from hoverboards to warehouse logistics—Boston Dynamics’ recently-unveiled Handle robot operates on the same principles. Keeping them stable is notoriously tough, but the new approach managed to significantly improve all digital control approaches by radically improving the speed and efficiency of computations.

Key to bringing the idea alive was the recent emergence of memristors—electrical components whose resistance relies on previous input, which allows them to combine computing and memory in one place in a way similar to how biological neurons operate.

The researchers used memristors to build an analog circuit that runs an algorithm responsible for integrating data from the robot’s accelerometer and gyroscope, which is crucial for detecting the angle and velocity of its body, and another that controls its motion. One key advantage of this setup is that the signals from the sensors are analog, so it does away with the need for extra circuitry to convert them into digital signals, saving both space and power.

More importantly, though, the analog system is an order of magnitude faster and more energy-efficient than a standard all-digital system, the authors report. This not only lets them slash the power requirements, but also lets them cut the processing loop from 3,000 microseconds to just 6. That significantly improves the robot’s stability, with it taking just one second to settle into a steady state compared to more than three seconds using the digital-only platform.

At the minute this is just a proof of concept. The robot the researchers have built is small and rudimentary, and the algorithms being run on the analog circuit are fairly basic. But the principle is a promising one, and there is currently a huge amount of R&D going into neuromorphic and memristor-based analog computing hardware.

As often turns out to be the case, it seems like we can’t go too far wrong by mimicking the best model of computation we have found so far: our own brains.

Image Credit: Photos Hobby / Unsplash Continue reading

Posted in Human Robots

#437293 These Scientists Just Completed a 3D ...

Human brain maps are a dime a dozen these days. Maps that detail neurons in a certain region. Maps that draw out functional connections between those cells. Maps that dive deeper into gene expression. Or even meta-maps that combine all of the above.

But have you ever wondered: how well do those maps represent my brain? After all, no two brains are alike. And if we’re ever going to reverse-engineer the brain as a computer simulation—as Europe’s Human Brain Project is trying to do—shouldn’t we ask whose brain they’re hoping to simulate?

Enter a new kind of map: the Julich-Brain, a probabilistic map of human brains that accounts for individual differences using a computational framework. Rather than generating a static PDF of a brain map, the Julich-Brain atlas is also dynamic, in that it continuously changes to incorporate more recent brain mapping results. So far, the map has data from over 24,000 thinly sliced sections from 23 postmortem brains covering most years of adulthood at the cellular level. But the atlas can also continuously adapt to progress in mapping technologies to aid brain modeling and simulation, and link to other atlases and alternatives.

In other words, rather than “just another” human brain map, the Julich-Brain atlas is its own neuromapping API—one that could unite previous brain-mapping efforts with more modern methods.

“It is exciting to see how far the combination of brain research and digital technologies has progressed,” said Dr. Katrin Amunts of the Institute of Neuroscience and Medicine at Research Centre Jülich in Germany, who spearheaded the study.

The Old Dogma
The Julich-Brain atlas embraces traditional brain-mapping while also yanking the field into the 21st century.

First, the new atlas includes the brain’s cytoarchitecture, or how brain cells are organized. As brain maps go, these kinds of maps are the oldest and most fundamental. Rather than exploring how neurons talk to each other functionally—which is all the rage these days with connectome maps—cytoarchitecture maps draw out the physical arrangement of neurons.

Like a census, these maps literally capture how neurons are distributed in the brain, what they look like, and how they layer within and between different brain regions.

Because neurons aren’t packed together the same way between different brain regions, this provides a way to parse the brain into areas that can be further studied. When we say the brain’s “memory center,” the hippocampus, or the emotion center, the “amygdala,” these distinctions are based on cytoarchitectural maps.

Some may call this type of mapping “boring.” But cytoarchitecture maps form the very basis of any sort of neuroscience understanding. Like hand-drawn maps from early explorers sailing to the western hemisphere, these maps provide the brain’s geographical patterns from which we try to decipher functional connections. If brain regions are cities, then cytoarchitecture maps attempt to show trading or other “functional” activities that occur in the interlinking highways.

You might’ve heard of the most common cytoarchitecture map used today: the Brodmann map from 1909 (yup, that old), which divided the brain into classical regions based on the cells’ morphology and location. The map, while impactful, wasn’t able to account for brain differences between people. More recent brain-mapping technologies have allowed us to dig deeper into neuronal differences and divide the brain into more regions—180 areas in the cortex alone, compared with 43 in the original Brodmann map.

The new study took inspiration from that age-old map and transformed it into a digital ecosystem.

A Living Atlas
Work began on the Julich-Brain atlas in the mid-1990s, with a little help from the crowd.

The preparation of human tissue and its microstructural mapping, analysis, and data processing is incredibly labor-intensive, the authors lamented, making it impossible to do for the whole brain at high resolution in just one lab. To build their “Google Earth” for the brain, the team hooked up with EBRAINS, a shared computing platform set up by the Human Brain Project to promote collaboration between neuroscience labs in the EU.

First, the team acquired MRI scans of 23 postmortem brains, sliced the brains into wafer-thin sections, and scanned and digitized them. They corrected distortions from the chopping using data from the MRI scans and then lined up neurons in consecutive sections—picture putting together a 3D puzzle—to reconstruct the whole brain. Overall, the team had to analyze 24,000 brain sections, which prompted them to build a computational management system for individual brain sections—a win, because they could now track individual donor brains too.

Their method was quite clever. They first mapped their results to a brain template from a single person, called the MNI-Colin27 template. Because the reference brain was extremely detailed, this allowed the team to better figure out the location of brain cells and regions in a particular anatomical space.

However, MNI-Colin27’s brain isn’t your or my brain—or any of the brains the team analyzed. To dilute any of Colin’s potential brain quirks, the team also mapped their dataset onto an “average brain,” dubbed the ICBM2009c (catchy, I know).

This step allowed the team to “standardize” their results with everything else from the Human Connectome Project and the UK Biobank, kind of like adding their Google Maps layer to the existing map. To highlight individual brain differences, the team overlaid their dataset on existing ones, and looked for differences in the cytoarchitecture.

The microscopic architecture of neurons change between two areas (dotted line), forming the basis of different identifiable brain regions. To account for individual differences, the team also calculated a probability map (right hemisphere). Image credit: Forschungszentrum Juelich / Katrin Amunts
Based on structure alone, the brains were both remarkably different and shockingly similar at the same time. For example, the cortexes—the outermost layer of the brain—were physically different across donor brains of different age and sex. The region especially divergent between people was Broca’s region, which is traditionally linked to speech production. In contrast, parts of the visual cortex were almost identical between the brains.

The Brain-Mapping Future
Rather than relying on the brain’s visible “landmarks,” which can still differ between people, the probabilistic map is far more precise, the authors said.

What’s more, the map could also pool yet unmapped regions in the cortex—about 30 percent or so—into “gap maps,” providing neuroscientists with a better idea of what still needs to be understood.

“New maps are continuously replacing gap maps with progress in mapping while the process is captured and documented … Consequently, the atlas is not static but rather represents a ‘living map,’” the authors said.

Thanks to its structurally-sound architecture down to individual cells, the atlas can contribute to brain modeling and simulation down the line—especially for personalized brain models for neurological disorders such as seizures. Researchers can also use the framework for other species, and they can even incorporate new data-crunching processors into the workflow, such as mapping brain regions using artificial intelligence.

Fundamentally, the goal is to build shared resources to better understand the brain. “[These atlases] help us—and more and more researchers worldwide—to better understand the complex organization of the brain and to jointly uncover how things are connected,” the authors said.

Image credit: Richard Watts, PhD, University of Vermont and Fair Neuroimaging Lab, Oregon Health and Science University Continue reading

Posted in Human Robots

#437182 MIT’s Tiny New Brain Chip Aims for AI ...

The human brain operates on roughly 20 watts of power (a third of a 60-watt light bulb) in a space the size of, well, a human head. The biggest machine learning algorithms use closer to a nuclear power plant’s worth of electricity and racks of chips to learn.

That’s not to slander machine learning, but nature may have a tip or two to improve the situation. Luckily, there’s a branch of computer chip design heeding that call. By mimicking the brain, super-efficient neuromorphic chips aim to take AI off the cloud and put it in your pocket.

The latest such chip is smaller than a piece of confetti and has tens of thousands of artificial synapses made out of memristors—chip components that can mimic their natural counterparts in the brain.

In a recent paper in Nature Nanotechnology, a team of MIT scientists say their tiny new neuromorphic chip was used to store, retrieve, and manipulate images of Captain America’s Shield and MIT’s Killian Court. Whereas images stored with existing methods tended to lose fidelity over time, the new chip’s images remained crystal clear.

“So far, artificial synapse networks exist as software. We’re trying to build real neural network hardware for portable artificial intelligence systems,” Jeehwan Kim, associate professor of mechanical engineering at MIT said in a press release. “Imagine connecting a neuromorphic device to a camera on your car, and having it recognize lights and objects and make a decision immediately, without having to connect to the internet. We hope to use energy-efficient memristors to do those tasks on-site, in real-time.”

A Brain in Your Pocket
Whereas the computers in our phones and laptops use separate digital components for processing and memory—and therefore need to shuttle information between the two—the MIT chip uses analog components called memristors that process and store information in the same place. This is similar to the way the brain works and makes memristors far more efficient. To date, however, they’ve struggled with reliability and scalability.

To overcome these challenges, the MIT team designed a new kind of silicon-based, alloyed memristor. Ions flowing in memristors made from unalloyed materials tend to scatter as the components get smaller, meaning the signal loses fidelity and the resulting computations are less reliable. The team found an alloy of silver and copper helped stabilize the flow of silver ions between electrodes, allowing them to scale the number of memristors on the chip without sacrificing functionality.

While MIT’s new chip is promising, there’s likely a ways to go before memristor-based neuromorphic chips go mainstream. Between now and then, engineers like Kim have their work cut out for them to further scale and demonstrate their designs. But if successful, they could make for smarter smartphones and other even smaller devices.

“We would like to develop this technology further to have larger-scale arrays to do image recognition tasks,” Kim said. “And some day, you might be able to carry around artificial brains to do these kinds of tasks, without connecting to supercomputers, the internet, or the cloud.”

Special Chips for AI
The MIT work is part of a larger trend in computing and machine learning. As progress in classical chips has flagged in recent years, there’s been an increasing focus on more efficient software and specialized chips to continue pushing the pace.

Neuromorphic chips, for example, aren’t new. IBM and Intel are developing their own designs. So far, their chips have been based on groups of standard computing components, such as transistors (as opposed to memristors), arranged to imitate neurons in the brain. These chips are, however, still in the research phase.

Graphics processing units (GPUs)—chips originally developed for graphics-heavy work like video games—are the best practical example of specialized hardware for AI and were heavily used in this generation of machine learning early on. In the years since, Google, NVIDIA, and others have developed even more specialized chips that cater more specifically to machine learning.

The gains from such specialized chips are already being felt.

In a recent cost analysis of machine learning, research and investment firm ARK Invest said cost declines have far outpaced Moore’s Law. In a particular example, they found the cost to train an image recognition algorithm (ResNet-50) went from around $1,000 in 2017 to roughly $10 in 2019. The fall in cost to actually run such an algorithm was even more dramatic. It took $10,000 to classify a billion images in 2017 and just $0.03 in 2019.

Some of these declines can be traced to better software, but according to ARK, specialized chips have improved performance by nearly 16 times in the last three years.

As neuromorphic chips—and other tailored designs—advance further in the years to come, these trends in cost and performance may continue. Eventually, if all goes to plan, we might all carry a pocket brain that can do the work of today’s best AI.

Image credit: Peng Lin Continue reading

Posted in Human Robots