The Trillion-Transistor Chip That Just Left a Supercomputer in the Dust

8:14
 
Share
 

Archived series ("Inactive feed" status)

When? This feed was archived on January 09, 2021 04:30 (6d ago). Last successful fetch was on December 06, 2020 21:47 (1M ago)

Why? Inactive feed status. Our servers were unable to retrieve a valid podcast feed for a sustained period.

What now? You might be able to find a more up-to-date version using the search function. This series will no longer be checked for updates. If you believe this to be in error, please check if the publisher's feed link below is valid and contact support to request the feed be restored or if you have any other concerns about this.

Manage episode 278311360 series 2515134
By Singularity Hub. Discovered by Player FM and our community — copyright is owned by the publisher, not Player FM, and audio is streamed directly from their servers. Hit the Subscribe button to track updates in Player FM, or paste the feed URL into other podcast apps.
The history of computer chips is a thrilling tale of extreme miniaturization. The smaller, the better is a trend that’s given birth to the digital world as we know it. So, why on earth would you want to reverse course and make chips a lot bigger? Well, while there’s no particularly good reason to have a chip the size of an iPad in an iPad, such a chip may prove to be genius for more specific uses, like artificial intelligence or simulations of the physical world. At least, that’s what Cerebras, the maker of the biggest computer chip in the world, is hoping. The Cerebras Wafer-Scale Engine is massive any way you slice it. The chip is 8.5 inches to a side and houses 1.2 trillion transistors. The next biggest chip, NVIDIA’s A100 GPU, measures an inch to a side and has a mere 54 billion transistors. The former is new, largely untested and, so far, one-of-a-kind. The latter is well-loved, mass-produced, and has taken over the world of AI and supercomputing in the last decade. So can Goliath flip the script on David? Cerebras is on a mission to find out. Big Chips Beyond AI When Cerebras first came out of stealth last year, the company said it could significantly speed up the training of deep learning models. Since then, the WSE has made its way into a handful of supercomputing labs, where the company’s customers are putting it through its paces. One of those labs, the National Energy Technology Laboratory, is looking to see what it can do beyond AI. So, in a recent trial, researchers pitted the chip—which is housed in an all-in-one system about the size of a dorm room mini-fridge called the CS-1—against a supercomputer in a fluid dynamics simulation. Simulating the movement of fluids is a common supercomputer application useful for solving complex problems like weather forecasting and airplane wing design. The trial was described in a preprint paper written by a team led by Cerebras’s Michael James and NETL’s Dirk Van Essendelft and presented at the supercomputing conference SC20 this week. The team said the CS-1 completed a simulation of combustion in a power plant roughly 200 times faster than it took the Joule 2.0 supercomputer to do a similar task. The CS-1 was actually faster-than-real-time. As Cerebrus wrote in a blog post, “It can tell you what is going to happen in the future faster than the laws of physics produce the same result.” The researchers said the CS-1’s performance couldn’t be matched by any number of CPUs and GPUs. And CEO and cofounder Andrew Feldman told VentureBeat that would be true “no matter how large the supercomputer is.” At a point, scaling a supercomputer like Joule no longer produces better results in this kind of problem. That’s why Joule’s simulation speed peaked at 16,384 cores, a fraction of its total 86,400 cores. A comparison of the two machines drives the point home. Joule is the 81st fastest supercomputer in the world, takes up dozens of server racks, consumes up to 450 kilowatts of power, and required tens of millions of dollars to build. The CS-1, by comparison, fits in a third of a server rack, consumes 20 kilowatts of power, and sells for a few million dollars. While the task is niche (but useful) and the problem well-suited to the CS-1, it’s still a pretty stunning result. So how’d they pull it off? It’s all in the design. Cut the Commute Computer chips begin life on a big piece of silicon called a wafer. Multiple chips are etched onto the same wafer and then the wafer is cut into individual chips. While the WSE is also etched onto a silicon wafer, the wafer is left intact as a single, operating unit. This wafer-scale chip contains almost 400,000 processing cores. Each core is connected to its own dedicated memory and its four neighboring cores. Putting that many cores on a single chip and giving them their own memory is why the WSE is bigger; it’s also why, in this case, it’s better. Most large-scale computing tasks depend on massively parallel proces...

575 episodes