The World’s Largest Computer Chip
Deep learning, the artificial-intelligence technology that powers voice assistants, autonomous cars, and Go champions, relies on complicated “neural network” software arranged in layers. A deep-learning system can live on a single computer, but the biggest ones are spread over thousands of machines wired together into “clusters,” which sometimes live at large data centers, like those operated by Google. In a big cluster, as many as forty-eight pizza-box-size servers slide into a rack as tall as a person; these racks stand in rows, filling buildings the size of warehouses. The neural networks in such systems can tackle daunting problems, but they also face clear challenges. A network spread across a cluster is like a brain that’s been scattered around a room and wired together. Electrons move fast, but, even so, cross-chip communication is slow, and uses extravagant amounts of energy.
Eric Vishria, a general partner at Benchmark, a venture-capital firm in San Francisco, first came to understand this problem in the spring of 2016, while listening to a presentation from a new computer-chip company called Cerebras Systems. Benchmark is known for having made early investments in companies such as Twitter, Uber, and eBay—that is, in software, not hardware. The firm looks at about two hundred startup pitches a year, and invests in maybe one. “We’re in this kissing-a-thousand-frogs kind of game,” Vishria told me. As the presentation started, he had already decided to toss the frog back. “I’m, like, Why did I agree to this? We’re not gonna do a hardware investment,” he recalled thinking. “This is so dumb.”
Andrew Feldman, Cerebras’s co-founder, began his slide deck with a cover slide, then a team slide, catching Vishria’s attention: the talent was impressive. Then Feldman compared two kinds of computer chips. First, he looked at graphics-processing units, or G.P.U.s—chips designed for creating 3-D images. For a variety of reasons, today’s machine-learning systems depend on these graphics chips. Next, he looked at central processing units, or C.P.U.s—the general-purpose chips that do most of the work on a typical computer. “Slide 3 was something along the lines of, ‘G.P.U.s actually suck for deep learning—they just happen to be a hundred times better than C.P.U.s,’ ” Vishria recalled. “And, as soon as he said it, I was, like, facepalm. Of course! Of course!” Cerebras was proposing a new kind of chip—one built not for graphics but for A.I. specifically.