Wafer-scale expertise is making waves once more, this time promising to allow synthetic intelligence (AI) fashions with trillions of parameters to run sooner and extra effectively than conventional GPU-based methods. Engineers at The College of California, Riverside (UCR) declare to have developed a chip the scale of a frisbee that may transfer large quantities of information with out overheating or consuming extreme electrical energy.
They name these large chips wafer-scale accelerators, which Cerebras manufactured on dinner plate-sized silicon wafers. These wafer-scale processors can ship way more computing energy with a lot better vitality effectivity, traits which can be important as AI fashions proceed to develop bigger and extra demanding.
The dinner plate-sized silicon wafers are in stark distinction to postage stamp-sized GPUs, which are actually thought-about important in AI designs as a result of they will carry out a number of computational duties like processing photos, language, and knowledge streams in parallel.
Nevertheless, as AI mannequin complexity will increase, even high-end GPUs are beginning to hit efficiency and vitality limits, says Mihri Ozkan, a professor {of electrical} and laptop engineering in UCR’s Bourns Faculty of Engineering and the lead creator of the paper revealed within the journal Machine.
Determine 1 Wafer-Scale Engine 3 (WSE-3), manufactured by Cerebras, avoids the delays and energy losses related to chip-to-chip communication. Supply: The College of California, Riverside
“AI computing isn’t nearly pace anymore,” Ozkan added. “It’s about designing methods that may transfer large quantities of information with out overheating or consuming extreme electrical energy.” He in contrast GPUs to busy highways, that are efficient, however site visitors jams waste vitality. “Wafer-scale engines are extra like monorails: direct, environment friendly, and fewer polluting.”
The Cerebras Wafer-Scale Engine 3 (WSE-3), developed by UCR engineers, comprises 4 trillion transistors and 900,000 AI-specific cores on a single wafer. Furthermore, as Cerebras studies, inference workloads on the WSE-3 system use one-sixth the facility of equal GPU-based cloud setups.
Then there may be Tesla’s Dojo D1, one other wafer-scale accelerator, which comprises 1.25 trillion transistors and almost 9,000 cores per module. These wafer-scale chips are engineered to get rid of the efficiency bottlenecks that happen when knowledge travels between a number of smaller chips.
Determine 2 Dojo D1 chip, launched in 2021, goals to reinforce full self-driving and autopilot methods. Supply: Tesla
Nevertheless, as UCR’s Ozkan acknowledges, warmth stays a problem. With thermal design energy reaching 10,000 watts, wafer-scale chips require superior cooling. Right here, Cerebras makes use of a glycol-based loop constructed into the chip bundle, whereas Tesla employs a coolant system that distributes liquid evenly throughout the chip floor.
Associated Content material
- Wafer Scale Rising
- Startup Spins Entire Wafer for AI
- Powering and Cooling a Wafer Scale Die
- Cerebras’ Third-Gen Wafer-Scale Chip Doubles Efficiency
- Cerebras Wafer-Scale Chip will Energy Scottish Supercomputer
The put up Wafer-scale chip claims to supply GPU various for AI fashions appeared first on EDN.