HomeCloud ComputingPerplexity’s open-source device to run trillion-parameter fashions with out expensive upgrades

Perplexity’s open-source device to run trillion-parameter fashions with out expensive upgrades



The apparent reply can be Nvidia’s new GB200 techniques, basically one large 72-GPU server. However these price thousands and thousands, face excessive provide shortages, and aren’t out there in all places, the researchers famous. In the meantime, H100 and H200 techniques are plentiful and comparatively low-cost.

The catch: working giant fashions throughout a number of older techniques has historically meant brutal efficiency penalties. “There are not any viable cross-provider options for LLM inference,” the analysis workforce wrote, noting that current libraries both lack AWS assist completely or undergo extreme efficiency degradation on Amazon’s {hardware}.

TransferEngine goals to vary that. “TransferEngine permits transportable point-to-point communication for contemporary LLM architectures, avoiding vendor lock-in whereas complementing collective libraries for cloud-native deployments,” the researchers wrote.

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

- Advertisment -
Google search engine

Most Popular

Recent Comments