Skip links
Illustration: Thomas R. Lechleiter/WSJ

As AI Matures, Chip Industry Will Look Beyond GPUs, AMD Chief Says

Future computer chips may be able to help ease the alarming energy demands of generative artificial intelligence, but chip makers say they need something from AI first: a slowdown in the sizzling pace of change.

Graphics processing units so far have dominated the bulk of training and running large-scale AI models. The chips, originally built for gaming graphics, offer a unique blend of high performance with the flexibility and programmability required to keep up with today’s constantly shifting swirl of AI models. 

Nvidia’s dominance in the GPU market has propelled it to a trillion-dollar valuation, but others, including Advanced Micro Devices, also make the chips.

As the industry coalesces around more standardized model designs, however, there will be an opportunity to build more custom chips that don’t require as much programmability and flexibility, said Lisa Su, chief executive at AMD. That will make them more energy-efficient, smaller and cheaper. 

Lisa Su, chairwoman and CEO of Advanced Micro Devices. PHOTO: I-HWA CHENG/AGENCE FRANCE-PRESSE/GETTY IMAGES

“GPUs right now are the architecture of choice for large language models, because they’re very, very efficient for parallel processing, but they give you just a little bit of programmability,” Su said. “Do I believe that that’s going to be the architecture of choice in five-plus years? I think it will change.” 

What Su expects in five or seven years’ time isn’t a shift away from GPUs, but rather a broadening beyond GPUs. 

Nvidia and AMD haven’t been vocal around specific plans here. Nvidia declined to comment for this article.

Some custom chips are already hard at work handling aspects of AI.

Large cloud providers like Amazon.com and Google have developed their own custom AI chips for internal use, such as Amazon’s AWS Trainium and AWS Inferentia, and Google’s tensor processing units, or TPUs. These are built to execute only specific functions: Trainium can only train models, for example, while Inferentia can only run inference, a less intensive process than training in which models process new information and respond.

Broadcom CEO Hock Tan said in an internal address this year that his company’s custom chip division, which mostly helped Google make AI chips, was bringing in over $1 billion in operating profit a quarter. 

Custom chips can be far more energy efficient, cheaper and smaller because they can be hard-wired to a given degree: they can perform one specific function, run one specific type of model or even one specific model itself, said Shane Rau, research vice president for computing semiconductors at market intelligence firm International Data Corp.

But the market for commercially selling these super-custom, application-specific chips is still immature, Rau said, a symptom of how much innovation is happening in AI models. 

Highly customized chips also present a challenging lack of flexibility and interoperability, said Chirag Dekate, a vice president analyst at research firm Gartner. To the extent that they are programmable, they’re very difficult to program, typically requiring custom software stacks, and it can be difficult to make them work with other kinds of chips. 

Many chip offerings today exist on a continuum, however, with some GPUs that can be more customized and some specialized chips that provide a level of programmability. That gives chip makers an opportunity, even before generative AI becomes more standardized. It can also be a conundrum. 

“That’s a big thing we’ve struggled with here,” said Gavin Uberti, co-founder and CEO of Etched. The startup makes chips that do inference only on the transformer architecture, developed by Google in 2017, that has since become the standard for large language models. Despite customizing up to a certain point, the chips also have to be flexible enough to adapt to smaller operations that vary model to model. 

“Right now, the models have plateaued enough that I think making a bet on the transformer makes sense, but I don’t think making a bet on say, Llama 3.1 405B makes sense yet,” Uberti said, referring to an AI model from Facebook owner Meta Platforms. “Transformers are going to stick around, but they’re going to keep getting bigger and evolving.” He added, “You do have to be careful to not overspecialize.” 

There is also no one-size-fits-all when it comes to computing, said Su, the AMD CEO. AI models in the future will use a combination of different types of chips, including today’s dominant GPUs and also more specialized chips still to be developed, for various functions. 

“There will be other architectures,” she said. “It’s just that it’ll depend on the evolution of the models.”

Source wsj.com

Leave a comment

This website uses cookies to improve your web experience.
Explore
Drag