Skip links
Companies are moving to deploy more AI use cases, but they are also under pressure to manage costs and returns on the pricey technology. PHOTO: ANDY BARTON/ZUMA PRESS

These AI Models Are Pretty Mid. That’s Why Companies Love Them.

Companies are looking for simpler and cheaper ways to deploy artificial intelligence

Companies are increasingly deploying smaller and midsize generative artificial intelligence models, favoring the scaled down, cost efficient technology over the large, flashy models that made waves in the AI boom’s early days. 

Unlike foundation models such as OpenAI’s GPT-4—which cost more than $100 million to develop and uses more than one trillion parameters, a measure of its size—smaller models are trained on less data and often designed for specific tasks

Nearly all model providers, including Microsoft and Google and startups like Mistral, Anthropic and Cohere, are moving to offer more of these types of models. 

Chief information officers say that for some of their most common AI use cases, which often involve narrow, repetitive tasks like classifying documents, smaller and midsize models simply make more sense. And because they use less computing power, smaller models can cost less to run.

The shift comes as companies slowly move to deploy more AI use cases, while they are also under pressure to manage costs and returns on the pricey technology.

“A giant LLM [large language model] that’s been trained on the entire World Wide Web can be massive overkill,” said Robert Blumofe, chief technology officer at cybersecurity, content delivery and cloud computing company Akamai. For enterprise use cases, he said, “You don’t need an AI model that knows the entire cast of ‘The Godfather,’ knows every movie that’s ever been made, knows every TV show that’s ever been made.” 

As they scale up, costs for large models can quickly get out of control, WPP Chief Technology Officer Stephan Pretorius says. PHOTO: WPP

Oliver Parker, vice president of global generative AI go-to-market at Google Cloud, said he has seen enterprises shifting to midsize models in the last three months, in part because the models meet criteria capturing a lot more enterprise use cases. 

Nonbank mortgage servicer and originator Mr. Cooper is testing the capabilities of midsize models in its call center to analyze voice data to help agents understand where conversations are likely going and what customers are likely to ask, said Chief Information Officer Sridhar Sharma.

“We don’t need to overengineer something just because it’s bigger,” Sharma said, adding Mr. Cooper is also using large foundation models for more complex use cases. 

TD Bank so far has been using OpenAI’s GPT family of models and others, including using GPT-4, to help call-center workers answer customer inquiries faster. But recently the bank also signed a partnership with AI model provider Cohere, and it will be looking at whether Cohere’s smaller or midsize models are more effective and cost efficient at that and other use cases, said Maksims Volkovs, TD’s chief AI scientist. 

Volkovs said he would evaluate Cohere’s models along with OpenAI’s offerings in terms of cost, accuracy and latency. Volkovs said he anticipates midsize models will win out in some scenarios. 

“The trade off between accuracy and cost should be more favorable,” he said. 

A year ago, enterprises gravitated toward a handful of large models, said Stephan Pretorius, CTO of marketing services company WPP. That was fine when companies used them in limited pilot capacities, he said, but now as they scale up, costs for the large models can quickly get out of control. WPP is using several models from Google’s Gemini family, including its midsize model, Flash. 

Flash is suited for uses like analyzing shopping habits in different countries and using findings to write relevant web copy for given products. For example, copy for mascara selling in the United Kingdom might do best highlighting waterproof qualities. Writing copy like that is where a midsize model can shine, said Pretorius. 

Large models still have relevance and value for complex use cases requiring a lot of data, creativity and interpretation, Pretorius said. For example, a large model would be suited for ingesting all of Shakespeare’s works and analyzing female characters versus male characters over time, he said. But that isn’t on WPP’s to-do list.

Source wsj.com/

Leave a comment

This website uses cookies to improve your web experience.
Explore
Drag