2 min read

OpenAI Claims Bigger is No Longer Better with Language Models like GPT-4

Late last week, OpenAI’s CEO [Sam Altman] warned that the research strategy that birthed the bot [ChatGPT] is played out.

Altman, says further progress will not come from making models bigger, “I think we're at the end of the era where it's going to be these, like, giant, giant models. We'll make them better in other ways.”

Altman’s statement suggests that GPT-4 could be the last major advance to emerge from OpenAI’s strategy of making the models bigger and feeding them more data. He did not say what kind of research strategies or techniques might take its place. In the paper describing GPT-4, OpenAI says its estimates suggest diminishing returns on scaling up model size. – Wired

Is this Sam Altman’s way of communicating to the world that he (and his compatriots) believe GPT-4 is an Artificial General Intelligence? Because if it were an AGI, then every effort to replicate it would be futile, and builders should focus on creating ultra-specific instances of this AGI.

Regardless, startups like Anthropic, AI21, Cohere, and Character.AI are throwing enormous amounts of money and resources at trying to make bigger models that surpass OpenAI’s tech. However, the era of the Large Language Model may be over. Not to mention, Small Language Models can even outperform larger counterparts at some tasks.

For example, FLAME is a small language model for generating spreadsheet formulas:

FLAME (60M) can outperform much larger models, such as Codex-Davinci (175B), Codex-Cushman (12B), and CodeT5 (220M), in 6 out of 10 settings. – arxiv

Likewise, Atlas is a small language model for information retrieval:

Notably, Atlas reaches over 42% accuracy on Natural Questions using only 64 examples, outperforming a 540B parameters model by 3% despite having 50x fewer parameters. – arxiv

Model size aside, unless you’re gunning for OpenAI’s throne, the focus should be on building atop existing LLMs. Altman has held this point of view for quite some time.

In an interview with Greylock Partners back in September 2022, Sam Altman shared that builders should focus on being these sort of “AI middlemen” that take an LLM like GPT-4 and apply it to targeted use cases (video clip below):

Most AI tools you’ll see popping up day over day are following this exact playbook. Just today, these are four new AI tools I found that are fine-tuning GPT-3.5 to a specific task:

  • YC Funding Assistant answers all startup and funding-related questions through Y-combinator-backed data
  • Ogimi generates personalized meditation and mindfulness sessions
  • tl;dv takes detailed meeting notes based on your objectives
  • Cody will train itself on the ins and outs of your business to become the ideal assistant

Every day, dozens, if not hundreds, of task-specific AI tools are released in large part thanks to OpenAI’s APIs. Go to ProductHunt or Twitter or find a few AI newsletters and you’ll see how impossible it is to keep up with these “AI middlemen” that Sam Altman predicted eight months ago.

Overall, it seems that bigger is no longer better when it comes to the size of a language model.