"We are not here to jerk ourselves off about parameter count."
Big News
Large language models (LLMs) like OpenAI's are getting bigger and better with each new iteration.
Last month, the company unveiled its long awaited GPT-4, a beefy and substantially larger upgrade to its chatbot's underlying LLM that's so impressive it immediately inspired a massive group of experts and tech CEOs — including Elon Musk — to sign a letter calling for a moratorium on experimenting on AI more advanced than OpenAI's latest model.
With results like that, you'd think OpenAI would want to keep digging its heels in to push out even larger models than before. But its CEO Sam Altman is now cautioning that the age of simply scaling up AI to make them more powerful may already be over. From here, the approach will have to be decidedly less size-focused.
"I think we're at the end of the era where it's going to be these, like, giant, giant models," Altman said at an MIT event last week, as quoted by Wired. "We'll make them better in other ways."
Diminishing Returns
Generally speaking, when it comes to AIs and LLMs in particular, bigger has been better. OpenAI's first landmark model GPT-2, released in 2019, boasted around 1.5 billion parameters, the adjustable variables that connect the neurons of an AI that help it "learn" and refine itself based on its input data.
By the time GPT-3 rolled out the next year, it boasted a whopping 175 billion parameters, and by GPT-4, one trillion, according to some outside estimates. But crucially, as Wired notes, OpenAI itself has not shared GPT-4's exact size, which is perhaps emblematic of the company's pivot away from simply enlarging its models.
While each of these increases in parameters has been met with an uptick in the capabilities of the GPT models, this approach may now be yielding diminishing returns, according to the findings of OpenAI's own technical report — like how you can't just keep adding more cylinders to a car engine to make it more powerful.
Still Heading Up
It's worth mentioning that Altman conceded that parameter counts may trend up regardless — diminishing returns, after all, are still returns — but he maintains that the metric gets "way too much focus."
"This reminds me a lot of the gigahertz race in chips in the 1990s and 2000s, where everybody was trying to point to a big number," Altman said, as quoted by TechCrunch.
"What we want to deliver to the world is the most capable and useful and safe models," he added. "We are not here to jerk ourselves off about parameter count."
More on AI: Google Surprised When Experimental AI Learns Language It Was Never Trained On
Share This Article