"We are not here to jerk ourselves off about parameter count."
Large language models (LLMs) like OpenAI's are getting bigger and better with each new iteration.
Last month, the company unveiled its long awaited GPT-4, a beefy and substantially larger upgrade to its chatbot's underlying LLM that's so impressive it immediately inspired a massive group of experts and tech CEOs — including Elon Musk — to sign a letter calling for a moratorium on experimenting on AI more advanced than OpenAI's latest model.
With results like that, you'd think OpenAI would want to keep digging its heels in to push out even larger models than before. But its CEO Sam Altman is now cautioning that the age of simply scaling up AI to make them more powerful may already be over. From here, the approach will have to be decidedly less size-focused.
Generally speaking, when it comes to AIs and LLMs in particular, bigger has been better. OpenAI's first landmark model GPT-2, released in 2019, boasted around 1.5 billion parameters, the adjustable variables that connect the neurons of an AI that help it "learn" and refine itself based on its input data.
By the time GPT-3 rolled out the next year, it boasted a whopping 175 billion parameters, and by GPT-4, one trillion, according to some outside estimates. But crucially, as Wired notes, OpenAI itself has not shared GPT-4's exact size, which is perhaps emblematic of the company's pivot away from simply enlarging its models.
While each of these increases in parameters has been met with an uptick in the capabilities of the GPT models, this approach may now be yielding diminishing returns, according to the findings of OpenAI's own technical report — like how you can't just keep adding more cylinders to a car engine to make it more powerful.
Still Heading Up
It's worth mentioning that Altman conceded that parameter counts may trend up regardless — diminishing returns, after all, are still returns — but he maintains that the metric gets "way too much focus."
"This reminds me a lot of the gigahertz race in chips in the 1990s and 2000s, where everybody was trying to point to a big number," Altman said, as quoted by TechCrunch.
"What we want to deliver to the world is the most capable and useful and safe models," he added. "We are not here to jerk ourselves off about parameter count."
Share This Article