OpenAI Just Unveiled Its Next-Generation AI That Can Turn Images into Text

It's here, everyone. GPT-4 is here.

Well, actually, it's been here for a little while, as Microsoft's OpenAI-powered Bing AI has been using the next-gen tech this whole time.

But now, OpenAI has made GPT-4 itself available for broader public use — but at a price. The large language model (LLM) will only be available to users who upgrade to ChatGPT Plus for $20 a month.

"GPT-4 is OpenAI's most advanced system, producing safer and more useful responses," reads an OpenAI blog post.

According to the company, its new-and-improved LLM contains several notable updates over its previous iteration, GPT-3.5, and is more accurate, thanks to the even more immense amount of training material that it's been fed.

It's an absolutely badass test-taker, the company claims, utterly crushing pretty much every standardized test out there.

It also reportedly shines at copy editing and can come up with high-quality summaries, comparisons, and breakdowns of written material — an ability that seems to have impressed experts.

"To do a high-quality summary and a high-quality comparison, it has to have a level of understanding of a text and an ability to articulate that understanding," Oren Etzioni, CEO of the Allen Institute for Artificial Intelligence, told the New York Times. "That is an advanced form of intelligence."

It's also multimodal, meaning that users can bolster text prompts with image inputs. For example, if you upload a photo of a few kitchen ingredients and ask what you might be able to bake with them, it'll serve you up some recipes to try.

In other words, it can "see" — or make sense of images you feed it.

OpenAI further claims that the tech "surpasses ChatGPT in its advanced reasoning capabilities" — an area where GPT and other LLMs really struggle — with OpenAI CEO Sam Altman telling the NYT that the bot could reason "a little bit."

According to the report, though, GPT-4's reasoning skills still break down often, and the bot remains quite far from being anywhere close to any human-level analytical reasoning.

The company says that there have also been some much-needed safety improvements.

"Following the research path from GPT, GPT-2, and GPT-3, our deep learning approach leverages more data and more computation to create increasingly sophisticated and capable language models," reads the blog, claiming that after spending "six months" working to make GPT-4 "safer and more aligned," the new model is "82 percent less likely to respond to requests for disallowed content and 40 percent more likely to produce factual responses than GPT-3.5 on our internal evaluations."

So, in short, the new model is markedly better at defending itself against prompt injection attacks and jailbreaking attempts, and also hallucinates — in other words, the LLM's tendency to make facts up — a lot less.

But while it might be better at both, it's not perfect at either.

As the NYT found, GPT-4 still has a tendency to hallucinate, despite OpenAI's best efforts — making it less than ideal for doing research on the internet.

All in all, while GPT-4 represents a marked improvement over previous models, it's still only a tiny iterative step towards a future where the lines between human and machine start to blur.

READ MORE: 10 Ways GPT-4 Is Impressive but Still Flawed [The New York Times]

More on OpenAI: ChatGPT Is Coming to Slack Because We Live and Work in Hell

Share This Article

OpenAI's Next-Generation AI Is About to Demolish Its Competition

Here's why.