AI developer Hugging Face says it's created an open-source AI research agent that can trade blows with OpenAI's latest Deep Research feature — in just 24 hours.
The Sam Altman-led company released the agent, which "uses reasoning to synthesize large amounts of online information and complete multi-step research tasks for you," over the weekend.
Simply put, Deep Research — the company technically doesn't capitalize the name, but we're just going to go ahead and do so because it looks bizarre not to — sits on top of an existing AI model to provide new functionality to the user. In practice, you can ask it to do things like generate a "competitive analysis on streaming platforms or a personalized report on the best commuter bike," according to OpenAI, which could take "anywhere from five to 30 minutes."
But it didn't take long for Hugging Face researchers to come up with a worthy alternative.
"While powerful LLMs are now freely available in open-source, OpenAI didn’t disclose much about the agentic framework underlying Deep Research," Hugging Face wrote in a Tuesday announcement. "So we decided to embark on a 24-hour mission to reproduce their results and open-source the needed framework along the way!"
The company created an "agent" framework" that writes actions in code instead, which it says immediately led to a major bump in performance.
It's not perfect quite yet, it's worth pointing out. Hugging Face's Open Deep Research scored a 55.15 percent accuracy on a benchmark called General AI Assistants, while OpenAI's version scored 67.36, leaving some room for improvement. (OpenAI's version itself still has a lot of trouble distinguishing between "information from rumors," greatly undercutting its current usefulness as a research analyst.)
But considering that Hugging Face, which has far fewer resources to work with than OpenAI, created its agent in a mere 24 hours, the challenge highlights just how replaceable OpenAI's AI tools have become. Every time it drops a hot new AI, there now seems to be a race to duplicate its capabilities with a fraction of the gigantically funded company's resources.
While Hugging Face's Aymeric Roucher, who led the research, told Ars Technica that the agent "worked well" with OpenAI's o1, he added that Hugging Face's open-source model called open-R1 may soon work even "better."
The exchangeability of AI models is an especially pertinent topic, considering the emergence of Chinese AI startup DeepSeek, which upended the entire tech sector with its extremely lean and efficient model called R1 last month. (Hugging Face's open-R1 is an open source version of DeepSeek's model).
DeepSeek also likely flexed the power of distillation, which is the strategy of creating "reasoning" capabilities by training an AI model on the output of another one. Whether that constitutes any violation of intellectual property, something that OpenAI has since accused DeepSeek of doing, remains to be seen, especially considering OpenAI's own AI was built by indiscriminately ripping off protected content on the internet.
But it's a clever workaround that could give AI industry stalwarts like OpenAI a run for their money. Case in point, researchers at Stanford and the University of Washington developed a worthy rival to OpenAI's o1 "reasoning" model for less than $50 worth of cloud compute credits, as detailed in a yet-to-be-peer-reviewed paper that was first spotted by TechCrunch.
That new model, dubbed s1, performed on the level of DeepSeek's R1 and OpenAI's o1 on math and coding tests. It was distilled using the output of Google's (mostly) free-to-use Gemini 2.0 Flash Thinking Experimental reasoning model.
The team trained s1 using a dataset of just 1,000 curated questions with answers from Google's AI. It took less than 30 minutes to achieve strong performance on AI benchmarks, as TechCrunch reports, using a mere 16 Nvidia AI chips.
Meanwhile, the industry's biggest players like OpenAI and Meta are planning to pour hundreds of billions of dollars into initiatives to expand AI infrastructure in the US, enormous investments that were thrown into question by the emergence of much cheaper to train and run alternatives like DeepSeek.
Whether these tools can ever even turn a profit, let alone stop hallucinating facts every step of the way, remains as uncertain as ever — especially when the little guys can quickly clone their best work and offer it for free.
More on OpenAI: OpenAI Shows Off AI "Researcher" That Compiles Detailed Reports, Struggles to Differentiate "Information From Rumors"
Share This Article