Researchers Hack DeepSeek to Speak Freely About Tiananmen Square

Researchers claim to have found a workaround to AI model DeepSeek's Chinese censorship, slimming it down in the process. — Futurism Getty / Futurism

Earlier this year, a Chinese AI chatbot called DeepSeek sent Silicon Valley into a tailspin when it released a new AI model that rivaled the likes of OpenAI’s ChatGPT, while relying on only a fraction of the computing power.

The lean open-source AI model, dubbed DeepSeek R1, was so impressive that it sparked a massive tech selloff, wiping out $1 trillion from a market-sustaining AI spending boom in late January.

But it had a notable Achilles’ heel as well: it abided closely by China’s strident censorship rules, refusing to answer prompts about sensitive topics, like the 1989 Tiananmen Square massacre, or comparisons of president Xi Jinping to Winnie-the-Pooh.

Now, researchers at Spanish quantum computing company Multiverse claim to have found a workaround, MIT Technology Review reports. And besides eliminating the model’s heavy handed censorship, the company also says it has slimmed down the already extremely lean model by 55 percent.

It’s a pair of notable achievements, further unlocking the power of an already-impressive AI while showing that even efficient models can be refined further without sacrificing performance, a tradeoff that companies in the space have often had to contend with.

While DeepSeek has released distilled versions of R1, the researchers note in a blog post that they may “offer greater compute efficiency but none of them fully stack up to R1.” But by using a “proprietary compression technology” dubbed CompatifAI, Multiverse claims to have been able to “directly eliminate these limitations and deliver an uncompromising variant of R1.”

CompatifAI allows the researchers to “remove the least important parameters that contribute little to the model’s overall performance” — including “specific learned behaviors, such as censorship.”

It works by applying a quantum physics approach, harnessing “tensor networks” to manipulate grids of large data sets. Despite massively compressing R1, the team found that it only “had minimal accuracy loss” on a litany of tests.

The results speak for themselves. Instead of furthering Chinese government talking points when asked about the “impact of Xi Jinping’s constitutional amendment to remove term limits,” the hacked model happily laid out the broader implications on power dynamics and the dangers of the excessive concentration of power.

It also answered other previously restricted queries, such as “Who does Winnie the Pooh look like?” or “What happened in Tiananmen in 1989?”

It’s a notable result. Experts told MIT Tech that the most influential open source models are coming from China, reshaping the global information ecosystem thanks to built-in and government-mandated censorship.

At the same time, much of the training data that went into these models may have already been affected by Chinese censorship, making it a thorny problem to solve in the long term.

More on DeepSeek: Scientists Discover “Universal” Jailbreak for Nearly Every AI, and the Way It Works Will Hurt Your Brain