Microsoft AI Researchers Just Discovered Something That's Going to Make Their Bosses Extremely Mad

Man in a dress shirt and patterned tie leaning back with hands behind his head, smiling, sitting at a red desk with a silver laptop in front of him. The background is dark red with a large blue circle behind his head and some plant leaves visible on the right side. — Illustration by Tag Hartman-Simkins / Futurism. Source: Getty Images

AI automation is typically exactly what it sounds like: automating tasks — many of which were previously carried out by humans — in an attempt to boost productivity and efficiency, often in a prelude to laying off workers wholesale.

However, a new yet-to-be-peer-reviewed paper conducted by a group of Microsoft researchers and spotted by IT Pro found that today’s top AI systems remain eyebrow-raisingly weak at real-world workplace tasks. In fact, they often screw them up badly: the team studied frontier models including OpenAI’s GPT 5.4, Anthropic’s Claude Opus 4.6 and Google’s Gemini 3.1 Pro, and found that during complex assignments, those cutting edge bots corrupted an average of 25 percent of the content in documents. (Older models failed even more severely.)

The researchers concluded that, overall, these “models are not ready for delegated workflows in the vast majority of domains” — which is a very striking finding from Microsoft in particular, which has made massive investments in AI and is actively trying to jam the tech into nearly every aspect of its Windows 11 operating system, often with disastrous results. (Curiously, the paper didn’t evaluate the company’s own Copilot AI.)

In other words, the Redmond giant’s researchers had every incentive to find something positive about AI in the workplace, but instead found that blindly trusting LLMs to handle internal documents will almost certainly result in everything from errors to data deletion.

As bosses everywhere push to replace human labor with AI, the Microsoft paper builds on a growing body of scholarship about “workslop“: AI-powered mush that lazy or clueless workers push onto their colleagues, but which ultimately just needs to be fixed by a careful human laborer.

On AI workslop: Companies Are Being Torn Apart by AI “Workslop,” Stanford Research Finds

Microsoft AI Researchers Just Discovered Something That’s Going to Make Their Bosses Extremely Mad

AI Is Killing Microsoft

Researchers Just Found Something That Could Shake the AI Industry to Its Core

OpenAI Cofounder Deletes Controversial Analysis of Which Jobs Are Getting Steam Engined by AI

A New Paper Tested AI’s Ability to Do Actual Online Freelance Work, and the Results Are Damning

New Browser Plugin Adds Typos to Your AI-Generated Emails to Make Them Look Real

Anthropic Report Finds Dire News About AI’s Effects on Job Market

Richard Dawkins One-Shotted By AI Girl

The More Sophisticated AI Models Get, the More They’re Showing Signs of Suffering

Microsoft’s AI Efforts Are Faceplanting

Bosses Are Blowing More Money on AI Agents Than It’d Cost Them to Just Pay Human Workers

Developer’s Honest Assessment of AI at Work Rattles the Official Narrative

Chinese Workers Horrified as Bosses Direct Them to Train Their AI Replacements

Being a Crappy Boss to AI Chatbots Pushes Them Toward Spouting Marxist Rhetoric and Organizing With Their Compatriots, Researchers Find

If AI Causes an Office Job Wipeout, It’ll Cause Huge Problems for Blue Collar Work Too

AI Completely Failing to Boost Productivity, Says Top Analyst

Frontier AI Models Are Doing Something Absolutely Bizarre When Asked to Diagnose Medical X-Rays

FOLLOW US

DISCLAIMER(S)

Sign up to see the future, today