Using AI to Do Your Taxes Is Likely to Backfire Spectacularly

A man with a beard and short hair wearing a checkered shirt looks surprised or shocked while reading a document. The background is a bright orange with a yellow circle and a grid pattern overlay. — Illustration by Tag Hartman-Simkins / Futurism. Source: Getty Images

Tax season, that dreaded time of year, is upon us. But if you were hoping that newfangled AI tech could help you file the laborious paperwork — and perhaps find a way of saving you a few bucks — think again.

After testing several four leading AI chatbots, the New York Times found that all of them struggled to pick and fill out the correct forms, fumbling key calculations. In all, the bots miscalculated the tax money owed to the IRS by an average of more than $2,000.

“The problem with taxes is all those very small little details matter, and it’s not going to get every single little detail right,” Benedict Evans, an analyst who writes a technology newsletter, told the NYT.

“These models get dramatically better over the course of every six months,” he continued. “But they still give you what is roughly the right answer, and that’s not what you want.”

AI can be useful for processing and summarizing large amounts of information, but it struggles with precision in virtually every domain. Chatbots will often fabricate false factual claims, even when asked to summarize a single document. AI programming assistants will slip errors into their code. Image generators produce strange visual artifacts and inconsistencies.

The conundrum is the same with arithmetic. Pair that with byzantine tax laws and all its corresponding, highly specific forms, and you have a recipe for, if not disaster, then a taxing and expensive back-and-forth with the IRS.

To test the AI models — OpenAI’s ChatGPT, Anthropic’s Claude, Google’s Gemini, and xAI’s Grok — the NYT had them attempt to solve a series of tax scenarios described in training materials by the tax service TaxSlayer. Only after supplying the models with highly specific instructions, like where each piece of information should go in each IRS document, did the AIs begin to fare better.

That, you might argue, defeats the point of using an automated tool in the first place. Your average joe uses overpriced tax software precisely because they don’t know the nitty gritty of the process. Software like TurboTax or TaxAct “is procedural, following ‘if-then’ logic built for mathematical precision,” Erik Brynjolfsson, a senior fellow at the Stanford Institute for Human-Centered AI, explained to the NYT — whereas large language models are prediction engines that “can be superhuman at many tasks yet fail at some that seem simpler to humans.”

A prime example of how hallucinating LLMs can cock up your tax homework? TurboTax’s own experiments with the tech. When the tax software company deployed its “Intuit Assist” chatbot to answer tax questions, it would spin off irrelevant answers. When the answers were on topic, they were often wrong.

More on AI: Grammarly Offering Manuscript Reviews by AI Versions of Recently Deceased Professors

Using AI to Do Your Taxes Is Likely to Backfire Spectacularly

If You Use AI Chatbots to Follow the News, You’re Basically Injecting Severe Poison Directly Into Your Brain

“Novelist” Boasts That Using AI She Can Churn Out a New Book in 45 Minutes, Says Regular Writers Will Never Be Able to Keep Up

Perplexity CEO Warns That AI Girlfriends Can Melt Your Brain

Anthropic CEO Says Company No Longer Sure Whether Claude Is Conscious

Top Machine Learning Developer Speechless at Simple Question: Should AI Simulate Emotional Intimacy?

Humongous Numbers of People Are Uninstalling ChatGPT as Anti-OpenAI Sentiment Surges

Tech Giants Pushing AI Into Schools Is a Huge, Ethically Bankrupt Experiment on Innocent Children That Will Likely End in Disaster

A Staggering Proportion of High School Kids Are Using AI to Do Their Homework, Which Is Probably Not Going to End Well

FOLLOW US

DISCLAIMER(S)

Sign up to see the future, today