We’re racing towards a future in which devices will be able to read our thoughts.
You see signs of it everywhere, from brain-computer interfaces to algorithms that detect emotions from facial scans. And though the tech remains imperfect, it’s getting closer all the time: now a team of scientists say they’ve developed a model that can generate descriptions of what people’s brains are seeing by simply analyzing a scan of their brain activity.
They’re calling the technique “mind captioning,” and it may represent an effective way for transcribing what someone’s thinking, with impressively comprehensive and accurate results.
“This is hard to do,” Alex Huth, coauthor of a new study in the journal Science Advances, and a computational neuroscientist at the University of California, Berkeley, told Nature. “It’s surprising you can get that much detail.”
The implications of such technology are a double-edged sword: on the one hand, it could give a voice to people who struggle speaking due to stroke, aphasia, and other medical difficulties, but on the other hand, it may threaten our mental privacy in an age when many other facets of our lives are surveilled and codified. But the team stress the model can’t decode your private thoughts. “Nobody has shown you can do that, yet,” Huth added.
The researchers’ new technique relies on several AI models. To train them, first a deep language model analyzed the text captions in more than 2,000 short form videos, generating unique “meaning signature.” Then another AI tool was trained on the MRI brain scans of six participants while they watched the same videos, matching the brain activity to the signatures.
Combined, the resulting brain decoder could analyze a new brain scan from someone watching a video and predict the meaning signature, while an AI text generator scoured for sentences that matched the predicted signature, creating dozens of candidate descriptions and refining them along the way.
While it sounds like an elaborate chain of guessing games, the results were remarkably descriptive and mostly on the money. According to Nature, by analyzing the brain activity of a participant who watched a video of someone jumping from the top of a waterfall, the AI model initially predicted the string “spring flow,” refined that into “above rapid falling water fall” on the tenth guess, and finally landed on “a person jumps over a deep water fall on a mountain ridge” on the 100th guess.
Overall, the generated text descriptions achieved a 50 percent accuracy in identifying the correct video out of 100 possibilities. That’s significantly higher than random chance, which would be around one percent, and impressive in the context of essentially divining coherent thoughts out of brain patterns.
The researchers aren’t the only ones to claim they’ve developed a technique for scanning thoughts. But other attempts only produced a crude description of key words instead of providing detailed context, study coauthor Tomoyasu Horikawa, a computational neuroscientist at NTT Communication Science Laboratories in Kanagawa, Japan, told Nature. Or they used AI models to directly form the sentences, blurring the lines between what the person’s actual thoughts were and what was AI-generated.
Other techniques were wildly impractical. Meta, for example, created a device that lets you type text with your brain by combining a deep learning AI model with a magnetoencephalography scanner. But such a machine is both prohibitively expensive and large, and can only be used inside a room shielded from the Earth’s magnetic field.
While this latest approach relied on the scans of an MRI machine, which is no less impractical for daily use, the researchers hope that their approach could be combined with brain implants which would provide the readings.
“If we can do that using these artificial systems, maybe we can help out these people with communication difficulties,” Huth told Nature.
More on brain tech: Neuralink Head of Surgery Says Robot-Human Interface Happening “Very Soon”