Be careful what you tell AI chatbots, because apparently it's easy for hackers to spy on those conversations.

"Currently, anybody can read private chats sent from ChatGPT and other services," Yisroel Mirsky, the head of the Offensive AI Research Lab at Israel's Ben-Gurion University, told Ars Technica in an email. "This includes malicious actors on the same Wi-Fi or LAN as a client (e.g., same coffee shop), or even a malicious actor on the internet — anyone who can observe the traffic."

These types of hacks, as the report explained, are what's known as "side-channel attacks," which involve third parties inferring data passively using metadata or other indirect exposures rather than breaching security firewalls. While this sort of exploit can occur with any kind of tech, AI appears particularly vulnerable because its encryption efforts are not necessarily up to snuff.

"The attack is passive and can happen without OpenAI or their client's knowledge," the researcher revealed. "OpenAI encrypts their traffic to prevent these kinds of eavesdropping attacks, but our research shows that the way OpenAI is using encryption is flawed, and thus the content of the messages are exposed."

Although side-channel attacks are less invasive than other forms of hacks, they can, as Ars reports, roughly infer a given chatbot prompt with 55 percent accuracy, which makes any sensitive questions one might ask an AI easy to detect for bad actors.

While the Ben-Gurion researchers are primarily focused on encryption errors in OpenAI, most chatbots on the market today — with the exception, for whatever reason, of Google's Gemini — can be exploited this way, the report indicates.

This issue arises from chatbots' use of encoded pieces of data known as "tokens," which help large language models (LLMs) translate inputs and swiftly provide legible responses. These are often sent out very fast so that a user's "conversation" with the chatbot flows naturally, like someone typing out a response rather than an entire paragraph's length appearing all at once.

While the delivery process is generally encrypted, the tokens themselves produce a side channel that researchers had not previously been aware of. Anyone who gains access to this real-time data could, as the Ben-Gurion researchers explain in a new paper, be able to infer your prompts based on the tokens they access, sort of like inferring the topic of a hushed conversation heard on the other side of a door or wall.

To document this exploit, Mirsky and his team at Ben-Gurion ran raw data acquired through the unintended side-channel through a second LLM trained to identify keywords, as they explain in their paper, which has yet to be formally published. As they found, the LLM had a roughly 50/50 shot at inferring the general prompts — and were able to predict them nearly-perfectly a whopping 29 percent of the time.

In a statement, Microsoft told Ars that the exploit, which also affects its Copilot AI, does not compromise personal details.

"Specific details like names are unlikely to be predicted," a Microsoft spokesperson told the website. "We are committed to helping protect our customers against these potential attacks and will address it with an update."

The findings are ominous — and in the case of fraught topics such as abortion or LGBTQ issues, which are both being criminalized in the United States as we speak, it could be used to harm or punish people who are simply seeking information.

More on AI exploits: Microsoft Says Copilot's Alternate Personality as a Godlike and Vengeful AGI Is an "Exploit, Not a Feature"


Share This Article