Amid its long-running copyright infringement lawsuit against OpenAI, the paper of record will soon have access to all of OpenAI's user archives — including the stuff that was deleted.
As Ars Technica reports, the federal judge presiding over the lawsuit by the New York Times against OpenAI has granted the newspaper and its co-plaintiffs, the New York Daily News and the Center for Investigative Reporting, access to the AI company's logs to see exactly how much copyright was infringed.
In its previous reporting on the order, which came down last month and was affirmed this week over OpenAI's attempts to appeal, Ars noted that the NYT justifies this broad sweep by suggesting that people who use ChatGPT to bypass its paywalls may delete their history after doing so. The newspaper also claims that searching through those logs may prove to be the crux of the whole suit: that OpenAI's large language models (LLMs) are not only trained on its copyrighted material, but are also able to plagiarize that material, too.
OpenAI, as you might imagine, is none too pleased. Last month, the company claimed that the order will force it to bypass its "long-standing privacy norms," and after the latest ruling, an OpenAI spokesperson told Ars that OpenAI intends to "keep fighting" against it.
Notably, all this is occurring as the NYT et al and OpenAI negotiate how to handle the data trove search. As OpenAI noted in a statement last month, the order covers everything from free ChatGPT logs to more sensitive information from folks who use its API. (The order does specifically note that logs from ChatGPT Enterprise and ChatGPT Edu, its custom model specifically for college and universities, will be exempt.)
Along with its search for evidence of copyright infringement, this OpenAI log gambit may also help prove that ChatGPT dilutes the news market by summarizing articles within the chatbot, which ultimately leads to lost ad revenue for media outlets because their links are entirely bypassed by would-be readers.
Earlier this year, the content licensing platform TollBit found, per Forbes, that chatbots from OpenAI, Google, and others sent 96 percent less traffic to publishers than traditional search engines would — a trend that has already started to hurt the news industry.
In the existential fight between word purveyors and AI, proof of market dilution could, as a judge told a publisher suing Anthropic last month, tip the scales in favor of copyright holders — a momentous implication for anyone trying to bypass annoying paywalls.
More on OpenAI legal trouble: OpenAI Removes All Jony Ive Materials From Its Website
Share This Article