Well, that was quick. In a very unfortunate turn of events, there:
- Already appears to be an AI-generated Substack blog.
- That blog, The Rationalist, is already word-for-word plagiarizing human-made work.
- That plagiarized work was shared by another platform, amassing views and sparking conversation.
- There's seemingly very little means of recourse, if any, for the human writer whose work was ripped off — and approximately zero hope that this extremely frustrating and otherwise damaging scenario will be prevented from happening again. (And again and again and again.)
"Over the weekend, a new Substack called The Rationalist lifted analysis and writing directly from Big Technology," tech journalist — and plagiarized party — Alex Kantrowitz wrote in an explanatory Substack post of his own. "It would've been a terrific debut for any publication, if it was authentic."
Indeed, to Kantrowitz's point, the success of The Rationalist's post was impressive. It ended up on Hacker News, a well-read site with an engaged readership, and even garnered substantial discourse there.
But again, as Kantrowitz very clearly lays out, the article was a blatant copy of his own, in terms of specific sentences as well as overall message and structure. And though Substack promised to investigate the ordeal, there appear to be a number of bright red flags — the Rationalist's writer goes only by the single name "PETRA," while the writing itself was repetitive and rarely linked to sources.
"The Rationalist is an odd publication. It has no mission. No named authors outside of PETRA. It's been live for a week,” Kantrowitz continued. "And yet two days after it went live, it was lifting passages directly from Big Technology."
"The speed at which they were able to copy, remix, publish, and distribute their inauthentic story was impressive," he warned. "It outpaced the platforms' ability, and perhaps willingness, to stop it, signaling generative AI's darker side will be difficult to tame."
Of course, while we certainly have some skin in the game here, writing isn't the only field where similar situations are happening. Related conversations are playing out about AI-powered image generators, particularly OpenAI’s DALL-E 2; artists are rightfully concerned about theft and proper crediting — especially considering that OpenAI has already okay’d that images it generates can be used for commercial purposes.
Still, though there's an existential debate to be had over where exactly the line between inspiration and full-on copying really is, Kantrowitz's ordeal seems to be a prime example of the latter.
Still, there is some nuance here. Though PETRA failed to note that the article was AI-written in the actual Substack post, they did come clean to some skeptical Hacker News commenters, explaining that they used GPT-3 and Huggingface, along with "a few other writing tools," to help them "wordsmith it" — but not because they wanted to plagiarize.
"I am not a native English speaker, so I have been using these tools to avoid awkward sentences/paragraphs," they added. "Clearly this has not been the outcome I was hoping for."
If true, that's something to sympathize with, and a complicating point indeed. (Although to Kantrowitz’s point, there are some extra caveats to their publication, profile, and failure to be up-front about their AI use from the jump that sound some alarm bells.)
But even if the plagiarism was entirely unintentional, it still absolutely sucks for the plagiarized party. And though Hacker News readers were able to notice that something was up with the writing, generative AI programs are only going to get stronger. If this is already happening now, it's a pretty grim sign of the confusion and theft that's to come.
"Imagine AI remixing the Financial Times' ten most-read stories of the day — or The Information's VC coverage — and making the reporting available sans paywall," Kantrowitz continued. "AI is already writing non-plagiarized stories for publications like CNET. At a certain point, some publishers will cut corners."
Maybe the most troubling part is that Substack ultimately decided, according to Kantrowitz, that PETRA's blog didn't actually break any of the platform's rules.
And when you start thinking about the material used to train large AI systems, the questions about intellectual property get even more complicated.
READ MORE: A Writer Used AI To Plagiarize Me. Now What? [Big Technology]
More on generative AI ethics: CNET Is Quietly Publishing Entire Articles Generated by AI