We Tested OpenAI's New AI-Detector and Uhhhhh

OpenAI, the company behind blockbuster AI chatbot ChatGPT, has released a tool meant to help teachers detect if a text was written by a student or an AI.

The tool couldn't have come at a more appropriate time, with educators across the country battling with a new reality. According to one recent survey, a whopping 48 percent of students confessed they already made use of ChatGPT to complete an at-home test or quiz.

OpenAI's new tool — with the uninspired name "AI Text Classifier" — requires a sample of at least 150 words to classify whether a text is either "very unlikely, unlikely, unclear if it is, possibly, or likely AI-generated."

"The model is primarily trained and evaluated on English language text from the public web (in the case of the human-written dataset) and from models trained on English language text (in the case of the model-written dataset)," the company noted in a small FAQ section amended to the tool's webpage.

But whether it'll actually prove useful to educators remains to be seen.

"The classifier isn't always accurate," the company admits. "It can mislabel both AI-generated and human-written text."

In other words, it's not very good yet. In our own cursory testing, the tool was easily capable of identifying all ten of the blog samples we fed it that were written by a human. Nine were evaluated as being "very unlikely AI-generated" while one was classified as only "unlikely AI-generated."

But things started to look drastically different when we fed it ten text samples generated by ChatGPT. Only four of the samples were rated as "likely" to be generated by an AI, and three as "possibly" AI-generated.

One sample — we asked it to generate 1,000 words on the causes of global poverty — was even listed as "very unlikely" to have been AI-generated. A further three AI-generated samples were classified as only "possibly" AI-generated.

Those are some pretty dismal results for a detection tool developed by the same company that came up with the language model it's being measured against.

That also means that it's not going to be very useful to educators. After all, only getting a vague semblance of an answer won't be enough for teachers to accuse their students of plagiarism, a serious charge.

In fairness, that's something OpenAI is well aware of.

"The results may help, but should not be the sole piece of evidence when deciding whether a document was generated with AI," OpenAI notes.

It's not the first ChatGPT detection tool we've encountered. An app called GPTZero, developed by Princeton University computer science student Edward Tian, made headlines earlier this month for its ability to "quickly and efficiently detect whether an essay is ChatGPT or human written."

"Think are high school teachers going to want students using ChatGPT to write their history essays?" the 22-year-old student tweeted. "Likely not."

According to The Wall Street Journal, Tian's app has already amassed a waitlist of 23,000 teachers for an upcoming version.

But after testing it for ourselves, we came away unimpressed. Out of sixteen pieces of text total, GPTZero correctly identified the ChatGPT text in seven out of eight attempts and the human writing six out of eight times.

Reminder: to be actually useful in an academic environment, it would need a near-perfect score.

It remains to be seen if OpenAI's own tool will fare any better in the long term. The company clearly still has a lot of work to do in perfecting its classifier given our results.

"We have not thoroughly assessed the effectiveness of the classifier in detecting content written in collaboration with human authors," the company admitted.

In short, the cat is officially out of the bag. Even the creator of ChatGPT, which is already being used by students across the country to cheat, isn't sure how to tell AI-generated from human-submitted text, which doesn't exactly instill confidence.

On a more optimistic note, ChatGPT may force educators to grapple with a new reality and find new ways of incorporating AI into their lessons — which can only be a good thing, given AI's rampant rise in popularity and staying power.

READ MORE: OpenAI releases tool to detect machine-written text [Axios]

More on ChatGPT: College Student Caught Submitting Paper Using ChatGPT

Share This Article