There are no reliable ways to distinguish text written by a human from text written by an large language model. OpenAI writes:
Do AI detectors work?
- In short, no. While some (including OpenAI) have released tools that purport to detect AI-generated content, none of these have proven to reliably distinguish between AI-generated and human-generated content.
- Additionally, ChatGPT has no “knowledge” of what content could be AI-generated. It will sometimes make up responses to questions like “did you write this [essay]?” or “could this have been written by AI?” These responses are random and have no basis in fact.
- To elaborate on our research into the shortcomings of detectors, one of our key findings was that these tools sometimes suggest that human-written content was generated by AI.
- When we at OpenAI tried to train an AI-generated content detector, we found that it labeled human-written text like Shakespeare and the Declaration of Independence as AI-generated.
- There were also indications that it could disproportionately impact students who had learned or were learning English as a second language and students whose writing was particularly formulaic or concise.
- Even if these tools could accurately identify AI-generated content (which they cannot yet), students can make small edits to evade detection.
There is some good research in watermarking LLM-generated text, but the watermarks are not generally robust.
I don’t think the detectors are going to win this arms race.