How to Tell If What You’re Reading Was Written By AI

4 months ago

This post is part of Lifehacker’s “Exposing AI” series. We’re exploring six different types of AI-generated media, and highlighting the common quirks, byproducts, and hallmarks that help you tell the difference between artificial and human-created content.

From the moment ChatGPT introduced the world to generative AI in late 2022, it was apparent that, going forward, you can no longer trust that something you’re reading was written by a human. You can ask an AI program like ChatGPT to write something—anything at all—and it will, in mere seconds. So how can you trust that what you’re reading came from the mind of a person, and not the product of an algorithm?

If the ongoing deflation of the AI bubble has shown us anything, it’s that most people kind of hate AI in general, which means they probably aren’t keen on the idea that what they are reading was thoughtlessly spit out by a machine. Still, some have fully embraced AI’s ability to generate realistic text, for better or, often, worse. Last year, CNET quietly began publishing AI content alongside human-written articles, only to face scorn and backlash from its own employees. Former Lifehacker parent company G/O Media also published AI content on its sites, albeit openly, and experienced the same blowback—both for implementing the tech with zero employee input, and because the content itself was just terrible.

But not all AI-generated text announces itself quite so plainly. When used correctly, AI programs can generate text that is convincing—even if you can still spot clues that reveal its inhuman source.

How AI writing works

Generative AI isn’t some all-knowing digital consciousness that can answer your questions like a human would. It’s not actually “intelligent” at all. Current AI tools are powered by large language models (LLMs), which are deep-learning algorithms trained on huge data sets—in this case, data sets of text. This training informs all of their responses to user queries. When you ask ChatGPT to write you something, the AI breaks down your question and identifies what it “thinks” are the most important elements in your query. It then “predicts” what the right sequence of words would be to answer your request, based on its understanding of the relationship between words.

More powerful models are able to both take in more information at once, as well as return longer, more natural results in kind. Plus, it’s common for chatbots to be programmed with custom instructions that apply to all prompts, which, if used strategically, can potentially mask the usual signs of AI-generated text.

That said, no matter how you coax the AI into responding, it is beholden to its training, and there will likely be signs such a piece of text was generated by an LLM. Here are some things to look out for.

Watch for commonly used words and phrases

Because chatbots have been trained to look for the relationships between words, they tend to use certain words and phrases more often than a person would. There’s no specific list of words and phrases that serve as red flags, but if you use a tool like ChatGPT enough, you may start to pick up on them.

For example, ChatGPT frequently uses the word “delve,” especially during transitions in writing. (e.g. “Let’s delve into its meaning.”) The tool also loves to express how an idea “underscores” the overall argument (e.g. “This experience underscores the importance of perseverance…”), and how one thing is “a testament to” something else. (I generated three essays with ChatGPT for this section—two with GPT-4o and one with GPT-4o mini—and “testament” popped up in each one.)

Similarly, you may see repeated uses of words like “emerge,” “relentless,” “groundbreaking,” among other notable regulars. In particular, when ChatGPT is describing a collection of something, it will often call it a “mosaic” or a “tapestry.” (e.g. “Madrid’s cultural landscape is a vibrant mosaic.”)

This Reddit thread from r/chatgpt highlights a bunch of these commonly generated—though it’s worth noting that the post is 10 months old, and OpenAI frequently updates its models, so some of it may not be as relevant today. In my testing, I found some of the Reddit thread’s most-cited words didn’t appear in my test essays at all, while others certainly did, with frequency.

All these words are certainly perfectly fine to use when doing your own writing. If a student writes “delve into” in their essay, that isn’t a smoking gun that proves they generated it with ChatGPT. If an employee writes that something is “a testament to” something else in a report, that doesn’t mean they’re outsourcing their work to AI. This is just one aspect of AI writing to note as you analyze text going forward.

Consider the style of the writing

It’s impressive how quickly AI can generate a response to a query, especially when you’re working with a particularly powerful LLM. And while some of that writing can appear very natural, if you’re reading closely, you’ll start to notice quirks that most human writers wouldn’t use.

Whether you’re using OpenAI’s GPT model or Google’s Gemini, AI has a bad habit of using flowery language in its generations, as if it was mostly trained on marketing copy. AI will often try to sell you hard on whatever it happens to be talking about: The city it’s writing about is often “integral,” “vibrant,” and a “cornerstone” of the country it’s in; the analogy it uses “beautifully” highlights the overall argument; a negative consequence is not just bad, but “devastating.” None of these examples is damning in isolation, but if you read enough AI text, you’ll start to feel like you’ve been talking to a thesaurus.

This becomes even more apparent when a chatbot is attempting to use a casual tone. If the bot is purporting to be a real person, for example, it often will present as bubbly and over-the-top, and far too enthusiastic to listen to anything you have to say. To be fair, in my testing for this article, ChatGPT’s GPT-4o model didn’t appear to do this as much as it used to, preferring more succinct responses to personal queries—but Meta AI’s chatbot absolutely still does it, stepping into the roles of both best friend and therapist whenever I shared a fake problem I was having.

If you’re reading an essay or article that expresses an argument, take note of how the “writer” structures their points. Someone who asks an AI tool to write an essay on a subject without giving it too much coaching will often receive an essay that doesn’t actually delve into the arguments all that much. The AI will likely generate short paragraphs offering surface-level points that don’t add much to deepen the argument or contribute to the narrative, masking these limitations with the aforementioned $10 words and flowery language. Each paragraph might come across more of a summary of the argument, rather than an attempt to contribute to the argument itself. Remember, an LLM doesn’t even know what it’s arguing; it just strings together words it believes belong together.

If you feel you’ve walked away from the piece having learned nothing at all, that might be AI’s doing.

Fact check and proofread

LLMs are black boxes. Their training is so intricate, we can’t peer inside to see exactly how they established their understanding of the relationships between words. What we do know is, all AI has the capability (and the tendency) to hallucinate. In other words, sometimes it an AI will just make things up. Again, LLMs don’t actually know anything: They just predict patterns of words based on their training. So while a lot of what they spit out will likely be rooted in the truth, sometimes it predicts incorrectly, and you might get some bizarre results on the other end. If you’re reading a piece of text, and you see a claim you know isn’t true stated as fact, especially without a source, be skeptical.

On the flip side, consider how much proofreading the piece required. If there were zero typos and no grammatical mistakes, that’s also an AI tell: These models might make things up, but they don’t output mistakes like misspellings. Sure, maybe the author made sure to dot every “i” and cross every “t,” but if you’re already concerned the text was generated with AI, stilted perfectionism can be a giveaway.

Try an AI text detector (but you can’t trust those either)

AI detectors, like LLMs, are based on AI models. However, instead of being trained on large volumes of general text, these detectors are trained specifically on AI text. In theory, this means they should be able to spot AI text when presented with a sample. That’s not always the case.

When I wrote about AI detectors last year, I warned not to use them, because they were not as reliable as they claimed to be. It’s tough to say how much they have improved in the time since: When I feed one of my stories through a tool like ZeroGPT, it says my piece was 100% human-written. (Damn right.) If I submit an essay generated by Gemini about the significance of Harry losing his parents in the Harry Potter series, the tool identifies 94.95% of the piece as AI-generated. (The only sentence it thinks was written by a human was: “This personal stake in the conflict distinguishes Harry from other characters, granting him an unwavering purpose.” Sure.)

And yet the detector still fails the same test I gave it in 2023: It believes 100% of Article 1., Section 2. of the United States Constitution is AI-generated. Someone tell Congress! I also set it to analyzing this short article from The New York Times, published July 16, 2015, long before the advent of modern LLMs. Again, I was assured the piece was 100% AI.

There are a lot of AI detectors on the market, and maybe some are better than others. If you find one that tends to reliably identify text you know to be human-generated as such, and likewise for text you know is AI, go ahead and test writing you aren’t sure about. But I still think the superior method is analyzing it yourself. AI text is getting more realistic, but it still comes with plenty of tells that give itself away—and often, you’ll know it when you see it.