The Uncanny Valley of Words

Why Large Language Models Feel Eerie

and

Sep 24, 2025

When Masahiro Mori first sketched the famous “uncanny valley” curve in 1970, he was thinking about faces and bodies. The Japanese roboticist noticed something odd: the more human a robot looked, the more we warmed to it—until it got too close. Suddenly the warmth dropped off a cliff. Almost-human figures triggered unease, revulsion even. A porcelain doll, a zombie, a humanoid robot with glassy eyes and stiff fingers—each sits in that unsettling dip between “cute machine” and “believably human.” But what Mori didn’t know back then is that the uncanny valley would not be limited to appearances. Today we are meeting its linguistic cousin: a valley of words.

Large language models (LLMs)—the technology behind ChatGPT and its cousins—don’t walk or blink or grimace. They don’t have faces at all. Yet many of us feel the same shiver of strangeness when we talk to them. Their replies are polished, smooth, polite. Too polished. Too polite. They sound human until they don’t. They imitate empathy until the seams show. They answer as if they understand, but a strange hollowness lingers. Why? Because LLMs pull us into a new kind of uncanny valley—one that lives not in the body, but in language itself.

The uncanny valley is always about mismatched expectations. We expect smooth skin but get waxy plastic. We expect warm eyes but get lifeless glass.

With LLMs, the mismatch is subtler. We expect genuine thought, but what we get is probability. We expect insight, but sometimes we get nonsense dressed up as sense. We expect a conversation partner, but the “partner” never interrupts, never pushes back, never remembers what it’s like to be tired or angry or bored.

It is the almost-ness that disturbs. The “close but not quite.” The fact that these systems can produce text that looks indistinguishable from human writing—until suddenly it doesn’t. One jarring slip and the illusion collapses.

The Politeness Problem

While it might be nice to have our ego stroked by ‘someone’ telling us that ‘that is a really insightful question’, or that your blog post is ‘thoughtful’ … it’s a fleeting contentment. If you’ve used an LLM, you’ll know the tone: unfailingly polite, deferential, careful. It is the tone of a machine that has been sanded smooth by guardrails and safety training.

“I’m sorry, but I can’t provide that information.”
“Great question! Here are three possible answers…”
“It’s important to note that…”

Polite. Respectful. Helpful. But also… wrong. Humans aren’t this polite. We forget to say thank you. We cut each other off. We hedge, grumble, snap. We use sarcasm and irony. We say the wrong thing, backtrack, laugh at ourselves.

In human interaction, rough edges are a mark of authenticity. The very things designers try to scrub out of LLMs are the things that make conversation feel real. Over-politeness becomes its own kind of mask. This is why people sometimes describe LLMs as “obsequious” or “creepy butler-like.” We don’t fully trust someone who agrees too quickly, flatters too much, or apologizes at every turn. A system that always serves feels less human, not more.

The Awareness Gap

There’s another factor at work: we know they’re programmed. Unlike a robot face, which might trick us until we notice the jerky eye movement, with LLMs we go in with full awareness. We know this is code, not consciousness. An algorithm. A program. It doesn’t know us! We know that the system is guessing the next word based on probabilities, not weighing a thought or feeling.

“[LLMs] are like [stochastic] parrots, they repeat without understanding” Prof. Emily Bender, Director, Computational Linguistics Laboratory, University of Washington

That double-awareness is a key ingredient in the uncanny sensation. You type a question, get a smooth, human-like response, and feel the tug of natural conversation. But then, in the back of your mind, the reminder: this thing has never had an experience, never cared about anything, never believed or doubted a word it just generated.

It is like listening to an actor who can deliver the lines with perfect cadence but has never lived the story.

What gives it away? And we don’t need the sophistication of the Voight-Kampff test in this scenario! Here are some of the most common linguistic tics that tip people into the uncanny valley of words:

The Self-Disclosure Disclaimer: “As an AI language model, I can’t…”
The Over-Polite Apology: “I’m sorry, but I cannot do that.”
The Teacherly Clarification: “It’s important to note that…”
The Obsequious Acknowledgment: “That’s an excellent point!”
The Safe-But-Vague Statement: “There are pros and cons to both approaches.”
The Robotic Formalism: “Therefore, it can be concluded that…”

Each of these reveals the machinery underneath. They are the textual equivalents of waxy skin or glass eyes. They remind us, abruptly, that the voice we’re hearing is not a mind but a mirror.

The strangeness of LLMs becomes clearer when you hold them up against the quirks of human talk.

We interrupt. We trail off. We contradict ourselves. We say “um” and “you know.” We change tone halfway through a sentence. We argue bluntly. We joke in ways that only make sense because of shared history. These are not flaws; they are signals of presence. They show that there is a mind improvising, groping, adjusting, caring. The very messiness of human dialogue is what makes it feel ‘normal’.

LLMs, by contrast, produce text that is too tidy, too smoothed over. They rarely interrupt themselves. They rarely leave a thought dangling. They rarely shift register in a way that feels spontaneous. The absence of those imperfections is precisely what drops them into the uncanny valley.

You might ask: so what? If the text is useful, does it matter if it feels a bit off? It does matter—because trust in language is fragile. Conversation is not just about exchanging information; it is about building confidence that the other party understands, cares, and means what they say. When a response feels hollow, scripted, or eerily polite, we sense that something vital is missing.

That missing “something” is authenticity. We don’t just want answers; we want answers from someone who has a stake in them. LLMs don’t.

This doesn’t mean LLMs are useless. Far from it. We love them. They are extraordinary tools for drafting, summarising, brainstorming, or exploring ideas. But when we use them as conversational partners, the uncanny valley of words reminds us: this is not another mind. It is an echo of ours.

The Future of the Linguistic Valley …where does this go next? Two possibilities:

Deeper Disguise: Designers might try to close the valley by making LLMs sound even more human. Add filler words. Insert fake hesitations. Script in small talk. Some companies are already experimenting with this. The risk is that it makes the systems more deceptive, blurring the line between tool and person. This becomes more … what word shall we use … perhaps ‘pernicious’ when we move from the keyboard and screen as our usual modes of interaction to the spoken word (which of course is an option today)
Radical Honesty: Alternatively, we might lean into the difference. Stop pretending the system is a person. Design voices that sound confidently artificial rather than creepily human. Celebrate the tool-like qualities instead of masking them. Think of R2-D2 chirping rather than a waxy humanoid smiling.

The second path might be more ethical, even more comfortable. We don’t need machines to be our friends; we need them to be our partners. And partnerships are built on clarity, not disguise.

The uncanny valley of words is not going away. In fact, as LLMs get better, it may deepen. The closer they get to fluency, the sharper we notice the gaps. But perhaps that’s not a failure—it’s a reminder. A reminder that conversation is not just about strings of words but about the human weight behind them. Beliefs, emotions, memories, intentions. The things a language model cannot have. When we feel that shiver of eeriness talking to an LLM, it may not just be discomfort. It may be our cognitive immune system at work, warning us that something sounds human but isn’t. That’s not a bug. That’s a safeguard.

In the end, maybe the best lesson of the uncanny valley is not about machines at all, but about us. The valley shows us that what we value in human conversation is not perfect grammar, not polished tone, not endless politeness. What we value are the interruptions, the slips, the emotional tangles, the messy sparks of real thought. We don’t trust the flawless. We trust the flawed, the quirky. So as we learn to live and work with language models, the trick will be to see them clearly—not as human replacements, not as eerily polite butlers, but as tools. Useful, powerful, sometimes strange. And always different from us. Because the real uncanny valley is not a gap in the machines. It is the gap between what we imagine they are and what they really are.

Who wrote this post? Adam, David or ChatGPT? What gives it away?

The Uncanny Valley of Words

Why Large Language Models Feel Eerie

Discussion about this post