Not Quite There: Exploring the Flaws of Using AI in Writing

6 Oct

While working on my Voice & Tone Masterclass, I came across various wild claims when it comes to what AI, and especially LLMs like ChatGPT, can do for you regarding voice and tone.

Upload a couple of sample texts (newsletters, social media captions, press releases) and let it derive voice & tone guidelines from it.

Teach it your voice and tone and let it create style-consistent copy for you.

Brief the tool and let it use non-writers to do writing tasks so that writers don’t become the bottlenecks of their teams.

All those things should be possible with tools like ChatGPT.

You just need to brief and train it correctly, right? Actually, that’s right.

Broken down, it really is that simple. But it’s not easy.

The Root of AI Scepticism

The reason why so many professional writers will tell you that it still takes a professional writer to write high-quality copy with AI-based tools like ChatGPT is simple, too: These tools are flawed in many ways non-writers don’t even recognize.

For many, it’s just entering a well-thought-out prompt, maybe uploading a voice and tone style guide, and ChatGPT spitting out a text that sounds pretty good. As a non-writer it might not be clear that the text is actually not good, does not align with the overall brand voice, sounds AI-ish, and feels just… off.

To create awareness, this blog post will summarize the most severe flaws of LLMs that will naturally compromise the quality of your copy. For that, we will look at the specific case of ChatGPT, as it is currently still the most widespread and popular model.

The Flaws of LLMs in 2025

Many of the inherent flaws of LLMs like ChatGPT are based on the fact that these tools do not work based on critical thinking, but rather on training with language data. Therefore, the quality of the output ChatGPT creates is not only determined by the prompt you give it, but also on the training data and the processing of it – both of which are complete blackboxes. Let’s talk about the most consequential flaws of this truth.

Adding a WEIRD Bias

Psychology research has long criticized “WEIRD” sampling studies that overrepresent participants from Western, Educated, Industrialized, Rich, Democratic countries. The issue is clear: if you only study a small, biased slice of humanity, your results can’t be generalized to everyone.

Studies show that LLMs like ChatGPT are trained disproportionately on internet text, written mostly in English and heavily shaped by WEIRD cultural norms (Western, Educated, Industrialized, Rich, Democratic countries).

For UX Writing, this bias bears a huge risk. It makes the text output lean toward a “default” style of communication that reflects the statistical center of its training data, not a brand’s unique identity or the preferences of your specific target audience – even if it’s a target audience located in that specific global region.

Bringing in its own tone

Even if you upload a carefully designed Voice and tone guide, ChatGPT often sneaks in its own tone. You can literally feel it being dragged towards that central, core style its training resulted in. This is also due to the fact that many voice and tone guides are far too vague, incomplete and have severe gaps – gaps that ChatGPT fills with its own “ideas.” This includes certain syntax (like the typical binary constructs, aka “This is not a. This is b.” or “This is not only a. This is b.”)

And even worse: the more AI-sounding content people publish unedited, the more of that tone ends up in ChatGPT’s training data, which means future outputs sound even more “AI-ish.” The result? A monoculture of sameness in content everywhere.

Being unreliable

You uploaded your voice and tone guide and ChatGPT produced some decent copy for you last Tuesday? Awesome! It finally works.

Or does it?

Well, the truth is, sometimes ChatGPT nails it. Other times, it’s completely off – and you can’t predict which one you’ll get. For non-writers that might mean that they rely on something completely unreliable, and they might not even notice, since the nuances that make copy sound “off” are often very subtle.

Therefore, every single output actually needs expert review. Otherwise, you risk publishing text that undermines your brand voice instead of reinforcing it.

For companies hoping that “anyone can create professional copy with AI,” this is probably the hardest truth: without someone trained in voice and tone reviewing and adjusting (which also does not include all writers), you’re not placing a safe bet. You’re getting lottery tickets.

Having weak memory

And there is another, closely related issue: ChatGPT also doesn’t reliably “remember” training. Even in Projects mode, corrections and feedback are often forgotten after a few hours, rules get applied inconsistently. Custom GPTs and single chats, on the other hand, don’t remember anything across sessions in the first place, so you have to manually reinsert instructions if you want consistency.

That means “training” ChatGPT isn’t like training a team member. It’s more like writing reminders on sticky notes and hoping the system glances at them at the right time. And therefore, consistent quality is always at risk.

Knowing little about accessibility and inclusivity

But there are also more severe flaws: AI isn’t trained on accessibility standards or inclusive language guidelines. Yes, they might be included somewhere in the training data, but mostly, ChatGPT is trained on what’s available online – and a lot of that content is flawed regarding accessibility and inclusivity.

This may result in copy that doesn’t work for screen readers, language that unintentionally excludes or insults part of the audience, a tone that fails to meet trauma-informed or culturally sensitive standards, humor that’s totally inappropriate, and so on and so forth.

With that, you might put your company at legal, ethical, or reputational risk.

The risk of superficial expertise

When you combine all of the above, you get a deeper danger: the illusion of competence. Because ChatGPT produces text, many people assume the task is done. But the gap between some text and good text is enormous.

Each and every day, I read LinkedIn posts that are clearly created by ChatGPT or any similar form of LLM. They sound extremely copy-pasted, and thus, generic. That’s because they all have the same voice and tone elements, including

anaphora (“It’s edgy. It’s groundbreaking. It’s the new standard.”)
binary constructs (“It’s not only edgy. It’s groundbreaking.”)
strongly shortened syntax (“The idea? Groundbreaking.”)
the absolute lack of punctuation diversity and strong focus on full-stops and em-dashes, while avoiding exclamation marks.
the overuse of gerunds (“Groundbreaking ideas have real impact on your business, increasing your revenue one step at a time.”)
overly figurative language (“That’s why mediocre ideas won’t cut it.”)
unnatural idioms (“But here’s the kicker:”)
and much more.

The tone ChatGPT uses, especially for marketing copy, is supposed to be bold, confident and attention-winning per default. However, how bold and attention-winning can a text possibly be that has the same tone as 60 % of the rest of the content? That effect just makes the author look delusional.

However, an author who is not even aware of the typical AI-ish tone and text elements might not even realize that. And now you might say: “But neither will many readers!” True, but I promise, especially in the marketing space, many people are already aware of ChatGPT’s typical tone, and the number of people noticing will increase.

Therefore, having non-writers write copy – of whatever kind – with tools like ChatGPT might save you some time and human resources. But it will also put your reputation at risk. So what do you win, really?

Conclusion

Does this mean AI has no place in writing work? No, that’s not what it means. It can support experts in their work, and enrich their workflows. By the way, that doesn’t mean that the work is being done faster – simply because effective usage of AI tools for writing requires intense briefing, training, prompting, and reviewing. But it sure can improve the writing experience and even the final output.

However, having a professional writer doing it is a non-negotiable. Even after reading this blog post, a non-writer will not detect all the flaws the copy carries, or even be able to correct it, especially regarding highly complex aspects such as the right type and amount of humor, cultural sensitivity, inclusivity, and much more.

So, don’t ditch your idea of hiring a professional writer too soon! It might cost you your reputation.

Katharina Grimm

Not Quite There: Exploring the Flaws of Using AI in Writing

The Root of AI Scepticism

The Flaws of LLMs in 2025

Adding a WEIRD Bias

Bringing in its own tone

Being unreliable

Having weak memory

Knowing little about accessibility and inclusivity

The risk of superficial expertise

Conclusion

Navigation

Let’s learn!

Not Quite There: Exploring the Flaws of Using AI in Writing

The Root of AI Scepticism

The Flaws of LLMs in 2025

Adding a WEIRD Bias

Bringing in its own tone

Being unreliable

Having weak memory

Knowing little about accessibility and inclusivity

The risk of superficial expertise

Conclusion

Personal Note: On My Creative Break

Shaping Voice and Tone in UX Writing: Punctuation

Navigation

Let’s learn!