Do you regularly find text content that you know is AI written (but is not marked as such)? Because honestly I don't, and it must exist in decent quantity by now. Or perhaps it's still sparse?
Have a look here [1] and here [2] - I think they are good resources, but fallible in the long run. I think yes, I do, often confirmed by communication with people I know (i.e. i suspect they have used AI to make something -> I ask). This falls victim to confirmation bias, though. I suspect a nontrivial amount of writing I read is AI generated without me realising, and I'm wary also of falsely flagging AI-generated content that is actually from humans.
- Other source-to-text integrity issues; for example, the WWF source says very little about Malaysia specifically, only mentions Sunda tigers (Panthera tigris sondaica), and does not mention tapirs at all
- Very short yet consistent paragraph length
- Generic "see also" links, one of which is redlinked
This is not the sort of thing that I pay attention to unless I'm doing detailed research. And even then I'd probably have a bot check these for me, ironically, since it's such a mechanical job. At the very least detecting AI like this requires conscious effort.
I think the second resource that you linked to is valuable. The first is useless unless you're a Wikipedia editor, the significance of verifying citations not withstanding.
The gap between LLM-generated writing and the composite style of the average Wikipedia page is more narrow than most people may believe.
You will start to recognize it over time. The major AI models each have their own voice and patterns that they overuse.
The more you see those patterns the more you start recognizing them. By now I can recognize quickly if a blog post or README.md was generated by Claude or ChatGPT because the signs are so obvious.
Even Hacker News comments that are AI written are easy to spot if they weren't edited. I know I'm not alone because when I recognize an AI comment I check their comment history and find other people calling out their AI-generated submissions, too.
Learning how to recognize the output of the popular AI models is becoming a critical business skill, too. You need to be able to separate out the content from someone who was doing real work that you should take seriously as opposed to the output of someone who is having ChatGPT produce volumes of text that they don't review. The people who do that will waste your time.
I don't see how to interpret your claims. How do you yourself know that you're right when you "recognize" Claude or ChatGPT? How do you know how much of the text you don't recognize as any LLM is actually LLM-generated? My recollection is whenever I've seen data on this--the educators who think they can spot students cheating--the conclusion is people are really bad at identifying LLM-generated content.
I'm not claiming to be able to spot 100% of LLM written output
However the default tone and output style of Claude and ChatGPT are very obvious.
> My recollection is whenever I've seen data on this--the educators who think they can spot students cheating--the conclusion is people are really bad at identifying LLM-generated content.
If you can share that data we can discuss it, but there's nothing really to discuss here without a source
It’s very obvious if you leave the default tone. If you specifically ask it to hide its ai voice and make it appear human, it does a really good job. Even better if you give it an example of the writing style.
Ask it to write in the style of patio11 or someone else with a distinctive tone, and it will do a remarkable job.
It will pass pretty consistently. Not sure I love it.
This is a temporary problem. Look at how fast things are progressing. Things will improve until none of this matters because the output is indistinguishable.
Yes, often, and often here on HN or Substack if I point it out, it doesn't lead to anything good. Many don't recognize it, many do, the author gets defensive etc.
This article doesn't have the tells, it looks human written.
For example the first frontpage post I read just now (I haven't checked others) is I'm fairly sure written with the use of AI (I would guess based on a human draft): https://news.ycombinator.com/item?id=47566442
I can't prove it but I'm comfortable enough in my judgment to say it.
I found that many people don't have a radar for this. They may know about delve, emdashes, tapestry, multifaceted or "not just X but y" and if these are not there they don't see it.
They probably don't care enough to notice the tells. I think that it's generally those ambivalent, skeptical or opposed to AI who notice, while those who wholeheartedly support AI see no reason to differentiate between it and humans and so do not even try to.
I don't think it's that simple. I'm not blanket opposed to it. I'm more along the lines of the author of the article. Use it for what it's good at, sift through unstructured info, convert information from one format to another, implement things that are planned out well with iterations and feedback, etc, and generally mapping out the capabilities.
I think those who are very opposed to AI often don't know much about the real limitations since they don't use it, and their complaints are often a year or more out of date.
I think the ideal demographic for spotting these are people who use the frontier LLMs a lot and they also have worked with text in detail, such as copywriters, people who have learned foreign languages and grammar etc., have edited articles for language and generally have a more "wordsmith" look at language and are sensitive to flow and rhythm of language on a more technical level.