Outsmarting AI

A few months ago someone made an ad with an AI-generated person with Down syndrome. The character looked into the camera and asked you, with the model's best approximation of sincerity, not to scroll past. The ad went viral, not for the cause it was nominally for, but because everyone was correctly furious about it. The thread is worth a read. The replies are unanimous.

I want to write about why that ad felt different from ten thousand other manipulative pieces of media we've all gotten used to. Because it is different. And the difference points at the thing that's actually changing about content in the AI era, which is not what most people think it is.

The detection arms race

The first thing people say, when AI content lands somewhere uncomfortable, is "we need to be able to detect this." Reasonable instinct. The detection problem has been at the center of the AI safety conversation since the GPT-2 days. There are watermarking schemes. There are classifier models. There are browser extensions and forensic tools and academic papers on stylometric drift.

None of them work, in the medium term. Not really. The detection arms race is the same as every other detection arms race in security: the side trying to evade is structurally favored, because evasion only has to win once per piece of content. The detector has to win every time. The economics of the two sides are not symmetric.

What this means in practice is that by the end of 2026, the average person will not be able to reliably tell AI-generated images from real ones. Some experts will. Some specialized tools will, for some classes of media, under some conditions. Most people, most of the time, looking at most pieces of content, will not.

This is not, as is often claimed, an emergency. It's a phase transition. The question stops being "can you detect it" and starts being "now that you can't, what are you going to use to decide what to care about."

Skin in the game becomes the watermark

There's a Taleb concept that's been floating around the corners of finance and philosophy twitter for fifteen years now, which is the idea of "skin in the game." The argument is that the most reliable signal of whether someone is serious about a claim is whether they personally bear the cost of being wrong. The trader who shorts a stock he says is overvalued is more credible than the analyst who downgrades it on a research note.

I think we are about to discover that this principle applies to almost everything, because AI removed the cost of generating most other signals.

When generating content cost real human time, the existence of the content was already evidence of something. Someone cared enough to write a 2000-word essay. Someone cared enough to make the video. Someone cared enough to draw the picture. The artifact was a costly signal, in the technical biology sense, because the production cost itself was the proof of investment.

That signal is gone now. Or rather, it's broken. A 2000-word essay can be produced in eleven seconds for less than a cent. A video can be generated by typing one sentence. A picture costs nothing. The content is the same. The signal of investment behind it is not.

What replaces the production-cost signal is skin in the game. The artifact that says "I actually went somewhere and saw this." The artifact that says "I am betting my own reputation that this is true." The artifact that says "I have something to lose if I am wrong about this." These signals are not free to generate, because they require the person to actually have a reputation, actually go somewhere, actually risk something. They are the new costly signal. They are what the audience will increasingly look for, often without realizing that's what they are doing.

The Down syndrome ad, in this frame

This is why the AI-generated Down syndrome character felt different. The ad was not making a claim about a person. It was making a claim about a category of human experience that exists in the world, and it was making that claim with no one behind the claim. There was no actual person with Down syndrome being represented. There was no actual family choosing to share their story. There was no skin. The vulnerability the ad was using to move you to action was synthetic, generated to bypass the part of your brain that asks "is this real?"

This is not the same as a stock photo. A stock photo of someone is, at minimum, a real human who agreed to be photographed. The image is a fictionalization of their context, but the human is real, and somewhere, that human went to a shoot, signed a release, got paid. There is an actual person whose face is on the artifact.

The AI ad doesn't have that. There is no person. The vulnerability being depicted is fully synthetic. The empathy you feel when you look at the character is being extracted from you by a machine that has no relationship to the human experience it is mimicking. This is the difference. It is a real difference. It is worth being unsubtle about.

But the dog was already in the river

Here is the part where I have to be honest about my own view. The instinct to denounce the AI ad is correct, but the instinct to treat AI as the cause of the problem is wrong.

Long before the diffusion models showed up, there was the genre of video where a man in a parking lot films himself "finding" an abandoned dog, takes the dog home, gives it a bath, and posts the rescue arc on tiktok. The man, in many cases, put the dog there. The man had skin in the game in only the most degenerate sense: he was risking his reputation only with people who didn't watch enough of these videos to notice they were all the same. Within his subculture, he was lying. Within the algorithm, he was winning.

The same pattern applies to a thousand other genres. The influencer "crying" about a problem they made up. The street interview where the interviewer fed the answer. The before-and-after photo where the before was staged. The viral story that was a marketing campaign. We have been swimming in synthetic-empathy content for at least fifteen years. AI did not introduce this. AI accelerated it.

This matters because the response of "we need to ban AI-generated emotional content" is going to fail, and is going to fail in the same way that "we need to ban deceptive videos" failed. The categories don't crisply exist. The intent is mostly what matters. And intent is what regulators have always been bad at.

What to do with the gray

There is a gray area. There is always a gray area. The AI ad with the Down syndrome character is on the clearly-bad side. A children's book illustrator using diffusion to speed up their workflow is on the clearly-fine side. The middle is most of where content actually lives, and the middle is going to get bigger.

What I think the right response looks like, for the people who care about this:

Adjust your priors on cheap-to-generate content. Most of what shows up in your feed is going to be either AI-assisted or AI-generated by 2027. Discount accordingly. The default assumption should not be that someone made it. The default assumption should be that someone prompted for it. Then notice what the marginal information actually is.

Reward skin in the game when you see it. When a person actually shows up, in person, in their own face, on the record, with a reputation to lose, give that artifact more weight than the comparable AI-generated alternative. Pay them. Subscribe to them. Cite them. They are doing the costly thing in a market increasingly flooded with cheap substitutes, and the only way to keep that signal alive is to reward it.

Be willing to be fooled, and care about the marginal harm. You will be fooled. Sometimes a lot. The framing that helps most here is harm reduction, not perfect avoidance. If you got fooled into watching a fake-rescue video and you didn't donate to anything based on it, the harm to you is rounding error. If you almost got fooled into donating to a campaign based on a synthetic vulnerable person, the harm is real, and you should update your filters. The criterion isn't "did I detect it." It's "what bad thing almost happened."

Hold the line on the cases that matter. Synthetic depiction of vulnerable populations to manipulate empathy is one of the cases where the line should be loud. Not because AI did it (the dog-in-river video did it too) but because the category is one where the harm of being fooled is large. Some uses are clearly off the menu. Be willing to say so without softening it.

What this means for builders

If you build with AI, take this seriously. The line between "useful tool" and "manipulation engine" is mostly a function of the choices you make in your product. Most of the choices are small. A few are large.

Don't generate humans depicted with vulnerable characteristics to make a marketing claim. Don't synthesize testimonials. Don't fake a face on a fake story. Watermark when you can. Disclose when you should. Build for the reader who can't tell, because that's almost always who you are building for.

The interesting thing about the intelligence era is that the technology gives you almost arbitrary creative power and almost no friction against using it badly. The friction has to come from you. From the people on your team. From the audience that walks away when they catch you. From the slow accumulation of trust that doesn't survive much manipulation before it's gone.

The detection arms race is not the war. The trust market is the war. You can win it by being the kind of operator who doesn't have to detect everything, because you've stopped trusting things that don't show their skin.