If you solve this CAPTCHA, you are not welcome here

About this page

This page shows a bunch of annoying dialogs to anyone who has JavaScript enabled.

The checks can be bypassed by hitting the Escape key five times. There are a few other ways to get through the checks, which I’ll leave undocumented.

An inverse CAPTCHA

It’s been the case for a few years now that bots are better able to solve CAPTCHA than humans.

That being the case, we are approaching the point where any client that bothers to solve a CAPTCHA is more likely to be a bot than a human.^[1]

This page explores that idea.

Passing the gauntlet of prompts that pop up on this page results in being locked out. Only those who attempt to exit the process, reject all of the cookies, hit escape, or click away from the incessant dialogs are able to get through.

How this page works

This is a normal blog entry, there is just a script loaded that does a bunch of annoying stuff.

The script doesn’t really do any of the things it claims to do. These are strictly fake versions of some of the more annoying junk that makes the Web unpleasant to use: CAPTCHAs, cookie banners, anti-fraud interstitials, sign-up forms, and so forth.

All this content is added by that script.

The trap is the point

Anyone who successfully “solves” this CAPTCHA gets to visit a page (here, it’s safe to visit, I promise) that tells them they have failed the test.

This is a trap. The script saves a value to your browser. Any future visits to this page will be immediately forwarded to the rejection page.

The rejection page does give you a way to reset this state, because to do otherwise seemed too much like a middle finger to visitors.

What about the bots?

This page is not really for bots, it’s just a demonstration. Well, that and a chance to build some really annoying stuff. Many bots won’t even see the nonsense that I made because they don’t bother running scripts. This page is mostly just an experiment that explores the shape that a bot trap might take.

It would not be particularly hard to make a real bot trap. The trick being to reduce the number of humans it catches.

Many bots will solve CAPTCHAs to get to where they want to go. A non-AI bot might need to be taught about new techniques, but AI tends to just do whatever is necessary to accomplish its task.

If the site is built so that “passing” through all the checks puts the bot in a trap, this creates the right sort of incentives.

A trap doesn’t need to lead to a total lock-out. It probably shouldn’t, either, because having the bot know that it is trapped doesn’t help. Putting the bot in a tarpit, serving garbage, or providing some lower class of service ensures that you can seamlessly provide distinct handling for bots.

Humans will get trapped too

That humans will end up caught out is the main reason an inverse CAPTCHA probably isn’t a great idea for most sites.

Some people will attempt to solve these increasingly annoying puzzles too. Those that do will end up in the same traps as the bot.

So we aren’t yet at the point where we can completely invert CAPTCHAs.

CAPTCHAs are user hostile

One reason I built this tool was to highlight just how much typical abuse mitigations – whether it be CAPTCHA, proof of work, or similar (along with cookie banners) – degrade user experience.

CAPTCHA is annoying for everyone. CAPTCHA excludes many people. Besides, CAPTCHA doesn’t work.

Sites that set such demeaning tasks make it very clear that they think very little of the people who visit them.

So why are CAPTCHAs still common?

Everyone knows that a determined burglar is not slowed much by a locked door. Still, we take the time to lock up anyway.

Slowing an adversary down can be the whole point. You set up a simple cost-benefit equation for them. If what your security protects is not worth the cost of getting around those protections, that might be enough to encourage the adversary to move on.

In other words, websites: if your CAPTCHA is working, think carefully about what that says about the value of the stuff it protects.

Either that, or it could mean that the papier-mâché lock – while not an effective lock – is an effective signal nonetheless.

A sign on the door

Bots could choose to interpret the presence of a CAPTCHA as something more like a polite request to stay out.

After all, solving CAPTCHA isn’t that hard. I mean, bot solving is a service that is widely advertised for 1 or 2 dollars per thousand uses (with 100% reliability). A general purpose model is probably not cheaper, but any model would be capable of solving novel challenges more reliably than a human.

Any bot therefore is deciding to either respect a request to stay out or deciding that the cost-benefit balance leans toward not solving the challenge.

It certainly could be the case that 0.1c per site is too high a cost. But it seems like that is a cost that can be amortized over multiple interactions if cost were the main reason. After all, humans that solve CAPTCHA get a cookie. The bot could do the same.

Maybe, in the near term, we should regard the effect of CAPTCHA as the moral equivalent of a “keep off the grass” sign.

Could asking nicely work?

That suggests that having a way to ask bots to behave, without annoying your human visitors, could help.

robots.txt is founded on the same basic idea. It is the simple ask that precedes more drastic action.

The encouraging part is that, for many years, asking nicely was largely enough. My theory is that this was because there was mutual benefit in something like search indexing.

It was also because crawlers ignore the requests in robots.txt at their jeopardy. If their bots were to act badly, sites could move to block their IP and cut their access off.

This implied threat no longer works especially well. For one, search visibility is too valuable to risk and sites are unable to block crawlers if that risks cutting them off from search indexes. At the same time, it has become easier to avoid consequences by moving cloud hosts or pretending to be a more reputable bot.

That showed robots.txt to be a poor shape for managing the new suite of problems. Though it can never deal with outright abuse, the simple Allow/Disallow it produces is not enough to deal with the complexities of the new reality.

The basic idea of robots.txt, that asking nicely can be effective, still potentially contributes to improving the situation for sites. The work we’re doing in the IETF on AI preferences is grounded on that idea.

When asking fails, what then?

There will always be bots that fail to respect such requests. But the idea that it might be possible to distinguish positively between a bot that wants to evade detection – and any reprisal that might follow – and a human is not really viable today.

It’s worth considering though: how much of that evasion is driven by the fear of reprisal?

The challenge is that the volume of bot traffic is huge. Even if most bots respect polite requests to follow behavioural norms there are enough that don’t to leave sites with a huge problem.

Things will improve

The knowledge that many bots are well-behaved isn’t much comfort to a site that has to deal with a flood of bot-driven abuse.

Nor can we really ask sites to wait for solutions that are still in development. PACT could eventually provide sites with useful tools for managing the bot flood.

PACT offers sites a way to move to rate limits as their primary defense. All visitors get the same class of service as a human, after which they get cut off. Evading detection does the bot little good, because bots tend to rely on being able to make many more requests than a human.

Bots might offer voluntary self-identification in hopes of persuading the site to loosen its constraints. That has the potential to generate positive feedback. As more bots provide identification, more sites see benefits; and, the more sites that have accommodations for bots, the more reason there is for bots to identify themselves.

All these efforts point the same way. Cooperation between sites, users, and other good actors helps keep the system functioning despite increasing involvement from selfishly-motivated bots.

The ride will be rough for some time yet

Of course, all of that is a poor consolation for a site suffering under bot abuse today. That’s why the rest of us are likely to suffer as well for a while yet.

Just know that the problem has reached a critical threshold. The wheels are turning, so that some day soon the web could start being a whole lot less annoying.

Other than this page, of course. I reserve the right to make this page more annoying (suggestions in that direction are welcome).

Either way, no tears will be shed as the last CAPTCHAs disappear from the web.

Base rate fallacy suggests that this becomes more true as the number of bots begins to exceed the number of humans, but you get the idea. ↩︎