It's not that hard to dominate bots. I do it for fun, I do it for profit. Block datacenters. Run bot motels. Poison them. Lie to them. Make them have really really bad luck. Change the cost equation so that it costs them more than it costs you.
You're thinking of it wrong, the seeds of the thinking error are here: "I wonder how soon it becomes actually infeasible to operate a website with actual original content".
Bots want original content, no? So what's the problem with giving it to them? But that's the issue, isn't it? Clearly, contextually, what you should be saying is "I wonder how soon it becomes actually infeasible to operate a website for actual organic users" or something like that. But phrased that way, I'm not sure a CDN helps (I'm not sure they don't suffer false positives which interfere with organic traffic when they intermediate, more security theater because hangings and executions look good, look at the numbers of enemy dead).
Take measures that any damn fool (or at least your desired audience) can recognize.
Reading for comprehension, I think Rachel understands this.
Easy way is to implement e.g. a 4xx handler which serves content with links which generate further 4xx errors and rewrite the status code to something like 200 when sent to the requester. Load the garbage pages up with... garbage.
Since this is getting upvoted, I will put forth a suggestion I've made to the people who've paid me to help with this sort of subterfuge: turn your 404 handler into search. Then a human who goes there has a way out. But absolutely, load it up with garbage and broken links.
Once in a while people pay you to do something you enjoy doing, like making people cry and wish they had a jobs flipping burgers instead. But I do it on my own systems for fun, honestly.
The idea is that bots are inflexible to deviations from accepted norms and can't actually "see" rendered browser content. So if your generic 404, 403 error pages return a 200 status instead, with invisible links to other non accessible pages. The bots will follow the links but real users will not, trapping them in a kind of isolated labyrinth of recursive links (the urls should be slightly different though). It's basically how a lobster trap works if you want a visual metaphor.
The important part here is to do this chaotically. The worst sites to scrape are buggy ones. You are, in essence, deliberately following bad practices in a way real users wouldn't notice but would still influence bots.
You're thinking of it wrong, the seeds of the thinking error are here: "I wonder how soon it becomes actually infeasible to operate a website with actual original content".
Bots want original content, no? So what's the problem with giving it to them? But that's the issue, isn't it? Clearly, contextually, what you should be saying is "I wonder how soon it becomes actually infeasible to operate a website for actual organic users" or something like that. But phrased that way, I'm not sure a CDN helps (I'm not sure they don't suffer false positives which interfere with organic traffic when they intermediate, more security theater because hangings and executions look good, look at the numbers of enemy dead).
Take measures that any damn fool (or at least your desired audience) can recognize.
Reading for comprehension, I think Rachel understands this.