Hacker News

Favorites Setup
Comment by ryan_n | original | Monetization Gateway: Charge for any resource behind Cloudflare via x402
[−]ryan_n · 2026-07-01 Wed 17:29 UTC · link
Is there actually a reliable way to differentiate human from bot?
[−]mpeg · 2026-07-01 Wed 18:04 UTC · link
There are reliable ways of differentiating human from cheap, bulk scraping bots.

But if the bot is advanced / expensive enough, it gets a lot harder. Where this product's market sits is in giving a paid way to access content compared to having to spin up bots that run js, from real IP addresses, etc. all of which are more expensive

[−]xur17 · 2026-07-01 Wed 18:27 UTC · link
Agreed. To me this feels like the perfect solution for websites and ai crawlers. Instead of having crawlers paying proxy services and captcha solvers, they can pay the website itself. As a web scraper, I'd happily pay the website provider to get access if it meant easy access to the content. Heck, as a human, I'd pay to avoid the dumb captchas.
[−]cphoover · 2026-07-01 Wed 20:14 UTC · link
As I understand it as models driving agent behavior of headless browsers are getting more and more sophisticated it's getting harder to reliably predict.

The same way LLM's without watermarking cannot be reliably classified as "not-human" neural-network driven scraping tools are getting harder to detect.

Cloudflare, and DataDome position themselves as companies that can detect automated traffic using things like IP reputation, behavioral signals, timing... But these things can be faked through proxy-networks, human behavior signals can be imitated with generative AI the same way text can be, web bots can utilize neural networks to generate trajectories and timings similar to those of humans.

If you can have an AI use a browser the same way a human can how can you distinguish the two?