A well-written scraper would check the image against a CLIP model or other capti...

Simran-B · 2025-07-12T07:42:19 1752306139

Then captions that are somewhat believable? "Abstract digital art piece by F. U. Botts resembling wide landscapes in vibrant colors"

Someone · 2025-07-12T11:42:35 1752320555

Do scrapers actually do such things on every page they download? Sampling a small fraction of a site to check how trustworthy it is, I can see happen, but I would think they’d rather scrape many more pages than spend resources doing such checks on every page.

Or is the internet so full of garbage nowadays that it is necessary to do that on every page?

vintermann · 2025-07-13T09:22:23 1752398543

Ain't nobody got the processing time for that! Scraping is about more, more, more. If they do any filtering it'll be afterwards.