AI Firms Are Getting Around Web Protocol That Lets Publishers Block Scraping (mediapost.com)
The debate over content scraping took a new turn on Friday when TollBit, a content licensing startup, alleged that artificial intelligence companies are bypassing a web standard used by publishers to block scraping, Reuters reports.
The web standard is Robots Exclusion Protocol, or robots.txt, which was created in the 1990s to prevent websites from being overwhelmed with web crawlers, according to Reuters.
A Wired probe alleges that AI search startup Perplexity has likely bypassed efforts to block its web crawler via robots.txt, Reuters continues. And TollBit says its analytics show that numerous AI agents are bypassing the robots.txt protocol.
more at: https://www.mediapost.com/publications/article/397097/ai-firms-are-getting-around-web-protocol-that-lets.html