AI Firms Are Getting Around Web Protocol That Lets Publishers Block Scraping (mediapost.com)

June 24, 2024
Midland Paper Clips
End User News

The debate over content scraping took a new turn on Friday when TollBit, a content licensing startup, alleged that artificial intelligence companies are bypassing a web standard used by publishers to block scraping, Reuters reports.

The web standard is Robots Exclusion Protocol, or robots.txt, which was created in the 1990s to prevent websites from being overwhelmed with web crawlers, according to Reuters.

A Wired probe alleges that AI search startup Perplexity has likely bypassed efforts to block its web crawler via robots.txt, Reuters continues. And TollBit says its analytics show that numerous AI agents are bypassing the robots.txt protocol.
more at: https://www.mediapost.com/publications/article/397097/ai-firms-are-getting-around-web-protocol-that-lets.html

ACMA, MPA, PostCom Jointly Call For USPS Proposed Volume Variability Estimating of Labor Costs for Automated Flats

Librarians Gather in Texas as Book Bans Mount

U.S. Postal Service Files Notice with PRC for New Mailing Services Pricing