Reddit Toughens Rules to Block AI Data Scraping

On Tuesday, social media giant Reddit announced plans to revise their Web standard aimed at preventing automated data scraping from its platform. The decision comes in light of reports indicating that various artificial intelligence startups have been successfully evading these restrictions to harvest content.

This development occurs amid growing concerns regarding AI companies allegedly using content from publishers without permission, leading to the creation of AI-generated summaries that fail to credit the original sources.

Reddit stated it would be updating the Robots Exclusion Protocol, also known as “robots.txt,” a standard practice for delineating which sections of a website may be crawled.

In conjunction with this update, the company plans to enforce rate-limiting measures, which serve to restrict the volume of requests from individual entities. Additionally, Reddit will block unidentified bots and crawlers from collecting data on its site.

Recently, the robots.txt file has emerged as a crucial mechanism for publishers seeking to prevent tech firms from exploiting their materials for free in order to train AI algorithms and generate search query summaries.

A report from last week highlighted findings from content licensing startup TollBit, which indicated that several AI companies were bypassing the web standard to scrape content from publishing sites.

This follows a Wired investigation that revealed AI search startup Perplexity likely found ways to circumvent Reddit’s measures to restrict its web crawler using robots.txt.

Earlier in June, Forbes, a business media publisher, accused Perplexity of using its investigative work in generative AI systems without proper attribution.

Despite these measures, Reddit assured that organizations and researchers, including the Internet Archive, would still have access to its content for non-commercial usage.

Affiliate links may be automatically generated – see our ethics statement for details.

Reddit Toughens Rules to Block AI Data Scraping

Comment

Reddit Toughens Rules to Block AI Data Scraping

Share This Post

or copy the link

Tamamen Ücretsiz Olarak Bültenimize Abone Olabilirsin

Related News

WhatsApp Unveils Exciting Camera and Sticker Features!

WhatsApp Unveils Sleek New Design for iOS and Android!

Google Messages Tests Long-Awaited Editing Feature!

WhatsApp Tests New Offline Message Translation Feature!

OpenAI Partners with Reddit for AI-Powered Future

Write a Reply Cancel