FILE PHOTO: Reddit's logo is displayed, at the New York Stock Exchange in New York City, U.S., March 21, 2024. REUTERS/Brendan McDermid/File Photo
Reddit said that it would update the Robots Exclusion Protocol, or"robots.txt," a widely accepted standard meant to determine which parts of a site are allowed to be crawled. More recently, robots.txt has become a key tool that publishers employ to prevent tech companies from using their content free-of-charge to train AI algorithms and create summaries in response to some search queries.
This follows a Wired investigation which found that AI search startup Perplexity likely bypassed efforts to block its web crawler via robots.txt.