Reddit Tells Microsoft Pay Up To Scrape Its Data Or Buzz Off

Reddit icon
Reddit has grown to become one of the most popular websites in the world, and the company is leveraging that position to change the way online search operates. After forging deals with Google and OpenAI for access to its data, the site now wants other companies to pay up as well. Until they do, you won't see very much of Reddit on search engines like Bing. Monetizing its wealth of data is key to its quest to stop losing money

In a recent interview, Reddit CEO Steve Huffman called out Microsoft, Anthropic, and Perplexity for refusing to negotiate for access to Reddit's data. This marks a dramatic shift in how Reddit interacts with the internet as a whole. The site updated its robots.txt file in May to block crawlers that it had not authorized. That's also when Reddit rolled out its new paid API, which killed most third-party Reddit apps and services. These changes coincided with the company's IPO, which has seen Reddit's value rise by more than 25% as of August.

Reddit's objection is not so much the way it appears in search results, but more the use of its data to train AI models. OpenAI famously ingested virtually the entire public-facing internet to train ChatGPT, which has led to a series of legal headaches for the AI leader. Huffman claims in the interview that "Search and summarization and training are merging, and the value exchange of crawling in exchange for traffic back is becoming muddied."

The newly public Reddit sees its collection of user-generated data as a valuable asset, which it can license for use in search and AI. The $50 million deal allows Google to crawl Reddit for search and use its data for AI training. Huffman holds this deal up as an example of what it wants from other search and AI firms. OpenAI has agreed to a similar arrangement, but neither side has revealed the price tag.
As a result of Reddit's new stance, you will only see older Reddit links (before May 2024) on search engines like Bing. Microsoft has apparently refused to engage in negotiations, with the company's AI chief dismissive of the idea that anyone should pay for scraping the open internet. Microsoft's head of search noted on X (formerly Twitter) that Bing provides websites with crawling tools to control how their data is used by Microsoft. Despite that, Bing is on the block list.

Reddit doesn't currently intend to enter into any exclusive deals. So, it's possible Bing and other AI-fueled search tools will be able to access Reddit data in the future. Microsoft does have one advantage here. Its close ties with OpenAI give it access to the company's best models for use in Bing search and Copilot. While it can't display links to Reddit, Microsoft's AI should continue to benefit from training with Reddit's data. But if you want to actually find something on Reddit, it looks like Google is your best bet.