Reddit Slams AI Firm Anthropic In Privacy Lawsuit Alleging Mass Data Scraping

hero1 reddit lawsuit ai
The number of lawsuits filed against AI companies training their models on data posted to the internet continues to grow. The ongoing legal battle between Meta and a group of authors over copyright infringement is perhaps the most prominent, but Reddit has also filed a lawsuit against Anthropic (an AI firm), accusing it of training its Claude AI model with data scraped from the Reddit online community.

The lawsuit accuses Anthropic of acting without restraint in collecting and using users' posts from its platform -- without authorization -- from as far back as 2021. It alleges that Anthropic's actions amounted to a breach of Reddit's rules and regulations. As a result, Reddit is seeking damages for these alleged breaches and an order to compel Anthropic to comply with its regulations.

Reddit contends that Anthropic has admitted that it used content belonging to Reddit users to train its AI tool. The lawsuit alleges that Anthropic CEO specifically listed Reddit comments as one of the sources that trained Claude. Although a spokesperson for Anthropic has since claimed that the company stopped using Reddit content, Reddit claims this is false. Reddit argues that Anthropic's crawlers continued exploiting Reddit users' content over one hundred thousand times after the claim was made in July 2024.

body ai reddit lawsuit

Anthropic is not the only company accused of training its AI model with Reddit users' content. We reported last year that AI powerhouses like OpenAI and Google have an agreement with Reddit regarding the use of its content, while Microsoft and Perplexity have been heavily criticized for the unathroized scraping of data from Reddit.

The lawsuit further highlights this dilemma as AI companies try to continually enhance their models and compete with rivals. They either pay to access content or limit themselves to freely available public materials. The challenge, however, is that while the former option is expensive, especially for fledgling AI companies, the latter could mean AI models will be trained on outdated material that's similar to competitors. This could render the models inaccurate and unable to compete with rival models trained on a more modern and robust dataset.
Tags:  Lawsuit, Reddit, AI