According to a document, Reddit is suing Anthropic, accusing it of using the website's data to train AI models without an appropriate licensing agreement. In the complaint filed on Wednesday in the Northern District of California, Reddit claims that Anthropic used the site’s data for commercial purposes without authorization, which is illegal, and accuses the AI startup of violating Reddit's user agreement.
This lawsuit by Reddit makes it the first major tech company to legally challenge an AI model provider over training data practices, joining a list of publishers who have sued tech companies for similar reasons.
The New York Times has also filed a lawsuit against OpenAI and Microsoft, alleging that they used its news articles for training without payment or permission. Meanwhile, Sarah Silverman and other authors have sued Meta for training AI models on their books without approval. Music publishers and artists have similarly accused AI audio, video, and image generation startups of misusing their content.
"We will not tolerate profit-driven entities like Anthropic commercially exploiting Reddit content for billions of dollars without giving anything back to Reddit users or respecting their privacy," said Reddit's Chief Legal Officer Ben Lee in a statement to TechCrunch.
Notably, Reddit has reached agreements with other AI model providers, including OpenAI and Google, allowing these companies to train AI models on Reddit’s data and display posts from the site in responses generated by their respective AI chatbots. However, in the documents, Reddit stated that it imposed certain terms on OpenAI and Google to protect its users' interests and privacy.
OpenAI's CEO, Sam Altman, owns 8.7% of Reddit's shares, making him the third-largest shareholder, and he was previously a member of the company's board.
In the documents, Reddit claims it reached out to Anthropic and explicitly stated that the AI startup had no authorization to scrape or use Reddit’s content. However, Reddit alleges that Anthropic "refused to cooperate."
Reddit asserts in its complaint that Anthropic's scraping bots ignored the social network's robots.txt file, a standard method for signaling automated systems not to crawl a site. As further evidence, Reddit claims that Anthropic's AI chatbot, Claude, frequently cites Reddit communities and topics discussed on the platform.
Reddit is seeking compensation from Anthropic for damages and the profits gained from scraping Reddit's content. The company is also requesting an injunction to prevent Anthropic from continuing to use its content.