Reddit Sues Perplexity and AI Data Scraping Companies for Unauthorized Use of Its Data

2025-10-23

Reddit Inc. has filed a lawsuit against startup Perplexity AI Inc. and three data scraping service providers, accusing them of trawling through the company's copyrighted content to train artificial intelligence models.

The social media platform likened the data scraping firms — SerpApi, Oxylabs, and AWMProxy — to "bank robbers," adding that one company "would stop at nothing to obtain the Reddit data it desperately needs to power its 'answer engine'—except, apparently, striking a direct deal with Reddit, as some of its competitors have done."

Some AI companies have already reached agreements with Reddit, including OpenAI, which signed a deal last year to use Reddit’s vast data trove for training its large language models. Although no specific figures were disclosed, the deal was reportedly worth $60 million. At the time, Reddit stated it aimed to generate approximately $200 million in revenue from licensing agreements over the next three years, with Google LLC also signing on.

Reddit later filed another lawsuit against Anthropic PBC, alleging that the company was scraping content from Reddit to train its Claude series of AI models. This makes the latest filing—submitted today in the U.S. District Court for the Southern District of New York—one of the few ongoing legal actions of its kind.

Data scraping companies represent a relatively new phenomenon that emerged shortly after the explosion of generative AI. According to the New York Times, SerpApi is based in Texas and serves multiple clients. Oxylabs operates from Lithuania, while AWMProxy is based in Russia.

"AI companies are locked in an arms race for high-quality human-generated content—this pressure fuels an economy of industrial-scale 'data laundering,'" Reddit’s Chief Legal Officer Ben Lee told the Times. "Scrapers bypass technical protections to steal data, which they then sell to clients hungry for training materials."

In the lawsuit, Reddit claims it set a trap for Perplexity by posting a "test post" on its platform that was only visible to Google’s search engine and inaccessible anywhere else on the internet. According to Reddit, the hidden post’s content appeared in Perplexity’s search results within hours.

Perplexity stated it had not yet received the lawsuit but told media outlets it would “vigorously defend the rights of users to freely and fairly access public knowledge.” The company added, “Our approach remains principled and responsible, as we deliver factual answers through accurate AI. We will not tolerate threats to openness and the public good.”