Wikipedia Asks AI Companies to Stop Scraping Data and to Start Paying Up

The Wikimedia Foundation, the nonprofit organization that hosts Wikipedia, wants AI companies to stop scraping its data to train AI models and to begin paying to use its Application Programming Interface instead, the foundation stated in a blog post on Monday.

Wikimedia contends that AI companies need high-quality, human-curated information to keep their models working. Wikipedia's extensive volunteer network of editors ensures that its information remains well-sourced, and its content is available in over 300 languages.

At the same time, running Wikipedia is a costly endeavor. It's currently the seventh-most visited website in the world, according to Semrush. It cost $179 million to run Wikipedia for the 2023-2024 fiscal year, according to a Wikimedia Foundation audit. Wikimedia keeps Wikipedia afloat primarily through donations and doesn't run advertising.

Don't miss any of our unbiased tech content and lab-based reviews. Add CNET as a preferred Google source.

But AI is changing people's research habits. Instead of researching subjects on Wikipedia, people are turning to AI to answer their questions. Although Wikipedia is free to use, if people circumvent it by using ChatGPT, they won't see donation requests at the top of the Wikipedia home page, and the site could lose money.

Wikimedia is asking AI companies to pay to use its Enterprise API, which will allow them "to use Wikipedia content at scale and sustainably without severely taxing Wikipedia's servers, while also enabling them to support our nonprofit mission."

Representatives for Google, OpenAI, Meta, Anthropic, DeepSeek and xAI didn't reply to requests for comment, and a representative for Wikimedia also didn't respond to a request for comment. Perplexity declined to comment, stating that it does not comment on partnerships without partner consent. Microsoft declined to comment.

Google did agree to a deal with Wikimedia in 2022 to commercially access Wikipedia content.

CNET AI Atlas badge art; click to see more

Wikimedia's request comes as online content creators are pushing back against AI companies using online data without permission or payment. Online publishers, such as Penske, the New York Times and News Corp, are suing AI companies for copyright infringement. Other companies, such as the Associated Press and Reuters, have signed licensing deals with AI firms.

(Disclosure: Ziff Davis, CNET's parent company, in April filed a lawsuit against OpenAI, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.)

During the AI boom, Big Tech stocks have soared to stratospheric heights. Nvidia briefly became the world's first $5 trillion company late last month, with Microsoft and Google's parent company, Alphabet, breaking the $4 trillion barrier earlier this year.