Chris Petrillo stands on stage in front of several large screens showing Wikimedia and Wikipedia logos

Wikimedia Enterprise at Stanford Human-Centered Artificial Intelligence Seminar

Members of the Wikimedia Enterprise team presented on “Wikipedia in the Age of AI and Bots” at the February 2026 seminar of Stanford’s Institute for Human-Centered Artificial Intelligence. The Stanford HAI Institute offers a front-row seat to some of the most pressing questions in AI today, covering how AI systems are built and how they interact with society at scale. This seminar saw over 220 attendees, both in-person and online, for this talk exploring how the Wikimedia Foundation can stay human-centric while also innovating alongside current developments in AI.

Chris Petrillo and Chia Hwu covered how Wikimedia projects have always been entangled in machine learning and AI: algorithmic bot accounts have been part of Wikipedia’s edit history as early as 2002. In 2023, the first policy around LLMs and generative AI was published on Wikipedia in response to humans’ use of ChatGPT to draft and edit articles. AI has always relied on Wikimedia project data with Wikipedia being heavily weighted as a source in AI training and cited often in AI chatbot answers.

For 25 years, Wikipedia has been a foundational source of text data for natural language processing and machine learning. Its openly licensed content has powered much of the education, innovation, and research across academia and industry. It’s been cited over 2.85 million times on Google Scholar. With the rise of LLMs, this data consumption has increased. Wikipedia is now facing rapidly changing traffic patterns driven not only by human readers but increasingly by large-scale automated access. Bot traffic has increased, and off-platform readership has grown to the detriment of on-platform readership.

people sit on chairs looking at Chris Petrillo presenting at the Stanford lunch seminar, multiple screens showing a blue slides are visible overhead.

Maintaining the sustainability of the platform and prioritizing human access first has required nuanced approaches. New resources are being created and leveraged to optimize reuse and stability: new technical and commercial partnerships are being forged, attribution guidelines are being drafted, and more focus is being put on Wikimedia Enterprise services and the use cases and integrations it can bring.

Wikimedia Enterprise APIs are presented as one of the crucial new resources to help reduce strain on Wikimedia’s infrastructure and to meet the needs of automated readers. These APIs give access to parsed article content via the Structured Contents Initiative, eliminating the need for building and maintaining custom parsing libraries. Furthermore our APIs  include credibility signals that provide critical insight to make informed decisions about the current latest revision and if edits are backed by verifiable sources.

 In the end, Wikimedia content is valuable because of its reliable sourcing, the importance of verifiability, its transparency through talk pages and discussions in the public sphere, and most of all because it is created by and for humans. These humans work together to build consensus and show the different views of the same reality. Wikimedia Enterprise is one part of the answer to keeping Wikipedia and its sister projects human, while also creating spaces for automated traffic and bots to thrive.

– The Wikimedia Enterprise Team

Photo Credits

Photos of Stanford HAI seminar, taken by Chia Hwu, CC BY-SA 4.0