Wikimedia Enterprise Blog - Releases, News, Announcements

How Databricks Parsed Wikipedia to Markdown with Python

26 May 2026

Parsing raw wikitext into a clean text corpus is notoriously hard. Databricks engineers used Wikimedia Enterprise’s Structured Contents endpoints and Apache Spark to convert millions of Wikipedia articles to Markdown at scale, skipping the regex-heavy parsing layer entirely.
Read this article →
How CivicLens Uses Wikidata APIs to Make Civic Data More Accessible

28 Apr 2026

Ryland Hale wanted to make it easier for people to see what their politicians are actually doing. His project, CivicLens, pulls from sources like Congress.gov, LegiScan, the FEC, and Wikipedia to show voters exactly who represents them and how they vote. But turning raw government databases into a readable website requires a massive amount of
Read this article →
Article Images and Lists now in Structured Contents payloads

09 Apr 2026

Structured Contents endpoints now parse and return article images and lists as structured JSON, directly within each section of a Wikimedia project article. Both features are available today in the On-demand API and Snapshot API.
Read this article →
Wikidata APIs from Wikimedia Enterprise: Connect to the World’s Knowledge Graph

31 Mar 2026

Wikidata API endpoints are now part of Wikimedia Enterprise’s On-demand and Realtime APIs. Use your existing access token to query structured entity data, multilingual labels, and cross-language article links alongside Wikipedia and other Wikimedia project data.
Read this article →
Aligned AI is developing Ethical AI products for families, with the help of Wikimedia Enterprise

24 Mar 2026

Aligned AI utilizes the Wikimedia Enterprise Snapshot API to download and host comprehensive datasets from Wikimedia projects directly on their devices. This allows people to use Aligned’s offline search tool while still finding relevant Wikipedia articles to read and learn from.
Read this article →
Firecrawl Replaces Wikipedia Scraping with Wikimedia Enterprise APIs

17 Mar 2026

Firecrawl has partnered with Wikimedia Enterprise to support the millions of monthly requests their users make to retrieve Wikimedia project data. With Firecrawl aiming to become the main infrastructure layer between web data and AI, getting Wikimedia data in any format for agentic workflows becomes as easy as a single request.
Read this article →
Ecosia Enriches Search Results and AI Answers with Wikimedia Enterprise

16 Feb 2026

Ecosia, the search engine that uses 100% of its profits for the planet, provides a search experience with a positive environmental impact. Ecosia is conscious about its data sources: it aims to provide search functionality that provides clear and truthful results quickly, with links to sources and correct attribution. That’s why Ecosia partnered with Wikimedia
Read this article →
Wikimedia Enterprise at Stanford Human-Centered Artificial Intelligence Seminar

29 Jan 2026

This seminar saw over 220 attendees, both in-person and online, for this talk exploring how the Wikimedia Foundation can stay human-centric while also innovating alongside current developments in AI.
Read this article →
Mistral AI partners with Wikimedia Enterprise to Leverage Open Knowledge for AI

27 Jan 2026

Mistral AI and Wikimedia Enterprise are excited to announce the start of a three-year partnership. Data from Wikipedia and other Wikimedia projects is indispensable as a source of knowledge for Mistral’s AI projects, such as its AI assistant Le Chat.
Read this article →
Announcing New Wikimedia Enterprise Partners for Wikipedia’s 25th Birthday

15 Jan 2026

Wikipedia celebrates 25 years of human-created knowledge on 15 January 2026. To mark this milestone, Wikimedia Enterprise is publicly announcing partnerships with Amazon, Meta, Microsoft, Mistral AI, and Perplexity. Learn how our infrastructure delivers human-governed knowledge at scale.
Read this article →
2025 Year in Review: Wikimedia Enterprise & The Evolution of Open Data

15 Dec 2025

Read our full annual wrap-up to explore how 2025 marked a fundamental shift in open knowledge. From Wikipedia’s recognition as a Digital Public Good to the launch of Parsed Tables and Quality Scoring models, discover how we are helping developers build a sustainable future for ethical data with Wikimedia Enterprise.
Read this article →
Reef Media uses Wikimedia Enterprise Snapshot API to Fact Check and Verify Sources

05 Nov 2025

Reef Media is building a comprehensive platform to help users analyze media for its strengths and weaknesses, creating a more informed public. One of the key datasets underpinning their technical strategy is a robust, verifiable data source: Wikimedia data extracted through the Wikimedia Enterprise Snapshot API.
Read this article →
Wikimedia Enterprise at NeurIPS – Events, Talks, and Recaps

17 Oct 2025

Learn about the intersection between generative AI data and open, trusted datasets in the talks by Wikimedia Enterprise, MLCommons, and the AI Alliance at NeurIPS.
Read this article →
Unlock Wikipedia Tables as Structured JSON: Introducing Parsed Tables in Wikimedia Enterprise

10 Sep 2025

Access Wikipedia’s most valuable tables as structured JSON with the new Parsed Tables feature from Wikimedia Enterprise. Instantly convert complex tables into clean, machine-readable data without scraping. Enhance your AI, search, and knowledge graph projects with reliable, human-curated facts that were previously locked away in HTML and wikitext.
Read this article →
Nomic AI’s NOMAD Projection uses Enterprise Datasets to Visually Map Multilingual Wikipedia

19 Jun 2025

Nomic AI used Wikimedia Enterprise’s Structured Contents dataset, via Hugging Face, to build the first full open-source vectorization of multilingual Wikipedia. Their work highlights how structured open data can accelerate AI research, improve model performance, and enable new forms of data visualization.
Read this article →
Wikipedia Kaggle Dataset using Structured Contents Snapshot

16 Apr 2025

Explore Wikipedia content in a clean, structured format with our new beta dataset on Kaggle. Built from our Snapshot API using the Structured Contents beta, it’s ideal for data science, ML training, and experimentation.
Read this article →
Wikimedia Enterprise Partners with ProRata.ai to Champion Sustainable Search Engine Practices

31 Mar 2025

Wikimedia Enterprise is partnering with ProRata.ai to power its new search engine, Gist.ai, with reliable, human-curated Wikimedia content. The collaboration supports a sustainable content ecosystem through transparent attribution and API-driven innovation—ensuring creators are credited and content remains discoverable in the AI era.
Read this article →
Parsing Wikipedia References with Quality Scoring Models

19 Mar 2025

The latest API release boosts Wikipedia data integration with parsed references in JSON and two quality scoring models – Reference Need and Reference Risk. These enhancements streamline citation access and improve content reliability for developers.
Read this article →
Exploring the Future of Open Access and AI at SXSW 2025

20 Feb 2025

Wikimedia Enterprise joined Creative Commons at SXSW 2025 on March 9th in downtown Austin for a day of insightful conversations and panels exploring the intersection of artificial intelligence and open data. The discussions emphasized the critical importance of protecting and ethically advancing open access principles amidst rapid technological growth. The event brought together industry leaders,
Read this article →