Nomic AI used Wikimedia Enterprise’s Structured Contents dataset, via Hugging Face, to build the first full open-source vectorization of multilingual Wikipedia. Their work highlights how structured open data can accelerate AI research, improve model performance, and enable new forms of data visualization.
We launched Wikimedia Enterprise last year with a goal of making it easy to programmatically access data from across the Wikimedia Foundation projects. Since then, we have been busy building a product that can serve the needs of commercial users of any size. Today, we are thrilled to share some of the first customers using…