Structured Contents endpoints now parse and return article images and lists as structured JSON, directly within each section of a Wikimedia project article. Both features are available today in the On-demand API and Snapshot API.
Since launching the Structured Contents Initiative, the goal has been straightforward: deliver Wikimedia project data in clean, ready-to-use JSON so your pipeline doesn’t have to build its own parser. Images and list parsing are the next major step toward that. Every image attached to an article section, and every list whether ordered, unordered, or definition-style, now comes through in full structured detail.
In this article: Article Images | Article Lists
All Article Images now included
Wikipedia articles in Structured Contents have always had a main image representing the article as a whole; this field remains unchanged. What’s new is that images appearing throughout the article body are now parsed and included in the has_parts array of the section they belong to, returned alongside the text content they illustrate.
This matters because context is everything. An image of Josephine Baker in military uniform means something different sitting next to a caption and a paragraph about her World War II intelligence work than it does in isolation. By attaching each image to its section, Structured Contents preserves that relationship: the image, the surrounding text, and the caption all arrive together in a single coherent object. For search indexing, AI pipelines, and content applications, that co-located context is what makes the image actually useful rather than just present.

Each section image also includes encoding_format and media_type alongside its URL, dimensions, identifier, name, and caption so your application knows exactly what it’s receiving before it fetches anything. Decorative icons and images smaller than 16px are excluded, keeping payloads focused on content that carries meaning.
"name": "World War II",
"type": "section",
"has_parts": [
{
"type": "image",
"images": [
{
"content_url": "https://upload.wikimedia.org/wikipedia/commons/thumb/3/3e/Baker_Harcourt_1948.jpg/250px-Baker_Harcourt_1948.jpg",
"identifier": "5a03b69c25031ef6fdfee65e4b378020468f0d637ffafc0eb1018f03c4030ca3",
"name": "File:Baker_Harcourt_1948.jpg",
"caption": "Baker in uniform, 1948",
"height": 344,
"width": 250,
"encoding_format": "image/jpeg",
"media_type": "bitmap"
}
]
},[...]
Article Lists content is now available
Lists are one of the most information-dense structures in any Wikipedia article. Competition results, glossary terms, chronological timelines, participant rosters: information that is precise, ordered, and often critical to the subject of the article. Structured Contents now parses all of them, preserving the full hierarchy, nesting, and structure of the original article content and delivering it as clean JSON in the has_parts array.
Three list types are supported, each mapped to the structure Wikipedia uses to present them.
Unordered lists capture collections where the items belong together but have no required order. Think named participants in a conflict, award nominees, or feature sets. Ordered lists are for sequences where position carries meaning, and Structured Contents guarantees that order is preserved exactly as it appears in the article. Definition lists handle the term-and-description pattern common in glossaries, lexicons, and disambiguation-style content, keeping each term paired with its definition in the structured output.
Inline links within list items are preserved and returned with their URL and anchor text, so relationships between list content and other Wikipedia articles are not lost in parsing. Nested lists, lists inside infoboxes, and flat lists are all handled. Empty lists are omitted rather than returned as empty objects, keeping payloads clean.
Unordered lists

"name": "History",
"type": "section",
"has_parts": [
{
"type": "list",
"has_parts": [
{
"type": "list_item",
"value": "1964–1984: Wheelchair Powerlifting"
},
{
"type": "list_item",
"value": "1984–2016: Paralympic Powerlifting / IPC Powerlifting"
},
{
"type": "list_item",
"value": "2017–present: Para Powerlifting"
}]
}]
Ordered lists
Ordered lists will always retain their order in Structured Contents JSON.

{
"type": "ordered_list",
"has_parts": [
{
"type": "list_item",
"value": "A pastime in general, usually involving some form of competing.",
"links": [
{
"url": "https://en.wikipedia.org/wiki/Glossary_of_card_game_terms#cite_note-FOOTNOTEPhillips1957401-64"
}
]
},
{
"type": "list_item",
"value": "A variant of a basic game e.g. Gin Rummy or Wendish Schafkopf.",
"links": [
{
"url": "https://en.wikipedia.org/wiki/Gin_Rummy",
"text": "Gin Rummy"
},
[...]
Definition lists

{
"type": "definition_list",
"has_parts": [{
"type": "definition_term",
"value": "game points",
"has_parts": [{
"type": "definition",
"value": "In point-trick games, the score awarded to the players based on the outcome of a hand, the game value of a contract and any bonuses earned. Game points are accumulated (or deducted) to decide the overall winner. Not to be confused with card points.",
"links": [{
"url": "https://en.wikipedia.org/wiki/Point-trick_game",
"text": "point-trick games"
},
{
"url": "https://en.wikipedia.org/wiki/Glossary_of_card_game_terms#hand",
"text": "hand"
},
{
"url": "https://en.wikipedia.org/wiki/Glossary_of_card_game_terms#contract",
"text": "contract"
},
{
"url" "https://en.wikipedia.org/wiki/Glossary_of_card_game_terms#bonus",
"text": "bonuses"
},
{
"url":
"https://en.wikipedia.org/wiki/Glossary_of_card_game_terms#card_points",
"text": "card points"
}]
}]
}]
}
Structured Contents is a living initiative. As it evolves, the features added reflect what teams building on Wikimedia project data actually need. Image and list parsing both grew directly from feedback from developers and organizations using the On-demand and Snapshot APIs. If you’re working with Structured Contents endpoints and have opinions on what we should prioritize next, we’d like to hear from you.
For a full overview of the initiative and what’s currently available and on the roadmap, visit the Structured Contents Initiative page.
Get Started
Structured Contents payloads are available for free in the On-demand API today. The same image and list parsing is also available across Snapshot API files for teams that need bulk access. Sign up for a free account to get started, or contact our sales team to discuss your use case.
— The Wikimedia Enterprise Team
Photo Credits
Portrait of Jean Miélot, by Jean le Tavernier, Public Domain Mark, via Wikimedia Commons

