Realtime API Documentation
Receive event updates from all supported Wikimedia projects in real-time using the streaming endpoint, or batch files that are updated hourly.
Batch files return NDJSON in a packaged tarball file. Streaming supports server-sent events (SSE) by default or NDJSON when you pass the Accept: application/x-ndjson
header. Event types are: update, delete, visibility-change:
- An
update
event type is sent when an article is created, its content is updated, or its name or namespace is changed. - A
delete
event type is sent when an article has been deleted. - A
visibility-change
event type is sent when the visibility of an article’s editor, comment, or content is changed by community volunteers
For access to Realtime APIs, contact our sales team.
Blog: Realtime API Parallel Connections and Restart Support
Article Updates (Streaming)
Returns a stream of new articles, updates, name changes, deletes, visibility changes across all supported projects. The type of event can be discerned by article.event.type
(possible values: update
, delete
, visibility-change
).
since
- string
- Optional
- Since Date in RFC3339 ('2006-01-02T15:04:05Z07:00')
fields
- array
- Optional
- Specify return fields that you need (example
version.*
will return all version object fields) filters
- array
- Optional
- You can specify how you want to filter your data.
parts
- array
- Optional
- This parameter is used when opening parallel connections to the realtime API. Using parts, one can target subsets of partitions in each of the parallel connections. The max allowed number of parallel connections is 10, i.e., the parts can take 0 through 9. Each of these parts represent 1/10th of the subsequent partition. For instance, parts 0 correspond to partitions 0 through 4; parts 1 correspond to partitions 5 through 9 and so on.
offsets
- object
- Optional
- This parameter is used when reconnecting to the realtime API. One can pass a map of partition:offset when reconnecting. This indicates the realtime API the offset from which to start sending out the events from, for a specific partition. If an irrelevant partition (that is not represented by the
parts
param) is included in theoffsets
map, it will simply be ignored. Ifoffsets
param does not include a partition that is represented byparts
, the events will be delivered in 'live mode' (as they appear) for that partition. since_per_partition
- object
- Optional
- This parameter is used when reconnecting to the realtime API. One can pass a map of partition:timestamp (date in RFC3339) when reconnecting. This indicates the realtime API the timestamp from which to start sending out the events from, for a specific partition. If an irrelevant partition (that is not represented by the
parts
param) is included in thesince_per_partition
map, it will simply be ignored. Ifsince_per_partition
param does not include a partition that is represented byparts
, the events will be delivered in 'live mode' (as they appear) for that partition.
-
text/event-stream
{ "event": { "identifier": "string", "type": "string", "date_created": "string", "date_published": "string", "partition": "integer", "offset": "integer" }, "additional_entities": "array", "article_body": { "html": "string", "wikitext": "string" }, "categories": "array", "date_modified": "string", "identifier": "integer", "in_language": { "identifier": "string", "name": "string" }, "is_part_of": { "date_modified": "string", "identifier": "string", "in_language": { "identifier": "string", "name": "string" }, "name": "string", "size": { "unit_text": "string", "value": "number" }, "url": "string", "version": "string" }, "license": "array", "main_entity": { "aspects": "array", "identifier": "string", "url": "string" }, "name": "string", "abstract": "string", "namespace": { "identifier": "integer", "name": "string" }, "protection": "array", "redirects": "array", "templates": "array", "url": "string", "version": { "comment": "string", "editor": { "date_started": "string", "edit_count": "integer", "groups": "array", "identifier": "integer", "is_anonymous": "boolean", "is_bot": "boolean", "name": "string" }, "identifier": "integer", "is_flagged_stable": "boolean", "is_minor_edit": "boolean", "is_breaking_news": "boolean", "noindex": "boolean", "scores": { "revertrisk": { "prediction": "boolean", "probability": "object" } }, "maintenance_tags": { "citation_needed_count": "integer", "pov_count": "integer", "clarification_needed_count": "integer", "update_count": "integer" }, "tags": "array" }, "visibility": { "comment": "boolean", "text": "boolean", "user": "boolean" } }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
since
- string
fields
- array
filters
- array
parts
- array
offsets
- object
since_per_partition
- object
application/json{ "since": "2006-01-02T15:04:05Z", "fields": [ "name", "identifier" ], "filters": "[{\"field\": \"in_language.identifier\",\"value\": \"en\"}]\n", "parts": [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ], "offsets": { "\u201c0\u201d": 3614782, "\u201c4\u201d": 3593806, "\u201c8\u201d": 3588693 }, "since_per_partition": { "\u201c1\u201d": "2023-06-05T12:00:00Z", "\u201c2\u201d": "2023-06-05T12:00:00Z" } }
-
text/event-stream
{ "event": { "identifier": "string", "type": "string", "date_created": "string", "date_published": "string", "partition": "integer", "offset": "integer" }, "additional_entities": "array", "article_body": { "html": "string", "wikitext": "string" }, "categories": "array", "date_modified": "string", "identifier": "integer", "in_language": { "identifier": "string", "name": "string" }, "is_part_of": { "date_modified": "string", "identifier": "string", "in_language": { "identifier": "string", "name": "string" }, "name": "string", "size": { "unit_text": "string", "value": "number" }, "url": "string", "version": "string" }, "license": "array", "main_entity": { "aspects": "array", "identifier": "string", "url": "string" }, "name": "string", "abstract": "string", "namespace": { "identifier": "integer", "name": "string" }, "protection": "array", "redirects": "array", "templates": "array", "url": "string", "version": { "comment": "string", "editor": { "date_started": "string", "edit_count": "integer", "groups": "array", "identifier": "integer", "is_anonymous": "boolean", "is_bot": "boolean", "name": "string" }, "identifier": "integer", "is_flagged_stable": "boolean", "is_minor_edit": "boolean", "is_breaking_news": "boolean", "noindex": "boolean", "scores": { "revertrisk": { "prediction": "boolean", "probability": "object" } }, "maintenance_tags": { "citation_needed_count": "integer", "pov_count": "integer", "clarification_needed_count": "integer", "update_count": "integer" }, "tags": "array" }, "visibility": { "comment": "boolean", "text": "boolean", "user": "boolean" } }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
Available Hourly Batches
Returns a list of available Realtime (Batch) bundles by date. Includes identifiers, file sizes and other relevant metadata.
date
- string
- Required
fields
- array
- Optional
- Allows to select what fields you receive in your response.
filters
- array
- Optional
- Allows you to filter the response payload.
-
application/json
[ { "identifier": "string", "name": "string", "version": "string", "in_language": { "identifier": "string", "name": "string", "alternate_name": "string", "direction": "string" }, "is_part_of": { "identifier": "string", "code": "string", "name": "string", "url": "string", "in_language": { "identifier": "string", "name": "string", "alternate_name": "string", "direction": "string" } }, "namespace": { "identifier": "number", "name": "string", "description": "string" }, "size": { "unit_text": "string", "value": "number" } } ]
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
date
- string
- Required
fields
- array
filters
- array
application/json{ "fields": "[\"name\",\"identifier\"]\n", "filters": "[{\"field\": \"namespace.identifier\",\"value\": 0}]\n" }
-
application/json
[ { "identifier": "string", "name": "string", "version": "string", "in_language": { "identifier": "string", "name": "string", "alternate_name": "string", "direction": "string" }, "is_part_of": { "identifier": "string", "code": "string", "name": "string", "url": "string", "in_language": { "identifier": "string", "name": "string", "alternate_name": "string", "direction": "string" } }, "namespace": { "identifier": "number", "name": "string", "description": "string" }, "size": { "unit_text": "string", "value": "number" } } ]
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
Single Hourly Metadata
Information on specific hourly batch. Includes identifier, file size and other relevant metadata.
date
- string
- Required
identifier
- string
- Required
- Batch identifier.
fields
- array
- Optional
- Allows to select what fields you receive in your response.
-
application/json
{ "identifier": "string", "name": "string", "version": "string", "in_language": { "identifier": "string", "name": "string", "alternate_name": "string", "direction": "string" }, "is_part_of": { "identifier": "string", "code": "string", "name": "string", "url": "string", "in_language": { "identifier": "string", "name": "string", "alternate_name": "string", "direction": "string" } }, "namespace": { "identifier": "number", "name": "string", "description": "string" }, "size": { "unit_text": "string", "value": "number" } }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
date
- string
- Required
identifier
- string
- Required
- Batch identifier.
fields
- array
application/json{ "fields": "[\"name\",\"identifier\"]\n" }
-
application/json
{ "identifier": "string", "name": "string", "version": "string", "in_language": { "identifier": "string", "name": "string", "alternate_name": "string", "direction": "string" }, "is_part_of": { "identifier": "string", "code": "string", "name": "string", "url": "string", "in_language": { "identifier": "string", "name": "string", "alternate_name": "string", "direction": "string" } }, "namespace": { "identifier": "number", "name": "string", "description": "string" }, "size": { "unit_text": "string", "value": "number" } }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
Project Updates (Batch)
Downloadable bundle of updated articles by project, namespace, and date. Updated hourly starting at 00:00 UTC each day.
date
- string
- Required
identifier
- string
- Required
- Batch identifier.
Range
- string
- Optional
- The Range HTTP request header indicates the part of a document that the server should return.
-
application/gzip
{}
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
Set of headers that describe the hourly download.
date
- string
- Required
identifier
- string
- Required
- Batch identifier.
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }