Skip to main content

LDES

Introduction

This tutorial walks you through how to consume the Rijksmuseum Linked Data Event Stream (LDES).

The documentation page explains the available endpoints and technical specifications. This tutorial focuses on something different:

How do you actually use LDES in a real workflow?

After completing this tutorial, you will understand:

  • what LDES is and when to use it
  • how to access the stream
  • how pagination works in practice
  • how to process events and retrieve records

What is LDES?

LDES (Linked Data Event Streams) is a specification for publishing dataset changes as streams of immutable events. Instead of updating records in place, LDES models changes as immutable events published to a stream.

This means:

  • each event represents a published state of a resource at a specific moment
  • events are immutable once published
  • consumers are responsible for reconstructing the current state themselves, typically by retaining the latest event per canonical identifier

The LDES model is append-only by design and supports preserving full event history over time. However, implementations may apply retention policies that limit what is retained.

The Rijksmuseum implementation uses a LatestVersionSubset retention policy and only retains the latest version of each resource. This makes it suitable for synchronization workflows, but not for reconstructing historical changes.

When should you use LDES?

Use LDES when you want to:

  • build and maintain a local copy of collection metadata
  • synchronize metadata incrementally over time
  • process metadata as a stream of immutable events
  • process changes to records over time

LDES is not the right choice when you need to:

  • search for specific objects interactively
  • retrieve single records on demand
  • perform low-latency lookup queries

In those cases, the Search API or Linked Data Resolver is more appropriate.


How does LDES compare to OAI-PMH?

Both LDES and OAI-PMH are designed for harvesting metadata in bulk.

The key difference is:

  • OAI-PMH returns the most recent version of each record as it exists at the time of harvesting, with support for incremental updates using from / until parameters.
  • LDES treats every change as an immutable event and exposes a full stream of versions over time, making it well suited for event-based processing and incremental synchronization workflows.

In theory, LDES offers a richer model than OAI-PMH — preserving full event history and supporting event-based processing. In practice, the Rijksmuseum implementation uses a LatestVersionSubset retention policy, which means only the most recent version of each resource is retained. This reduces the practical difference between the two approaches to mainly the data model and the navigation mechanism.

How LDES works

LDES exposes a dataset as a continuous stream of immutable events.

Instead of returning a single current state, it returns a sequence of changes over time.

LDES is primarily a mechanism for publishing and consuming changes over time. The stream tells you that something changed, not the current state of the resource. To reconstruct the current state, consumers must aggregate events per canonical identifier.

When you consume an LDES feed, you are:

  • receiving a sequence of pages (fragments), structured and linked according to the TREE specification — a standard that describes how a collection is fragmented and how consumers can navigate between fragments using typed relations
  • processing events in chronological order. Consumers typically order events by their published timestamp to ensure that the latest state of each resource can be reconstructed correctly.
  • encountering multiple versions of the same resource

Each page contains a limited number of events and links to the next fragment in the stream.

In the LDES model:

  • updates are published as new events, not replacements
  • historical versions can be preserved
  • ordering matters when interpreting the data

However, the Rijksmuseum implementation is configured with a retentionPolicy: LatestVersionSubset and amount: 1, meaning that only the most recent version of each object remains available. Historical versions are not retained.

Step-by-step tutorial

Step 1 — Access the stream

The LDES stream is organized in three levels:

  • Collection: points to yearly fragments
  • Year: points to monthly fragments
  • Month: points to daily fragments, with optional pagination within each day

Start at the root:

https://data.rijksmuseum.nl/ldes/collection.json

This returns the EventStream with a view containing relation entries that point to yearly fragments:

{
"@id": "https://data.rijksmuseum.nl/ldes/collection.json#eventstream",
"@type": "EventStream",
"view": {
"relation": [
{
"@type": "tree:GreaterThanOrEqualToRelation",
"path": { "@id": "as:published" },
"value": { "@value": "2026-01-01T00:00:00Z" },
"node": { "@id": "https://data.rijksmuseum.nl/ldes/2026.json" }
},
{
"@type": "tree:LessThanRelation",
"path": { "@id": "as:published" },
"value": { "@value": "2027-01-01T00:00:00Z" },
"node": { "@id": "https://data.rijksmuseum.nl/ldes/2026.json" }
}
]
}
}

The key fields are:

  • @type — the type of relation: GreaterThanOrEqualToRelation or LessThanRelation. Two relations together define a time interval.
  • path — the field on which the relation applies. Here as:published refers to the publication timestamp of each event. This allows consumers to navigate directly to the fragment containing events for a specific time period, without fetching all fragments.
  • value — the boundary value of the interval, as a timestamp
  • node.@id — the URL of the fragment containing events within this interval

In the Rijksmuseum implementation, relations are exposed in pairs. A GreaterThanOrEqualToRelation and a LessThanRelation with consecutive timestamps together define a single fragment. For example: >= 2026-01-01T00:00:00Z and < 2027-01-01T00:00:00Z together include all events from 2026 (in 2026.json)

The same pattern repeats at each level — yearly fragments point to monthly fragments, and monthly fragments point to daily fragments.

Consumers do not need to traverse the entire hierarchy sequentially. TREE relations allow clients to navigate directly to fragments covering a specific time interval.

Step 2 — Navigate to a daily fragment

Each yearly fragment points to monthly fragments, and each monthly fragment points to daily fragments. For example, to access events from 8 May 2026:

If a day contains many events, the daily fragment may be split into multiple fragments.

These additional fragments are connected via TREE pagination relations. Following these relations ensures you retrieve the complete event stream for that day.

Consumers should not assume that fragment URLs remain permanently stable over long periods of time. In production workflows, harvesting is typically resumed using timestamps or recently processed fragments.

Step 3 — Understand an event

An important thing to understand upfront: LDES is a signal mechanism, not a content delivery mechanism. Each event tells you that something changed, not what exactly changed.

To reconstruct the current state, consumers must aggregate events per canonical identifier.

Each daily fragment contains a member array with events. Each entry in member is an event, not the resource itself.

The stream covers a wide range of resource types — not only collection objects, but also related resources such as persons, organizations, and thesaurus terms.

A simplified example of a collection object event (dc profile):

{
"@id": "https://data.rijksmuseum.nl/200880469#2026-05-07T13:32:08.399049Z",
"@type": "Update",
"object": {
"@id": "https://data.rijksmuseum.nl/200880469?_profile=dc",
"canonical": "https://data.rijksmuseum.nl/200880469",
"mediaType": "application/ld+json",
"profile": "http://purl.org/dc/terms/"
},
"published": "2026-05-07T13:32:08.399049Z",
"@graph": {
"@context": { "@vocab": "http://purl.org/dc/terms/" },
"@id": "https://id.rijksmuseum.nl/200880469",
"@type": "PhysicalResource",
"title": "Trekschuit bij de Joodse begraafplaats Beth Haim te Ouderkerk aan de Amstel",
"creator": {
"@id": "https://id.rijksmuseum.nl/2101990",
"title": "Romeyn de Hooghe"
},
"date": "1675 - ca. 1695",
"identifier": "KOG-ZG-1-19-6026",
"relation": {
"@id": "https://iiif.micr.io/qnNbo/full/max/0/default.jpg",
"type": { "@id": "http://purl.org/dc/dcmitype/Image" }
},
"rights": { "@id": "https://creativecommons.org/publicdomain/mark/1.0/" }
}
}

And an example of a person event (la-framed profile):

{
"@id": "https://data.rijksmuseum.nl/210170701#2026-05-08T13:42:40.698816Z",
"@type": "Create",
"object": {
"@id": "https://data.rijksmuseum.nl/210170701?_profile=la-framed",
"canonical": "https://data.rijksmuseum.nl/210170701",
"mediaType": "application/ld+json",
"profile": "https://linked.art/ns/v1/linked-art.json"
},
"published": "2026-05-08T13:42:40.698816Z",
"@graph": {
"@context": "https://linked.art/ns/v1/linked-art.json",
"id": "https://id.rijksmuseum.nl/210170701",
"type": "Person",
"identified_by": [
{
"type": "Name",
"content": "Cornelia Tinnekens"
}
],
"born": {
"type": "Birth",
"timespan": {
"type": "TimeSpan",
"identified_by": [
{ "type": "Name", "content": "1719 - 1720" }
]
}
},
"died": {
"type": "Death",
"timespan": {
"type": "TimeSpan",
"identified_by": [
{ "type": "Name", "content": "1787 - 1788" }
]
}
}
}
}

The key fields are:

  • @id — unique identifier for this specific version of the event, combining the canonical URI and a timestamp
  • @type — type of change: Create, Update, or Delete
  • object["@id"] — identifier of the specific representation of the resource, including the metadata profile
  • object.canonical — base identifier of the underlying resource, independent of the metadata representation
  • object.mediaType — media type of the representation
  • published — timestamp of the event
  • @graph — inline metadata for this version of the resource. Only present for certain profiles such as dc and la-framed. For other profiles, the metadata must be fetched separately using the object["@id"] URL. Delete events typically do not contain inline metadata in @graph.

Note: like the Change Discovery API, a single resource change produces multiple events — one per metadata representation. Group events by object.canonical when processing the stream.

Step 4 — Python examples

Basic example — reading a single LDES fragment

Before running this script, make sure the required library is installed:

pip install requests

This example shows how to fetch a single LDES daily fragment and iterate over its events.

This example does not follow TREE pagination relations and only retrieves a single fragment. As a result, it may miss additional events when a day is split across multiple fragments.

Each event represents a change (Create, Update, or Delete) on a resource.

import requests  

# URL of a daily LDES fragment containing events for a specific date
url = "https://data.rijksmuseum.nl/ldes/2026/5/8.json"

# Fetch the JSON data from the endpoint and parse it into a Python dictionary
data = requests.get(url).json()

# Extract the list of events from the "member" field
events = data["member"]

# Iterate over each event in the fragment
for event in events:
# Print the event type (e.g. Create, Update, Delete)
# and the canonical identifier of the affected object
print(event["@type"], event["object"]["canonical"])

This will produce output like:

Create https://data.rijksmuseum.nl/311234800
Update https://data.rijksmuseum.nl/302343418
Update https://data.rijksmuseum.nl/300284702
Delete https://data.rijksmuseum.nl/301345857

Fetching a single LDES fragment

This example fetches a single daily fragment from the LDES stream and inspects its contents.

import json
import requests

def fetch_page(url):
# Fetch a single LDES fragment and return it as a JSON object
response = requests.get(url)
response.raise_for_status()
return response.json()

# URL of a daily fragment
url = "https://data.rijksmuseum.nl/ldes/2026/5/8.json"

# Fetch the fragment
data = fetch_page(url)

# Extract the events from the fragment
events = data.get("member", [])

print(f"Events in fragment: {len(events)}")

# Print the first event if one exists
if events:
print("\nFirst event:\n")
print(json.dumps(events[0], indent=2))

This will produce output like:

Events in fragment: 10

First event:

{
"@id": "https://data.rijksmuseum.nl/302368881#2026-05-08T15:00:47.482427Z",
"@type": "Update",
"object": {
"@id": "https://data.rijksmuseum.nl/302368881?_profile=bf",
"canonical": "https://data.rijksmuseum.nl/302368881",
"mediaType": "application/n-triples",
"profile": "http://id.loc.gov/ontologies/bibframe/"
},
"published": "2026-05-08T15:00:47.482427Z"
}

This example only retrieves a single fragment.

A single day may contain multiple paginated fragments when the number of events is large. Additional fragments can be discovered through the TREE relation entries.

The next example shows how to follow these relations and retrieve all events for a specific day.

Reading all events for a specific day

This example follows TREE pagination relations and retrieves all fragments for a given day. In the Rijksmuseum implementation, forward TREE relations can be followed sequentially to retrieve all fragments for a day.

import requests

def fetch_page(url):
# Fetch a single LDES fragment and return it as JSON
response = requests.get(url)
response.raise_for_status()
return response.json()


def get_next_page(data):
# Find the next fragment using TREE pagination relations
relations = data.get("view", {}).get("relation", [])

for relation in relations:
relation_type = relation.get("@type", "")
node_id = relation.get("node", {}).get("@id")

# Only follow forward pagination relations
if relation_type in [
"tree:GreaterThanRelation",
"tree:GreaterThanOrEqualToRelation"
]:
return node_id

return None


def get_events_for_day(year, month, day):
# Start from the daily entry point
url = f"https://data.rijksmuseum.nl/ldes/{year}/{month}/{day}.json"

all_events = []
visited = set()

# Continue following TREE relations until no next fragment is available.
while url and url not in visited:
print(f"Fetching: {url}")
visited.add(url)

data = fetch_page(url)

# Collect events from this fragment
all_events.extend(data.get("member", []))

# Move to next fragment (if any)
url = get_next_page(data)

return all_events


# Example usage
events = get_events_for_day(2026, 5, 12)

for event in events:
canonical = event.get("object", {}).get("canonical", "")
event_type = event.get("@type", "")
published = event.get("published", "")

print(f"{published} {event_type} {canonical}")

This will produce output like:

2026-05-12T06:50:06.739776Z  Create  https://data.rijksmuseum.nl/202914060
2026-05-12T06:50:06.688243Z Update https://data.rijksmuseum.nl/200569560
2026-05-12T06:50:06.532529Z Create https://data.rijksmuseum.nl/200914060

Note: like the Change Discovery API, the same canonical may appear multiple times — once per metadata representation. Group by object.canonical to count unique objects.

Fetching resource metadata separately

Not all events contain full metadata inline. In many cases, the event only points to a resource using object["@id"].

To retrieve the actual metadata, you need to fetch that URL separately.

import requests
import json

# Example event (normally coming from the LDES stream)
event = {
"object": {
"@id": "https://data.rijksmuseum.nl/311234800?_profile=schema-framed"
}
}

# The event only contains a reference
resource_url = event["object"]["@id"]

# Fetch the actual resource metadata
response = requests.get(resource_url)
response.raise_for_status()

metadata = response.json()

print(json.dumps(metadata, indent=2))

This returns the full metadata of the resource:

{
"@context": "https://schema.org/docs/jsonldcontext.jsonld",
"id": "https://id.rijksmuseum.nl/311234800",
"type": "Organization",
"name": "Galerie Medici Solothurn"
}

This illustrates an important principle of LDES: an event signals that a resource changed. The actual data is retrieved separately from the referenced resource.

Collecting changes over a date range

This example collects all Create, Update and Delete events over a range of days in an LDES stream.

For each day, the daily LDES endpoint is used as an entry point. If multiple fragments exist for that day, they are retrieved by following TREE pagination relations.

All events are aggregated and grouped by their canonical identifier.

This example collects events from the last three days, including the current day. The time window is computed dynamically at runtime.

import requests
from datetime import date, timedelta

def fetch_page(url):
# Fetch a single LDES fragment and return it as JSON
response = requests.get(url)
response.raise_for_status()
return response.json()

def get_events_for_day(year, month, day):
# Start from the daily LDES entry point
url = f"https://data.rijksmuseum.nl/ldes/{year}/{month}/{day}.json"

all_events = []
visited = set()

# Follow TREE pagination to retrieve all fragments for the day
while url and url not in visited:
visited.add(url)
data = fetch_page(url)

# Collect events from the current fragment
all_events.extend(data.get("member", []))

# Find the next fragment (if available)
next_url = None

for relation in data.get("view", {}).get("relation", []):
relation_type = relation.get("@type", "")
node_id = relation.get("node", {}).get("@id")

# Follow forward TREE relations (based on node links)
if relation_type in [
"tree:GreaterThanRelation",
"tree:GreaterThanOrEqualToRelation"
]:
if node_id:
next_url = node_id
break

url = next_url

return all_events


# Define a relative time window (last 3 days, including today)
end_date = date.today()
start_date = end_date - timedelta(days=2)

# Use sets to deduplicate resources across multiple events and days
created = set()
updated = set()
deleted = set()

current = start_date

while current <= end_date:
print(f"Fetching {current}...")
events = get_events_for_day(current.year, current.month, current.day)

for event in events:
canonical = event.get("object", {}).get("canonical", "")
event_type = event.get("@type", "")

if event_type == "Create":
created.add(canonical)
elif event_type == "Update":
updated.add(canonical)
elif event_type == "Delete":
deleted.add(canonical)

current += timedelta(days=1)


print("\nSummary of changes")
print(f"Created: {len(created)}")
print(f"Updated: {len(updated)}")
print(f"Deleted: {len(deleted)}")

This will produce output like:

Fetching 2026-05-10...
Fetching 2026-05-11...
Fetching 2026-05-12...

Summary of changes
Created: 731
Updated: 2943
Deleted: 13

A single day may consist of multiple fragments. This example follows TREE pagination to ensure complete retrieval. The same canonical may appear multiple times — this example aggregates unique resources. In production workflows, persist the last processed timestamp or fragment URL to resume harvesting incrementally.

Summary

In this tutorial, you learned how to consume the Rijksmuseum Linked Data Event Stream. You explored the three-level structure of the stream — year, month, and day — and how to navigate through paginated fragments using TREE relations. You learned how to interpret events, extract inline metadata where available, and collect changes over a date range.

For more information, see the LDES specification, the TREE specification, and the EU Academy e-learning module.