The most common production outage you will trigger as an API designer is the unbounded list endpoint. The fix is simple, but you must build pagination from the start — retrofitting it on a public API breaks every client.
Why Pagination Matters
An endpoint that returns "all orders" works fine when the demo customer has 12. It melts down when one real customer has 4 million. The database query is slow, the network payload is huge, the client cannot render the result, and the request times out before any of that happens.
- Always default to a small page size (25–50).
- Cap maximum page size (100–1000).
- Document both clearly.
Offset Pagination
GET /orders?limit=25&offset=50
- + Trivial to implement.
- + Caller can jump to any page directly.
- − Slow for large offsets — the database often has to scan and skip.
- − Items inserted or deleted between pages cause duplicates or gaps.
Use it for small datasets and admin tools. Avoid for scrolling timelines and big tables.
Cursor (Keyset) Pagination
Encode "where I am" as an opaque cursor that the server can resume from. Typically the cursor is the last seen sort key plus the last seen ID.
GET /orders?limit=25
→ {
"data": [ ... ],
"next_cursor": "eyJpZCI6ICJvcmRfMTAwIn0="
}
GET /orders?limit=25&cursor=eyJpZCI6ICJvcmRfMTAwIn0=
→ next page
- + Fast at any depth — query becomes
WHERE (created_at, id) < (?, ?) ORDER BY created_at DESC, id DESC LIMIT 25. - + Stable under inserts and deletes.
- − Cannot jump to "page 47" — only forward (and sometimes backward).
This is the right default for most modern APIs.
Page Tokens (Google-Style)
Google APIs use a similar pattern with named token fields:
GET /orders?pageSize=25&pageToken=ABC...
→ { "items": [...], "nextPageToken": "DEF..." }
Equivalent to cursors but with bigger naming convention. Pick one and stick with it.
Total Counts: Be Honest
Clients often want a total. The database may not give you one cheaply on a billion-row table. Options:
- Don't return it. Acceptable for infinite-scroll UIs.
- Return an approximate count from cached statistics.
- Return an exact count only on filtered queries small enough to count fast.
Document which you do; never silently switch.
Sorting
Use a simple, stable convention:
GET /orders?sort=created_at,desc
GET /orders?sort=-created_at (Stripe / GitHub style: leading - = desc)
GET /orders?sort=status,created_at,desc
- Whitelist sortable fields — never let callers sort on arbitrary columns (no index, full scan).
- Always tie-break with the primary key for stable ordering across pages.
Filtering
Most APIs converge on simple equality filters in query strings:
GET /orders?status=open
GET /orders?status=open&customer_id=cus_42
GET /orders?created_at[gte]=2025-01-01&created_at[lt]=2025-02-01
For richer filtering, three approaches scale:
- Bracket operators:
created_at[gte]=.... Readable. - Predefined filter object on POST:
POST /orders/searchwith a JSON body. Bypasses URL length limits and lets you express complex queries. - RSQL / FIQL:
?filter=status==open;amount=gt=1000. Standardised but less common.
Whatever you pick, document it once and apply it everywhere.
Field Selection (Sparse Fieldsets)
Let clients ask for only the fields they need to reduce payload size:
GET /orders?fields=id,status,amount
Useful for large objects on slow networks. GraphQL gives this for free; in REST you opt in.
Embedding / Expanding Relations
Avoid forcing N+1 client requests by allowing inline expansion of related resources:
GET /orders/100?expand=customer,line_items
→ {
"id": "ord_100",
"customer": { ... },
"line_items": [ ... ]
}
Whitelist what can be expanded; cap nesting depth.
Putting It Together
GET /orders?
status=open&
created_at[gte]=2025-01-01&
sort=-created_at&
limit=50&
cursor=...&
fields=id,status,amount&
expand=customer
→ {
"data": [ ... ],
"next_cursor": "...",
"has_more": true
}
Five small conventions and your list endpoints scale from prototype to billion-row table without breaking changes.
Cert Mapping
| Cert | Pagination scope |
|---|---|
| AWS SAA | API Gateway pagination; DynamoDB pagination tokens |
| AWS Data Engineer | Pagination in batch ingestion APIs |
The next lesson moves to GraphQL, which approaches these problems differently.