GraphQL N+1 and DataLoader
Why GraphQL resolvers cause N+1 query storms and how DataLoader batches and caches them away. Clear examples, real code, and the pitfalls.
What you'll learn
- ✓Why GraphQL resolvers naturally cause N+1
- ✓How DataLoader batches calls within a tick
- ✓How per-request caching avoids duplicate work
- ✓How to wire DataLoader into a typical schema
- ✓Pitfalls with auth and cache scope
Prerequisites
- •Basic GraphQL schemas and resolvers
The first time a GraphQL server hits production, the database lights up. A query for 50 posts triggers 50 user lookups plus 50 comment counts. That is N+1, and it is built into how naive resolvers work. DataLoader is the standard fix. This post is about why N+1 happens and how to make it go away without rewriting your schema.
Why N+1 happens
In GraphQL, each field is resolved independently. A list field returns N items, then each item’s nested field is resolved once per item. Each of those resolvers is its own function and does its own database call. No magic batching happens.
{
posts(limit: 50) {
title
author { name }
}
}
Naively: one query for posts, then 50 queries for authors. That is 51 queries for 50 posts.
Mental model
Without DataLoader (N+1):
posts query --> [p1..p50]
|
+-- author(p1) -> SELECT user WHERE id=1
+-- author(p2) -> SELECT user WHERE id=2
...
+-- author(p50) -> SELECT user WHERE id=50
With DataLoader (batched):
posts query --> [p1..p50]
|
+-- author(p1..p50) collected in one tick
-> SELECT user WHERE id IN (1..50) DataLoader collects all the keys you ask for during one event-loop tick and dispatches a single batch function with all of them at once.
Hands-on: a typical DataLoader
const DataLoader = require('dataloader');
function makeUserLoader(db) {
return new DataLoader(async (ids) => {
const rows = await db.users.findMany({ where: { id: { in: ids } } });
const byId = new Map(rows.map(r => [r.id, r]));
return ids.map(id => byId.get(id) ?? null);
});
}
The contract is strict: the batch function takes keys, returns an array of the same length, in the same order. Missing keys become null or an Error. Skip the ordering and you have silent data corruption.
Wiring it into a schema
Loaders belong on the per-request context. Create them fresh for each request so the cache does not leak data between users.
const { ApolloServer } = require('@apollo/server');
const server = new ApolloServer({
typeDefs,
resolvers: {
Post: {
author: (post, _, ctx) => ctx.loaders.user.load(post.authorId),
},
},
});
await startStandaloneServer(server, {
context: async () => ({
loaders: { user: makeUserLoader(db) },
}),
});
Now the 50-post query fires one batched user lookup. From 51 queries to 2.
Beyond simple by-id loaders
DataLoader works for anything keyable: counts, joins, even per-key paginated lists.
const commentCountLoader = new DataLoader(async (postIds) => {
const rows = await db.$queryRaw`
SELECT post_id, COUNT(*)::int AS n
FROM comments
WHERE post_id = ANY(${postIds})
GROUP BY post_id`;
const byId = new Map(rows.map(r => [r.post_id, r.n]));
return postIds.map(id => byId.get(id) ?? 0);
});
For loaders that take composite keys (e.g., “comments for post X with status Y”), key by a stable string and parse it inside the batch function.
The cache is per-request
DataLoader caches load(key) results within its lifetime. Because the loader is created per request, the cache is per-request too. That is good: it avoids stale or cross-tenant data. It also means you do not get cache hits across requests; for that, use a Redis or in-process cache below the DataLoader.
Common pitfalls
- Returning a different array length from the batch function. DataLoader maps results by index; getting this wrong silently misaligns data.
- Sharing a loader across requests. The cache is now global and you will leak data across users.
- Loading the same key with different “shapes” (e.g., user with vs without email). Use separate loaders per query shape.
- Calling
awaitbetween collecting keys. DataLoader batches only what is queued within one tick; awaiting in the middle breaks the batch. - Throwing inside the batch function for one bad key. Return an
Errorfor that index instead, so the others still resolve. - Ignoring authorization. DataLoader does not know about access checks. Authorize the result before returning to the resolver.
Practical tips
- Make a loaders factory keyed by user, so context construction is one line.
- For lists per parent (“comments of post X”), key by
postIdand have the batch return arrays. Document the contract clearly. - Use Prisma’s
findMany({ where: { id: { in } } })or raw SQL withANY($1)for the batched lookup; both are fast. - Add metrics: batch size distribution tells you whether DataLoader is helping or whether resolvers are forced into sequential awaits.
- Pair DataLoader with query complexity limits. Batching helps databases; it does not stop a client from asking for a million nodes.
Wrap-up
GraphQL’s per-field resolver model is what makes the API expressive, and it is also what makes N+1 inevitable. DataLoader solves the problem with two ideas: collect within a tick, return in order. Wire one loader per data source, scope to the request, and most of your “GraphQL is slow” complaints will go away.
Related articles
- GraphQL GraphQL Resolver Patterns Explained
Compare resolver patterns in GraphQL: thin resolvers, service layers, DataLoader batching, and error handling that scales.
- GraphQL GraphQL Caching with Apollo Client
How Apollo Client's normalized cache works, why entity IDs matter, and the patterns for cache updates, refetches, and consistent UI after mutations.
- GraphQL GraphQL Error Handling Best Practices
Compare the errors array, union result types, and partial responses to design predictable, typed error handling for your GraphQL APIs and clients.
- GraphQL GraphQL Federation: A Practical Overview
Understand Apollo Federation: subgraphs, the gateway, entity references, and when to choose federation over a monolithic GraphQL schema.