Skip to content

Instantly share code, notes, and snippets.

@AndrewIngram
Last active August 24, 2016 19:35
Show Gist options
  • Save AndrewIngram/23bb597ef638fe2e285769301db8ef6e to your computer and use it in GitHub Desktop.
Save AndrewIngram/23bb597ef638fe2e285769301db8ef6e to your computer and use it in GitHub Desktop.
Dataloader Problem

(See example query below)

Sequence of HTTP requests (with naive dataloader usage):

  1. Fetch a page of article IDs

    /article-ids/?first=10

  2. Batch fetch the articles represented by each ID

    /articles/batch/{id1,id2,...id10}/

  3. For each article, fetch a page of comment IDs

    /articles/{n}/comment-ids/?first=3 (repeats 10 times)

  4. Fetch each comment batch, fetch the comments (can't be batched because previous steps won't complete at the same time)

    /comments/batch/{id1,id2,id3}/ (repeats 10 times)

Total number of request: 22

Now with more sophisticated dataloader usage:

  1. Fetch a page of article IDs

    /article-ids/?first=10

  2. Batch fetch the articles represented by each ID

    /articles/batch/{1,2,...n}/

  3. Batch fetch pages of 3 comment IDs for each article

    /articles/{n}/comment-ids/batch/?first=3

  4. Fetch all the comments for all page batches.

    /comments/batch/{1,2,...30}/

Total number of requests: 4

Why is the second way harder? Dataloader assumes the only necessary information for constructing a batch request is the list of IDs. When dealing with graph nodes this is correct. But when dealing with edges we need more information, specifically page size.

{
viewer {
articles(first: 10) {
edges {
node {
title
slug
body
comments(first: 3) {
edges {
node {
email
body
}
}
}
}
}
}
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment