Skip to content

Instantly share code, notes, and snippets.

@paralin
Last active July 3, 2018 19:28
Show Gist options
  • Save paralin/f6891bd30eb96e91fba628096af3278c to your computer and use it in GitHub Desktop.
Save paralin/f6891bd30eb96e91fba628096af3278c to your computer and use it in GitHub Desktop.
GraphQL @defer, @LiVe, @stream implementation notes

Notes moved from graph-gophers/graphql-go#15

Graphql-js can parse directives like @live just fine -

parsed.definitions[0].selectionSet.selections[0].selectionSet.selections[0].directives[0] =
{ kind: 'Directive',
  name: { kind: 'Name', value: 'live', loc: { start: 26, end: 30 } },
  arguments: [],
  loc: { start: 25, end: 30 } }

So, we can write things like this:

{
  likeCount @live
}

The question then is how the data is returned. This is more of a client question - I plan to patch this stuff into apollo-client. Apollo client and relay both have internal caches / stores. A query looks like this:

  • Parse the query
  • Transform it (remove unnecessary / cached fields, add __typename)
  • Send the query to the server (graphql-go)
  • Receive the response, parse it, apply changes to the internal cache
  • Rebuild the response data using the original user-supplied query, and the cache as the datastore.
  • Deliver the data back to the user.

Right now responses look like this:

{
  "data": {
    "regions": [
      {
        "__typename": "Region",
        "id": "av1",
        "ipRange": {
          "__typename": "IPRange",
          "ip": {
            "__typename": "IPAddress",
            "address": [
              10,
              110,
              1,
              0
            ]
          },
          "plen": 24
        },
        "location": {
          "__typename": "GeoLocation",
          "latitude": 40.4257789,
          "longitude": -86.9193591
        },
        "name": "Anvil",
        "zoomlevel": 16
      }
   ],
}

With an @live we would expect realtime updates to come back. With @defer, you would initially ignore the @defer-red fields and return without them, then later send the data back in a separate message. Facebook's proposed "patch" message would look like this:

{
  "path": ["feed", "stories", 0, "comments"],
  "data": [{
      "name": "tester",
      "comment": "test"
    }]
}

I would argue more for something like my json-mutate package, where you can do more granular updates of messages if they change. But perhaps adhering to Facebook's style is better, as they seem to already be using it in production and probably won't want to change down the line.

So, as far as I can tell the things that need to be done here to support this (experimentally!) are:

  • Graphql-js already supports the directives
  • Assess graphql-go's directive parser and see if anything needs to be done there (done)
  • Add the relevant code to apollo-client to:
    • Handle long-term queries and provide an API handle to cancel / terminate them (or do so automatically)
  • Batch together @defer and @live queries to avoid duplicate streams of data
  • Add a NetworkInterface to support live queries
  • Decide on a format for the patch messages and add a mechanism to process them + apply them to the cache. Apollo's existing code should be capable of handling updates to the cached data and communicating these updates back to the UI code.
  • Add the relevant code to graphql-go to handle these types of queries.

In terms of graphql-go - I think for "defer" we can implement it without any changes to the resolver function pattern. Instead of waiting for all of the resolvers to finish, we can instead just wait for the required ones, and then send back the response immediately and send patches later as the resolvers finish

For @live and subscriptions, we already have Context in the resolver functions. I think the difference here is, we need to return something like a channel from the resolver. When the query is terminated, the context will be canceled. If the resolver needs to stop sending data, it can close the channel, and optionally send an error down the channel just before doing that.

Shouldn't be too difficult! :)

User-Facing API

Right now we have a single Execute( call which returns a single response object. Obviously we need a API change if we want to handle multiple responses. graphql-go should support a second mode of operation that would allow these kinds of deferred requests. I'm proposing calling this function ExecStream, which would return a channel of *Response objects instead of just a single response. Again, the server implementation can indicate query completion when closing the response channel, and the requester can terminate the long-lived query by canceling the parent context.

Here are the behaviors for various combinations of the new directives:

  • @live with @defer should not wait for an initial value.
  • @live without @defer should wait for an initial value to include in the first response.
  • @stream implies @defer. Therefore, @stream with @defer is equivilent to just @stream.

These behaviors hold true in the GraphQL-Go example implementation.

Message Streams

There also needs to be a mechanism to pair messages coming from a server with a specific query context. In apollo-client this is left up to the network transport, and returns an integer ID for the query context, which can later be released (like a setTimeout handle). This is a good approach in my opinion.

I'm already using a similar approach in my WebSocket transport between Apollo and GraphQL-Go for just regular requests - since we don't have the context of a HTTP request, there needs to be an identifier for each request. I think this is best left up to the user / network transport to handle. There are a number of varying approaches to do this depending completely on the transport - if you're using long-lived HTTP requests for streaming requests, for example, such an identifier might be unnecessary as you still have the structure of an HTTP request.

Internal Implementation Changes

Since you could send standard queries alongside complex long-lived queries together through this kind of call, it then makes sense to standardize the server code. Calling resolvers should be the same code regardless of Stream or no Stream. We can add a parameter to iExec.exec for "channel" which, when false, would cause the server to completely ignore directives like defer and live.

Right now exec returns an interface{}. Lists are executed in parallel, spawning a goroutine for each element in the list, remembering the index the result should be in. This code is already super close to what would be necessary to do a @stream directive, which returns array elements over time as they are resolved (although index would probably not be necessary in this case). Relevant code for processing things like lists is here.

I would propose allowing resolvers to return a channel, rather than an array, with an optional argument for a struct with information about the request (is it @stream or not?). If we are running a standard Execute we can wait until the channel is closed to return the result from that resolver. If we are running an ExecuteStream with the @stream set on the field, we can send each result entry in the channel as it comes in over the wire, and not wait for the channel to close to send the initial response.

iExec.exec() needs a parameter that describes where it lies in the query, so that we can identify what the path to the current resolver execution is for live update messages. Again, this could be ommitted in non-Stream mode of execution as we don't need it in that case.

We need a root WaitGroup for nested fields, so we know when the query is finished. Wait to call Wait() on this until after the initial response is built.

We need a channel for responses passed to each of the exec() calls, which can be used for deferred / live responses and passed back to the caller of ExecStream().

Doing @live is a bit more complex. I propose optionally returning a channel on non-array fields. If the field is not an array, it would be interpreted as a channel for values over time, rather than objects over time. The resolver is expected to immediately send an initial value over the channel, and then when the value changes, send the new value through the channel later. GraphQL-Go can then interpret the changes, format them into change messages for the client (in the path style as written above), and then format + send the Response object over the response channel.

The introspection code should conditionally return the streaming directives depending on if Execute or ExecuteStream was called.

schema := `
type Character {
name: String
description: String
}
type RootQuery {
hero(episode: String): Character
heros(episode: String): [Character]
}
`
type ExampleResolver struct {}
// Current kinds of resolvers (in master today):
basicQuery := `
query testQuery($episode: String) {
heros(episode:$episode) {
}
}
`
liveQuery := `
query liveQuery($episode: String) {
heros(episode: $episode) @stream {
name @live
description @defer
}
}
`
func (e *ExampleResolver) Hero(args *struct{ Episode string }) *CharacterResolver {
return &CharacterResolver{}
}
func (e *ExampleResolver) Heros(ctx context.Context, args *struct{ Episode string }) *[]*CharacterResolver {
res := []*CharacterResolver{
&CharacterResolver{},
&CharacterResolver{},
}
return &res
}
// Proposed "live" resolvers
type CharacterResolver struct {}
// This resolver allows the character's name to change over time.
func (r *CharacterResolver) Name(ctx context.Context) <-chan *string {
ch := make(chan *string, 10)
go func() {
defer func() {
close(ch)
}()
done := ctx.Done()
value1 := "Tom"
value2 := "Jerry"
ch <- &value1
select {
case <-time.After(time.Duration(1)*time.Second):
case <-done: // If we are not doing a live query, context will be cancelled immediately after first value.
return
}
ch <- &value2
}()
return ch
}
// This resolver takes some time to resolve the character's description.
func (r *CharacterResolver) Description(ctx context.Context) *string {
done := ctx.Done()
select {
case <-time.After(time.Duration(500)*time.Millisecond):
case <-done: // If we are not doing a live query, context will be cancelled immediately after first value.
return nil
}
value1 := "Tom was a rather foolish cat who could never catch any mice."
return &value1
}
// This resolver returns heros over time if @stream is specified. Otherwise, graphql-go will build an array with all the results.
// NOTE: you could also return a channel of interface{} if you wanted to potentially return an error.
func (e *ExampleResolver) Heros(ctx context.Context, args *struct{ Episode string }) <-chan *CharacterResolver {
ch := make(chan *CharacterResolver, 2)
go func() {
defer func(){
close(ch)
}()
// Wrap these in select{} if we were actually doing any real work.
ch <- &CharacterResolver{}
ch <- &CharacterResolver{}
}()
return ch
}

In the above code example, we introduce the following:

  • @stream: returns array elements over time, rather than all at once.
  • @live: observes changes to a field over time
  • @defer: returns the rest of the query immediately, and resolves + returns the data in this field later.

Here is what a set of response objects to the above might look like.

[{
  "data": {}
},
{
  "data": {
    "path": ["heros", 0],
    "data": {
      "name": "Tom"
    }
  }
}, {
  "data": {
    "path": ["heros", 0, "description"],
    "data": "Tom was a rather foolish cat who could never catch any mice."
  }
}, {
  "data": {
    "path": ["heros", 0, "name"],
    "data": "Jerry"
  }
}]

Obviously, there's the issue of what happens when a field returns an error - see the comment in the above example code.

It's up to graphql-go to determine the index of each of the returned objects in a @stream.

The implementation of the server is now complete at http://github.com/paralin/graphql-go/ on the feat-defer branch.

Demos:

TODO:

  • Allow changes to entire object results. For example, return a <-chan <-chan *ResolverType to invalidate an entire result by returning a new one.
  • Handle returning error over a Live channel.
  • Handle returning errors over a Stream
@paralin
Copy link
Author

paralin commented Feb 6, 2017

I've moved effort on this to my RealTime GraphQL project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment