Skip to content

Instantly share code, notes, and snippets.

@slightlyoff
Last active September 30, 2022 23:11
Show Gist options
  • Save slightlyoff/18dc42ae00768c23fbc4c5097400adfb to your computer and use it in GitHub Desktop.
Save slightlyoff/18dc42ae00768c23fbc4c5097400adfb to your computer and use it in GitHub Desktop.
Delivering H/2 Push Payloads To Userland

Background

One of the biggest missed opportunities thus far with HTTP/2 ("H/2") is that we are not yet able to sunset WebSockets in favor of H/2. Web Sockets and H/2 both support multiplexing messages bi-directionally and can send both textual and binary data.

Server Sent Events ("SSE"), by contrast, are not bi-directional (they're a "server-push-only" channel) and binary data cannot be sent easily. They are, however, very simple to implement. Adding to the menagerie of options, RTCPeerConnection can also be used to signal data to applications in a low-latency (but potentially lossy) way.

Because H/2 does not support the handshake (upgrade) that WebSockets use to negotiate a connection (abandoned draft approach to repairing here), web developers are faced with a complex set of choices regarding how to handle low-latency data updates for their apps.

Web apps may benefit from a WebSocket-like API for receiving data updates without a custom protocol (e.g., Web Sockets or SSE) and associated infrastructure. An API for H/2 push receipts seems like a natural solution to this quandry, but H/2 has created a non-origin sharing concept for pushed responses. The upshot is that H/2's behavior needs to be futher constrained to only consider single origins (or URLs) so that an application is not incidentially confused by pushes into a shared H/2 cache by an app for which the server is also authoritative but which the application does not wish to trust.

Proposal

The Server Sent Events API, perhaps, provides a way forward. What if, instead of only connecting to SSE-protocol supporting endpoints, a call to new EventSource(...) could also subscribe to pushed updates to individual URLs (or groups of URLs) and allow notifiation to the application whenever an H/2 PUSH_PROMISE resolves?

A minor extension to SSE might allow use of H/2 push as a substrate for delivering messages, with a slight modification of the event payload to accomodate providing Response objects as values:

// If an H/2 channel is open, any PUSH_PROMISE for the resource at
// `/endpoint` will trigger a "resourcepush" event
let es = new EventSource("/endpoint", { 
  withCredentials: true
});
es.addEventListener("resourcepush", (e) => {
  console.log(e.data instanceof Response);
});

Opting in to globbing for all push promises below a particular scope might look like:

// Any resource pushed to "/endpoint" or anything that matches that
// as a prefix will deliver to this handler
let es = new EventSource("/endpoint", { 
  withCredentials: true,
  matchScope: true
});
es.addEventListener("resourcepush", (e) => {
  console.log(e.data.url);
});

// If multiple handlers are installed, it's longest-prefix-wins, ala
// Service Workers:
let es = new EventSource("/endpoint/channel", { 
  withCredentials: true,
  matchScope: true
});
// If a push is sent to `/endpoint/blargity`, will be handled by previous source
// If a push is sent to `/endpoint/channel`, `/endpoint/channel1`, or 
// `/endpoint/channel2`, it will be delivered below:
es.addEventListener("resourcepush", (e) => {
  console.log("channel-pushed resource:", e.data.url);
});

Open questions

  • "resourcepush" is a terrible name.
  • should updates to previously pushed resources be opt-in instead of transparent?
  • do we need a version of this API that allows delivery to Service Workers?
  • how much of a problem is the H/2 vs. Origin Model split in practice? Do we actually know?
  • via Ryan Sleevi, to what extent should we allow subscriptions to any origin?

Alternatives Considered

A notable downside of the proposed design is that it might "miss" same-origin payloads sent earlier in the page's lifecycle over the channel. One could imagine a version of the proposal that adds a flag to enable subscription-time delivery of these resources so long as they're still in the push cache. Putting this feature as another optional flag feels clunky, though.

One way to make it less clunky would be to describe the API in terms of an Observer pattern (e.g. Intersection Observer). The above example might then become:

let options = {
  path: "/endpoint",
  withCredentials: true,
  matchScope: true
};
let callback = (records, observer) => {
  records.forEach((r) => {
    console.log(r instanceof Response); // === true
  };
};

let resoureObserver = new ResourceObserver(callback, options);

This feels somewhat better as one of the key distinguishers of Observer-pattern APIs is their ability to scope interest in a specific subset of all events internal to the system. We also dont need to cancel delivery, so phraising it as an Event might not be great. It is, however, much more new API surface area.

Another considered alternative might instead put registration for listening inside a Service Worker's installation or activation phase; e.g.:

self.addEventListener("install", (e) => {
  e.registerResourceListener("/endpoint", { 
    withCredentials: true,
    matchScope: true
  });
});

self.addEventListener("resourcePush", (e) => {
  console.log("channel-pushed resource:", e.data.url);
});

This is interesting because it would allow us to avoid delivering these events to all windows and tabs which are open, and doesn't upset our design goal for Service Workers to be shut down when not in use. The ergonomics suffer somewhat: what about pages that don't have a SW (e.g., first-load)? Does handling resource pushes then gate on SW install time (which can take a long while)? Also, Response objects can't be sent postMessage to clients as they aren't structured-clonable, meaning the SW would need to signal to a page that an update is available and then, perhaps, transfer the data by extracting it from the Response body to send via postMessage or transfer using the Cache API.

@benaadams
Copy link

Initiating the stream with a fetch would be ok; but the fetch send would need to be incomplete as its N requests for M responses.

e.g. the client might be sending data at 1 chunk per second and the server responding at 30 chunks per second or vice versa; with no actual response being directly related to any request (other than the initialization); or any request related to any response.

To replace WS it needs to be full-duplex (2 unidirectional data flows that aren't request->response, request->response)

request->open stream->bi-directional flows (requests without responses, responses without requests)

response and requests might still be a good unit/representation of a complete message/sub-stream; rather than raw socket where its just chunks with no protocol indicating where a message begins or ends in the chunk; and they can be interpreted as different data types (arraybuffer, blob, json etc).

Simple example would be chat. The client would periodically send messages (user's typed text); the server would periodically send the other users' chats when they send them (probably at a higher rate).

However, other than in the users' heads or to AI/ML or an application defined way those the client sends and the server sends aren't necessarily related to each other - other than being in the same medium/channel/stream etc.

@kornelski
Copy link

For flushing H/2 cached resources I think EventSource API is enough and I don't see much point in using Observer API.

When SSE is created, it could immediately receive events for all cached pushes. If you want that to be explicit, you can introduce a new readyState for this or add a flag to the events to mark them as cached. Normal SSE receives a backlog of previous messages with last-event-id, so getting a backlog of previous pushes is not too different conceptually.

Alternatively, is it fair to say that H/2 push is a feature of the cache/inseparable from a cache? Maybe it should be an event on window.Cache?

@jakearchibald
Copy link

@pornel

For flushing H/2 cached resources I think EventSource API is enough

What do you mean by flushing, and why restrict this to EventSource?

@triblondon
Copy link

triblondon commented Oct 13, 2017

I believe there's always a request to attach to

Wouldn't the request that you want to 'attach' to potentially be the navigation that loaded the current page, not a fetched request?

Personally I like restricting the observability of pushes to service workers, because I'm concerned about potential complexities of triggering push events in multiple open pages. I'm also wondering if the practical implementation optimisations of H2 Push will be a problem for this kind of use case - specifically, if I take a common SSE usecase, like stock price changes, and use H2 push for that, I'll be doing frequent pushes of the same resource over the same connection. It might complicate implementations that optimise around an assumption that once pushed a resource won't be pushed again.

@tyoshino
Copy link

Web Sockets and H/2 both support multiplexing messages bi-directionally

I was editing an RFC about multiplexing for WebSockets, but it's been abandoned in favor of H2. For now, WebSockets are bidirectional but not multiplexed at all. There has been several proposals of WS/H2 made but nothing's standardized/implemented yet.

What's described in jakearchibald's post is really what's been envisioned for Streams/Fetch integration. It would hopefully realize bi-directional multiplexed communication within the standard HTTP semantics.

That said, WebSockets have an established ecosystem (community, infrastructure, code, library, etc.). According to Chrome's statistics, 4% of page visits are using WebSockets. My gut impression is that both fetch()/Streams and WebSocket API would continue working as a pair of wheels for the Web for a while, at least the API and possibly the framing would also.

To address the issues in the abandoned proposal https://tools.ietf.org/html/draft-hirano-httpbis-websocket-over-http2-01, I've been working on https://tools.ietf.org/html/draft-yoshino-wish-03, and Patrick from Mozilla has just also proposed WS/H2 bootstrapping idea https://tools.ietf.org/html/draft-mcmanus-httpbis-h2-websockets-00. I'm investigating his proposal now.


Notifying pushed resources to the page and allowing them to be used easily is also important topic in terms of better resource loading, but should be discussed separately, I think. I feel that PUSH_PROMISE is not the right tool for building something replacing WebSockets as benaadams@ analyzed above. Comet/H2 would just work well for some cases.

I agree that it can be something EventSource-like (in terms of subscribing to some identifier and receiving multiple events notifying some changes received), but it should be a separate API, I think, as EventSource is designed to be raw data communication interface, not about resources. It would be confusing if we mix it with the "resourcepush" stuff which is very different from the original semantics.

Maybe it should be an event on window.Cache?

window.Cache is a part of the CacheStorage API. AFAIU, it's not an abstraction of the HTTP cache of a UA though it's closely related.

But I agree that the interface for probing PUSH_PROMISE should be not specific to PUSH_PROMISE but more generalized HTTP cache probing interface. It looks Ilya also said so in https://gist.github.com/slightlyoff/18dc42ae00768c23fbc4c5097400adfb#gistcomment-2227368. Do we want to distinguish refreshing of a cache entry by push and one by re-validation/reload/etc.? My understanding has been no, though I might missing some point.

IIUC, the goal of such interface is

  • to allow the page to look at the cache before initiating requests (this can be done also by using the cache options of the Fetch API) for doing something intelligent such as request reordering
  • to allow a server to push new data to use but
    • without polling (though the cost of polling is not so big with H2)
    • unlike WebSockets and EventSource, having the UA parse/decode the data in C++, etc. level so that it can be immediately used for , etc.

Am I understanding correctly?

So, it would be something like as follows, I guess:

const cacheStatePromise = probeCache('/foobar');
cacheStatePromise.then(cacheState => {
  // cacheState has various members e.g. timeStamp.
  // It may have a flag indicating whether it's been pushed or not as Ilya said if needed

  // Resolves once any further change happens for the cache entry(entries) for '/foobar'.
  // In case e.g. there's Vary header, it's unknown whether the cache entry has been
  // updated for a certain request with some headers.
  const cacheStatePromise = cacheState.probe();
});

@KenjiBaheux
Copy link

KenjiBaheux commented Oct 19, 2017

The discussion around resource notifications to the page is very interesting. Also, the use cases and motivation seem more concrete. Should we split this gist into 2 separate things?

I find it much easier to reason about new proposals when there is a good list of use cases.
Do we have more examples?

Here is one use case I heard from a partner that seems related:
"Publishers tend to add a lot of third party scripts on their pages which hurts performance. RUM/Analytics companies could perform optimization at scale if they were given the proper APIs. For example, assuming the RUM vendor determines that async script X on the main page doesn't affect anything above the fold. It would be a good idea to delay its load and/or delay its processing by the main thread."

Sidenote: offering an API surface to control how/when resources are loaded/processed is scary. What's to prevent a malicious third party script, included in the main document, to down-prioritize / hold-hostage everything except what matters to its own needs? Which mechanisms do we have to help publishers (or the partners they trust) to remain in control of the user experience?

@jakearchibald
Copy link

@triblondon

Wouldn't the request that you want to 'attach' to potentially be the navigation that loaded the current page, not a fetched request?

Yep. But that would be a fetched request within a service worker.

@rektide
Copy link

rektide commented Nov 6, 2017

Contrast

I really appreciate how @slightlyoff's SEE proposal, in contrast to Jake's FetchObserver,

  1. Discovers PUSH_PROMISEs bound to resources requested by means other than Fetch!
  2. Means that I do not have to instrument every fetch call with an observer. Globally maintain one & stick it on everywhere would be a common not-great practice under Fetch Observing.
  3. Can also limit my scope of listening if I want to (by using withScope)
  4. Fit in with existing knowledge of how a core piece of the web platform works: EventTarget.

These are very handy advantages for practical development. Getting an event source of what's being sent to the browser is an ideal top level interface, versus some octupuss creature where you're trying to make sure any given hand knows what's its got.

SSE Stream ID

Pushed responses are always associated with an explicit request from the client. -8.2.1 Push Requests.

While I like that the SSE proposal isn't tied to only seeing Fetch requests, I do think whatever API is created for supporting PUSH_PROMISE ought to allow users making a Fetch requests to know that PUSH_PROMISEs arriving are in reply to their Fetch requests. This, I believe, is currently missing in the SSE proposal. Two approaches I've considered for fixing this:

  1. Create a fetch attribute to the ResourcePushEvent, having the Fetch Response object.
  2. Create a numerical streamId attribute on the ResourcePushEvent and the Fetch Response.

I'm a little concerned that the first fetch attribute response could extend the lifetime of the Fetch Response object. And it's limited world view is fetch-only, which I hope ResourcePushEvent can be better than.

Cancellation

If we're going to see some kind of ResourcePushEvent, let the browser be capable of cancelling these events. Sending a RST_STREAM in response to a PUSH_PROMISE is an important tool for clients to prevent re-sending of data they already have. It ought be exposed here.


@KenjiBaheux here's one trivial use-case: as a web application, when a JSON resource of type http://schema.org/Event is PUSH_PROMISEd to me, I want to know to be able to detect that PUSH such that I can ask for the resource, then animate into view the new Event on a calendar I have been rendering on the user's screen.

@martinthomson
Copy link

I like @jakearchibald's model here. Another addition to SSE would further entrench the use of what is a pretty bad mechanism. Fetch is infinitely more appropriate a place to house this sort of addition. EventSource has some odd requirements for the request it makes and few options for modifying them.

The bigger question for me is whether you consider it interesting to surface information about multi-origin pushes and how that might be done.

A server that is authoritative for two origins can take a request for one origin and push for the other. Jake's design might work for the first origin (exposing a crossoriginpush event, say), but the second origin doesn't get to learn about the push fetch.

You could maybe address that with a SW-global push event, but that means that (a) the second origin doesn't get to see the context, and (b) you have two different SWs competing for the same fetch. An accessor for the original request would address (a), and running the SW events in series would address (b).

I'm not sure if that is worth adding to the proposal immediately though; deferring the decision and iterating on the simpler single-origin design would probably be OK provided that there is a general agreement that this won't prevent something from being done later.

@rektide
Copy link

rektide commented Nov 7, 2017

@martinthomson I agree generally thatAnother addition to SSE would further entrench the use of what is a pretty bad mechanism.[link]. At the same time, the SSE proposal on the table is IMO a significantly stronger, more useful recommendation than anything else I've seen. It's ability to act broadly, cross-resourcefully, at scopes, is incredibly powerful & matches the mindspace I have as an application developer.

Like, say, bayeux, the SSE extension features the ability to subscribe to scopes. We see similar patterns in Etcd's wait-for-change API with it's ?recursive=true option- PUSH is different than this long-poll API form, but this resourceful view of pushed, subscribed content is a powerful one, and one SSE is far better mated to than the one-by-one Fetch proposal on the table.

The Fetch proposal on the table is by far the cleanest, most direct API I can see for implementing martin's WebPush Protocol. In this use case, a resource (or group of resources) is explicitly requested (Fetch'ed) with the expectation that the request will be held open indefinitely and PUSHes will arrive in reply to it. In this case, the Fetch proposal works great. Fetch is infinitely more appropriate a place to house this sort of addition makes semantic good-sense for cracking this use case & is very elegant.

But I find the ramifications deeply deeply unsettling and very limiting, such as for basic cases like trying to detect & render linked-data-island resources pushed in response a page (like index.html). If we want to start extending the DOM such that we can see replies pushed in response to documents and images and everything else, we can escape the pit-trap of writing something specific to a very limited piece of the platform (Fetch), but there's got to be more than specific-Fetch grade awareness here.

A long, long time ago, in a issue tracker far far away, there was Define a Fetch registry, which was merged and resolved. Today, in a different place, we still have a CustomElementRegistry. To look at this problem differently (not tied to Fetch or SSE) the idea of having some registry where content is registered (very close to the original Fetch registry purpose), but also extending EventTarget and dispatching ResourcePushEvents, feels to me like a more readily adaptable plan, using the best event-sourcing capabilities of the platform. With streamId's, it can readily implement WebPush protocol, while also allowing for a more general implementation of the "content was pushed I want to detect it and render it". Leaving this last need unfulfilled invites a tactic I have already desultorily fallen back to, polling the server for a list of resources that have been pushed at me (along with the cost of collecting and storing those list of pushed resources in the serving-farm). This is... radically unideal, and a limitation readily circumventable with painful imo unnecessary state tracking & coordination on the server's part. I don't want to see the web-platform create that kind of burden- it's not sympathetic to a use case people will make happen one way or another.

@LPardue
Copy link

LPardue commented Nov 28, 2017

I revisit this proposal with great interest, I gain more insight into the many facets each time.

I've spent some time recently looking at Server Push in HTTP/QUIC, it currently does away with notions of Stream ID and favours a Push ID. To that end, I'd encourage some care in assessing proposals such that any solution would naturally migrate to a future version of HTTP.

@jhampton
Copy link

@jakearchibald I found this thread and your article about browser support/implementation for HTTP/2 in the nick of time. I'm working on a project to implement a general-purpose fetch/XHR client that supports HTTP/2 push. I'm wondering if the needle has moved toward a standard since this thread began. I'll ping you on Twitter as well, and thank you very much for sharing your (hard-won) knowledge and thoughts with the community at large.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment