As I see it, there's a few options, each with its own tradeoffs.
Run the expensive query/calculation when you write relevant data, synchronously, and store it in a way that's optimized for fast reads (ie. in a cache somewhere, or just a JSON blob in the database).
PROS:
- optimized for fast reads, data is kept up to date if you make sure to compute it in every write operation that might affect the outcome.
CONS:
- this moves the latency to your app's writes, which if we're talking about 40+ seconds might not be ideal. Also if the cached data itself is seldom used you could end up impacting your write performace for little overall gain.
When a request for this metric comes in, do the following:
* 1). Compute a cache key based on whatever's relevant. It could be the concatenation of the update dates of tables that are part of the metric, ex.
# cache key
# This is pseudo-code!
SELECT CONCAT(MAX(users.updated_at), MAX(articles.updated_at)) AS cache_key from users, articles;
* 2). Check the cache for that key. The cache could be a database table, Redis, Memcache, or even a per-process global lookup.
* 2.a). if the cache has data for the key, serve it back to the client.
* 2.b). if the cache key does NOT have data for the key:
* 2.b.1). Spin up a thread to run the expensive calculation. When it's done, it will populate the cache with the key and data.
* 2.b.3). While the thread above runs:
* 2.b.3.1). Does the cache have a previous entry (ie stale data)? Serve it back to the client.
* 2.b.3.1). The cache is empty? Serve a default, zeroed-out metric to the client, or an error response.
There's some important details about this one, I'll come back to it in a second.
PROS:
- Responses are always fast. Whether because the app served fresh data from the cache, stale data from the cache, or a default zeroed-out initial value for the metric.
- While a response might serve slighlty stale data, it will set the computation running in the background, so that subsequent requests get fresh data.
- This is fine if it's Ok for some requests to get stale data from time to time, mostly right after the source data changes.
CONS:
- There's some technical caveats, to follow:
-
if your cache is just an in-memory data structure in each process, each of your web worker processes needs to launch its own thread to populate the local cache. If you have multiple workers behind a load balancer, clients might get responses at varying degrees of staleness depending on what worker they happen to hit, until they're all eventually up to date.
-
if your cache is off-process, ie a database, Redis, etc, you'll want each worker process to check and set a lock before setting up the backgound thread, so that other requests to other workers can check the lock and know that one of the workers already has a running threads busy computing the new data, so that they skip launching their own thread. This lock can just be a value in a database table, or Redis' SETNX, for example.
-
These short-lived background threads will use a database connection concurrently with the request that launched them, so you have to factor that in when configuring your database connection pool size.
This is a variation of the above, but if you already have Sidekiq or similar (not sure what stack you're on), it might be easier to just compute and populate the cache in a background job. You would still need to check/set a lock before launching the job, to make sure you don't have multiple jobs computing the same data unnecessarily. The points above about being able to serve stale cache entries in the meantime, or a default empty metric, are still applicable, since those are the things that allow you to always serve something inmediatly to the client.
I recently did something similar using the in-memory cache + thread version above, and it's working well. With the caveats that, after source data changes, clients might get inconsistent responses for a minute or so (but they always get something fast).