Last active
December 18, 2015 00:59
-
-
Save smerritt/5700858 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Let's examine how the account and container caching works for a simple | |
middleware like container_quotas. | |
Abbreviations used: | |
s.c.m --> swift.common.middleware | |
s.c.p.a --> swift.controllers.proxy.account | |
s.c.p.b --> swift.controllers.proxy.base | |
s.c.p.c --> swift.controllers.proxy.container | |
s.c.p.o --> swift.controllers.proxy.object | |
Okay, so we've got an object PUT request that comes in. | |
So, s.c.m.container_quotas.ContainerQuotaMiddleware.__call__() calls | |
s.c.p.b.get_container_info(), then pulls quota stuff out of the return | |
value and does stuff with it. We don't care about the stuff; let's | |
look into get_container_info(). | |
s.c.p.b.get_container_info() checks for container info in a few | |
places. If it's in the WSGI environment, then that's what is returned. | |
Otherwise, if it's in memcache, then it's placed into the WSGI | |
environment and returned. If it's not in memcache, then a container | |
HEAD request is made, the container info is generated from the HEAD | |
response, and the result is stuck into the WSGI environment and | |
returned. | |
Interestingly, get_container_info() does not ever put anything into | |
memcache. It relies on the container HEAD request to populate the | |
cache. On the one hand, it's an odd asymmetry, but on the other hand, | |
it does mean that there's no chance of overwriting | |
ever-so-slightly-fresher cached data with an ever-so-slightly-stale | |
response. | |
(To see what I mean: imagine that get_container_info() issues a | |
container HEAD request. Shortly thereafter, someone issues a POST to | |
that same container. Then the POST completes, then the HEAD completes. | |
If get_container_info() stored things into memcache, it could | |
overwrite the new stuff that got stored on the POST request, thereby | |
taking the cache from fresh to stale. Not good.) | |
Now let's drill down a little more into that container HEAD request. | |
After some routing and whatnot, we wind up in | |
s.c.p.c.ContainerController.GETorHEAD(). This method makes requests to | |
the container servers, then stashes the result in memcache. Note that | |
it never *reads* from memcache; it only writes to it. There's even a | |
comment there talking about ratelimiting. Yikes. | |
To recap: if get_container_info suffers from two cache misses (WSGI | |
environment and memcache), then it relies on the container HEAD to | |
populate memcache, and it only populates the WSGI environment. | |
Okay, so now we're done, right? Wrong! We've passed the | |
container_quotas middleware without error, so now our original object | |
PUT request is on its way down to the proxy. | |
Running total: | |
* 1 memcache get | |
* 1 container-server HEAD | |
* 1 memcache set | |
The container info is now in the WSGI environment and in memcache. | |
Got it? Okay, let's keep going. Remember, we're done with middleware | |
now, and we're on to the proxy. (I'll look at how multiple middlewares | |
interact with the cache and each other at a future point. Summary: | |
it's complicated.) | |
So, this object PUT request makes its way to | |
s.p.c.o.ObjectController.PUT(). First thing this method does is to | |
call s.c.p.b.Controller.container_info() [for ACLs, versioning, et | |
cetera. It's got good reasons]. That method checks memcache for | |
container info, and returns it if found. Note that it *does not* check | |
the WSGI environment for container info, so the data that | |
s.c.p.b.get_container_info() stuffed into the environment earlier was | |
for nothing. | |
Let's say that our cache was big enough and this request moved fast | |
enough that we got a memcache hit for the container info. | |
Running total: | |
* 2 memcache gets | |
* 1 container-server HEAD | |
* 1 memcache set | |
Now we're down to just the rest of the PUT method, which is 300 | |
lines of code (ugh) that doesn't seem to do any more container or | |
account info fetching. | |
Scenario 2: Small / Missing Cache | |
================================= | |
Well, the basic scenario doesn't look too bad. Let's see how this | |
plays out with a small cache (so we have misses) or just no memcache | |
at all. | |
The container-quotas middleware goes as before, bringing us up to | |
Running total: | |
* 1 memcache get | |
* 1 container-server HEAD | |
* 1 memcache set | |
We get back to s.c.p.b.Controller.container_info(), and now instead of | |
a cache hit, we get a miss instead. | |
Now something interesting happens: instead of just doing a | |
container-server HEAD request, s.c.p.b.Controller.container_info() | |
calls s.c.p.b.Controller.account_info() for some reason. This checks | |
memcache for the account info, and let's say it misses. Now | |
account_info() goes and makes a HEAD request to the account servers, | |
then stashes the result in memcache. | |
Now, at the end of account_info(), we have: | |
Running total: | |
* 3 memcache gets | |
* 1 account-server HEAD | |
* 1 container-server HEAD | |
* 2 memcache sets | |
Right? Okay, back to s.c.p.b.Controller.container_info(). Now, having | |
verified that the account exists (I guess), it makes a | |
container-server HEAD request, then stashes the result in memcached | |
before returning. | |
Final total: | |
* 3 memcache gets | |
* 1 account-server HEAD | |
* 2 container-server HEADs | |
* 3 memcache sets | |
New code + requests: | |
==================== | |
Run through Scenario 1 again here: container_quotas middleware | |
Assume empty cache to start. | |
First, container_quotas calls get_container_info() as before, which | |
calls get_info(). This calls _get_info_cache(), which looks in the | |
WSGI environment and then in memcache for stuff. Since we're in | |
cache-miss land, we try memcache, but find nothing. | |
Now, the get_info() call for the container goes and recursively calls | |
itself for the account, resulting in another memcache miss. Post-miss, | |
get_info() makes a HEAD request into the application for the account. | |
The account HEAD handler ends up calling | |
s.p.c.b.Controller.GETorHEAD_base(), which stores the account info in | |
memcache. get_info() then takes the account info, sticks it in the | |
WSGI environment, and returns it. | |
Okay, recursive call over; we're now back in get_info() for the | |
container. Since get_info() for the account returned a truthy value, | |
we then continue on with a container HEAD request. Like its sibling in | |
AccountController, this guy calls up to Controller.GETorHEAD_base(), | |
which stores the result in memcache. Back up to get_info(), which | |
stashes things in the WSGI environment and returns. | |
Running total: | |
* 2 memcache gets | |
* 1 account-server HEAD | |
* 1 container-server HEAD | |
* 2 memcache sets | |
Moving on from container_quotas, we hit the proxy server in | |
s.p.c.o.ObjectController.PUT(). This method calls | |
s.p.c.b.Controller.container_info(). | |
Now we're back to get_info() for the container again. Fortunately, | |
this time the result is cached in the WSGI environment, so no more | |
memcache traffic is necessary. | |
Final total: | |
* 2 memcache gets | |
* 1 account-server HEAD | |
* 1 container-server HEAD | |
* 2 memcache sets (account info and container info) | |
Old final total: | |
* 2 memcache gets | |
* 1 container-server HEAD | |
* 1 memcache set (container info) | |
Run through Scenario 2 again here (where memcache is broken) | |
Due to the caching in the WSGI environment, we get the same final | |
total, minus the memcache. | |
Final total: | |
* 1 account-server HEAD | |
* 1 container-server HEAD | |
Old final total: | |
* 3 memcache gets | |
* 1 account-server HEAD | |
* 2 container-server HEADs | |
* 3 memcache sets | |
However, if get_info() for a container didn't call itself for the | |
account, then we'd get: | |
Better final total: | |
* 1 container-server HEAD |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment