dnsdist has a built-in cache that is 1) really fast 2) saves a lot of packets from being sent.
the recursor supports EDNS Client Subnet where it feeds part of the client IP address to authoritative servers, to get better answers
If the recursor is hidden behind dnsdist, dnsdist can be configured to send on part of the client IP address to the recursor. So ECS then gets used twice: once to the recursor, once to the authoritative server.
When this setup is enabled, the dnsdist cache contains packets with the ECS option on the question. This means the cache hitrate goes down tremendously, as it will only deliver 'hits' within the same /24 (say).
When ECS is enabled, most domains are still not ECS-variable. This means we have split up the cache into thousands of /24 shards for no good reason.
The recursor internally knows if an answer it sent back is ECS-variable or not. It uses this in its own cache. In fact, it even knows how variable an answer is ('not at all', or 'valid for 1.2.0.0/16')
As it stands, the recursor does not return that knowledge to dnsdist. In fact, it offers no ECS option at all on responses.
A small bit on how the dnsdist cache works. It is keyed on a 32 bit hash of the query. To find responses, we calculate the hash of the query (with id field set to 0), see if something is there, and check if the response actually matches the query. Crucially, this hash is calculated after adding the ECS option. When there is no hit and we send the query to the backend, now with ECS option, we insert the answer in the cache based on that hash-with-ECS.
To get the dnsdist cache to be useful on ECS, the following needs to happen:
- calculate the hash also before adding ECS, check cache for hit, if so, use it
- calculate hash after adding ECS, check cache hit, if so use it
- if both were cache misses, send to backend resolver with ECS option
- If response carries a /0 ECS scope, remove ECS option, store response in cache using the pre-ECS hash
- If it does carry an ECS scope, remove it too, store response in cache using the earlier calculated hash with ECS
Required changes on recursor: add option that returns a scope if use-incoming-edns-subnet was set In dnsdist: storing pre-ecs hash, stripping ecs from responses from dnsdist