Skip to content

Instantly share code, notes, and snippets.

@cben
Created September 10, 2015 20:46
Show Gist options
  • Save cben/012c1fdbbb69d76cedaf to your computer and use it in GitHub Desktop.
Save cben/012c1fdbbb69d76cedaf to your computer and use it in GitHub Desktop.
The problem with CloudFlare's DNS "CNAME flattening" TTL

[attaching this to a CloudFlare survey why I left them (for DNSimple)]

I was a free user using CloudFlare only for DNS, chiefly because it can simulate CNAME at an apex domain. The apex domains mathdown.net,mathdown.com point to mathdown-cben.rhcloud.com. Cloudflare "CNAME-flattening" nicely returns an A record; unfortunately it's served with a huge TTL of 7 days(!), which causes a long outage when the underlying IP changes.

I asked support how I can lower the TTL (BTW it's great that you provide free support at all) and was told [https://support.cloudflare.com/hc/en-us/requests/522551, emphasis mine]:

This is based on the TTL of your authoritative provider for mathdown-cben.rhcloud.com:

$ dig mathdown-cben.rhcloud.com
...
;; ANSWER SECTION:
mathdown-cben.rhcloud.com. 60 IN CNAME ex-std-node638.prod.rhcloud.com.
ex-std-node638.prod.rhcloud.com. 150 IN CNAME ec2-52-7-66-21.compute-1.amazonaws.com.
ec1-52-7-66-21.compute-1.amazonaws.com. 604800 IN A 52.7.66.21
...

You would need to have them alter the TTL of the resulting A record for you

So CloudFlare took the maximum TTL of the CNAME chain. I believe this is both technically wrong and impractical — you should take the minimum TTL. See below.

But the more immediate issue is bad UX: I had no idea this record would be served with a TTL of 7 days. On the contrary, CloudFlare's UI offered me to choose the TTL on this record (as on any other). IIRC I changed if from "auto" to 10min or something of that order. It didn't occur to me that what I choose might not be what's actually served...

Why is telling me that I'd need the original provider to change their TTL(s) impractical? The 7 days TTL comes from ec1-52-7-66-21.compute-1.amazonaws.com. 604800 IN A 52.7.66.21. As is clear from the name, Amazon intends ec1-52-7-66-21 to have ip 52.7.66.21 "forever", so they set a TTL of a week. Even if I were their direct customer (I'm not, I'm a customer of RHcloud), they wouldn't change this for me.

What would be most practical is what DNSimple and DnsMadeEasy do on their ANAME/ALIAS records: just let me set the TTL of the resulting A record.

Why is taking the maximum technically wrong? Compare to a non-flattened chain of CNAMEs:

  1. E.g. the non-apex www.mathdown.com is CNAME for mathdown-cben.rhcloud.com. with TTL of 1min;
  2. mathdown-cben.rhcloud.com. was CNAME for ex-std-node638.prod.rhcloud.com. with TTL of 1min;
  3. that's a CNAME for some Amazon machine with TTL of 2.5min;
  4. the Amazon machine has an A record with huge TTL.

The parts that change a lot are (2) and (3). AFAIK a client caches each DNS record separately up to its own TTL. The Amazon record (4) can take an eternity to time out but it doesn't matter - within minutes of (2) or (3) changing, the client will re-fetch them and a different Amazon record will come into play. Thus the effective TTL of a non-flattened chain is the minimum of the chain's TTL. If you care about flattening precisely simulating the chain, you should use the minimum.

In any case, make sure your UI reflects the TTL that will actually be served.

@henrik
Copy link

henrik commented Nov 8, 2016

According to a Cloudflare employee in this comment, you can customise the TTL.

Also, if you select the "orange cloud" (more details in the linked comment), Cloudflare will point the record to their own IPs and proxy to your site, which should mean instant IP updates.

@cben
Copy link
Author

cben commented Nov 8, 2016

UPDATE: Pavel from CloudFlare mailed me saying they fixed this 🎉. I haven't yet re-tested it (I'm no longer actively using CF).

@jukeee
Copy link

jukeee commented Nov 9, 2016

Hey, I am a free user of Cloudflare and will leave them soon.
As for my experience, CNAME flattening from Cloudflare was working without any problem then one day images where not retrieved anymore (http://www.mtrlst.com/wp-content/uploads/2016/10/Look-15.jpg vs http://main-mtrlst.rhcloud.com/wp-content/uploads/2016/10/Look-15.jpg) that's kind of strange to stop working from one day to another

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment