Since I couldn't really use the network at my hotel, I spent some time poking around how it was so horribly broken.
$ grep ^nameserver /etc/resolv.conf
nameserver 172.16.2.5
nameserver 172.18.82.11
nameserver 4.2.2.2
Seems reasonable enough. Let's test them!
mawagner ~ $ dig google.com @172.16.2.5
; <<>> DiG 9.8.3-P1 <<>> google.com @172.16.2.5
;; global options: +cmd
;; connection timed out; no servers could be reached
Oh, okay... Your primary nameserver is down? Maybe that's why things are so slow. But at least there's a backup!
mawagner ~ $ dig google.com @172.18.82.11
; <<>> DiG 9.8.3-P1 <<>> google.com @172.18.82.11
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 14184
;; flags: qr rd ra; QUERY: 1, ANSWER: 11, AUTHORITY: 4, ADDITIONAL: 4
;; QUESTION SECTION:
;google.com. IN A
;; ANSWER SECTION:
google.com. 50 IN A 74.125.224.99
google.com. 50 IN A 74.125.224.98
google.com. 50 IN A 74.125.224.104
google.com. 50 IN A 74.125.224.100
google.com. 50 IN A 74.125.224.96
google.com. 50 IN A 74.125.224.102
google.com. 50 IN A 74.125.224.103
google.com. 50 IN A 74.125.224.105
google.com. 50 IN A 74.125.224.101
google.com. 50 IN A 74.125.224.97
google.com. 50 IN A 74.125.224.110
;; AUTHORITY SECTION:
google.com. 46614 IN NS ns1.google.com.
google.com. 46614 IN NS ns4.google.com.
google.com. 46614 IN NS ns3.google.com.
google.com. 46614 IN NS ns2.google.com.
;; ADDITIONAL SECTION:
ns3.google.com. 220294 IN A 216.239.36.10
ns2.google.com. 219426 IN A 216.239.34.10
ns1.google.com. 234245 IN A 216.239.32.10
ns4.google.com. 219777 IN A 216.239.38.10
;; Query time: 133 msec
;; SERVER: 172.18.82.11#53(172.18.82.11)
;; WHEN: Tue Feb 26 21:57:10 2013
;; MSG SIZE rcvd: 340
As much as I think that 133ms is pretty poor for an in-cache response from a local namserver, 133ms is blazingly fast here.
So, here is where I tried to show you that the public DNS server they have as a tertiary NS is so much faster:
mawagner ~ $ dig google.com @4.2.2.2
; <<>> DiG 9.8.3-P1 <<>> google.com @4.2.2.2
;; global options: +cmd
;; connection timed out; no servers could be reached
But whoops, they appear to block outbound DNS queries at the firewall.
Except, not TCP queries. So, our moment of zen:
mawagner ~ $ dig google.com @4.2.2.2 +tcp
; <<>> DiG 9.8.3-P1 <<>> google.com @4.2.2.2 +tcp
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 58987
;; flags: qr rd ra; QUERY: 1, ANSWER: 11, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;google.com. IN A
;; ANSWER SECTION:
google.com. 213 IN A 173.194.38.133
google.com. 213 IN A 173.194.38.137
google.com. 213 IN A 173.194.38.142
google.com. 213 IN A 173.194.38.132
google.com. 213 IN A 173.194.38.128
google.com. 213 IN A 173.194.38.136
google.com. 213 IN A 173.194.38.131
google.com. 213 IN A 173.194.38.134
google.com. 213 IN A 173.194.38.129
google.com. 213 IN A 173.194.38.130
google.com. 213 IN A 173.194.38.135
;; Query time: 17 msec
;; SERVER: 4.2.2.2#53(4.2.2.2)
;; WHEN: Tue Feb 26 22:01:56 2013
;; MSG SIZE rcvd: 204
17ms to get a query from a remote namserver; 133ms from the local cache.
But, hmm... Let's run those queries again against the internal ones, to see if maybe I just got a bad result.
Here's their primary nameserver, which was down:
mawagner ~ $ dig google.com @172.16.2.5
; <<>> DiG 9.8.3-P1 <<>> google.com @172.16.2.5
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 6986
;; flags: qr rd ra; QUERY: 1, ANSWER: 11, AUTHORITY: 4, ADDITIONAL: 4
;; QUESTION SECTION:
;google.com. IN A
;; ANSWER SECTION:
google.com. 259 IN A 74.125.224.104
google.com. 259 IN A 74.125.224.101
google.com. 259 IN A 74.125.224.98
google.com. 259 IN A 74.125.224.99
google.com. 259 IN A 74.125.224.102
google.com. 259 IN A 74.125.224.96
google.com. 259 IN A 74.125.224.100
google.com. 259 IN A 74.125.224.97
google.com. 259 IN A 74.125.224.105
google.com. 259 IN A 74.125.224.110
google.com. 259 IN A 74.125.224.103
;; AUTHORITY SECTION:
google.com. 46220 IN NS ns2.google.com.
google.com. 46220 IN NS ns1.google.com.
google.com. 46220 IN NS ns4.google.com.
google.com. 46220 IN NS ns3.google.com.
;; ADDITIONAL SECTION:
ns3.google.com. 219900 IN A 216.239.36.10
ns2.google.com. 219032 IN A 216.239.34.10
ns1.google.com. 233851 IN A 216.239.32.10
ns4.google.com. 219383 IN A 216.239.38.10
;; Query time: 368 msec
;; SERVER: 172.16.2.5#53(172.16.2.5)
;; WHEN: Tue Feb 26 22:03:44 2013
;; MSG SIZE rcvd: 340
It's back! 368ms is pretty lousy, but at least it's back.
And how about the secondary? Can it beat its previous record of 133ms, or will it be in the 350ms+ range like its sibling?
mawagner ~ $ dig google.com @172.18.82.11
; <<>> DiG 9.8.3-P1 <<>> google.com @172.18.82.11
;; global options: +cmd
;; connection timed out; no servers could be reached
Oh. It's like they take turns, with only one of them functioning at a time.
So I wonder, since I'm getting such weird results, if maybe the gateway is off-premise or something?
mawagner ~ $ netstat -nr | head -n5
Routing tables
Internet:
Destination Gateway Flags Refs Use Netif Expire
default 10.77.60.1 UGSc 8 0 en0
So, let's try a traceroute there:
mawagner ~ $ traceroute 10.77.60.1
traceroute to 10.77.60.1 (10.77.60.1), 64 hops max, 52 byte packets
1 * * *
2 * * *
3 * * *
4 * * *
5 * * 10.77.60.1 (10.77.60.1) 45.394 ms !X
6 * * *
7 * 10.77.60.1 (10.77.60.1) 7.348 ms !X *
8 * * *
9 * * *
10 * * *
11 * * *
12 * * *
13 * * *
14 * * *
15 * * *
16 * * *
17 10.77.60.1 (10.77.60.1) 10.740 ms !X * 14.764 ms !X
Oh, of course. It's five, seven, and 17 hops away.
So, maybe my earlier speed test was just a bad server. Let's test again with a new server, to be fair!
Wait, seriously? It's worse?