9 minute read

One of the best things about joining a new company is that you get to go to a series of training for that company. For Cloudflare, I get a technical bootcamp focused on Internet technologies. I learned Internet technologies a long time ago, so it’s a chance to catch up and immerse myself in the improvements that have happened along the way. My role recently has focused on developers (and it still does), but that doesn’t mean you should be oblivious to standard trouble shooting.

Tools for troubleshooting DNS

Take DNS, for example - subject #1 in the Cloudflare tech bootcamp. There is an old mantra when things go wrong on the Internet - “It’s always DNS”. There’s even a t-shirt. So, quite obviously, you will want to fix or rule out DNS quickly. Fortunately, there are tools for that. Unfortunately - some of them need DNS to work.

This post isn’t about learning DNS. There are much better sites than mine for that.

Looking up a name - nslookup, dig, delv

The most obvious thing you are going to need to do is look up something within DNS. Every single system - Windows, Mac, Linux - has nslookup installed. It allows you to do a query against a specific DNS resolver. For instance, you might type the following:

$ nslookup -query=a -timeout=10 cloudflare.com 1.1.1.1
Server:		1.1.1.1
Address:	1.1.1.1#53

Non-authoritative answer:
Name:	cloudflare.com
Address: 104.16.132.229
Name:	cloudflare.com
Address: 104.16.133.229

It gives you the information without any fuss. If you want more information, however, you need a better tool. That tool is dig. If I do the same query using dig, I get more information:

$ dig cloudflare.com A @1.1.1.1

; <<>> DiG 9.10.6 <<>> cloudflare.com A @1.1.1.1
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 29931
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;cloudflare.com.			IN	A

;; ANSWER SECTION:
cloudflare.com.		134	IN	A	104.16.133.229
cloudflare.com.		134	IN	A	104.16.132.229

;; Query time: 24 msec
;; SERVER: 1.1.1.1#53(1.1.1.1)
;; WHEN: Mon Apr 20 09:11:10 BST 2026
;; MSG SIZE  rcvd: 75

Same result, but now I can see some details of the DNS protocol used behind the scenes. DNS is hierarchical with multiple servers involved, so I may want to see why the resolution happened that way - I can add +trace to the command (although I would also add +nodnssec to avoid seeing too much).

Talking of DNSSEC, sometimes you are going to need to check the certificate chain of your DNS query. dig doesn’t quite work for that. While you can use dig +sigchase for this, delv is better. If you say delv name (where name is a signed zone), delv will report “fully validated”, giving you a confidence that things are working. delv is preferred because it works in a way that is much closer to what really happens inside a DNS server.

If you are on a Mac, you can install both dig and delv using brew install bind. Linux (and WSL on Windows) can use the ISC package manager downloads. There are also web sites that will do this for you. However, you should download the tools so you can check YOUR resolver.

Handling modern protocols - kdig and dog

While nslookup and dig are great for standard DNS (which uses TCP or UDP port 53), they aren’t designed for newer encrypted protocols like DNS-over-HTTPS (DoH) and DNS-over-TLS (DoT). DoT operates on port 853 and creates a TLS tunnel for DNS traffic, making it easier for network admins to identify (and potentially block) compared to DoH. kdig is available from the knot-dnsutils package (or knot package for brew) for handling this:

$ kdig +tls @1.1.1.1 www.google.com
;; TLS session (TLS1.3)-(ECDHE-X25519)-(ECDSA-SECP256R1-SHA256)-(AES-256-GCM)
;; ->>HEADER<<- opcode: QUERY; status: NOERROR; id: 34748
;; Flags: qr rd ra; QUERY: 1; ANSWER: 8; AUTHORITY: 0; ADDITIONAL: 1

;; EDNS PSEUDOSECTION:
;; Version: 0; flags: ; UDP size: 1232 B; ext-rcode: NOERROR
;; PADDING: 293 B

;; QUESTION SECTION:
;; www.google.com.              IN      A

;; ANSWER SECTION:
www.google.com.         298     IN      A       142.251.150.119
www.google.com.         298     IN      A       142.251.156.119
www.google.com.         298     IN      A       142.251.154.119
www.google.com.         298     IN      A       142.251.153.119
www.google.com.         298     IN      A       142.251.151.119
www.google.com.         298     IN      A       142.251.155.119
www.google.com.         298     IN      A       142.251.157.119
www.google.com.         298     IN      A       142.251.152.119

;; Received 468 B
;; Time 2026-04-20 10:04:54 BST
;; From 1.1.1.1@853(TLS) in 97.6 ms

Yes, it looks exactly like the output from dig, but it’s use DNS-over-TLS (see the last line for confirmation). You can also use dog, which is a Rust-based DNS client that tends to be much more user-friendly for encrypted queries. While dog is preferred by practitioners for its JSON output, it’s more painful to install. kdig has the advantage of being human readable and installable from the normal package managers you use.

DoH (DNS-over-HTTP) wraps DNS queries inside standard HTTP traffic on port 443. This makes it nearly indistinguishable from regular web traffic, which is excellent for bypassing censorship but harder for enterprise monitoring. You can use kdig and dog again:

$ kdig @1.1.1.1 +https cloudflare.com
;; TLS session (TLS1.3)-(ECDHE-SECP256R1)-(ECDSA-SECP384R1-SHA384)-(AES-256-GCM)
;; HTTP session (HTTP/2-POST)-(1.1.1.1/dns-query)-(status: 200)
;; ->>HEADER<<- opcode: QUERY; status: NOERROR; id: 0
;; Flags: qr rd ra ad; QUERY: 1; ANSWER: 2; AUTHORITY: 0; ADDITIONAL: 1

;; EDNS PSEUDOSECTION:
;; Version: 0; flags: ; UDP size: 1232 B; ext-rcode: NOERROR
;; PADDING: 389 B

;; QUESTION SECTION:
;; cloudflare.com.              IN      A

;; ANSWER SECTION:
cloudflare.com.         153     IN      A       104.16.133.229
cloudflare.com.         153     IN      A       104.16.132.229

;; Received 468 B
;; Time 2026-04-20 10:16:30 BST
;; From 1.1.1.1@443(HTTPS) in 128.9 ms

I prefer kdig to dog as it is more readily available on more platforms, being part of Knot DNS.

Getting to the server - traceroute and mtr

One of the common problems you have to solve is “can I get there from here?” - is the resolver (or name server) I am checking reachable for me? For this, there are two tools - traceroute (tracert on windows) and mtr. You will probably have access to traceroute, but mtr may not be available and require to be installed.

Traceroute uses ICMP packets with ever increasing time-to-live to plot the route a packet will take to your specified destination. It’s got a basic functionality:

$ traceroute -n linux.die.net.
traceroute: Warning: linux.die.net. has multiple addresses; using 172.67.69.187
traceroute to linux.die.net (172.67.69.187), 64 hops max, 40 byte packets
 1  192.168.1.1  8.735 ms  6.944 ms  7.335 ms
 2  212.158.250.39  13.987 ms  13.826 ms  10.548 ms
 3  63.130.172.37  11.505 ms  11.517 ms *
 4  90.255.251.37  17.378 ms  11.493 ms  14.453 ms
 5  162.158.32.9  15.367 ms  14.616 ms
    162.158.32.45  13.774 ms
 6  172.67.69.187  10.802 ms  14.140 ms  9.756 ms

Note the * * * entries. This means that it found a hop, but the router at that hop is not responding to ICMP packets. This is normal on the Internet and not a concern. You can also see that hop 5 went through two different routers - again, relatively common.

On the internet, routes may be asymmetric - you may not take the same route back from a destination as you did to get to that destination. Thus, in an ideal world, you would be able to do a traceroute from either end. Unfortunately, it doesn’t work like that. Fortunately, mtr can sort of handle it. mtr combines the functionality of traceroute and ping in a single network diagnostic tool. While traceroute shows a single snapshot, mtr provides rolling statistics which makes it superior for catching intermittent packet loss.

$ sudo mtr google.com -c 10 -r
Start: 2026-04-21T09:33:36+0100
HOST: DC2K0HQTXH                  Loss%   Snt   Last   Avg  Best  Wrst StDev
  1.|-- 162.158.73.16              0.0%    10   27.1  13.3  10.1  27.1   5.0
  2.|-- 162.158.73.16              0.0%    10   13.2  13.6  11.4  18.3   2.4
  3.|-- 104.28.0.0                 0.0%    10   11.6  13.1  11.5  16.3   1.7
  4.|-- 162.158.73.1               0.0%    10   22.2  18.8  12.1  39.5   8.1
  5.|-- 162.158.32.44              0.0%    10   13.5  18.9  12.7  35.2   8.0
  6.|-- man-b2-link.ip.twelve99.n  0.0%    10   21.1  18.6  13.8  33.1   5.7
  7.|-- dln-b6-link.ip.twelve99.n  0.0%    10   18.9  20.1  18.0  26.1   2.6
  8.|-- dln-b3-link.ip.twelve99.n 30.0%    10   16.3  17.0  15.9  20.5   1.7
  9.|-- 72.14.243.178              0.0%    10   23.7  18.2  15.9  23.7   2.7
 10.|-- lclhrb-in-f139.1e100.net   0.0%    10   28.6  30.5  23.8  65.0  12.4

Notice that there is packet loss at hop 8. This is generally because there is a problem on the return leg. 30% suggests that 3 out of 10 packets didn’t make the round-trip. However, the final hop shows 0% packet loss, so it’s not a problem.

You can install mtr using brew on Mac and through the normal Linux package managers and there is a Windows package called WinMTR for you.

Note: mtr needs to create raw sockets to send ICMP or UDP packets, which is a privileged operation on most Unix-like systems. You may need to run mtr within sudo, depending on permissions.

Use another resolver

If you work in an enterprise (or you have a solid home lab), you are probably running your own recursive resolver. What if that is broken? At that point, you will want to do comparisons between the results from your resolver and someone elses resolver. Fortunately, there are plenty of those that you can use for free:

  • Cloudflare has 1.1.1.1 (this is the one I recommend because it’s everywhere you are and privacy focused)
  • Google has 8.8.8.8 and 8.8.4.4
  • Quad9 has 9.9.9.9, 9.9.9.10, 9.9.9.11
  • Cisco OpenDNS has 208.67.222.222 and 208.67.220.220

When you are wondering what the Internet sees, target a public resolver instead of your own resolver.

Web sites you might want to know

One of the common things to watch for is mis-configured DNS. It works - it’s just giving out the wrong information. Here are a few websites I have in my collection:

  • MxToolbox is focused on DNS for email transfer. It will not only do DNS lookups, but it will analyze email headers, see if your SMTP outbound IP is on a blacklist, and perform basic SMTP diagnostics. This is a useful site if your DNS problems stem from an email report.
  • DNSChecker is a basic DNS lookup - much like nslookup, but using someone elses computer. It’s main feature is being able to see whether your change has propagated out to the Internet or not.
  • DNSTools provides access to all the tools that you can run on your local machine - but on someone elses machine.
  • who.is provides access to the registrar information. The main job here is to understand which name servers are authoritative for a specific domain.

You also want to have specific tests for handling DNSSEC:

  • Verisign Labs provides a DNSSEC domain debugger.
  • DNSViz visualizes the status of a DNS zone, explicitly providing a visual analysis of the DNSSEC chain.
  • DNS CAA Tester let’s you view the certificate authority authorization (CAA) embedded in your DNS records, which let’s you specify which certificate authorities are allows to issue certificates for the domains you own.

Final thoughts

It’s always DNS, but it doesn’t have to be. With these tools in your pocket, you can quickly and easily determine if the problem is actually DNS. You still need to “learn DNS”, but there are lots of resources for that. Make sure the tools are available on all the machines you use to diagnose issues and that you have practiced how to use them before they are needed. You need to know what “good” looks like before you can determine if the current state is good or bad.

DNS will never be problematic again with these tools, DNS know-how, and basic troubleshooting skills.

Tags:

Categories:

Updated:

Leave a comment