Editor’s note: Chris will be doing a workshop called Setting up Your Web App – Powerful Alternatives to LAMP at FOWA London in October.
DNS is a big topic, and I’m certainly not going to try to cover all of it here. However, I think that by the end of this article, we should have covered the parts that developers should be aware of and understand properly. We’ll start by talking about the sequence of servers that may get consulted whenever a DNS lookup is performed.
The DNS system is in place to translate the representations of internet destinations from things that we humans care about (domain names) to things that computers can use (IP addresses). The system is naturally distributed as we’ll see, which keeps the servers that handle DNS requests from getting too overloaded. There are two distinct sorts of nameservers that we’ll discuss, and each is responsible for a different sort of task. They are:
- Authoritative nameservers
- Resolving nameservers
If you’ve ever moved a domain name between hosting providers, you’ve probably had to do something to “update the nameservers” for the domain. What you are doing there is changing the authoritative nameservers for that domain. Similarly, if you perform a whois query on a domain, it will tell you what the authoritative nameservers are for it. As the name suggests, the job of an authoritative nameserver is to know all the DNS information for the domains assigned to it.
What’s not understood as well is that whatever computer you are using to read this article is almost certainly not talking to an authoritative nameserver. It’s talking to a resolving nameserver, often referred to as just a “resolver”. The resolvers are typically provided to you by your ISP. Their job is to answer the requests of client computers for arbitrary DNS lookups.
When you surf to a website, say amazon.com, your computer asks the resolver what the corresponding IP address is (assuming it’s not stored already in a local cache somewhere on your computer). If the resolver knows, it just answers immediately. If it doesn’t know, the following sequence of things happen very, very quickly:
- The resolver determines what nameservers are authoritative for the domain.
- The resolver asks one of the authoritative nameservers for the needed info.
- Upon getting the info, the resolver stores the information, and answers your computer.
So in this way, DNS info gets distributed around the internet. Typically lots of people will be using any particular group of resolvers from an ISP, and once the resolvers know the answers for any given lookup, they remember the answers for future use. This is why Google’s nameservers can stay up. It’s not as if they get queried every single time anybody goes to google.com, since the relevant resolvers have that information cached close to all the people surfing there.
Of course, eventually, a resolving nameserver will need to check back in with an authoritative one to see if anything has changed. Otherwise, no DNS information could ever update on the Internet. I’ll explain how this works next.
So in the last section, we noted how a resolver will ask an authoritative nameserver for information that it doesn’t already have. It will always get back a little more than just the information it asked for though. Specifically, it will get something called a TTL value for the lookup, which stands for “Time To Live”. The TTL is measured in seconds, and it tells the resolver how long it should assume the information it got back should be considered valid. After the amount of time specified in the TTL has passed, the resolver should check back in with the authoritative server even if it still has the information for the lookup stored.
This concept is important enough that it’s probably worth talking through an example. Let’s say that I have the TTL for chrislea.com set at four hours, and that a resolver asks my authoritative nameservers for the IP address at exactly 12:00pm. Let’s then assume that I change the IP address in the authoritative servers at exactly 12:01pm. In this scenario, any client computer talking to the resolver that checked in at noon won’t pick up the IP address change until 4:00pm. This is why it can take some time for an IP address change to propagate across the Internet. Had I been smarter, at 8:00am that morning, I would have lowered the TTL so some low value, say 300 seconds. That way, by the time I wanted to make the change at noon, all the resolvers that cared to check would have gotten the five minute TTL value. Therefore, in this case, the IP address change would only take five minutes to propagate across the Internet. After the propagation was finished, I would have updated the TTL again and put it back to a higher value to save strain on my authoritative nameservers.
There’s really no standard for what TTL values are supposed to be. Typical values might correspond to 12 or 24 hours, though they certainly can be longer in some cases. One important note though relates to the “lowering the TTL” trick I just described. I recommend never using a TTL lower than 300 seconds. This is because some resolvers are set up such that if they see a TTL value that’s considered “too low”, it’s assumed to be wrong, and the resolver will just use an internal default value. Said internal default will generally be something like 12 or 24 hours, so you would effectively have induced the exact opposite behavior of what you were shooting for.
We’ve now covered how the relevant servers talk to each other, and how DNS distributes information around. Now, we can dive into exactly what that information is and how it’s used. When a DNS query happens, it’s for a specific type of record. There are lots of different types of DNS records, and we’re not going to talk about all of them here, but we’re going to cover the most common ones that you should know about.
A lookup for an A record is what people are generally talking about when they generically say “DNS Lookup”. Your computer gives the resolver a domain name, gets an IP address back in response. It’s the sort of lookup that occurs when you are surfing around in a web browser, pinging something, or telnetting to a port.
An important point here is that every time you add a something ending in a dot character “.” to the left of a domain, then as far as DNS is concerned, it’s a differnet domain. For example, the following two domains are distinct:
In practice, basically everybody points the www. to the same IP address as just the base domain, but you don’t have to. They could point to completely different places. Conveniently, you can create a “star” record for subdomains if you want a catchall. For example, I have *.chrislea.com set up as a catchall for my personal domain, and the catchall is pointed to the same IP address as just chrislea.com. Therefore, if you try and ping the domain vogons.chrislea.com, which doesn’t have an explicit entry, it will match the *.chrislea.com entry and ping the same IP as if you had just pinged chrislea.com. However, if you ping debian01.chrislea.com, which does have an explicit entry, you will see the IP assigned to that domain directly. If a catchall doesn’t exist, and you query a subdomain with no entry, the nameserver will respond that there is no entry, and the behavior of the program you’re using in that case is dependent on the program itself.
Understanding how lookups for A records happen is quite straightforward as we’ve seen. There are more complex queries though that require multi stage lookups to complete. Let’s talk about one of these next.
The MX part of the name stands for “Mail Exchanger”. As it turns out, email is so special that it has its own class of DNS entries. They aren’t too complex, though it is a bigger topic than the A records we just covered. To understand it, we first have to cover just the basics of how an email gets from your computer to wherever it’s supposed to go.
Many people use desktop email clients such as Outlook, Thunderbird, Mail.app, or (if you’re lucky) Evolution. When you send an email out using one of these programs, it’s not this client software that’s actually responsible for getting that email to its final destination. What happens is that your client program forwards the mail on to a server running Mail Transport Agent (MTA) software such as postfix. This MTA program is what actually sends the mail for you. When it gets an email message, the MTA looks up the domain to the right of the @ symbol in the email address, and then checks DNS for the MX record(s) for that domain. The response should be one or more domain names. Again, just to be clear, it gets domain names in the form of A records in response. It then looks up the IP address for one of the A records, just as discussed in the previous section, and sends the mail to that IP address.
As an example, let’s assume you’re sending an email to my GMail account. When the MTA you’re using gets the email message, it will check the MX records for gmail.com. The response it gets back will (currently) include:
- gmail.com mail is handled by 40 alt4.gmail-smtp-in.l.google.com.
- gmail.com mail is handled by 10 alt1.gmail-smtp-in.l.google.com.
- gmail.com mail is handled by 20 alt2.gmail-smtp-in.l.google.com.
- gmail.com mail is handled by 5 gmail-smtp-in.l.google.com.
- gmail.com mail is handled by 30 alt3.gmail-smtp-in.l.google.com.
Don’t worry about the syntax here. The important points to note are that
- There are multiple domains returned in the response.
- These domains are A records.
- There is a number, used for priority, attached to each domain.
The priority number tells the MTA in what order to try the different domains to send the message to. It will try from lowest to highest. So in this case, the MTA will first try to deliver the message to gmail-smtp-in.l.google.com since that has a priority of 5. If it can’t deliver it to the IP that corresponds to that domain, it will try alt1.gmail-smtp-in.l.google.com next, with a priority of 10, and so on.
There are a variety of reasons for this infrastructure to be in place. First, it allows for some resiliency in email sending, since the receiving entity can have more than one place that mail can be delivered to. If a server goes down, then there may be another one still up that can handle getting the message. Second, it means that the place where your mail goes can be a completely different server, or group of servers, than where your website’s IP address is. And, it can do this without having to have really annoying domains to the right of the @ symbol in your email address. This is frequently done to use third party mail services for your mail, such as a hosted Exchange service or GMail.
Next, let’s cover another two stage lookup type of record. This one is actually a bit simpler.
A CNAME, which stands for Canonical Name, is best thought of as an alias. The “Canonical Name” name is very poorly worded, and I recommend just always referring to it as a CNAME. The way a CNAME works is that you have a domain you control, and you assign it to be an alias of some other domain name. For example, I currently have the domain thechris.org CNAME’d to the domain virb.com. There is no A record set up for thechris.org, so if you surf there in your browser, the CNAME value is what’s returned by DNS. Much like with MX records, the domain virb.com is returned, and then a second lookup happens to determine what IP that is. Then, your browser knows to go to that IP address. However, it will still show thechris.org in the URL bar.
Now, I could have simply pointed the domain thechris.org to the virb.com IP address using an A record. Virb would not know any difference if I did. The problem with that approach is that if, in the future, we ever change the IP address for virb.com, I would have to go and explicitly update my A record for thechris.org to whatever the new IP address was in order for things to keep working. Using a CNAME alias like this, things just keep working automagically as we’d like them to.
Okay, enough with these fancy double lookup records. Let’s move on to our next topic which will bring us back to a straightforward lookup mechanism, albeit a good one to understand.
The TXT record is really boring, at least in terms of how it’s defined in the RFC. Essentially, this field is supposed to be a “comments” area, where you can put whatever you want, but it’s not supposed to contain important or machine readable information. So why am I mentioning it? Well, as it turns out, even though you’re not supposed to put imporant stuff in there, people do it all the time. The most common reason currently is to provide SPF data. Really covering SPF is outside of the scope of this article, but in a nutshell, it’s a way to specify via DNS what IP address are allowed to send mail out for a certain domain. For example, here is the TXT record currently in place for chrislea.com:
v=spf1 a mx ip4:126.96.36.199 ip4:188.8.131.52 ip4:184.108.40.206 ip4:220.127.116.11 ~all
This says that the four IPs listed there are “blessed” by me as being allowed to send email from the domain chrislea.com. The information is used to help in determining if a message is SPAM or not. For example, if I send a message to my GMail account from my chrislea.com domain, and I look at the headers in GMail by selecting “show original” from the actions, one of the things I see is this.
Received-SPF: pass (google.com: domain of firstname.lastname@example.org designates 18.104.22.168 as permitted sender) client-ip=22.214.171.124;
Authentication-Results: mx.google.com; spf=pass (google.com: domain of email@example.com designates 126.96.36.199 as permitted sender) firstname.lastname@example.org
So Google checked that I had an SPF record set up, and that the sending IP address from my MTA was in fact permitted. This contributed to a “not spammy” score and the message made it to my Inbox.
It’s important to note again that this technique is really sort of a hack. The DNS specification clearly states that useful information such as this isn’t supposed to go into TXT records. It’s done though because it led to people being able to adopt SPF very quickly, since all the nameserver software has supported TXT records forever. In 2005, IANA introduced a new record, specifically called an SPF record, to get around this. The syntax is identical to what people are putting into the TXT records. However, due to how recently it was introduced, not all nameserver software supports this new record type yet. Therefore most people are still putting SPF information into TXT records.
There is one more record type that I’m going to go over, and as it turns out, it’s also relevant to you largely because of the implications with SPAM.
A PTR record, or “pointer record”, is generally used to set up what’s called a reverse DNS lookup. Basically, the idea is that if there is a server associated with a given IP address, you should be able to ask DNS about that IP address and get some indication that there’s a server there, in the form of a domain name. This is important because many MTA programs that recieve mail are set up to flatly reject any message from an IP address that does not have reverse DNS set up. The idea here is that bad people doing spammy things aren’t going to properly set up reverse DNS, and thus any message coming from such an IP address is considered untrustworthy.
As an example, my personal mail server at mail.chrislea.com has the IP address 188.8.131.52 currently. If I check reverse DNS, for this, I will see that the reverse entry is for debian01.chrislea.com, as this is the name of the server that I handle my mail on. Note that it’s not critical for the domain names to match here. The forward lookup is for the “mail” subdomain and the reverse lookup returns the “debian01” subdomain. Some SPAM calssifying software might use the fact that the base domains match in part of their scoring, but it’s generally not that critical. What is critical is the fact that the reverse DNS entry exists and is a valid domain name that itself has a forward DNS entry. Because of this (and the fact I don’t send spammy looking emails), my mail typically doesn’t get rejected when I send out to other people. It is a good idea to make sure that whoever you are using to send mail through has reverse DNS set up properly for their sending servers. That said, if you haven’t experienced massive problems with your mail going into SPAM folders, they almost certainly already have.
Using host to check DNS Entries
There is a command line program available on every *NIX type system I know of called host which can be used to check DNS entries directly from nameservers. If you are using Microsoft Windows, you won’t have the host program. I’d recommend scrapping Windows, installing Ubuntu or Fedora on your computer and using that. But since that may not be practical for you, you might want to just skip to the next section. 🙂
Using host is not hard. First you’ll need to open up a Terminal. If you are using OS X, the Terminal program is in the Applications -> Utilities folder. If you are using Linux or some other *NIX, I’ll assume you already know how to open up a Terminal. 🙂
It’s easiest to learn by example, so now that you have the terminal open, type in the following:
The output you get should look like this:
chrislea.com has address 184.108.40.206
chrislea.com mail is handled by 10 mail.chrislea.com.
You just looked up the A record for chrislea.com, and found that the IP is currently 220.127.116.11. By default, host also looks up the MX records for a domain when you do this sort of lookup. That’s what you see in the second line there. I only have one MX record for my personal email. If I had more, as discussed before, they would all show up here.
A very important thing to be aware of here is that with the above command, you just queried the resolving nameservers that your computer is currently using. If you want to query a different nameserver, you put that in as a final argument like so.
host chrislea.com ns2.mediatemple.net
This should return essentially the same information since the (mt) Media Temple nameservers are authoritative for chrislea.com. If it doesn’t, then it probably means that the authoritative nameserver has been updated with new information, like a new IP address, but that this hasn’t propagated to the resolver you are using yet.
If you want to look at a record that’s not an A record, you can accomplish this with the -t flag using host. Let’s say you want to check on the TXT record for my domain. The command and results on my Linux machine look like this.
chl@melian:~$ host -t txt chrislea.com
chrislea.com descriptive text "v=spf1 a mx ip4:18.104.22.168 ip4:22.214.171.124 ip4:126.96.36.199 ip4:188.8.131.52 ~all"
Here, you can see that it returned the same SPF information that we talked about before, becuase I have that set up in the TXT record for chrislea.com. As another example, let’s look at the MX records for carsonified.com.
chl@melian:~$ host -t mx carsonified.com
carsonified.com mail is handled by 0 ASPMX4.GOOGLEMAIL.com.
carsonified.com mail is handled by 0 ASPMX5.GOOGLEMAIL.com.
carsonified.com mail is handled by 5 ALT1.ASPMX.L.GOOGLE.com.
carsonified.com mail is handled by 5 ALT2.ASPMX.L.GOOGLE.com.
carsonified.com mail is handled by 10 ASPMX.L.GOOGLE.com.
carsonified.com mail is handled by 0 ASPMX2.GOOGLEMAIL.com.
carsonified.com mail is handled by 0 ASPMX3.GOOGLEMAIL.com.
As you can probably guess, Ryan and crew are using GMail to handle their mail needs.
Finally, if you want to look up a PTR record to see the reverse DNS for an IP, just feed that IP address to host like you’d expect to.
chl@melian:~$ host 184.108.40.206
220.127.116.11.in-addr.arpa domain name pointer debian01.chrislea.com.
You can put a different nameserver in as the final argument for any of those last examples if you want to query something other than your resolvers. And, of course, if you’d like to learn about all the other fun options the host command has, you should consult the man page.
Using nslookup to check DNS Entries
If you are using Microsoft Windows, I’m sorry (seriously… Ubuntu will almost certainly install painlessly on your hardware, and it’s much prettier). But, if that’s the boat you are in, the tool they provide to query nameservers directly is called nslookup. I should point out that I’ve basically never used Windows Vista, so I am assuming it works the same on there as it does in Windows XP. If I’m wrong and there’s an astute reader out there who uses Vista and can correct me, please do so.
You’ll first need to open up a DOS shell, which is sort of the same as opening up a Terminal on a *NIX type system. In Windows XP, this can be accomplished by clicking Start, clicking Run, typing “cmd” with no quotes into the dialog and pressing Enter. I’m told that in Vista, you simply click Start, type “cmd” with no quotes, and press Enter.
The nslookup command has a non-interactive mode, and a shell mode. I find the shell mode much more useful so that’s what I’m going to explain. In the DOS prompt, simply type nslookup in order to enter the shell. Once you’re in the shell, type the domain name you are interested in followed by a period to do a lookup of the A record for that domain. If you don’t remember the trailing period, you won’t get the expected results. As an example:
If you’d like to do a different type of query, you use the syntax set query=<type>. So, if I’d like to check on the TXT record for chrislea.com, the commands would look like this.
> set query=txt
chrislea.com text =
“v=spf1 a mx ip4:18.104.22.168 ip4:22.214.171.124 ip4:126.96.36.199 ip4:188.8.131.52 ~all”
This should look familiar to you from the examples that used the host command (assuming you read those). Now, you’ll note in these results that the first lines indicate that the Server referenced is called mjolnir.chrislea.com at address 192.168.0.1. This device is my local router which happens to be running a resolver. As you might expect, by default, nslookup uses whatever resolving nameservers your Windows installation is itself using by default. If you want to change this, you use the syntax server <IP address of nameserver> to tell nslookup to use something different. Unfortunately, you can’t just use the hostname of the nameserver as far as I know. You can see what the IP address is for a nameserver by just pinging whatever that server’s hostname is from the DOS shell. For example:
C:Documents and SettingsChris Lea>ping ns1.mediatemple.net
Pinging ns1.mediatemple.net [184.108.40.206] with 32 bytes of data:
C:Documents and SettingsChris Lea>nslookup
Default Server: mjolnir.chrislea.com
> server 220.127.116.11
Default Server: ns1.mediatemple.net
Once you’ve used the server directive to change which nameserver nslookup is querying, it stays with that nameserver until you change it back or you exit the program.
I think this basically covers the same functionality as what I went over with the host command. You should now be able to query arbitrary nameservers for all the sorts of DNS records that we’ve talked about here.
There are just a few last things that I feel I should mention that didn’t fit in so well elsewhere.
The first is that many of the routers commonly in use today run their own, local, resolving nameservers. As I pointed out in my nslookup examples above, the standard “Default Server” read mjolnir.chrislea.com with IP address 18.104.22.168. That is the name and IP address of my local Linksys router. The Tomato firmware I’m using runs a local resolver. If the resolver on the router doesn’t know some answer that is requested, it will then ask my ISPs resolvers. If they don’t know, they find authoritative nameservers to ask. This is relevant, because sometimes it may be necessary to reboot your local router to flush DNS entries in order to get the most accurate data. I’ve found this to be particularly true with Apple’s Airport routers.
Another thing I’d like to mention is that authoritative nameservers are, generally, “open”. By this I mean that they will answer a DNS request from any IP address that queries it. They basically have to, since they by definition have authoritative information that other random systems may need to know. However, resolving nameservers do not share this concern. Therefore, typically, ISPs will limit connectivity to their resolving nameservers to their own IP space. For example, the resolving nameservers we run at (mt) Media Temple will only respond to queries from the (mt) IP space. This is something that is wise to keep in mind, but that practically will not cause much trouble for you in my experince.
Lastly, as I said at the beginning, DNS is a big topic. This should have given you a pretty solid grounding, but there is certainly a lot more to know about it if you’re trying to be comprehensive. If this applies to you, the standard first place to start reading is the book DNS and BIND from O’Reilly press. It’s commonly just known as “the bugs book” among systems administrators due to its ubiquity. A great deal of the book focuses on administering the BIND software package, which you will hopefully never have to deal with. That said it’s still the standard reference for learning DNS.
With those final notes, I will conclude. I hope you’ve found this article informative. If you have any questions or (cough) corrections, please comment below or contact me.
Like this article?
If you enjoyed, this article, feel free to re-tweet it to let others know. Thanks, we appreciate it! 🙂
Photo credit: flickr.com/photos/seanosh