7

We are using Windows Server 2012. Sometimes we are unable to resolve .gov websites. When we check with the following commands, they do resolve, so we know that the .gov websites are available.

nslookup www.fda.gov 8.8.8.8
Server:  google-public-dns-a.google.com
Address:  8.8.8.8

Non-authoritative answer:
Name:    a1715.dscb.akamai.net
Addresses:  2607:f7d8:801:100::40ba:2f29
          2607:f7d8:801:100::40ba:2f30
          23.3.96.168
          23.3.96.89
Aliases:  www.fda.gov
          www.fda.gov.edgesuite.net

Using our DNS forwarder:

nslookup www.fda.gov [IP address of DNS forwarder]
Server:  [FQDN of DNS forwarder]
Address:  [IP address of DNS forwarder]

Non-authoritative answer:
Name:    a1715.dscb.akamai.net
Addresses:  2607:f7d8:801:100::40ba:2f30
          2607:f7d8:801:100::40ba:2f29
          23.3.96.168
          23.3.96.89
Aliases:  www.fda.gov
          www.fda.gov.edgesuite.net

Using our DNS Server:

nslookup -d2 www.fda.gov
------------
SendRequest(), len 43
    HEADER:
        opcode = QUERY, id = 1, rcode = NOERROR
        header flags:  query, want recursion
        questions = 1,  answers = 0,  authority records = 0,  additional = 0

    QUESTIONS:
        11.1.168.192.in-addr.arpa, type = PTR, class = IN

------------
------------
Got answer (126 bytes):
    HEADER:
        opcode = QUERY, id = 1, rcode = NXDOMAIN
        header flags:  response, auth. answer, want recursion, recursion avail
        questions = 1,  answers = 0,  authority records = 1,  additional = 0

    QUESTIONS:
        11.1.168.192.in-addr.arpa, type = PTR, class = IN
    AUTHORITY RECORDS:
    ->  1.168.192.in-addr.arpa
        type = SOA, class = IN, dlen = 49
        ttl = 3600 (1 hour)
        primary name server = iss3.iss.local
        responsible mail addr = hostmaster.iss.local
        serial  = 163
        refresh = 900 (15 mins)
        retry   = 600 (10 mins)
        expire  = 86400 (1 day)
        default TTL = 3600 (1 hour)

------------
Server:  UnKnown
Address:  [server IP]

------------
SendRequest(), len 29
    HEADER:
        opcode = QUERY, id = 2, rcode = NOERROR
        header flags:  query, want recursion
        questions = 1,  answers = 0,  authority records = 0,  additional = 0

    QUESTIONS:
        www.fda.gov, type = A, class = IN

------------
DNS request timed out.
    timeout was 2 seconds.
timeout (2 secs)
SendRequest failed
------------
SendRequest(), len 29
    HEADER:
        opcode = QUERY, id = 3, rcode = NOERROR
        header flags:  query, want recursion
        questions = 1,  answers = 0,  authority records = 0,  additional = 0

    QUESTIONS:
        www.fda.gov, type = AAAA, class = IN

------------
DNS request timed out.
    timeout was 2 seconds.
timeout (2 secs)
SendRequest failed
*** Request to UnKnown timed-out

When I use set vc in the nslookup command:

Server:  UnKnown
Address:  192.168.1.11

*** UnKnown can't find www.fda.gov: Server failed

We did have IPv6 disabled, so I re-enabled it, but the problem persists. We would appreciate any advice or troubleshooting steps to resolve this issue.

We are in the United States. Here is a ping at a time that we have access to the FDA website. (As mentioned, this is intermittent.):

Pinging a1715.dscb.akamai.net [23.3.96.168] with 32 bytes of data:
Reply from 23.3.96.168: bytes=32 time=97ms TTL=57
Reply from 23.3.96.168: bytes=32 time=14ms TTL=57
Reply from 23.3.96.168: bytes=32 time=36ms TTL=57
Reply from 23.3.96.168: bytes=32 time=20ms TTL=57

Ping statistics for 23.3.96.168:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 14ms, Maximum = 97ms, Average = 41ms

Here is a ping when it is not available:

ping www.fda.gov
Ping request could not find host www.fda.gov. Please check the name and try again.

ping www.fda.gov.
Ping request could not find host www.fda.gov.. Please check the name and try again.

Others have had similar problems. We tried changing the MaxCacheTTL to 30 minutes (1800) as it resolved the problem in that thread, but the problem persists for us.

We also just tried changing the MaxCacheTTL to 0. That did not work. But we also discovered that we cannot access www.paypal.com at the same time we cannot access these other .gov websites. What is interesting is that when we are able to access www.fda.gov, we are also able to access www.paypal.com. That indicates to me that it cannot be a problem with TTL since TTL happens on a per-record basis. Also, the fact that adjusting the MaxCacheTTL the first time did not work should have been evident enough.

We performed a detailed logging action on DNS for www.fda.gov. The results are fascinating, but we don't know what to do with it. It appears that the DNS server looks for it as a subdomain in our domain: www.fda.gov.[domain].local.

3/9/2017 11:33:10 AM 448C PACKET  000000010655E8A0 UDP Rcv [server IP]    0002   Q [0001   D   NOERROR] A      (3)www(3)fda(3)gov(3)[domain](5)local(0)
UDP question info at 000000010655E8A0
  Socket = 492
  Remote addr [server IP], port 60700
  Time Query=2151068, Queued=0, Expire=0
  Buf length = 0x0fa0 (4000)
  Msg length = 0x0027 (39)
  Message:
    XID       0x0002
    Flags     0x0100
      QR        0 (QUESTION)
      OPCODE    0 (QUERY)
      AA        0
      TC        0
      RD        1
      RA        0
      Z         0
      CD        0
      AD        0
      RCODE     0 (NOERROR)
    QCOUNT    1
    ACOUNT    0
    NSCOUNT   0
    ARCOUNT   0
    QUESTION SECTION:
    Offset = 0x000c, RR count = 0
    Name      "(3)www(3)fda(3)gov(3)[domain](5)local(0)"
      QTYPE   A (1)
      QCLASS  1
    ANSWER SECTION:
      empty
    AUTHORITY SECTION:
      empty
    ADDITIONAL SECTION:
      empty

3/9/2017 11:33:10 AM 448C PACKET  000000010655E8A0 UDP Snd [server IP]    0002 R Q [8385 A DR NXDOMAIN] A      (3)www(3)fda(3)gov(3)[domain](5)local(0)
UDP response info at 000000010655E8A0
  Socket = 492
  Remote addr [server IP], port 60700
  Time Query=2151068, Queued=0, Expire=0
  Buf length = 0x0fa0 (4000)
  Msg length = 0x0064 (100)
  Message:
    XID       0x0002
    Flags     0x8583
      QR        1 (RESPONSE)
      OPCODE    0 (QUERY)
      AA        1
      TC        0
      RD        1
      RA        1
      Z         0
      CD        0
      AD        0
      RCODE     3 (NXDOMAIN)
    QCOUNT    1
    ACOUNT    0
    NSCOUNT   1
    ARCOUNT   0
    QUESTION SECTION:
    Offset = 0x000c, RR count = 0
    Name      "(3)www(3)fda(3)gov(3)[domain](5)local(0)"
      QTYPE   A (1)
      QCLASS  1
    ANSWER SECTION:
      empty
    AUTHORITY SECTION:
    Offset = 0x0027, RR count = 0
    Name      "(3)[domain](5)local(0)"
      TYPE   SOA  (6)
      CLASS  1
      TTL    3600
      DLEN   40
      DATA   
        PrimaryServer: (4)servername[C027](3)[domain](5)local(0)
        Administrator: (10)hostmaster[C027](3)[domain](5)local(0)
        SerialNo     = 2735
        Refresh      = 900
        Retry        = 600
        Expire       = 86400
        MinimumTTL   = 3600
    ADDITIONAL SECTION:
      empty

When it is working:

3/9/2017 11:33:10 AM 448C PACKET  000000010672E9F0 UDP Snd [server IP]    0004 R Q [8081   DR  NOERROR] A      (3)www(3)fda(3)gov(0)
UDP response info at 000000010672E9F0
  Socket = 492
  Remote addr [server IP], port 60702
  Time Query=2151068, Queued=2151068, Expire=2151071
  Buf length = 0x0200 (512)
  Msg length = 0x0077 (119)
  Message:
    XID       0x0004
    Flags     0x8180
      QR        1 (RESPONSE)
      OPCODE    0 (QUERY)
      AA        0
      TC        0
      RD        1
      RA        1
      Z         0
      CD        0
      AD        0
      RCODE     0 (NOERROR)
    QCOUNT    1
    ACOUNT    3
    NSCOUNT   0
    ARCOUNT   0
    QUESTION SECTION:
    Offset = 0x000c, RR count = 0
    Name      "(3)www(3)fda(3)gov(0)"
      QTYPE   A (1)
      QCLASS  1
    ANSWER SECTION:
    Offset = 0x001d, RR count = 0
    Name      "[C00C](3)www(3)fda(3)gov(0)"
      TYPE   CNAME  (5)
      CLASS  1
      TTL    128
      DLEN   25
      DATA   (3)www(3)fda(3)gov(7)edgekey(3)net(0)
    Offset = 0x0042, RR count = 1
    Name      "[C029](3)www(3)fda(3)gov(7)edgekey(3)net(0)"
      TYPE   CNAME  (5)
      CLASS  1
      TTL    3992
      DLEN   25
      DATA   (6)e11872(4)dscb(10)akamaiedge[C03D](3)net(0)
    Offset = 0x0067, RR count = 2
    Name      "[C04E](6)e11872(4)dscb(10)akamaiedge[C03D](3)net(0)"
      TYPE   A  (1)
      CLASS  1
      TTL    20
      DLEN   4
      DATA   184.31.201.196
    AUTHORITY SECTION:
      empty
    ADDITIONAL SECTION:
      empty

3/9/2017 11:33:32 AM 4988 PACKET  00000001050E88F0 UDP Rcv [server IP]   9658   Q [0001   D   NOERROR] A      (3)www(3)fda(3)gov(0)
UDP question info at 00000001050E88F0
  Socket = 492
  Remote addr [server IP], port 62657
  Time Query=2151089, Queued=0, Expire=0
  Buf length = 0x0fa0 (4000)
  Msg length = 0x001d (29)
  Message:
    XID       0x9658
    Flags     0x0100
      QR        0 (QUESTION)
      OPCODE    0 (QUERY)
      AA        0
      TC        0
      RD        1
      RA        0
      Z         0
      CD        0
      AD        0
      RCODE     0 (NOERROR)
    QCOUNT    1
    ACOUNT    0
    NSCOUNT   0
    ARCOUNT   0
    QUESTION SECTION:
    Offset = 0x000c, RR count = 0
    Name      "(3)www(3)fda(3)gov(0)"
      QTYPE   A (1)
      QCLASS  1
    ANSWER SECTION:
      empty
    AUTHORITY SECTION:
      empty
    ADDITIONAL SECTION:
      empty

3/9/2017 11:40:36 AM 0F98 PACKET  0000000102B32600 UDP Snd [server IP]    23f2 R Q [8081   DR  NOERROR] A      (3)www(3)fda(3)gov(0)
UDP response info at 0000000102B32600
  Socket = 492
  Remote addr [server IP], port 55901
  Time Query=2151514, Queued=0, Expire=0
  Buf length = 0x0fa0 (4000)
  Msg length = 0x0184 (388)
  Message:
    XID       0x23f2
    Flags     0x8180
      QR        1 (RESPONSE)
      OPCODE    0 (QUERY)
      AA        0
      TC        0
      RD        1
      RA        1
      Z         0
      CD        0
      AD        0
      RCODE     0 (NOERROR)
    QCOUNT    1
    ACOUNT    3
    NSCOUNT   9
    ARCOUNT   5
    QUESTION SECTION:
    Offset = 0x000c, RR count = 0
    Name      "(3)www(3)fda(3)gov(0)"
      QTYPE   A (1)
      QCLASS  1
    ANSWER SECTION:
    Offset = 0x001d, RR count = 0
    Name      "[C00C](3)www(3)fda(3)gov(0)"
      TYPE   CNAME  (5)
      CLASS  1
      TTL    300
      DLEN   25
      DATA   (3)www(3)fda(3)gov(7)edgekey(3)net(0)
    Offset = 0x0042, RR count = 1
    Name      "[C029](3)www(3)fda(3)gov(7)edgekey(3)net(0)"
      TYPE   CNAME  (5)
      CLASS  1
      TTL    15195
      DLEN   25
      DATA   (6)e11872(4)dscb(10)akamaiedge[C03D](3)net(0)
    Offset = 0x0067, RR count = 2
    Name      "[C04E](6)e11872(4)dscb(10)akamaiedge[C03D](3)net(0)"
      TYPE   A  (1)
      CLASS  1
      TTL    20
      DLEN   4
      DATA   23.194.99.134
    AUTHORITY SECTION:
    Offset = 0x0077, RR count = 0
    Name      "[C055](4)dscb(10)akamaiedge[C03D](3)net(0)"
      TYPE   NS  (2)
      CLASS  1
      TTL    1566
      DLEN   9
      DATA   (6)n6dscb[C05A](10)akamaiedge[C03D](3)net(0)
    Offset = 0x008c, RR count = 1
    Name      "[C055](4)dscb(10)akamaiedge[C03D](3)net(0)"
      TYPE   NS  (2)
      CLASS  1
      TTL    1566
      DLEN   9
      DATA   (6)n7dscb[C05A](10)akamaiedge[C03D](3)net(0)
    Offset = 0x00a1, RR count = 2
    Name      "[C055](4)dscb(10)akamaiedge[C03D](3)net(0)"
      TYPE   NS  (2)
      CLASS  1
      TTL    1566
      DLEN   9
      DATA   (6)a0dscb[C05A](10)akamaiedge[C03D](3)net(0)
    Offset = 0x00b6, RR count = 3
    Name      "[C055](4)dscb(10)akamaiedge[C03D](3)net(0)"
      TYPE   NS  (2)
      CLASS  1
      TTL    1566
      DLEN   9
      DATA   (6)n0dscb[C05A](10)akamaiedge[C03D](3)net(0)
    Offset = 0x00cb, RR count = 4
    Name      "[C055](4)dscb(10)akamaiedge[C03D](3)net(0)"
      TYPE   NS  (2)
      CLASS  1
      TTL    1566
      DLEN   9
      DATA   (6)n1dscb[C05A](10)akamaiedge[C03D](3)net(0)
    Offset = 0x00e0, RR count = 5
    Name      "[C055](4)dscb(10)akamaiedge[C03D](3)net(0)"
      TYPE   NS  (2)
      CLASS  1
      TTL    1566
      DLEN   9
      DATA   (6)n2dscb[C05A](10)akamaiedge[C03D](3)net(0)
    Offset = 0x00f5, RR count = 6
    Name      "[C055](4)dscb(10)akamaiedge[C03D](3)net(0)"
      TYPE   NS  (2)
      CLASS  1
      TTL    1566
      DLEN   9
      DATA   (6)n3dscb[C05A](10)akamaiedge[C03D](3)net(0)
    Offset = 0x010a, RR count = 7
    Name      "[C055](4)dscb(10)akamaiedge[C03D](3)net(0)"
      TYPE   NS  (2)
      CLASS  1
      TTL    1566
      DLEN   9
      DATA   (6)n4dscb[C05A](10)akamaiedge[C03D](3)net(0)
    Offset = 0x011f, RR count = 8
    Name      "[C055](4)dscb(10)akamaiedge[C03D](3)net(0)"
      TYPE   NS  (2)
      CLASS  1
      TTL    1566
      DLEN   9
      DATA   (6)n5dscb[C05A](10)akamaiedge[C03D](3)net(0)
    ADDITIONAL SECTION:
    Offset = 0x0134, RR count = 0
    Name      "[C0D7](6)n1dscb[C05A](10)akamaiedge[C03D](3)net(0)"
      TYPE   A  (1)
      CLASS  1
      TTL    807
      DLEN   4
      DATA   69.22.155.207
    Offset = 0x0144, RR count = 1
    Name      "[C0EC](6)n2dscb[C05A](10)akamaiedge[C03D](3)net(0)"
      TYPE   A  (1)
      CLASS  1
      TTL    3922
      DLEN   4
      DATA   69.22.155.209
    Offset = 0x0154, RR count = 2
    Name      "[C101](6)n3dscb[C05A](10)akamaiedge[C03D](3)net(0)"
      TYPE   A  (1)
      CLASS  1
      TTL    1418
      DLEN   4
      DATA   24.143.193.180
    Offset = 0x0164, RR count = 3
    Name      "[C083](6)n6dscb[C05A](10)akamaiedge[C03D](3)net(0)"
      TYPE   A  (1)
      CLASS  1
      TTL    3973
      DLEN   4
      DATA   23.220.96.109
    Offset = 0x0174, RR count = 4
    Name      "[C098](6)n7dscb[C05A](10)akamaiedge[C03D](3)net(0)"
      TYPE   A  (1)
      CLASS  1
      TTL    279
      DLEN   4
      DATA   23.220.96.86

When it is not working:

3/9/2017 11:50:47 AM 2988 PACKET  00000001058C3ED0 UDP Snd [server IP]    44af R Q [8281   DR SERVFAIL] A      (3)www(3)fda(3)gov(0)
UDP response info at 00000001058C3ED0
  Socket = 492
  Remote addr [server IP], port 54261
  Time Query=2152117, Queued=2152121, Expire=2152124
  Buf length = 0x0200 (512)
  Msg length = 0x001d (29)
  Message:
    XID       0x44af
    Flags     0x8182
      QR        1 (RESPONSE)
      OPCODE    0 (QUERY)
      AA        0
      TC        0
      RD        1
      RA        1
      Z         0
      CD        0
      AD        0
      RCODE     2 (SERVFAIL)
    QCOUNT    1
    ACOUNT    0
    NSCOUNT   0
    ARCOUNT   0
    QUESTION SECTION:
    Offset = 0x000c, RR count = 0
    Name      "(3)www(3)fda(3)gov(0)"
      QTYPE   A (1)
      QCLASS  1
    ANSWER SECTION:
      empty
    AUTHORITY SECTION:
      empty
    ADDITIONAL SECTION:
      empty

I discovered RAS had received an IP address that was then being reported as a DNS Name Server. I have modified those settings to remove that IP address, but the problem remains.

Below is a snapshot of the DNS properties for the Forward Zone of [domain].local. Forward Zone of [domain].local

Below is a snapshot of the DNS server properties: DNS server properties

Bobort
  • 220
  • 2
  • 20
  • I can't see a web-site on `fda.gov` (same IP), but `www.fda.gov` (109.159.158.210) works fine. I guess there is no web server at the former address, but there may be other services on it. – AFH Feb 13 '17 at 16:30
  • Okay, I updated the question to show `www.fda.gov` instead of `fda.gov`. We still have the problem, though. – Bobort Feb 13 '17 at 16:33
  • Without seeing the problem, I can't go that much further. I have no idea why I get a completely different IP, though I don't use Google DNS. The IP I quoted came from `ping`: oddly, `nslookup` gives different addresses again, 88.221.50.177 and 88.221.50.169 `ping fda.gov` returned the same address as you quoted, 63.80.4.10). What does your `ping` show? Do my numeric addresses work for you? You don't declare your location: you're not subject to geographic restrictions, by any chance? – AFH Feb 13 '17 at 16:46
  • When it's not working, what happens if you `ping www.fda.gov.` (don't exclude the intentional trailing backslash). – I say Reinstate Monica Mar 10 '17 at 16:25
  • Also, when it's not working, please start NSLookup with the `-d2` paramater (as in `nslookup -d2`) then post the output of a lookup attempt to `www.fda.gov`. This will produce similar output to your already-provided debugging info, but with more easily interpreted output. – I say Reinstate Monica Mar 10 '17 at 16:30
  • If I understand right, the problem only occurs when using your own DNS server, so some more information on it and its DNS would be useful. Does the problem happen on all clients or only on the server? Note that under Windows nslookup is preferable over ping - see [my answer here](https://superuser.com/a/508057/8672) - so you could try stopping the DNS Client service to see if it changes anything. Also, AFAIK `www.fda.gov` isn't fully qualified - to be fully qualified needs `www.fda.gov.` (trailing dot) or nslookup will append primary and/or connection specific DNS suffixes to the query. – harrymc Mar 10 '17 at 21:23
  • @Twisty, ping returns the following message when I ping: Ping request could not find host www.fda.gov. Please check the name and try again. – Bobort Mar 10 '17 at 22:22
  • @Bobort Including when you include to trailing dot, correct? (I mistakenly called it a backslash in my previous comment.) – I say Reinstate Monica Mar 10 '17 at 22:43
  • @Twisty, that was confusing me. I have updated the question, but the answer remains the same. – Bobort Mar 10 '17 at 22:57
  • What DNS server(s) is your DNS server resolving against? When the problem happens, try resolving directly against them. (I'm assuming your DNS server isn't using 8.8.8.8 as you've already demonstrated resolving with that server when the problem is occurring.) – I say Reinstate Monica Mar 10 '17 at 23:35
  • See if this is enabled in your DNS server : [Secure the Server Cache Against Names Pollution](https://technet.microsoft.com/en-us/library/cc772349(v=ws.11).aspx). If enabling it helps - verify the DNS servers used by your own server (is this your ISP?). If you have Active Directory see [Allow Only Secure Dynamic Updates](https://technet.microsoft.com/en-us/library/ee649287(v=ws.10).aspx). – harrymc Mar 11 '17 at 10:06
  • @Twisty, I have done that. It is in the question. The forwarders resolve the request just fine. – Bobort Mar 13 '17 at 13:46
  • @harrymc, Secure the Server Cache Against Names Pollution is already enabled. Also, only allow Secure dynamic updates is already enabled. – Bobort Mar 13 '17 at 13:46
  • Perhaps restating the obvious, this is a problem with the local DNS server. Would you be willing to post screen shots of the following tabs of the Properties tab of the DNS server: Interfaces, Forwarders, Advanced, and Trust Anchors? Posting links to the images is fine instead of inserting them inline. – I say Reinstate Monica Mar 13 '17 at 17:06
  • Please let us know what is the problem IP. If it relates in any way to your external IP address see [this article](http://techgenix.com/you_need_to_create_a_split_dns/) about how internal solvers can be confused when the zone DNS is not set up correctly. But in any case, please communicate this problem IP. Also the largest part of your external IP address that you would feel safe in communicating here. – harrymc Mar 13 '17 at 20:05
  • @Twisty, I have added snapshots for the DNS server properties and the Forward Lookup zone for [domain].local. I blacked out the Forwarder IP addresses, but I assure you that they work just fine. – Bobort Mar 14 '17 at 13:04
  • Please ignore my above question about the rogue DNS Name Server you found - I wrongly assumed that it has returned. I think there's a non-evident problem with your DNS server, maybe one of its forwarders is poisoning the cache, but only you can check it. – harrymc Mar 14 '17 at 14:21
  • @harrymc, I did figure out what the rogue server was. It was actually from RRAS, or our VPN setup. The VPN was using DHCP without static addresses, and it just happened to pull one from that list. I managed to modify the settings, but now the VPN isn't working. That's a problem for a different time, though. – Bobort Mar 14 '17 at 20:56
  • Has the fact that the VPN isn't working changed anything as regarding the problem? – harrymc Mar 14 '17 at 22:11
  • @harrymc, I'm afraid not. The problem remains unresolved. It seems that the VPN thing is just a distraction for us. – Bobort Mar 15 '17 at 13:43
  • I have extracted the good info: hopefully this is easier for you guys to trouble shoot: https://i.gyazo.com/7ab3dc39030c80fe9f64e8bf48a63959.png – DeerSpotter Mar 15 '17 at 15:31
  • What about for troubleshooting disabling use of your forwarders and only using root hints? – I say Reinstate Monica Mar 15 '17 at 16:19
  • The SERVFAIL error means that the fully qualified domain name does exist, that the root name servers have information on it, but that the authoritative name servers are not answering queries for it. This can come from a caching server if it didn't get an answer from any of the servers that the domain is delegated to. Or from a server noticing that the domain is delegated to itself, but it doesn't have the zone loaded and the answer isn't in its cache. I think there's a problem with your zone definition records that may clear up after some timeout. Queries not thru your server do succeed. – harrymc Mar 15 '17 at 18:29
  • You need a tool to check your zone records. I don't know of such, but a quick google found [DNSDataView](http://www.nirsoft.net/utils/dns_records_viewer.html) and [DNS Health Report](https://gallery.technet.microsoft.com/scriptcenter/DNS-Health-Report-80fa9675), but I can't vouch for them. Adding the list of these records to your post may help. – harrymc Mar 15 '17 at 18:31
  • @Twisty, I disabled forwarders when I noticed your comment. So for about two days, we've been using only root hints. I have not noticed any problem since then. This leads to a question, though. When we had the problems, using nslookup [forwarder IP] resolved the domain perfectly. Does this "workaround" suggest that the problem is with the forwarders? And, if so, why did the nslookup command resolve perfectly when using the forwarder IP addresses? – Bobort Mar 17 '17 at 14:27
  • From where had you run the `nslookup [forwarder IP]` command? Was it from the DNS server, or from a client machine? – I say Reinstate Monica Mar 17 '17 at 14:29
  • @Twisty, from both, actually. I tried it from the DNS server and a client with the same results. – Bobort Mar 17 '17 at 14:49
  • Then either 1) the problem is with the forwarders [you could try using other forwarders to confirm/solve this], or 2) there's a networking issue between your network and the forwarders, or the outside world in general...though the fact the root hints seem to be working seems to cast doubt on this theory. – I say Reinstate Monica Mar 17 '17 at 14:55
  • Root hint requests always terminate with an authoritative answer, unlike forwarders which just answer whatever they have. It might be a forwarder that has the problem and your server is just collateral damage, or some mismatch of one forwarder with your zone records. If you have several forwarders, trying them one-by-one might locate it. – harrymc Mar 17 '17 at 16:29

2 Answers2

1

Based on the fact disabling your DNS Forwarders and instead using only the Root Hint servers has eliminated the problem, it is reasonable to believe the problem is related to your forwarders. Your extensive search for mis-configuration of your DNS server has turned up nothing. A clear explanation as to why you're experiencing this problem doesn't seem forthcoming, so you may need to go with what works.

That said however, in this case, you have a couple of options:

  1. Continue using the root hint servers exclusively. While using DNS Forwarders can potentially provide faster lookup times (e.g. due to being "closer" to your network and having cached records of popularly accessed sites), there's nothing wrong with using the root hints.
  2. Try different forwarders. You could use Google's DNS servers (8.8.8.8 and 8.8.4.4), Verisign's Public DNS servers (64.6.64.6 and 64.6.65.6), or pick one from a list.
I say Reinstate Monica
  • 25,487
  • 19
  • 95
  • 131
  • Thank you so much for your assistance, @Twisty. For now, we will stick with the root hints exclusively. – Bobort Mar 17 '17 at 21:55
  • I greatly appreciate your involvement in this situation, but the problem appears to have returned out of thin air, so I must remove your answer as accepted. We have not made any changes since moving to root hints exclusively. We are unable to resolve www.fda.gov intermittently. I am lost on what is going on here. – Bobort Apr 20 '17 at 22:05
  • Quite disappointing the problem has returned! Looking back over the previous comments on this question I'm now suspicious that the problem is with the network between your DNS servers and the outside world. – I say Reinstate Monica Apr 21 '17 at 02:16
  • 1
    If you're interested, I've created a [chat room](http://chat.stackexchange.com/rooms/57448/chat-for-cannot-resolve-websites) where we can look into this further. *Anyone else reading this and interested in contributing it certainly welcome to do so* – I say Reinstate Monica Apr 21 '17 at 02:34
1

I did make a change a few months ago and wanted to confirm that it worked before answering my question. It turns out that the problem is not the DNS Server; it is the firewall. We use a Cisco ASA 5500, and it did not have EDNS0 (extension mechanisms for DNS) enabled. We used the workaround described in the article to resolve the problem. Basically, the idea is to allow DNS packets to change their “Maximum Packet Length” from 512 to 4096. Apparently, the .gov servers are using DNS extensions. We haven't had issues since. And I intend on changing the DNS settings back to the IP address of our ISP DNS servers in the near future.

Bobort
  • 220
  • 2
  • 20