very strange DNS problem, (whoa , many responses) clarification
Charles Steinkuehler
charles at steinkuehler.net
Thu Apr 24 15:17:09 CDT 2003
Walker (Zachary) Tippit wrote:
> I received many responses to this email, thank you
> all.
>
> I should clarify a couple of things - I am the ISP in
> this situation. The machine failing to do the lookup
> is the primary nameserver for said ISP, hosting 1200+
> domains. My nameserver doesn't think it's doing local
> service for nbbc.edu. Other .edu lookups work
> charmingly.
> So, as far as I know, the /etc/resolv.conf and
> /etc/nsswitch.conf are setup just fine. For kicks, I
> added att.net's primary nameserver to the resolv.conf;
> It didn't change much, except now I get these results:
<snip host output>
> Without the -v switch, it just returns me to a prompt
> with no output. Strange indeed.
>
> I read somewhere that a named dump might be in order,
> so dump I did. When searching through the dump, I
> found this:
>
>
> $ORIGIN edu.
> [....]
>
> nbbc 164772 IN NS NS1.nbbc.edu.
> ;Cr=addtnl[192.55.83.32]
>
> 164772 IN NS NS2.nbbc.edu.
> ;Cr=addtnl [192.55.83.32]
>
> excuse the wordwrap. So it appears that bind knows
> where to look for this domain.
>
> In short, I am pretty sure this is the only domain I
> can't look up, but I am paranoid about there being
> others so I want to know why.. All of the usual
> solutions turned up nil.
I think I may have a hint as to what's going on, and it looks like it's
not your fault.
When doing non-recursive queries with dig (ie manually "walking" the DNS
database starting with the top-level servers), I can properly resolve
www.nbcc.edu.
If I simply do a recursive query with dig to my local name server,
however, it fails on a "busy" system with lots of users, and works fine
on my local system (I run a name server that only does recursive lookups
for me and my wife). This is a pretty sure sign that the problem is
with incorrect data cached in the DNS tree.
A bit more investigation shows that the *CORRECT* IP's for the nbbc name
serves *DIFFER* from what you've got in your named dump, above. Output
from my working system:
[root at basic root]# host www.nbbc.edu
www.nbbc.edu has address 207.250.169.8
[root at basic root]# dig www.nbbc.edu
; <<>> DiG 9.2.1 <<>> www.nbbc.edu
;; global options: printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 19749
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 4
;; QUESTION SECTION:
;www.nbbc.edu. IN A
;; ANSWER SECTION:
www.nbbc.edu. 37741 IN A 207.250.169.8
;; AUTHORITY SECTION:
nbbc.edu. 24660 IN NS ns1.nbbc.edu.
nbbc.edu. 24660 IN NS ns2.nbbc.edu.
;; ADDITIONAL SECTION:
ns1.nbbc.edu. 38029 IN A 207.250.169.111
ns1.nbbc.edu. 38029 IN A 207.250.169.11
ns2.nbbc.edu. 24660 IN A 207.250.169.112
ns2.nbbc.edu. 24660 IN A 207.250.169.12
;; Query time: 136 msec
;; SERVER: 216.171.153.129#53(216.171.153.129)
;; WHEN: Thu Apr 24 10:20:08 2003
;; MSG SIZE rcvd: 146
Note the long timeout in your cache dump, above, and the incorrect IP
address.
I suspect the folks at nbbc.edu moved their nameservers (or renumbered
their whole network), but didn't remember to crank down the TTL settings
in their zone files. As a result, they will have intermittent name
resolution until all the dirty information cached in DNS servers across
the internet expires.
It also looks like one of their nameservers may be offline, adding to
their problems:
[root at falcon named]# dig www.nbbc.edu @207.250.169.12
; <<>> DiG 9.2.1 <<>> www.nbbc.edu @207.250.169.12
;; global options: printcmd
;; connection timed out; no servers could be reached
--
Charles Steinkuehler
charles at steinkuehler.net
More information about the Kclug
mailing list