There is a domain delegated to 4 NS-es.
If I block network connectivity with auth NS 1, kresd sometimes returns and caches
SERVFAIL without trying secondary NS-es 2-4.
Impact: single faulty auth NS affects multi-NS domain resolvability.
How to reproduce:
knot-resolver 5.7.6-cznic.1~bookworm amd64
-- Network interface configuration
net.listen('127.0.0.1', 53, { kind = 'dns' })
net.listen('::1', 53, { kind = 'dns', freebind = true })
net.listen('127.0.0.1', 8452, { kind = 'webmgmt' })
modules = {
'hints > iterate', -- Allow loading /etc/hosts or custom root hints
'stats', -- Track internal statistics
'http',
'cache',
}
cache.size = 100 * MB
let's use domain:
trafficmanager.net name server
tm1.dns-tm.com.
trafficmanager.net name server tm1.edgedns-tm.info.
trafficmanager.net name server
tm2.dns-tm.com.
trafficmanager.net name server tm2.edgedns-tm.info.
Intentionally block communication with
tm2.dns-tm.com (
tm2.dns-tm.com has address
150.171.16.240):
# ip route add blackhole 150.171.16.240
or
table inet filter {
chain input {
type filter hook input priority filter;
}
chain forward {
type filter hook forward priority filter;
}
chain output {
type filter hook output priority filter;
ip daddr 150.171.16.240 drop
}
}
Flush kresd cache:
# echo 'cache.clear()' | socat - UNIX-CONNECT:/run/knot-resolver/control/1
Query
admin.exchange.microsoft.com.
Repeat cache.flush and query few times if needed).
If kresd tries faulty NS as first, it returns and caches SERVFAIL.
All following queries will also return SERVFAIL.
Please note, when this happens, other auth NS-es (
tm1.dns-tm.com., tm1.edgedns-tm.info.,
tm2.edgedns-tm.info.) are reachable but not tried.
# while sleep 1; do host
admin.exchange.microsoft.com 127.0.0.1 | grep admin; done
Host
admin.exchange.microsoft.com not found: 2(SERVFAIL)
Host
admin.exchange.microsoft.com not found: 2(SERVFAIL)
Host
admin.exchange.microsoft.com not found: 2(SERVFAIL)
Host
admin.exchange.microsoft.com not found: 2(SERVFAIL)
Host
admin.exchange.microsoft.com not found: 2(SERVFAIL)
Host
admin.exchange.microsoft.com not found: 2(SERVFAIL)
Host
admin.exchange.microsoft.com not found: 2(SERVFAIL)
Host
admin.exchange.microsoft.com not found: 2(SERVFAIL)
Host
admin.exchange.microsoft.com not found: 2(SERVFAIL)
Host
admin.exchange.microsoft.com not found: 2(SERVFAIL)
Host
admin.exchange.microsoft.com not found: 2(SERVFAIL)
Host
admin.exchange.microsoft.com not found: 2(SERVFAIL)
Host
admin.exchange.microsoft.com not found: 2(SERVFAIL)
[...]
admin.exchange.microsoft.com is an alias for
admin.exchange.trafficmanager.net.
admin.exchange.trafficmanager.net is an alias for
eacafdprod-fzakegfrabbub0ab.z01.azurefd.net.
admin.exchange.microsoft.com is an alias for
admin.exchange.trafficmanager.net.
admin.exchange.trafficmanager.net is an alias for
eacafdprod-fzakegfrabbub0ab.z01.azurefd.net.
admin.exchange.microsoft.com is an alias for
admin.exchange.trafficmanager.net.
admin.exchange.trafficmanager.net is an alias for
eacafdprod-fzakegfrabbub0ab.z01.azurefd.net.
admin.exchange.microsoft.com is an alias for
admin.exchange.trafficmanager.net.
[...]
Host
admin.exchange.microsoft.com not found: 2(SERVFAIL)
Host
admin.exchange.microsoft.com not found: 2(SERVFAIL)
Host
admin.exchange.microsoft.com not found: 2(SERVFAIL)
Host
admin.exchange.microsoft.com not found: 2(SERVFAIL)
Host
admin.exchange.microsoft.com not found: 2(SERVFAIL)
Host
admin.exchange.microsoft.com not found: 2(SERVFAIL)
Host
admin.exchange.microsoft.com not found: 2(SERVFAIL)
Host
admin.exchange.microsoft.com not found: 2(SERVFAIL)
Host
admin.exchange.microsoft.com not found: 2(SERVFAIL)
Host
admin.exchange.microsoft.com not found: 2(SERVFAIL)
Host
admin.exchange.microsoft.com not found: 2(SERVFAIL)
Host
admin.exchange.microsoft.com not found: 2(SERVFAIL)
^C
# curl
http://localhost:8452/trace/admin.exchange.trafficmanager.net
[iterat][66079.00] 'admin.exchange.trafficmanager.net.' type 'A' new uid
was assigned .01, parent uid .00
[cache ][66079.01] => no NSEC* cached for zone:
trafficmanager.net.
[cache ][66079.01] => skipping zone:
trafficmanager.net., NSEC, hash 0;new TTL
-123456789, ret -2
[cache ][66079.01] => skipping zone:
trafficmanager.net., NSEC, hash 0;new TTL
-123456789, ret -2
[zoncut][66079.01] found cut:
trafficmanager.net. (rank 010 return codes: DS 1, DNSKEY
1)
[resolv][66079.01] => NS is provably without DS, going insecure
[select][66079.01] => id: '03790' choosing from addresses: 2 v4 + 0 v6; names
to resolve: 2 v4 + 4 v6; force_resolve: 0; NO6: IPv6 is KO
[select][66079.01] => id: '03790' choosing:
'tm2.dns-tm.com.'(a)'150.171.16.240#00053' with timeout 400 ms zone cut:
'trafficmanager.net.'
[resolv][66079.01] => id: '03790' querying:
'tm2.dns-tm.com.'(a)'150.171.16.240#00053' zone cut:
'trafficmanager.net.' qname: 'exchange.trafficmanager.net.' qtype:
'NS' proto: 'udp'
[resolv][66079.00] request failed, answering with empty SERVFAIL
[resolv][66079.01] finished in state: 8, queries: 0, mempool: 81952 B
/PM