Hi,
until now I had 3 secondaries running and a hidden primary. This ran perfectly well.
Now, I'd like to add some fallback functionality to deal with a potential longer downtime of my hidden primary. Thus I added two more hidden primaries such that now every host (3) has a hidden primary that can serve every secondary at all hosts. But: Only one should be active! Zone and database data will frequently be rsynced to both inactive primaries. If there would be a downtime I will have to start one of the others to continue.
According to my understanding of https://www.knot-dns.cz/docs/3.5/html/configuration.html#secondary-slave-zo… I have been in the naive understanding that a configuration like ...
remote:
- id: primaryMWN # MWN hidden primary (running)
address: 10.0.1.203@5333
- id: primaryKBN # KBN hidden primary (not running, standby)
address: 10.0.2.203@5333
- id: primaryEDN # EDN hidden primary (not running, standby)
address: 10.0.3.203@5333
template:
- id: default
master: [primaryMWN, primaryKBN, primaryEDN] # queried in that order
… would work, because of:
"Note that the master option accepts a list of remotes, which are queried for a zone refresh sequentially in the specified order. When the server receives a zone change notification from a listed remote, only that remote is used for a subsequent zone transfer."
But I do get error massages like:
edn.ellael.lan (ns3) knot[29856]: warning: [ellael.org.] refresh, remote primaryKBN not usable
edn.ellael.lan (ns3) knot[29856]: info: [ellael.org.] refresh, remote primaryEDN, address 10.0.3.203@5333, failed (connection reset)
edn.ellael.lan (ns3) knot[29856]: warning: [ellael.org.] refresh, remote primaryEDN not usable
edn.ellael.lan (ns3) knot[29856]: error: [ellael.org.] refresh, failed (no usable master), next retry at 2026-04-27T19:03:03+0200, expires in 1119353 seconds
edn.ellael.lan (ns3) knot[29856]: error: [ellael.org.] zone event 'refresh' failed (no usable master)
If I do use "master: primaryMWN" only, everything runs as expected.
I must have misunderstood something ...
Ok, I will have to modify all remaining secondary's knot.conf files if desaster strikes and another primary has to take over.
BTW: I wanted to omit a multi primary setup as mentioned in https://www.knot-dns.cz/docs/3.5/singlehtml/#multi-primary because I do have the feeling that this is some sort of overkill for hosting 5 domains, only ;-)
Are there other ways to achieve my goal? ;-)
Thanks and regards,
Michael
Hi,
Fastmail has been running Knot for a few years now. Thank you for such excellent software!
I'm new to this list, and new to the Knot codebase, but I'm an experienced C developer and have been working on Cyrus IMAPd (a mostly C codebase) for many years.
We have hundreds of thousands of domains, and currently they all have the same set of service IPs compiled into them. This has generally been fine - setting up a new server takes an hour or so to build all the domains, but we just wait until it's done then bring it into rotation.
Our current challenge -- we want to be able to transfer everything to a new IP range quickly for datacenter failover. Rebuilding every zone is too expensive for this. I looked at a few different issues and (along with Claude) figured that it wasn't much work to extend the ALIAS type to follow the pointer to another zone inside the same server and return the records from that. I have an initial pass at:
https://github.com/fastmail/knot-dns/tree/local-alias-synth
For now I've kept it as separate commits showing the evolution of the idea as I've tested it more and thought through how I want it to interact (basically any ALIAS get substituted with the contents of the name it points to, so you can mix and match them all sorts of ways).
I'm very happy to engage on testing and modifying this code to match what the upstream project wants; or revisit the approach if this doesn't match your vision. I just need something that has these properties, and this seemed a good way to get there.
Thanks,
Bron.
Hello,
In this other use-case, described in the thread "IXFR commit time scaling", there was a reply refering to
https://www.knot-dns.cz/docs/3.5/singlehtml/#signing-threads and
https://www.knot-dns.cz/docs/3.5/singlehtml/#adjust-threads
Which made me wonder...
a] you can have an external networked HSM, which sounds promising to speed up signing a lot...
b] nowadays you also have even 128 core processers, even mutiple CPU slots, which sounds as a immense boost for co-proccesing...
c] you could combine those...
Clinical data would propably be hard, but hypothetical/esitimated;
what would be wise/pointless/smart/insane to increase signing of large zones?
I'd expect that RAM speed is a major factor also.
What would be an ideal setup today?
--
With kind regards,
Met vriendelijke groet,
Mit freundlichen Grüß,
Leo Vandewoestijne
<***(a)dns.company>
<www.dns.company>
Hi Knot team,
I'm running Knot as an Auth secondary receiving IXFR from a BIND 9 primary.
To isolate bottlenecks I've stripped the config down as far as I know how.
Here's what I'm using.
zonefile-sync: -1
zonefile-load: none
journal-content: none
There is no DNSSEC or any downstream IXFR serving happening. Logs are
confirming that it is genuine IXFR and no signs of any AXFR fallback.
"semantic-checks" is off, and knotd is linked against jemalloc. I'm really
trying to make this as quick as possible by avoiding the disk.
The pattern:
IXFR processing time scales roughly proportionally with total zone size,
even when the changeset is small, for example, a few hundred RRs out of
several hundred thousand.
There is what appears to be a full zone walk on every IXFR commit in the
adjust logic, with single threaded execution due to parent befroe child
ordering requirements. Although I'd want your confirmation before reading
too much into it.
Questions:
1. With journal-content: none, does IXFR apply trigger a full in-memory
tree walk of the QP-trie, rather than an isolated incremental record-level
update? If so, is that a necessary consequence of running without a journal
to maintain state?
2. For a secondary with no NSEC/NSEC3, no wildcards or any downstream
IXFR'ing, could a "lightweight secondary" mode bypass post-apply
bookkeeping that might only be targetted to primaries and signers?
3. Could it rewalk only subtrees where adds or removes happen to their
ancestors, rather than the full zone? If NSEC is absent, is the prev
pointer chain actually used at query time, or can it be skipped entirely?
Our use case is secondary-only, with large zones and high frequency
updates. We're hoping there is something on the configuration or roadmap
side that might help, and ultimately not sure if we're just bumping up
against a realistic constraint.
Thanks for the great software btw, loving it.
Thanks!
Hi all,
I just set up catalog zones for the first time. I'm using a conf file
with my list of zones. After creating a catalog zone and adding member
zones to it, I executed 'knotc reload'. The catalog zone then appeared
in the output of 'zone-status', and member zones were listed with the
catalog zone name. However, 'zone-read <catalog.zone>' showed no PTR
records. I tried 'zone-reload <catalog.zone>', updated serials on the
member zones and such, but the catalog zone remained empty until knotd
was restarted. I saw this behavior on both 3.4.4 and 3.5.2. Is this
the intended behavior? Is there a way to generate the catalog without
restarting knotd?
Thanks in advance,
Bill
Greetings,
I have tried to use QUIC in zone transfering, I met one error in on
bigger zone,
from master's log, it displayed,
2026-01-04T17:32:19+0800 debug: [foo.] ACL, allowed, action transfer,
remote 10.0.0.147@60880 QUIC cert-key
xJKsDkUqpl6orXeTwsrDgDvgZ/PiYxOSVlOkVdn5EOU=
2026-01-04T17:32:19+0800 info: [foo.] IXFR, outgoing, remote
10.0.0.147@60880 QUIC, incomplete history, serial 2026010403, fallback
to AXFR
2026-01-04T17:32:19+0800 debug: [foo.] ACL, allowed, action transfer,
remote 10.0.0.147@60880 QUIC cert-key
xJKsDkUqpl6orXeTwsrDgDvgZ/PiYxOSVlOkVdn5EOU=
2026-01-04T17:32:19+0800 info: [foo.] AXFR, outgoing, remote
10.0.0.147@60880 QUIC, started, serial 2026010404
2026-01-04T17:32:20+0800 info: [foo.] AXFR, outgoing, remote
10.0.0.147@60880 QUIC, buffering finished, 0.87 seconds, 7390 messages,
124493148 bytes
2026-01-04T17:32:20+0800 notice: QUIC, terminated connections, outbuf
limit 1
on the slave side, I got log as,
2026-01-04T17:32:18+0800 info: [foo.] zone file loaded, serial 2026010403
2026-01-04T17:32:19+0800 info: [foo.] loaded, serial none -> 2026010403,
92000117 bytes
2026-01-04T17:32:19+0800 info: [foo.] refresh, remote 10.0.0.151@853,
remote serial 2026010404, zone is outdated
2026-01-04T17:32:19+0800 info: server started
(and, the knotd on slave will down without log.)
Thanks in advance.
My testing environment is,
the zone size is 1,000,000 x ( 2 NS + 2 A ), such as,
domain00000000 3600 NS ns1.domain00000000
3600 NS ns2.domain00000000
ns1.domain00000000 3600 A 10.0.0.1
ns2.domain00000000 3600 A 10.0.0.2
...
domain00999999 3600 NS ns1.domain00999999
3600 NS ns2.domain00999999
ns1.domain00999999 3600 A 10.0.0.1
ns2.domain00999999 3600 A 10.0.0.2
If I decrease the record number to 500,000 x ( 2 NS + 2 A ), the zone
could be transfer with QUIC successfully.
For traditional TCP and TLS, the zone transfer is processed without
error, even for more large size.
Version in master and slave are both 3.5.2, installed from copr.
OS in both side is Rocky9 x86_64.
Best Regards,
SUN Guonian
Greetings,
If I do not configure a "notify" statement in the zone section, I notice
that Knot DNS still sends NOTIFY messages to all servers in the NS records.
How can I disable NOTIFY messages on a server that is at the end of the
zone transfer link (e.g., a stealth or receiving-only secondary)?
Best Regards,
SUN Guonian
Hello Knot DNS users,
Knot DNS supports TCP Fast Open (when configured) in both the server and client roles for several years.
However, we have not observed any performance or other improvements from this technology so far. Since
removing it would simplify the code, I'm considering dropping the support for it. Is there anyone who would
miss TFO in Knot DNS?
For better XFR efficiency between Knots, https://www.knot-dns.cz/docs/latest/singlehtml/index.html#remote-pool-limit
works much better.
Thanks,
Daniel
Greetings,
One thing I’m not sure about is exactly what happens when we run `knotc zone-ksk-submitted`?
Our parent zones don’t support CDS/CDNSKEY, so we manually update DS records and then
run `knotc zone-ksk-submitted`. It seems to me that as soon as we run it, the retiring of the outgoing
KSK key starts and it’s removed from the DNSKEY RRset. Is that correct?
I’d like to be sure, because as it is, I wait at least the TTL of the DS record before running zone-ksk-submitted,
if I run it right away and knot removes the key immediately from the DNSKEY RRset, then caching resolvers
will invalidate the zone.
The docs for knotc say:
Use when the zone's KSK rollover is in submission phase. By calling this command the user confirms manually that the parent zone contains DS record for the new KSK in submission phase and the old KSK can be retired. (#)
Reading the docs, I would think I should run zone-ksk-submitted as soon as the new DS record has been
published in the parent, but then knot would need to know to wait for the TTL of the DS record before
removing the key.
Should I wait before running zone-ksk-submitted, or is there a config option I’m missing to tell knot
the ds ttl?
.einar
Hi, I'm having issues with ACL's and DNS updates and multiple DNS servers.
I use DNS-01 for TLS letsencrypt and AXFR between the servers, but for some reason acme.sh is not working for a new server with "NOTAUTH" failures.
(I give acme.sh export KNOT_SERVER='souseiseki.middlendian.com' so that it uses the master and that works fine on the master and another server, but for some reason on the new one there is an ACL
related failure?);
[Thu 04 Dec 2025 21:53:49 AEDT] Adding _acme-challenge.middlendian.com. 60 TXT "<snip>"
;; ->>HEADER<<- opcode: UPDATE; status: NOTAUTH; id: 42945
;; Flags: qr; ZONE: 1; PREREQ: 0; UPDATE: 0; ADDITIONAL: 1
;; ZONE SECTION:
;; middlendian.com. IN SOA
;; ADDITIONAL DATA:
;; TSIG PSEUDOSECTION:
acme_key. 0 ANY TSIG hmac-sha512. 1764845629 300 64 <snip> 42945 NOERROR 0
;; ERROR: update failed with error 'NOTAUTH'
knsupdate works with the set key to the master;
knsupdate
knsupdate> server souseiseki.middlendian.com
knsupdate> key hmac-sha512:acme_key:<snip>
knsupdate> zone middlendian.com.
knsupdate> add test.middlendian.com. 300 TXT test
knsupdate> send
knsupdate> answer
But, the ACL seems to have problems, as DNS updates fail if attempted via any secondary server?;
knsupdate
knsupdate> server hinaichigo.middlendian.com
knsupdate> key hmac-sha512:acme_key:<snip>
knsupdate> zone middlendian.com.
knsupdate> del test.middlendian.com TXT
knsupdate> send
;; ->>HEADER<<- opcode: UPDATE; status: NOTAUTH; id: 14970
;; Flags: qr; ZONE: 1; PREREQ: 0; UPDATE: 0; ADDITIONAL: 1
;; ZONE SECTION:
;; middlendian.com. IN SOA
;; ADDITIONAL DATA:
;; TSIG PSEUDOSECTION:
acme_key. 0 ANY TSIG hmac-sha512. 1764844872 300 64 <snip> 14970 NOERROR 0
;; ERROR: update failed with error 'NOTAUTH'
Knot on souseiseki outputs an error to syslog; "ACL, denied, action update, remote 125.63.60.124@38966 TCP"
but that isn't helpful debug output, as it does not say why the ACL was denied.
IP address related matching could be a problem, but reviewing the documentation, it seems to state that IP addresses are not considered in ACL's unless listed in the ACL?
Does anyone know what the issue is and otherwise how do I debug it?
All the servers have the same ACL key set;
/etc/knot/acme.key;
key:
- id: acme_key
algorithm: hmac-sha512
secret: <snip>
souseiseki (master);
remote:
- id: suigintou
address: [ 180.150.27.133@53, 2403:5806:e8d0::dead:beef:cafe@53 ]
- id: hinaichigo
address: 125.63.60.124@53
include: "/etc/knot/acme.key"
acl:
- id: acme_acl
key: acme_key
action: update
zone:
- domain: middlendian.com
dnssec-signing: on
acl: acme_acl
notify: [ suigintou, hinaichigo ]
hinaichigo (secondary);
remote:
- id: master
address: 144.6.197.157@53
acl:
- id: acme_acl
key: acme_key
action: [update, notify]
zone:
- domain: middlendian.com
master: master
acl: acme_acl
--
Kind Regards, DiffieHellman
Hi,
In our setup, we have one active signer and one backup signer. Both use
softhsm, but only the active signer does automatic key management.
There is an hourly cron job that syncs keys from active to backup signer.
It runs knotc zone-backup on the active signer, only backing up the kaspdb.
It then syncs the files over to the secondary and runs knotc zone-restore.
This has been running for a few years now without problems.
These last two weeks we’ve been performing algorithm rollovers for
some of our zones, and after we run `knotc zone-ksk-submitted nic.is`
we start seeing these errors when the zone-restore is run on the backup:
error: [nic.is.] zone event 'backup/restore' failed (already exists)
warning: [nic.is.] zone restore failed (already exists)
warning: [nic.is.] restore, key copy failed (already exists)
I searched the knot dns source code, but couldn't find where these
errors are output. Like I said, we’ve been running like this for a few
years, doing regular ZSK rollovers, and a few KSK rollovers, without
problems. There’s something about the algorithm rollover that
causes this problem with our setup.
I assume I can just delete the keys on the secondary and sync again,
but I want to understand what causes these errors so we can avoid them
or at best document them in our process.
.einar
debian 12
knot 5
so i believe i have a server with antique data in cache. the net of a
million lies says use `kresctl clear foo` but i can not find kresctl.
clue bat please
randy
Hello,
I upgraded my signing server to Debian 13, but I have a problem with my HSM :
Oct 15 21:09:18 arrakeen knotd[29552]: error: [durel.org.] zone event 'load' failed (PKCS #11 token not available)
Oct 15 21:09:18 arrakeen knotd[29552]: error: [geekwu.org.] zone event 'load' failed (PKCS #11 token not available)
keymgr gives me the same error :
# keymgr geekwu.org list
error: failed to initialize KASP (PKCS #11 token not available)
despite hsmwiz being able to access the key :
# hsmwiz identify
Using reader with a card: Nitrokey Nitrokey HSM (DENK01067960000 ) 00 00
Version : 3.4
Config options :
User PIN reset with SO-PIN enabled
SO-PIN tries left : 15
User PIN tries left : 3
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Default SO-PIN: 3537363231383830 Default PIN: 648219
Now executing: pkcs15-tool --dump
Using reader with a card: Nitrokey Nitrokey HSM (DENK01067960000 ) 00 00
PKCS#15 Card [knot]:
Version : 0
Serial number : DENK0106796
Manufacturer ID: www.CardContact.de
Flags : PRN generation[...]
Public EC Key [Private Key]
Object Flags : [0x00]
Usage : [0x140], verify, derive
Access Flags : [0x02], extract
FieldLength : 384
Key ref : 0 (0x00)
Native : no
ID : 74f59bc17317bfccc5806108d84df1abd275faef
DirectValue : <present>
Knot is using this keystore :
keystore:
- id: nitrokey
backend: pkcs11
config: "pkcs11:pin-value=*** /usr/lib/x86_64-linux-gnu/opensc-pkcs11.so"
I verified /usr/lib/x86_64-linux-gnu/opensc-pkcs11.so still exists, and ldd doesn't report any missing dependency
strace let me see communication with pcscd, whose logs have these :
Oct 15 21:20:14 arrakeen systemd[1]: Started pcscd.service - PC/SC Smart Card Daemon.
Oct 15 21:20:20 arrakeen pcscd[33186]: 00000000 ../src/auth.c:166:IsClientAuthorized() Process 33204 (user: 134) is NOT authorized for action: access_pcsc
Oct 15 21:20:20 arrakeen pcscd[33186]: 00000071 ../src/winscard_svc.c:357:ContextThread() Rejected unauthorized PC/SC client
After a bit of digging, I found it's controlled by polkit, and added a brutal rule :
cat /etc/polkit-1/rules.d/pcsc.rules
/* -*- mode: js; js-indent-level: 4; indent-tabs-mode: nil -*- */
polkit.addRule(function(action, subject) { if (subject.isInGroup("pcsc")) { return polkit.Result.YES; } })
with knot added to the pcsc group, it can access the HSM again.
Do you know of a better way to configure ?
NB: I'm using another account, as I began to write this with no DNS server running
Regards,
--
Bastien Durel
Hi
We recently tried to upgrade to knot 3.5.0, but ran into a problem. It appears zones added via `conf-set include` are not working until knot is reloaded.
So to reduce calls to knotc when inserting a number of domains, we build a config fragment and then use `knotc conf-set include fragment.conf` to load it
With 3.4.8 this worked fine. For example:
# /opt/knot/sbin/knotc -C /local/knot_dns/conf/ -s /run/knot_dns/knot_dns.sock status version
3.4.8
# dig +short foo.com @10.37.129.215 SOA
# cat > /local/knot_dns/zones/foo.com.zone <<EOF
foo.com. 3600 IN SOA ( ns1.fastmaildev.com.
postmaster.fastmaildev.com.
2025091802 ;serial
86133 ;refresh
600 ;retry
1209600 ;expire
3600 ;minimum
)
foo.com. 3600 IN NS ns1.fastmaildev.com.
foo.com. 3600 IN NS ns2.fastmaildev.com.
EOF
# cat > /tmp/zone.conf <<EOF
zone:
- domain: foo.com
template: "default"
EOF
# /opt/knot/sbin/knotc -C /local/knot_dns/conf/ -s /run/knot_dns/knot_dns.sock conf-begin
OK
# /opt/knot/sbin/knotc -C /local/knot_dns/conf/ -s /run/knot_dns/knot_dns.sock conf-set include /tmp/zone.conf
OK
# /opt/knot/sbin/knotc -C /local/knot_dns/conf/ -s /run/knot_dns/knot_dns.sock conf-commit
OK
# dig +short foo.com @10.37.129.215 SOA
ns1.fastmaildev.com. postmaster.fastmaildev.com. 2025091802 86133 600 1209600 3600
As you can see, immediately after the `conf-commit`, the zone can be queried via dig.
However this doesn't work in 3.5.0.
# /opt/knot/sbin/knotc -C /local/knot_dns/conf/ -s /run/knot_dns/knot_dns.sock status version
3.5.0
# dig +short foo2.com @10.37.129.215 SOA
# cat > /local/knot_dns/zones/foo2.com.zone <<EOF
foo2.com. 3600 IN SOA ( ns1.fastmaildev.com.
postmaster.fastmaildev.com.
2025091802 ;serial
86133 ;refresh
600 ;retry
1209600 ;expire
3600 ;minimum
)
foo2.com. 3600 IN NS ns1.fastmaildev.com.
foo2.com. 3600 IN NS ns2.fastmaildev.com.
EOF
# cat > /tmp/zone.conf <<EOF
zone:
- domain: foo2.com
template: "default"
EOF
# /opt/knot/sbin/knotc -C /local/knot_dns/conf/ -s /run/knot_dns/knot_dns.sock conf-begin
OK
# /opt/knot/sbin/knotc -C /local/knot_dns/conf/ -s /run/knot_dns/knot_dns.sock conf-set include /tmp/zone.conf
OK
# /opt/knot/sbin/knotc -C /local/knot_dns/conf/ -s /run/knot_dns/knot_dns.sock conf-commit
OK
# dig +short foo2.com @10.37.129.215 SOA
# /opt/knot/sbin/knotc -C /local/knot_dns/conf/ -s /run/knot_dns/knot_dns.sock zone-status foo2.com
error: [foo2.com] (no such zone found)
# /opt/knot/sbin/knotc -C /local/knot_dns/conf/ -s /run/knot_dns/knot_dns.sock zone-reload foo2.com
error: [foo2.com] (no such zone found)
# /opt/knot/sbin/knotc -C /local/knot_dns/conf/ -s /run/knot_dns/knot_dns.sock zone-check foo2.com
# /opt/knot/sbin/knotc -C /local/knot_dns/conf/ -s /run/knot_dns/knot_dns.sock reload
Reloaded
# dig +short foo2.com @10.37.129.215 SOA
ns1.fastmaildev.com. postmaster.fastmaildev.com. 2025091802 86133 600 1209600 3600
# /opt/knot/sbin/knotc -C /local/knot_dns/conf/ -s /run/knot_dns/knot_dns.sock zone-status foo2.com
[foo2.com.] role: master | serial: 2025091802
As you can see, after the `conf-commit` the zone isn't visible in knot at all, either via dig or even via knotc commands `zone-status` or `zone-reload`. However immediately after a knot server `reload`, it does become visible.
This feels like a bug and regression in 3.5.0 to me, or am I holding something wrong?
Rob Mueller
robm(a)fastmail.com
I am definitely interested in examples!
Reading up on groups, is it that 'group A' may represent 'customer A'
and have a specific set of Primary/Master nameservers, Group B ==
Customer B, different Primary, and so on?
(also fixed to be plain text - hope this is more legible in the archives)
--Chris
Hello DNS people,
I am exploring migrating from PowerDNS where we have a hidden primary (ns0)
and two public resolvers (ns1/ns2) using SQL replication, to instead use
Knot DNS for ns1/ns2 and Catalog zones to update them. ns0 would remain
Powerdns (frontend, zone edits for customers, etc). We are looking at
changing due to performance issues - "dns water torture" or "random
subdomain attacks" or whatever we're calling this these days.
Our test environment is more or less setup as listed here:
* https://nick.bouwhuis.net/posts/2024-12-31-catalog-zones-powerdns-knot/
This is similar to the architectured listed here:
*
https://indico.dns-oarc.net/event/47/contributions/1008/attachments/963/185…
(Klaus from nic.at)
For some zones, we're secondary to a customer's zone. In this case the
Primariy IPs are listed in PowerDNS metadata. I am trying to wrap my head
around how this could work seamlessly, where we keep the same workflow -
add the zone to PowerDNS, then it gets replicated with catalog zone to
ns1/ns2 (knot). Does anyone have this working? Secondary is mentioned in
the PDF above but no details about that are listed.
The issues appear to be at least these two things:
1) How to tell ns1/ns2 (knot) which IP's are its primaries in these zones?
The only thing I can think of is a separate script to generate a knot
config file with this info - effectively the same as "back in the day" with
BIND. This completely negates the function of catalog zones that are
secondaries. rfc9432 does address this:
"Catalog zones on secondary name servers would have to be set up manually,
perhaps as static configuration, similar to how ordinary DNS zones are
configured when catalog zones or another automatic configuration mechanism
are not in place. "
That RFC then says you still have to keep it in the catalog anyhow - it's
not immediately clear to me how/why - and how it could be configured per
the lasts sentence (manually in knot conf) as well as in the catalog -
wouldn't this be two declarations of the same zone?
"Additionally, the secondary needs to be configured as a catalog consumer
for the catalog zone to enable processing of the member zones in the
catalog, such as automatic synchronization of the member zones for
secondary service"
2) How would NOTIFY work? our hidden ns0 (powerdns) runs a copy of the
zones, but ns1/ns2 would be notified from the actual primary, and our ns0
would become out of date. Does knot have something like also-notify to
always notify that server? This may or may not be a problem, but the zone
data would completely become stale without this. Some customers log into
our web portal to view records of their secondaries and expect them to
match.
If anyone has operational experience with this or just a big cluebat to hit
me with - let me know.
Cheers,
Chris
Hi
We're noticing that as our list of zones gets larger (about 480k right now), adding a new zone or deleting an existing zone seems to continue to get slower. We are always doing our modifications as part of a transaction, and the time appears to occur in the commit phase.
An example timing.
# time /opt/knot/sbin/knotc ... conf-begin
OK
real 0m0.010s
user 0m0.000s
sys 0m0.010s
# time /opt/knot/sbin/knotc ... conf-unset zone.domain example.com
OK
real 0m0.010s
user 0m0.000s
sys 0m0.010s
# time /opt/knot/sbin/knotc ... conf-commit
OK
real 0m2.330s
user 0m0.000s
sys 0m0.009s
#
As you can see, it took > 2 seconds to commit the transaction that removes just the example.com zone. Similarly, it takes > 2 seconds to commit the transaction that adds the zone back.
Given the time is real time and not sys/user, I presume knotc is waiting on knotd to complete the work. I used perf to record a CPU profile of knotd while the commit was running, but nothing hugely stuck out at me.
10.75% knotd libc.so.6 [.] __memcmp_avx2_movbe ◆
6.03% knotd knotd [.] __popcountdi2 ▒
5.89% knotd knotd [.] ns_first_leaf ▒
5.25% knotd libc.so.6 [.] pthread_mutex_lock@@GLIBC_2.2.5 ▒
3.85% knotd liblmdb.so.0.0.0 [.] 0x0000000000003706 ▒
3.72% knotd knotd [.] ns_find_branch.part.0 ▒
2.76% knotd knotd [.] trie_get_try ▒
2.63% knotd liblmdb.so.0.0.0 [.] 0x00000000000069d2 ▒
2.34% knotd libknot.so.14.0.0 [.] knot_dname_lf ▒
1.92% knotd liblmdb.so.0.0.0 [.] mdb_cursor_get ▒
1.72% knotd knotd [.] create_zonedb ▒
1.68% knotd knotd [.] twigbit.isra.0 ▒
1.68% knotd knotd [.] catalogs_generate ▒
1.36% knotd knotd [.] twigoff.isra.0 ▒
1.28% knotd knotd [.] hastwig.isra.0 ▒
1.28% knotd knotd [.] db_code ▒
1.27% knotd libknot.so.14.0.0 [.] find_item ▒
1.11% knotd libknot.so.14.0.0 [.] knot_dname_size ▒
1.04% knotd knotd [.] zonedb_reload ▒
0.99% knotd libc.so.6 [.] _int_free ▒
0.99% knotd liblmdb.so.0.0.0 [.] 0x0000000000003ce8 ▒
0.96% knotd liblmdb.so.0.0.0 [.] memcmp@plt ▒
0.95% knotd liblmdb.so.0.0.0 [.] mdb_cursor_open ▒
0.88% knotd libc.so.6 [.] malloc ▒
0.88% knotd knotd [.] conf_db_get ▒
0.87% knotd knotd [.] ns_next_leaf ▒
0.82% knotd libknot.so.14.0.0 [.] iter_set ▒
0.75% knotd knotd [.] evsched_cancel ▒
0.73% knotd libknot.so.14.0.0 [.] find ▒
...
Our config is pretty simple, conf-export looks like:
server:
rundir: "/local/knot_dns/run/"
user: "nobody"
pidfile: "/local/knot_dns/run/knot.pid"
listen: [ ... ]
log:
- target: "syslog"
any: "info"
statistics:
timer: "10"
file: "/tmpfs/knot_dns_stats.yaml"
database:
storage: "/local/knot_dns/data"
mod-stats:
- id: "default"
request-protocol: "on"
server-operation: "on"
request-bytes: "on"
response-bytes: "on"
edns-presence: "on"
flag-presence: "on"
response-code: "on"
request-edns-option: "on"
response-edns-option: "on"
reply-nodata: "on"
query-type: "on"
query-size: "on"
reply-size: "on"
template:
- id: "default"
global-module: "mod-stats/default"
storage: "/local/knot_dns/zones/"
zone:
- domain: "example.com."
template: "default"
... 478,000 more domains all the same ...
Current files on disk are:
# ls -l /local/knot_dns/data/*
/local/knot_dns/data/catalog:
total 0
/local/knot_dns/data/journal:
total 0
/local/knot_dns/data/keys:
total 0
/local/knot_dns/data/timers:
total 75880
-rw-rw---- 1 root root 77697024 Jun 24 09:26 data.mdb
-rw-rw---- 1 root root 2432 Jul 17 01:05 lock.mdb
/local/knot_dns/data/timing:
total 0
This machine is not slow or constrained in any way. It's 24 core, 3.6Ghz, 64Gb, NVMe drives, etc. Load is very low (<1) with plenty of free resources.
So what I'm wondering is:
1. Is this normal? It doesn't feel right that adding/removing a single domain takes > 2 seconds regardless of the size of the existing zone database
2. Is there any way to improve this? Doing multiple adds/deletes at once within a transaction works and we do that where we can, but there are cases where we can't do that and I'd really like to understand why this is as slow as it is.
Thanks in advance
Rob
Hello,
I'm not sure I'm posting in the right place. Don't hesitate to tell me
if it's not.
I began to test the use of Knot Resolver 6.x for a future project
(deploying a DNS resolver with blocked domains lists).
I would like to know if it's possible to split the config.yaml in
several files (the main config in one file, acl and views section in
another, data-local section with rpz lists and tags to rely acl lists to
blocklists in another), and if the answer is yes, how can I do ?
Thank you for your help.
Regards,
Stephane
I ultimately found that I needed to use a CH in KDIG but wonder how I/We could update the documentation or add an alias for chaos.
Working Example:
dig +short @<dns server> version.bind chaos txt
dig +short @<dns server> version.bind CH txt
kdig +short @<dns server> version.bind CH txt
Problem Example:
kdig +short @<dns server> version.bind chaos txt
I had to read source code to find the answer in a speedy timeframe.
Hi,
for benchmark purpose I need to find out how log it takes to sign zone
files in different sizes with different hardware. What is the best way
to find out the exact time the signing process takes?
Thanks!
BR
Thomas
Hi,
when signing a zone file I receive this error in the log:
"zone event 're-sign' failed (not enough space provided)"
Can you tell me what is the limiting factor here?
Thanks!
BR
Thomas
Hello,
Could you please clarify whether Knot can perform a zone transfer not
from the first master listed, but from the one that sent the NOTIFY? The
masters are configured in the following order:
remote:
- id: master
address: [ "192.168.58.151", "192.168.58.134" ]
When a NOTIFY is sent from |192.168.58.134|, the zone transfer is still
performed from |192.168.58.151|.
Here are the relevant log entries:
Apr 25 19:09:25 ubuntu knotd[2065]: info: [chacha.com.] notify,
incoming, remote 192.168.58.134@32888 TCP, serial 2006
Apr 25 19:09:25 ubuntu knotd[2065]: info: [chacha.com.] refresh, remote
192.168.58.151@53, remote serial 2006, zone is outdated
Apr 25 19:09:25 ubuntu knotd[2065]: info: [chacha.com.] IXFR, incoming,
remote 192.168.58.151@53 TCP, receiving AXFR-style IXFR
Apr 25 19:09:25 ubuntu knotd[2065]: info: [chacha.com.] AXFR, incoming,
remote 192.168.58.151@53 TCP, started
Thank you for your product and for your help!
*Best regards,*
*A.A. Basov*
Hi,
this happened to me for the second time, that https://dnsviz.net <https://dnsviz.net/> tells me:
| enfer-du-nord.net/CDNSKEY: The CDNSKEY RRset must be signed with a key that is represented in both the
| current DNSKEY and the current DS RRset. See RFC 7344, Sec. 4.1.
| enfer-du-nord.net/CDS: The CDS RRset must be signed with a key that is represented in both the current
| DNSKEY and the current DS RRset. See RFC 7344, Sec. 4.1.
I do not understand what that means.
#) I haven't modified my KSK for some time now
#) I did notify my parent zone about a modified list of nameservers (via registrar's web portal)
I am not absolutely sure if the latter is the cause for these error messages.
I 'fixed' that issue by re-uploading my unmodified KSK DNSKEY (via registrar's web portal).
Hmm, how can I fix that issue the right way?
Any hints are highly welcome,
Michael
Hi,
given the case that a ip6/xy block might be delegated to me by my ISP, I began investigating Knot DNS' functionality with regard to ip6.arpa.
Hereby I stumbled over the module synthrecord and do not really understand what it is used for.
From https://www.knot-dns.cz/docs/3.4/singlehtml/index.html#synthrecord-automati…
"Records are synthesized only if the query can't be satisfied from the zone."
Please excuse my ignorance, but why would/should/must one return something else than the following for hosts not in the zone?
kbn> host 2001:dead:beef::1
Host 1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.f.e.e.b.d.a.e.d.1.0.0.2.ip6.arpa not found: 3(NXDOMAIN)
Any feedback is highly appreciated, thanks.
Regards,
Michael
Hi,
This is just a possible feature request. We’re planning on using Knot for user
hosted domains. To do that we’ll have to add and remove zones dynamically,
so we’ve enabled the config db.
What surprised us is that this means that the config file isn’t used at all anymore
(except you can use it to prime the config db).
As it is, we’ll have to embrace the config db, which makes our ansible playbook
more complicated. It’s easy to add a config file template in ansible, it’s more
complicated to issue `knotc conf-begin; knotc conf-set; knotc conf-commit` logic.
I wish knot was more like nsd, where you have the config file nsd.conf, but if
you add zones with `nsd-control addzone ….` it gets added to a seperate zonelist
file, which nsd reads on startup. It means we can have a static config file, but
still be able to add and delete zones dynamically.
nsd doesn’t have automatic DNSSEC key management and catalog zones in knot
are really easy to use, which is why we’re going with knot for this project. I just
wanted to lay it out there as an idea for the future :)
.einar
Hi Bill,
You can't mix a configuration database and a configuration file.
You can initialize a database with `knotc conf-init` or `knotc conf-import`.
Then if you start knotd, the database will be used for reading and persistent writing.
The configuration database format is a custom format using LMDB. It's not intended to be accessed externally!
Daniel
On 3/29/25 19:56, Billiam Crashkopf wrote:
> Hi all,
>
> I'm running Knot 3.4.4 on FreeBSD 14.2, installed from the distribution package repository.
>
> I'm trying to understand, is there a way to configure knotd using both the configuration file and the database? For example, can I use the configuration file for directives like listen and template, then use the configuration database for storing individual zone settings? So far, it appears that knotd only uses one or the other. Furthermore, there doesn't seem to be any command to explicitly load or flush configuration from the database. If I start with the config file, I can initialize the database, but upon stopping knotd nothing seems to be saved to the database. I also don't see any documentation on tooling to inspect the database on disk. What is the file format of the database?
>
> If anybody can help me to know more about how it works I would be most appreciative.
>
> Thanks,
>
> Bill
>
Hello!
In July 2024, in the Knot-DNS 3.3.8 release message, Daniel writes:
>I would like to ask users with hardware HSMs to send us the output of `keymgr <hsm_keystore_id> keystore-test`
>This will allow us to update https://www.knot-dns.cz/docs/latest/html/appendices.html#compatible-pkcs-11…
We're now running Knot 3.4.4 against a Thales HSM (I have no details of the
actual device/model in use at this time) and I see the following data:
$ keymgr -c etc/knot.conf thales keystore-bench
Keystore id 'thales', type PKCS #11, threads 1
Algorithm Sigs/sec
RSASHA256 33
ECDSAP256SHA256 27
ED25519 n/a
ED448 n/a
My first reaction was "hmm, that's slow".
Is there a list (above URL isn't it) of comparable results which I could show
the HSM operators and/or is anybody willing to share their data?
Thanks & regards,
-JP
Hi Vladimir,
I appreciate your response and it's great to know you validate by default.
I apologize for posting to the wrong list.
Best,
Henry
On Wed, Mar 12, 2025 at 9:14 AM Vladimír Čunát <vladimir.cunat(a)nic.cz>
wrote:
> Hello.
> On 10/03/2025 17.01, birgelee--- via knot-dns-users wrote:
>
> This ballot requires compliance with RFCs 4035 (specifically an implementation of a "security-aware" resolver as defined in Section 4) and 6840. To the best of my knowledge Knot would be a viable choice for conforming to this ballot particularly since there is a reference to RFCs 4035 in the config documentation and 6840 implements several key features of modern DNSSEC. Given the need for documentable compliance by CAs, a statement of intended support from the Knot team would be extremely helpful.
>
> This is about resolvers apparently, so we're slightly off-topic here, as
> we have a split knot-resolver-users(a)lists.nic.cz - but I expect this
> thread to be very short.
>
> Knot Resolver *does support* modern DNSSEC validation, as described in RFC
> 4035, 6840, and some others. And we validate by default, etc.
>
> --Vladimir
>
Hi all,
I know this topic has been dead for a bit, but I did want to specifically find out if Knot is intended to be compliant with DNSSEC RFCs 4035 and 6840. I ask because I am computer security researcher and I do a lot of work with the CA/Browser Froum. I recently proposed a draft ballot that would mandate all publicly-trusted web CAs validate DNSSEC:
https://github.com/cabforum/servercert/pull/571
This ballot requires compliance with RFCs 4035 (specifically an implementation of a "security-aware" resolver as defined in Section 4) and 6840. To the best of my knowledge Knot would be a viable choice for conforming to this ballot particularly since there is a reference to RFCs 4035 in the config documentation and 6840 implements several key features of modern DNSSEC. Given the need for documentable compliance by CAs, a statement of intended support from the Knot team would be extremely helpful.
Best,
Henry
https://henrybirgelee.com/
is there any guidance on using mod-rrl on a public server with a
moderate load, say 6kqps? we have rtfm, and remain unsure of
what we are doing. we want cookies, and therefore need to turn
rrl on. but with it turned on, we seem to drop a *lot* of
replies, a lot.
mod-rrl:
- id: default
rate-limit: 200
slip: 2
randy
I have static zones that are regenerated continually (every few
seconds) with a knotc zone-reload _zonename_ after.
Doing dig queries using ixfr= always used AXFR until I enabled per-zone
zonefile-load: difference
We saw logs periodically like:
warning: [foo.bar.example.com.] failed to update zone file (operation
not permitted)
This is because the files created are a different user than the knotd.
We are not doing automated signing nor dynamic updates.
This is with Knot 3.2.6.
1) Why would it try writing to the source zone file?
Anyways, we got rid of that warning with
zonefile-sync: -1
And the time to between the zone-reload and serving the new zone data
improved too.
I noticed with experimenting with journal-db-max-size and
journal-max-size the data.mdb was around 80% larger than the defined
size. As I removed and added records, the number of differences reduced
and I couldn't do IXFRs for many versions behind which I assume is
default journal-max-depth: 20. I believe this means the journaling was
dropping versions and does not require keeping at least 20 versions (not
a "minimum". I am fine with that.
I was able to trigger error like
error: [foo.bar.example.com.] zone event 'load' failed (not enough
space provided)
which I assume was due to journaling even when dropping old versions
still doesn't have enough space for most recent changes.
We are experimenting with increasing journal-max-size to work around
this.
It appears that the differences are fully handled in memory first before
updating to the journal file. I also see the journal file have large
jump in file size and never small increments.
2) Since I have no purpose to reuse a journal file for system restart or
recovery -- since I am building new zone files frequently, is there any
way to not use a journal file and just have knotd do this all in memory?
That way it is never writing to the journal every few seconds while I
continue to offer IXFR out support?
If not, maybe I will use memory file system for the journal files.
By the way, this is for around ten zones varying from around 1000
records to over 8 million records for the larger zones, They are
changing from a few records every few seconds to maybe a thousand
records every 30 seconds.
Thanks
I am testing IXFR for servers I did not install nor have easy access
to. version.server. says it is 3.2.6. I know there are IXFR changes
since then per the NEWS file and from git log. I don't see same
behavior on my different version different systems but they are also
configured differently.
The knot.conf zones are not configured with "zonefile-load: difference"
and the response effectively has the entire zone as if was AXFR and not
the changes. If I pass the IXFR SOA SERIAL to latest it has no changes
(answer has has the SOA only with same serial).
I used dnspython to output the response from doing IXFR queries (IXFR
question with SOA authoritative set with the serial in the query). I
noticed the output abruptly stops when "dig" doesn't stop.
So I used tcpdump many times to compare knot, named, and my other knot.
I found an odd behavior in this knot 3.2.6 response which dig ignores
and my dnspython fails.
After the expect record it has
1) OPT record with the requestors pay load size (class 1232) and edns rcode
and flags (all zeros ttl), then 00 rdlength and 00 rdata field.
2) then 28 bytes I don't understand such as:
40 11be dc80 0000 0101 fa00 0000 01
or
40 20be dc80 0000 0102 0300 0000 01
or
40 0fe1 6a81 0000 0102 0500 0000 00
or
12 8de1 6a81 0000 0100 9200 0000 00
3) then an IXFR record
following other labels ...
0363 6f6d 00 three characters "com" and end of domain
00 fb IXFR record type
00 01 INternet class
and then ends there, with NO ttl, rdlength, nor rdata.
4) followed by next label length, label ... etc with rrtype, class, ttl,
rdlength, rdata and so on.
This odd OPT, bytes I don't know, partial/broken IXFR record, may be
repeated a few times. I assume these were interspersed where IXFR's SOA
records should be.
I couldn't find an RFC that suggested using interspersed OPT nor IXFR
records. I find it odd that OPT record is in my ANSWER section.
I find it odd that the IXFR record is incomplete. And I don't know what
the other bytes are in-between.
This recognizable to anyone?
The IXFR works fine as seen with dig or when I use named as my
secondary but I assume the named is ignoring the junk parts too.
Hi Knots,
I use catalog zones to sync the set of zones my (hidden)master and slaves
handle. I'm trying to stop messing with zone files on my master, instead
switching exclusively to nsupdate (along with Tony Finch's nsdiff).
In my testing it seems updating the zone after adding it via a catalog is
not possible:
$ knotc zone-status dxld.at
[dxld.at.] role: master | serial: - | catalog: dxld.catalog. | re-sign: +9D15h6m14s
Yet the update fails:
$ knsupdate -y $SECRET <<EOF
> server ns0.dxld.at.
> zone dxld.at.
> add dxld.at. 3600 IN SOA ns0.dxld.at. hostmaster.dxld.at. 1 2m 5m 1w 5m
> send
update failed: SERVFAIL
Nothing is logged with `logging: any: debug` except a "ACL, allowed, action
update".
As soon as I create the zone on the server with zone{-begin,-set,-commit}
it starts working ofc. I guess this is just not supported, but is there a
good reason? I would find it quite convenient to do all my DNS ops over
port 53 without touching ssh ;-)
Thanks,
--Daniel
Hello!
I have an issue.
Knot is configured as a secondary server, and when receiving a zone, a
"trailing data" error occurs, preventing the zone from being loaded from
the primary server.
```
Jan 30 11:03:40 hostname knotd[5407]: info: [domain.com.] refresh, remote
50788646-db98-4caa-b26e-95b30a470796, address 1.2.3.4@53, failed (trailing
data)
```
The same warning appears when using the `kdig` utility:
```bash
kdig @1.2.3.4 domain.com AXFR > /tmp/domain.com
;; WARNING: malformed reply packet (trailing data)
;; WARNING: malformed reply packet (trailing data)
```
The issue occurs specifically with large zones. If the zone requires 2
messages to be received (e.g., `Received 32720 B (2 messages, 442
records)`), one warning appears. If it requires 3 messages (e.g., `Received
49083 B (3 messages, 878 records)`), two warnings appear.
However, if I place this zone (`/tmp/domain.com`) into `/var/lib/knot` and
then execute:
```bash
knotc reload
knotc zone-refresh domain.com
```
Knot successfully loads the zone.
Unfortunately, due to confidentiality, I cannot share the contents of the
zone. Additionally, I do not have precise information about the software
installed on the primary server. However, if BIND is used as the secondary
server, there are no issues. A regular `dig` command also does not return
any errors.
Is there any way to make Knot ignore the "trailing data" error and
successfully load the zone?
Thank you for your help!
Hello,
I have an issue with a behaviour change in knot 3.4.1.
Before 3.4.1, trying to send a conf-abort command using knotc to knot when there was no pending transaction properly returned an error, but since 3.4.1, instead of receiving an error, the connection hangs, and failed after a timeout.
Some digging shows that this is due to a change in commit 69328dd7799253978605f7dac29175945971e63f
Instead of returning and error as it should, ctl_process skip the command processing when it does not expect a conf-abort command.
Is this a bug, or is this behaviour intended ?
Just to give you some context about my use case, I wrote a daemon that is using libknot to sync the dns configuration, and as knot does not supports multiple transaction, it has to make sure there is no dangling transaction before trying to apply changes (in case the daemon did crash while applying a previous change). Until 3.4.1, it did that by simply sending a conf-abort before starting the new transaction.
Thanks
Hi Guys,
a happy new year to all of you!
Due to policy reasons we need to make knot use a HSM in the future. Is
anybody successfully using some cloud based HSM services like Google
Cloud HSM for DNSSEC signing?
Any information is helpful, thanks!
BR
Thomas
Hello,
My knot 3.4.3 gives me following notice :
notice: config, policy 'rail_policy' depends on default nsec3-salt-length=8, since version 3.5 the default becomes 0
In order to avoid problems when .5 will arrive, I see 2 possibilities:
* add an explicit nsec3-salt-length=8 to my policy
* add an explicit nsec3-salt-length=0 to my policy and resign the
zone.
From https://www.ietf.org/archive/id/draft-ietf-dnsop-nsec3-guidance-10.html#nam…
I understand that 0 should be the new configuration, but what are the
risks (considering eg. DNS caches) if I change the policy of the zone?
I only have small zones, with very few dynamic changes, which I can
delay for the time of the TTL if needed.
--
Erwan David