Hi!
I have tested with GnuTLS 3.8.4, the knotd process on master don't crash.
Thank you!
Best Regards,
SUN Guonian
在 2026/1/8 16:05, Daniel Salzman via knot-dns-users 写道:
Hi,
It seem this GnuTLS fix
https://gitlab.com/gnutls/gnutls/-/commit/f979aa3d0fcf1c79accc038b6599382ef…
solves the crashing Knot DNS master over QUIC. At first sight, the fix
is included in GnuTLS 3.8.4.
Daniel
On 1/6/26 04:33, SUN Guonian wrote:
>
> 在 2026/1/5 23:10, Daniel Salzman via knot-dns-users 写道:
>> Hi,
>>
>> On 1/5/26 10:54, SUN Guonian wrote:
>>>
>>> 在 2026/1/5 16:30, Daniel Salzman via knot-dns-users 写道:
>>>> Hi!
>>>>
>>>> On 1/5/26 08:06, SUN Guonian wrote:
>>>>> It take effect to increase quic-outbuf-max-size, the transfer
>>>>> works on architecture x86_64.
>>>>>
>>>>> but on aarch64, there is a lot difference,
>>>>
>>>> Are there the same number of background workers on both
>>>> architectures?
>>>
>>> background worker is the default value,
>>>
>>> on x86_64, it is 2 for both master & slave.
>>>
>>> on aarch64, master is 48, slave1(10.1.136.156) is 48,
>>> slave2(10.1.136.159, virtual host) is 24.
>>>
>>> I have decreased it to 2 on the master, both slave could get the zone.
>>
>> This is interesting, because the number of background workers
>> shouldn't matter. The output buffer limit is per worker.
>>
>>>
>>>>
>>>>>
>>>>> 1. I increase quic-outbuf-max-size from default 100M to 3200M,
>>>>> doubled each time, it still produce,
>>>>>
>>>>> 2026-01-05T14:47:49+0800 info: [foo.] AXFR, outgoing, remote
>>>>> 10.1.136.156@37936 QUIC, started, serial 2025123113
>>>>> 2026-01-05T14:47:49+0800 info: [foo.] AXFR, outgoing, remote
>>>>> 10.1.136.156@37936 QUIC, buffering finished, 0.27 seconds, 2956
>>>>> messages, 49797072 bytes
>>>>> 2026-01-05T14:49:13+0800 debug: QUIC, terminated inactive
>>>>> connections 1
>>>>> 2026-01-05T14:49:13+0800 debug: QUIC, terminated inactive
>>>>> connections 1
>>>>> 2026-01-05T14:49:19+0800 debug: [foo.] ACL, allowed, action
>>>>> transfer, remote 10.1.136.159@14324 QUIC cert-key
>>>>> TFg9ybqubTukNtMiFdn5jW61Y4VUPS9XmYxHsCeQ/4c=
>>>>> 2026-01-05T14:49:19+0800 info: [foo.] AXFR, outgoing, remote
>>>>> 10.1.136.159@14324 QUIC, started, serial 2025123113
>>>>> 2026-01-05T14:49:19+0800 info: [foo.] AXFR, outgoing, remote
>>>>> 10.1.136.159@14324 QUIC, buffering finished, 0.27 seconds, 2956
>>>>> messages, 49797072 bytes
>>>>> 2026-01-05T14:49:23+0800 debug: QUIC, terminated inactive
>>>>> connections 1
>>>>
>>>> Note that the reason for connection termination is different. The
>>>> client was inactive.
>>>>
>>>>>
>>>>> 2. one slave(10.1.136.156, rocky9) could get the zone fully,
>>>>> another (10.1.136.159, rock10) couldn't, both are alive.
>>>>
>>>> What does the server log?
>>>
>>> on slave1,
>>>
>>> 2026-01-05T17:31:58+0800 info: [foo.] notify, incoming, remote
>>> 10.1.136.154@24037 TCP, serial 2025123116
>>> 2026-01-05T17:31:58+0800 info: [foo.] refresh, remote
>>> 10.1.136.154@853, remote serial 2025123116, zone is outdated
>>> 2026-01-05T17:31:58+0800 info: [foo.] IXFR, incoming, remote
>>> 10.1.136.154@853 QUIC, receiving AXFR-style IXFR
>>> 2026-01-05T17:31:58+0800 info: [foo.] AXFR, incoming, remote
>>> 10.1.136.154@853 QUIC, started
>>> 2026-01-05T17:32:02+0800 info: [foo.] AXFR, incoming, remote
>>> 10.1.136.154@853 QUIC, finished, remote serial 2025123116, 3.42
>>> seconds, 3695 messages, 62246340 bytes
>>> 2026-01-05T17:32:03+0800 info: [foo.] refresh, remote
>>> 10.1.136.154@853, zone updated, 5.56 seconds, serial 2025123114 ->
>>> 2025123116, expires in 2491200 seconds
>>> 2026-01-05T17:32:05+0800 info: [foo.] zone file updated, serial
>>> 2025123114 -> 2025123116
>>> 2026-01-05T17:33:13+0800 debug: stats, dumped into file
>>> '/home/gtld/knot/chroot1/var/run/stats.yaml'
>>> 2026-01-05T17:42:08+0800 info: [foo.] refresh, remote master_quic,
>>> address 10.1.136.154@853, failed (connection reset)
>>> 2026-01-05T17:42:08+0800 warning: [foo.] refresh, remote
>>> master_quic not usable
>>> 2026-01-05T17:42:08+0800 error: [foo.] refresh, failed (no usable
>>> master), next retry at 2026-01-05T17:45:28+0800, expires in 2490595
>>> seconds
>>> 2026-01-05T17:42:08+0800 error: [foo.] zone event 'refresh' failed
>>> (no usable master)
>>> 2026-01-05T17:45:30+0800 info: [foo.] refresh, remote master_quic,
>>> address 10.1.136.154@853, failed (connection reset)
>>> 2026-01-05T17:45:30+0800 warning: [foo.] refresh, remote
>>> master_quic not usable
>>> 2026-01-05T17:45:30+0800 error: [foo.] refresh, failed (no usable
>>> master), next retry at 2026-01-05T17:48:50+0800, expires in 2490393
>>> seconds
>>> 2026-01-05T17:45:30+0800 error: [foo.] zone event 'refresh' failed
>>> (no usable master)
> 2026-01-06T10:24:38+0800 debug: [foo.] ACL, allowed, action notify,
> remote 10.1.136.154@21789 TCP
> 2026-01-06T10:24:38+0800 info: [foo.] notify, incoming, remote
> 10.1.136.154@21789 TCP, serial 2025123118
> 2026-01-06T10:24:38+0800 info: [foo.] refresh, remote
> 10.1.136.154@853, remote serial 2025123118, zone is outdated
> 2026-01-06T10:24:39+0800 info: [foo.] IXFR, incoming, remote
> 10.1.136.154@853 QUIC/0-RTT, receiving AXFR-style IXFR
> 2026-01-06T10:24:39+0800 info: [foo.] AXFR, incoming, remote
> 10.1.136.154@853 QUIC/0-RTT, started
> 2026-01-06T10:24:42+0800 info: [foo.] AXFR, incoming, remote
> 10.1.136.154@853 QUIC, finished, remote serial 2025123118, 2.99
> seconds, 3695 messages, 62246340 bytes
> 2026-01-06T10:24:43+0800 info: [foo.] refresh, remote
> 10.1.136.154@853, zone updated, 4.82 seconds, serial 2025123117 ->
> 2025123118, expires in 2491200 seconds
> 2026-01-06T10:24:45+0800 info: [foo.] zone file updated, serial
> 2025123117 -> 2025123118
> 2026-01-06T10:33:13+0800 debug: stats, dumped into file
> '/home/gtld/knot/chroot1/var/run/stats.yaml'
> 2026-01-06T10:34:44+0800 info: [foo.] refresh, remote master_quic,
> address 10.1.136.154@853, failed (connection reset)
> 2026-01-06T10:34:44+0800 warning: [foo.] refresh, remote master_quic
> not usable
> 2026-01-06T10:34:44+0800 error: [foo.] refresh, failed (no usable
> master), next retry at 2026-01-06T10:38:04+0800, expires in 2490599
> seconds
> 2026-01-06T10:34:44+0800 error: [foo.] zone event 'refresh' failed
> (no usable master)
>>>
>>> on slave2,
>>>
>>> 2026-01-05T17:32:28+0800 info: server started
>>> 2026-01-05T17:32:28+0800 info: [foo.] AXFR, incoming, remote
>>> 10.1.136.154@853 QUIC, started
>>> 2026-01-05T17:32:33+0800 info: [foo.] AXFR, incoming, remote
>>> 10.1.136.154@853 QUIC, finished, remote serial 2025123116, 4.94
>>> seconds, 3695 messages, 62246340 bytes
>>> 2026-01-05T17:32:35+0800 info: [foo.] refresh, remote
>>> 10.1.136.154@853, zone updated, 7.34 seconds, serial none ->
>>> 2025123116, expires in 2491200 seconds
>>> 2026-01-05T17:32:37+0800 info: [foo.] zone file updated, serial
>>> 2025123116
>>> 2026-01-05T17:34:59+0800 debug: [foo.] ACL, allowed, action notify,
>>> remote 10.1.136.154@21413 TCP
>>> 2026-01-05T17:34:59+0800 info: [foo.] notify, incoming, remote
>>> 10.1.136.154@21413 TCP, serial 2025123116
>>> 2026-01-05T17:42:36+0800 info: [foo.] refresh, remote master_quic,
>>> address 10.1.136.154@853, failed (connection reset)
>>> 2026-01-05T17:42:36+0800 warning: [foo.] refresh, remote
>>> master_quic not usable
>>> 2026-01-05T17:42:36+0800 error: [foo.] refresh, failed (no usable
>>> master), next retry at 2026-01-05T17:45:56+0800, expires in 2490599
>>> seconds
>>> 2026-01-05T17:42:36+0800 error: [foo.] zone event 'refresh' failed
>>> (no usable master)
>>> 2026-01-05T17:45:58+0800 info: [foo.] refresh, remote master_quic,
>>> address 10.1.136.154@853, failed (connection reset)
>>> 2026-01-05T17:45:58+0800 warning: [foo.] refresh, remote
>>> master_quic not usable
>>> 2026-01-05T17:45:58+0800 error: [foo.] refresh, failed (no usable
>>> master), next retry at 2026-01-05T17:49:18+0800, expires in 2490397
>>> seconds
>>> 2026-01-05T17:45:58+0800 error: [foo.] zone event 'refresh' failed
>>> (no usable master)
> 2026-01-06T10:27:44+0800 info: [foo.] refresh, remote
> 10.1.136.154@853, remote serial 2025123118, zone is outdated
> 2026-01-06T10:27:49+0800 info: [foo.] refresh, remote master_quic,
> address 10.1.136.154@853, failed (connection reset)
> 2026-01-06T10:27:49+0800 warning: [foo.] refresh, remote master_quic
> not usable
> 2026-01-06T10:27:49+0800 error: [foo.] refresh, failed (no usable
> master), next retry at 2026-01-06T10:31:09+0800, expires in 2431693
> seconds
> 2026-01-06T10:27:49+0800 error: [foo.] zone event 'refresh' failed
> (no usable master)
> 2026-01-06T10:31:12+0800 info: [foo.] refresh, remote master_quic,
> address 10.1.136.154@853, failed (connection reset)
> 2026-01-06T10:31:12+0800 warning: [foo.] refresh, remote master_quic
> not usable
> 2026-01-06T10:31:12+0800 error: [foo.] refresh, failed (no usable
> master), next retry at 2026-01-06T10:34:32+0800, expires in 2431490
> seconds
>
> 2026-01-06T10:31:12+0800 error: [foo.] zone event 'refresh' failed
> (no usable master)
>
>
> After slave2 do the second try, the process on master disappear.
>
>>>
>>> on master,
>>>
>>> 2026-01-05T17:31:58+0800 info: server started
>>> 2026-01-05T17:31:58+0800 debug: [foo.] ACL, allowed, action
>>> transfer, remote 10.1.136.156@45612 QUIC cert-key
>>> kulz9ehQf5Ycn/+2mCicUdfTMuDXHbQEWBwg5qDi0Eo=
>>> 2026-01-05T17:31:58+0800 info: [foo.] IXFR, outgoing, remote
>>> 10.1.136.156@45612 QUIC, incomplete history, serial 2025123114,
>>> fallback to AXFR
>>> 2026-01-05T17:31:58+0800 debug: [foo.] ACL, allowed, action
>>> transfer, remote 10.1.136.156@45612 QUIC cert-key
>>> kulz9ehQf5Ycn/+2mCicUdfTMuDXHbQEWBwg5qDi0Eo=
>>> 2026-01-05T17:31:58+0800 info: [foo.] AXFR, outgoing, remote
>>> 10.1.136.156@45612 QUIC, started, serial 2025123116
>>> 2026-01-05T17:31:58+0800 info: [foo.] AXFR, outgoing, remote
>>> 10.1.136.156@45612 QUIC, buffering finished, 0.33 seconds, 3695
>>> messages, 62246340 bytes
>>> 2026-01-05T17:32:47+0800 debug: [foo.] ACL, allowed, action
>>> transfer, remote 10.1.136.159@41166 QUIC cert-key
>>> TFg9ybqubTukNtMiFdn5jW61Y4VUPS9XmYxHsCeQ/4c=
>>> 2026-01-05T17:32:47+0800 info: [foo.] AXFR, outgoing, remote
>>> 10.1.136.159@41166 QUIC, started, serial 2025123116
>>> 2026-01-05T17:32:47+0800 info: [foo.] AXFR, outgoing, remote
>>> 10.1.136.159@41166 QUIC, buffering finished, 0.38 seconds, 3695
>>> messages, 62246340 bytes
>>> 2026-01-05T17:35:18+0800 info: [foo.] notify, outgoing, remote
>>> 10.1.136.159@53 TCP, retry, serial 2025123116
>>
>> These logs describe different situation. I see successful transfers
>> over QUIC. Note that the failed transfers were over TCP.
> 2026-01-06T10:24:33+0800 info: Knot DNS 3.5.2 starting
> 2026-01-06T10:24:33+0800 info: loaded configuration file
> '/home/gtld/knot/chroot4/etc/knot.conf', mapsize 500 MiB
> 2026-01-06T10:24:33+0800 info: using UDP reuseport, socket affinity,
> incoming TCP Fast Open
> 2026-01-06T10:24:33+0800 info: binding to interface 10.0.0.53@53
> 2026-01-06T10:24:33+0800 info: binding to interface 10.0.1.53@53
> 2026-01-06T10:24:33+0800 info: binding to interface 10.10.3.154@53
> 2026-01-06T10:24:33+0800 info: binding to interface 10.10.4.154@53
> 2026-01-06T10:24:33+0800 info: binding to interface 10.1.136.154@53
> 2026-01-06T10:24:33+0800 info: binding to QUIC interface 10.10.3.154@853
> 2026-01-06T10:24:33+0800 info: binding to QUIC interface 10.10.4.154@853
> 2026-01-06T10:24:33+0800 info: binding to QUIC interface
> 10.1.136.154@853
> 2026-01-06T10:24:33+0800 info: binding to TLS interface 10.10.3.154@853
> 2026-01-06T10:24:33+0800 info: binding to TLS interface 10.10.4.154@853
> 2026-01-06T10:24:33+0800 info: binding to TLS interface 10.1.136.154@853
> 2026-01-06T10:24:33+0800 debug: QUIC/TLS, using self-generated key
> '/home/gtld/knot/chroot4/zone/keys/quic_key.pem' with one-time
> certificate
> 2026-01-06T10:24:33+0800 info: QUIC/TLS, certificate public key
> w23/TdHAJ7kwJMrb/rz/wYCDaYBVoHv4uBc880pSgCo=
> 2026-01-06T10:24:33+0800 info: changing GID to 1000
> 2026-01-06T10:24:33+0800 info: changing UID to 1000
> 2026-01-06T10:24:33+0800 info: process not allowed to set
> capabilities, skipping
> 2026-01-06T10:24:33+0800 info: changed directory to /
> 2026-01-06T10:24:33+0800 info: loading 1 zones
> 2026-01-06T10:24:33+0800 debug: suspending zone events
> 2026-01-06T10:24:33+0800 debug: suspended zone events
> 2026-01-06T10:24:33+0800 info: [foo.] zone will be loaded
> 2026-01-06T10:24:33+0800 debug: resumed zone events
> 2026-01-06T10:24:33+0800 info: control, binding to
> '/home/gtld/knot/chroot4/var/run/knot.sock'
> 2026-01-06T10:24:33+0800 info: starting server as a daemon, PID 327676
> 2026-01-06T10:24:37+0800 info: [foo.] zone file loaded, serial
> 2025123118
> 2026-01-06T10:24:38+0800 info: [foo.] loaded, serial none ->
> 2025123118, 92000117 bytes
> 2026-01-06T10:24:38+0800 info: [foo.] notify, outgoing, remote
> 10.1.136.159@53 TCP, serial 2025123118
> 2026-01-06T10:24:38+0800 info: [foo.] notify, outgoing, remote
> 10.1.136.156@53 TCP, serial 2025123118
> 2026-01-06T10:24:38+0800 info: server started
> 2026-01-06T10:24:38+0800 debug: [foo.] ACL, allowed, action transfer,
> remote 10.1.136.156@40843 QUIC cert-key
> kulz9ehQf5Ycn/+2mCicUdfTMuDXHbQEWBwg5qDi0Eo=
> 2026-01-06T10:24:38+0800 info: [foo.] IXFR, outgoing, remote
> 10.1.136.156@40843 QUIC, incomplete history, serial 2025123117,
> fallback to AXFR
> 2026-01-06T10:24:38+0800 debug: [foo.] ACL, allowed, action transfer,
> remote 10.1.136.156@40843 QUIC cert-key
> kulz9ehQf5Ycn/+2mCicUdfTMuDXHbQEWBwg5qDi0Eo=
> 2026-01-06T10:24:38+0800 info: [foo.] AXFR, outgoing, remote
> 10.1.136.156@40843 QUIC, started, serial 2025123118
> 2026-01-06T10:24:39+0800 info: [foo.] AXFR, outgoing, remote
> 10.1.136.156@40843 QUIC, buffering finished, 0.31 seconds, 3695
> messages, 62246340 bytes
> 2026-01-06T10:24:42+0800 debug: QUIC, terminated inactive connections 1
> 2026-01-06T10:24:42+0800 debug: QUIC, terminated inactive connections 1
> 2026-01-06T10:28:03+0800 debug: [foo.] ACL, allowed, action transfer,
> remote 10.1.136.159@12443 QUIC cert-key
> TFg9ybqubTukNtMiFdn5jW61Y4VUPS9XmYxHsCeQ/4c=
> 2026-01-06T10:28:03+0800 info: [foo.] IXFR, outgoing, remote
> 10.1.136.159@12443 QUIC, incomplete history, serial 2025123116,
> fallback to AXFR
> 2026-01-06T10:28:03+0800 debug: [foo.] ACL, allowed, action transfer,
> remote 10.1.136.159@12443 QUIC cert-key
> TFg9ybqubTukNtMiFdn5jW61Y4VUPS9XmYxHsCeQ/4c=
> 2026-01-06T10:28:03+0800 info: [foo.] AXFR, outgoing, remote
> 10.1.136.159@12443 QUIC, started, serial 2025123118
> 2026-01-06T10:28:04+0800 info: [foo.] AXFR, outgoing, remote
> 10.1.136.159@12443 QUIC, buffering finished, 0.35 seconds, 3695
> messages, 62246340 bytes
>
> 2026-01-06T10:28:08+0800 debug: QUIC, terminated inactive connections 1
>
>
> from the master's log, the transfer to both slave is ok, followed by
> "QUIC, terminated inactive connections 1",
>
> but the slave2 don't get the zone.
>
>>
>>>
>>> master still crashed,
>>>
>>> -rw-r-----. 1 root root 65663813 Jan 5 17:42
>>> core.knotd.1000.eceeeff3d58f457db4014dc5f33e0fad.323451.1767606123000000.zst
>>>
>>>>
>>>>>
>>>>> 3. the master(10.1.136.154, rocky9) collapsed.
>>>>>
>>>>> # ls -lrt /var/lib/systemd/coredump/
>>>>>
>>>>> -rw-r-----. 1 root root 52527031 Jan 5 09:09
>>>>>
core.knotd.1000.eceeeff3d58f457db4014dc5f33e0fad.320376.1767575365000000.zst
>>>>> -rw-r-----. 1 root root 52575905 Jan 5 10:06
>>>>>
core.knotd.1000.eceeeff3d58f457db4014dc5f33e0fad.320623.1767578803000000.zst
>>>>> -rw-r-----. 1 root root 52546840 Jan 5 10:53
>>>>>
core.knotd.1000.eceeeff3d58f457db4014dc5f33e0fad.320970.1767581613000000.zst
>>>>> -rw-r-----. 1 root root 52519031 Jan 5 13:33
>>>>>
core.knotd.1000.eceeeff3d58f457db4014dc5f33e0fad.321756.1767591216000000.zst
>>>>> -rw-r-----. 1 root root 52519386 Jan 5 13:46
>>>>>
core.knotd.1000.eceeeff3d58f457db4014dc5f33e0fad.321896.1767591963000000.zst
>>>>> -rw-r-----. 1 root root 52512650 Jan 5 14:23
>>>>>
core.knotd.1000.eceeeff3d58f457db4014dc5f33e0fad.322020.1767594182000000.zst
>>>>> -rw-r-----. 1 root root 52553042 Jan 5 14:49
>>>>>
core.knotd.1000.eceeeff3d58f457db4014dc5f33e0fad.322309.1767595778000000.zst
>>>>
>>>> Could you send me a backtrace? Something like
>>>>
>>>> coredumpctl gdb knot \
>>>> --batch \
>>>> -ex "set pagination off" \
>>>> -ex "thread apply all bt full"
>>>
>>> this command output,
>>>
>>> coredumpctl: unrecognized option '--batch'
>>>
>>> I try to get the output for gdb/bt, but failed,
>>>
>>> # coredumpctl dump > c01
>>>
>>> PID: 323451 (knotd)
>>> UID: 1000 (gtld)
>>> GID: 1000 (gtld)
>>> Signal: 11 (SEGV)
>>> Timestamp: Mon 2026-01-05 17:42:03 CST (6min ago)
>>> Command Line: /usr/sbin/knotd -c
>>> /home/gtld/knot/chroot4/etc/knot.conf -d
>>> Executable: /usr/sbin/knotd
>>> Control Group: /user.slice/user-0.slice/session-129.scope
>>> Unit: session-129.scope
>>> Slice: user-0.slice
>>> Session: 129
>>> Owner UID: 0 (root)
>>> Boot ID: eceeeff3d58f457db4014dc5f33e0fad
>>> Machine ID: 82aa697a9ae54202bb5e0ec31c510520
>>> Hostname: tiangong-01
>>> Storage:
>>>
/var/lib/systemd/coredump/core.knotd.1000.eceeeff3d58f457db4014dc5f33e0fad.323451.1767606123000000.zst
>>> (truncated)
>>> Size on Disk: 62.6M
>>> Message: Process 323451 (knotd) of user 1000 dumped core.
>>>
>>> Stack trace of thread 323458:
>>> #0 0x0000ffff9bce36f0 n/a (n/a + 0x0)
>>> #1 0x0000ffff9be77c7c n/a (n/a + 0x0)
>>> #2 0x0000ffff9be77c7c n/a (n/a + 0x0)
>>> #3 0x0000ffff9beaf4b8 n/a (n/a + 0x0)
>>> #4 0x0000ffff9be42d5c n/a (n/a + 0x0)
>>> #5 0x0000ffff9be4ff8c n/a (n/a + 0x0)
>>> #6 0x0000ffff9c3516a8 n/a (n/a + 0x0)
>>> #7 0x0000ffff9c35170c n/a (n/a + 0x0)
>>> #8 0x0000ffff9c35c104 n/a (n/a + 0x0)
>>> #9 0x0000ffff9c374e2c n/a (n/a + 0x0)
>>> #10 0x0000ffff9c36b630 n/a (n/a + 0x0)
>>> #11 0x0000ffff9c36b984 n/a (n/a + 0x0)
>>> #12 0x0000ffff9c36c12c n/a (n/a + 0x0)
>>> #13 0x0000ffff9c359498 n/a (n/a + 0x0)
>>> #14 0x0000aaaae3cf8244 quic_handler
>>> (/usr/sbin/knotd + 0x28244)
>>> #15 0xeee58b05aa78eb00 n/a (n/a + 0x0)
>>> ELF object binary architecture: AARCH64
>>> More than one entry matches, ignoring rest.
>>>
>>> # gdb --core c01
>>> GNU gdb (Rocky Linux) 16.3-2.el9
>>> Copyright (C) 2024 Free Software Foundation, Inc.
>>> License GPLv3+: GNU GPL version 3 or later
>>> <http://gnu.org/licenses/gpl.html>
>>> This is free software: you are free to change and redistribute it.
>>> There is NO WARRANTY, to the extent permitted by law.
>>> Type "show copying" and "show warranty" for details.
>>> This GDB was configured as "aarch64-redhat-linux-gnu".
>>> Type "show configuration" for configuration details.
>>> For bug reporting instructions, please see:
>>> <https://www.gnu.org/software/gdb/bugs/>.
>>> Find the GDB manual and other documentation resources online at:
>>> <http://www.gnu.org/software/gdb/documentation/>.
>>>
>>> For help, type "help".
>>> Type "apropos word" to search for commands related to
"word".
>>>
>>> warning: BFD: warning: /var/lib/systemd/coredump/c01 has a segment
>>> extending past end of file
>>>
>>> warning: Can't open file /tmp/knot-confdb.B4NsUg/lock.mdb (deleted)
>>> during file-backed mapping note processing
>>>
>>> warning: Can't open file /tmp/knot-confdb.B4NsUg/data.mdb (deleted)
>>> during file-backed mapping note processing
>>> [New LWP 323458]
>>> [New LWP 323453]
>>> [New LWP 323451]
>>> [New LWP 323454]
>>> [New LWP 323456]
>>> [New LWP 323460]
>>> [New LWP 323455]
>>> [New LWP 323461]
>>> [New LWP 323457]
>>> [New LWP 323459]
>>> [New LWP 323462]
>>> [New LWP 323463]
>>> [New LWP 323452]
>>> [New LWP 323464]
>>> [New LWP 323465]
>>>
>>> warning: failed to parse execution context from corefile: Cannot
>>> access memory at address 0xffffc31bcfe8
>>> Reading symbols from /usr/sbin/knotd...
>>> Reading symbols from
>>> /usr/lib/debug/usr/sbin/knotd-3.5.2-cznic.1.el9.aarch64.debug...
>>>
>>> warning: Error reading shared library list entry at 0x2e20613635303264
>>> Cannot access memory at address 0x6578652d6c642f80
>>> Cannot access memory at address 0x6578652d6c642f78
>>> Failed to read a valid object file image from memory.
>>> Core was generated by `/usr/sbin/knotd -c
>>> /home/gtld/knot/chroot4/etc/knot.conf -d'.
>>> Program terminated with signal SIGSEGV, Segmentation fault.
>>> #0 0x0000ffff9bce36f0 in ?? ()
>>> [Current thread is 1 (LWP 323458)]
>>> (gdb) bt
>>> #0 0x0000ffff9bce36f0 in ?? ()
>>> Backtrace stopped: previous frame identical to this frame (corrupt
>>> stack?)
>>> (gdb)
>>>
>>
>> Unfortunately, this output is incomplete. You should increase the
>> curedump limit and install packages with debug symbols:
>> knot-debuginfo knot-libs-debuginfo
>
> -rw-r-----. 1 root root 1400373248 Jan 6 10:31
> core.knotd.1000.eceeeff3d58f457db4014dc5f33e0fad.327676.1767666688000000
>
>
> [root@tiangong-01 coredump]# gdb --core
> core.knotd.1000.eceeeff3d58f457db4014dc5f33e0fad.327676.1767666688000000
> GNU gdb (Rocky Linux) 16.3-2.el9
> Copyright (C) 2024 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later
> <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.
> Type "show copying" and "show warranty" for details.
> This GDB was configured as "aarch64-redhat-linux-gnu".
> Type "show configuration" for configuration details.
> For bug reporting instructions, please see:
> <https://www.gnu.org/software/gdb/bugs/>.
> Find the GDB manual and other documentation resources online at:
> <http://www.gnu.org/software/gdb/documentation/>.
>
> For help, type "help".
> Type "apropos word" to search for commands related to "word".
>
> warning: Can't open file /tmp/knot-confdb.g2kMNg/lock.mdb (deleted)
> during file-backed mapping note processing
>
> warning: Can't open file /tmp/knot-confdb.g2kMNg/data.mdb (deleted)
> during file-backed mapping note processing
> [New LWP 327685]
> [New LWP 327678]
> [New LWP 327676]
> [New LWP 327680]
> [New LWP 327679]
> [New LWP 327682]
> [New LWP 327681]
> [New LWP 327686]
> [New LWP 327684]
> [New LWP 327688]
> [New LWP 327687]
> [New LWP 327677]
> [New LWP 327683]
> [New LWP 327689]
> [New LWP 327690]
> Reading symbols from /usr/sbin/knotd...
> Reading symbols from
> /usr/lib/debug/usr/sbin/knotd-3.5.2-cznic.1.el9.aarch64.debug...
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> Core was generated by `/usr/sbin/knotd -c
> /home/gtld/knot/chroot4/etc/knot.conf -d'.
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0 __GI___libc_free (mem=0xa27d64bf12c00800) at malloc.c:3231
> 3231 if (chunk_is_mmapped (p)) /* release
> mmapped memory. */
> [Current thread is 1 (Thread 0xffff62c0dd00 (LWP 327685))]
> Missing rpms, try: dnf --enablerepo='*debug*' install
> libxdp-debuginfo-1.5.5-1.el9_6.aarch64
> (gdb) bt
> #0 __GI___libc_free (mem=0xa27d64bf12c00800) at malloc.c:3231
> #1 0x0000ffff84a77c7c in _gnutls_buffer_clear
> (str=str@entry=0xffff62c0bfc8) at str.c:80
> #2 0x0000ffff84aaf4b8 in _gnutls13_recv_end_of_early_data
> (session=session@entry=0xffff04032af0) at tls13/early_data.c:103
> #3 0x0000ffff84a42d5c in _gnutls13_handshake_server
> (session=session@entry=0xffff04032af0) at handshake-tls13.c:481
> #4 0x0000ffff84a4ff8c in handshake_server (session=<optimized out>)
> at handshake.c:3502
> #5 gnutls_handshake (session=session@entry=0xffff04032af0) at
> handshake.c:2898
> #6 0x0000ffff850006a8 in ngtcp2_crypto_read_write_crypto_data
> (conn=conn@entry=0xffff0402b780, encryption_level=<optimized out>,
> data=<optimized out>,
> datalen=<optimized out>) at
> contrib/libngtcp2/ngtcp2/crypto/gnutls.c:498
> #7 0x0000ffff8500070c in ngtcp2_crypto_recv_crypto_data_cb
> (conn=0xffff0402b780, encryption_level=<optimized out>,
> offset=<optimized out>,
> data=<optimized out>, datalen=<optimized out>,
> user_data=<optimized out>) at
> contrib/libngtcp2/ngtcp2/crypto/shared.c:1718
> #8 0x0000ffff8500b104 in conn_call_recv_crypto_data (conn=<optimized
> out>, encryption_level=<optimized out>, offset=<optimized out>,
> data=<optimized out>, datalen=<optimized out>) at
> contrib/libngtcp2/ngtcp2/lib/ngtcp2_conn.c:150
> #9 0x0000ffff85023e2c in conn_emit_pending_crypto_data
> (rx_offset=935, strm=0xffff04039dc0,
> encryption_level=NGTCP2_ENCRYPTION_LEVEL_INITIAL,
> conn=0xffff0402b780) at
> contrib/libngtcp2/ngtcp2/lib/ngtcp2_conn.c:5968
> #10 conn_recv_crypto.part.0.lto_priv.0 (conn=0xffff0402b780,
> encryption_level=NGTCP2_ENCRYPTION_LEVEL_INITIAL,
> crypto=0xffff04039dc0, fr=<optimized out>,
> ts=<optimized out>) at
> contrib/libngtcp2/ngtcp2/lib/ngtcp2_conn.c:7281
> #11 0x0000ffff8501a630 in conn_recv_crypto (ts=2855398215727159,
> fr=0xffff62c0c310, crypto=0xffff04039dc0,
> encryption_level=NGTCP2_ENCRYPTION_LEVEL_INITIAL,
> conn=0xffff0402b780) at contrib/libngtcp2/ngtcp2/lib/ngtcp2_conn.c:7213
> #12 conn_recv_handshake_pkt (conn=conn@entry=0xffff0402b780,
> path=path@entry=0xffff62c0cd10, pi=pi@entry=0xffff62c0cd00,
> pkt=pkt@entry=0xffff62164658 <incomplete sequence \303>,
> pktlen=pktlen@entry=1200, dgramlen=dgramlen@entry=1200,
> pkt_ts=pkt_ts@entry=2855398215727159,
> ts=ts@entry=2855398215727159) at
> contrib/libngtcp2/ngtcp2/lib/ngtcp2_conn.c:6902
> #13 0x0000ffff8501a984 in conn_recv_handshake_cpkt
> (conn=conn@entry=0xffff0402b780, path=0xffff62c0cd10, pi=0xffff62c0cd00,
> pkt=0xffff62164658 <incomplete sequence \303>,
> pktlen=pktlen@entry=1200, ts=ts@entry=2855398215727159)
> at contrib/libngtcp2/ngtcp2/lib/ngtcp2_conn.c:6987
> #14 0x0000ffff8501b12c in conn_read_handshake (conn=0xffff0402b780,
> path=<optimized out>, pi=<optimized out>, pkt=<optimized out>,
> pktlen=1200,
> ts=2855398215727159) at
> contrib/libngtcp2/ngtcp2/lib/ngtcp2_conn.c:10091
> #15 0x0000ffff85008498 in ngtcp2_conn_read_pkt_versioned
> (pkt_info_version=1, ts=2855398215727159, pktlen=1200, pkt=<optimized
> out>, pi=0xffff62c0cd00,
> path=0xffff62c0cd10, conn=<optimized out>) at
> contrib/libngtcp2/ngtcp2/lib/ngtcp2_conn.c:10308
> #16 knot_quic_handle (table=table@entry=0xffff04000d40,
> reply=reply@entry=0xffff62c0d060, idle_timeout=<optimized out>,
> out_conn=out_conn@entry=0xffff62c0d008) at libknot/quic/quic.c:718
> #17 0x0000aaaaba698244 in quic_handler (p_ecn=<optimized out>,
> mh_out=0xffff62164298, rx=<optimized out>, table=0xffff04000d40,
> idle_close=<optimized out>, layer=0xffff62c0d228,
> params=0xffff62c0d010) at knot/server/quic-handler.c:90
> #18 udp_mmsg_handle (ctx=0xffff62c0d228, iface=0xaaaad1487c10,
> d=0xffff62164010) at knot/server/udp-handler.c:359
> #19 0x0000aaaaba698ae4 in udp_master (thread=0xaaaad151b1d0) at
> knot/server/udp-handler.c:624
> #20 0x0000aaaaba68e330 in thread_ep (data=0xaaaad151b1d0) at
> knot/server/dthreads.c:137
> #21 0x0000ffff848d2e00 in start_thread (arg=0x10e540) at
> pthread_create.c:443
> #22 0x0000ffff8493d49c in thread_start () at
> ../sysdeps/unix/sysv/linux/aarch64/clone.S:79
>
> (gdb)
>
>
> gnutls 3.8.3-9.el9.0.1 baseos
>
> ngtcp2 1.15.1-2.el9 epel
>
>
> Best Regards,
>
> SUN Guonian
>
>>
>> Daniel
>>
>>>
>>> Thanks!
>>>
>>>
>>> Best Regards,
>>>
>>> SUN Guonian
>>>
>>>>
>>>> Thanks!
>>>>
>>>>>
>>>>> Thanks !
>>>>>
>>>>> Best Regards,
>>>>>
>>>>> SUN Guonian
>>>>>
>>>>> 在 2026/1/4 19:05, Daniel Salzman 写道:
>>>>>> Hello!
>>>>>>
>>>>>> You probably have to increase
>>>>>>
https://www.knot-dns.cz/docs/latest/singlehtml/index.html#quic-outbuf-max-s…
>>>>>>
>>>>>> Daniel
>>>>>>
>>>>>> On 1/4/26 10:46, SUN Guonian via knot-dns-users wrote:
>>>>>>> Greetings,
>>>>>>>
>>>>>>> I have tried to use QUIC in zone transfering, I met one error
>>>>>>> in on bigger zone,
>>>>>>>
>>>>>>> from master's log, it displayed,
>>>>>>> 2026-01-04T17:32:19+0800 debug: [foo.] ACL, allowed, action
>>>>>>> transfer, remote 10.0.0.147@60880 QUIC cert-key
>>>>>>> xJKsDkUqpl6orXeTwsrDgDvgZ/PiYxOSVlOkVdn5EOU=
>>>>>>> 2026-01-04T17:32:19+0800 info: [foo.] IXFR, outgoing, remote
>>>>>>> 10.0.0.147@60880 QUIC, incomplete history, serial 2026010403,
>>>>>>> fallback to AXFR
>>>>>>> 2026-01-04T17:32:19+0800 debug: [foo.] ACL, allowed, action
>>>>>>> transfer, remote 10.0.0.147@60880 QUIC cert-key
>>>>>>> xJKsDkUqpl6orXeTwsrDgDvgZ/PiYxOSVlOkVdn5EOU=
>>>>>>> 2026-01-04T17:32:19+0800 info: [foo.] AXFR, outgoing, remote
>>>>>>> 10.0.0.147@60880 QUIC, started, serial 2026010404
>>>>>>> 2026-01-04T17:32:20+0800 info: [foo.] AXFR, outgoing, remote
>>>>>>> 10.0.0.147@60880 QUIC, buffering finished, 0.87 seconds, 7390
>>>>>>> messages, 124493148 bytes
>>>>>>> 2026-01-04T17:32:20+0800 notice: QUIC, terminated
connections,
>>>>>>> outbuf limit 1
>>>>>>>
>>>>>>> on the slave side, I got log as,
>>>>>>> 2026-01-04T17:32:18+0800 info: [foo.] zone file loaded,
serial
>>>>>>> 2026010403
>>>>>>> 2026-01-04T17:32:19+0800 info: [foo.] loaded, serial none
->
>>>>>>> 2026010403, 92000117 bytes
>>>>>>> 2026-01-04T17:32:19+0800 info: [foo.] refresh, remote
>>>>>>> 10.0.0.151@853, remote serial 2026010404, zone is outdated
>>>>>>> 2026-01-04T17:32:19+0800 info: server started
>>>>>>>
>>>>>>> (and, the knotd on slave will down without log.)
>>>>>>>
>>>>>>> Thanks in advance.
>>>>>>>
>>>>>>>
>>>>>>> My testing environment is,
>>>>>>>
>>>>>>> the zone size is 1,000,000 x ( 2 NS + 2 A ), such as,
>>>>>>> domain00000000 3600 NS ns1.domain00000000
>>>>>>> 3600 NS ns2.domain00000000
>>>>>>> ns1.domain00000000 3600 A 10.0.0.1
>>>>>>> ns2.domain00000000 3600 A 10.0.0.2
>>>>>>> ...
>>>>>>> domain00999999 3600 NS ns1.domain00999999
>>>>>>> 3600 NS ns2.domain00999999
>>>>>>> ns1.domain00999999 3600 A 10.0.0.1
>>>>>>> ns2.domain00999999 3600 A 10.0.0.2
>>>>>>>
>>>>>>> If I decrease the record number to 500,000 x ( 2 NS + 2 A ),
>>>>>>> the zone could be transfer with QUIC successfully.
>>>>>>>
>>>>>>> For traditional TCP and TLS, the zone transfer is processed
>>>>>>> without error, even for more large size.
>>>>>>>
>>>>>>> Version in master and slave are both 3.5.2, installed from
copr.
>>>>>>> OS in both side is Rocky9 x86_64.
>>>>>>>
>>>>>>> Best Regards,
>>>>>>> SUN Guonian
>>>>>>>