Hi!
On 1/5/26 08:06, SUN Guonian wrote:
It take effect to increase quic-outbuf-max-size, the
transfer works on architecture x86_64.
but on aarch64, there is a lot difference,
Are there the same number of background workers on both architectures?
1. I increase quic-outbuf-max-size from default 100M to 3200M, doubled each time, it
still produce,
2026-01-05T14:47:49+0800 info: [foo.] AXFR, outgoing, remote 10.1.136.156@37936 QUIC,
started, serial 2025123113
2026-01-05T14:47:49+0800 info: [foo.] AXFR, outgoing, remote 10.1.136.156@37936 QUIC,
buffering finished, 0.27 seconds, 2956 messages, 49797072 bytes
2026-01-05T14:49:13+0800 debug: QUIC, terminated inactive connections 1
2026-01-05T14:49:13+0800 debug: QUIC, terminated inactive connections 1
2026-01-05T14:49:19+0800 debug: [foo.] ACL, allowed, action transfer, remote
10.1.136.159@14324 QUIC cert-key TFg9ybqubTukNtMiFdn5jW61Y4VUPS9XmYxHsCeQ/4c=
2026-01-05T14:49:19+0800 info: [foo.] AXFR, outgoing, remote 10.1.136.159@14324 QUIC,
started, serial 2025123113
2026-01-05T14:49:19+0800 info: [foo.] AXFR, outgoing, remote 10.1.136.159@14324 QUIC,
buffering finished, 0.27 seconds, 2956 messages, 49797072 bytes
2026-01-05T14:49:23+0800 debug: QUIC, terminated inactive connections 1
Note that the reason for connection termination is different. The client was inactive.
2. one slave(10.1.136.156, rocky9) could get the zone fully, another (10.1.136.159,
rock10) couldn't, both are alive.
What does the server log?
3. the master(10.1.136.154, rocky9) collapsed.
# ls -lrt /var/lib/systemd/coredump/
-rw-r-----. 1 root root 52527031 Jan 5 09:09
core.knotd.1000.eceeeff3d58f457db4014dc5f33e0fad.320376.1767575365000000.zst
-rw-r-----. 1 root root 52575905 Jan 5 10:06
core.knotd.1000.eceeeff3d58f457db4014dc5f33e0fad.320623.1767578803000000.zst
-rw-r-----. 1 root root 52546840 Jan 5 10:53
core.knotd.1000.eceeeff3d58f457db4014dc5f33e0fad.320970.1767581613000000.zst
-rw-r-----. 1 root root 52519031 Jan 5 13:33
core.knotd.1000.eceeeff3d58f457db4014dc5f33e0fad.321756.1767591216000000.zst
-rw-r-----. 1 root root 52519386 Jan 5 13:46
core.knotd.1000.eceeeff3d58f457db4014dc5f33e0fad.321896.1767591963000000.zst
-rw-r-----. 1 root root 52512650 Jan 5 14:23
core.knotd.1000.eceeeff3d58f457db4014dc5f33e0fad.322020.1767594182000000.zst
-rw-r-----. 1 root root 52553042 Jan 5 14:49
core.knotd.1000.eceeeff3d58f457db4014dc5f33e0fad.322309.1767595778000000.zst
Could you send me a backtrace? Something like
coredumpctl gdb knot \
--batch \
-ex "set pagination off" \
-ex "thread apply all bt full"
Thanks!
Thanks !
Best Regards,
SUN Guonian
在 2026/1/4 19:05, Daniel Salzman 写道:
> Hello!
>
> You probably have to increase
https://www.knot-dns.cz/docs/latest/singlehtml/index.html#quic-outbuf-max-s…
>
> Daniel
>
> On 1/4/26 10:46, SUN Guonian via knot-dns-users wrote:
>> Greetings,
>>
>> I have tried to use QUIC in zone transfering, I met one error in on bigger zone,
>>
>> from master's log, it displayed,
>> 2026-01-04T17:32:19+0800 debug: [foo.] ACL, allowed, action transfer, remote
10.0.0.147@60880 QUIC cert-key xJKsDkUqpl6orXeTwsrDgDvgZ/PiYxOSVlOkVdn5EOU=
>> 2026-01-04T17:32:19+0800 info: [foo.] IXFR, outgoing, remote 10.0.0.147@60880
QUIC, incomplete history, serial 2026010403, fallback to AXFR
>> 2026-01-04T17:32:19+0800 debug: [foo.] ACL, allowed, action transfer, remote
10.0.0.147@60880 QUIC cert-key xJKsDkUqpl6orXeTwsrDgDvgZ/PiYxOSVlOkVdn5EOU=
>> 2026-01-04T17:32:19+0800 info: [foo.] AXFR, outgoing, remote 10.0.0.147@60880
QUIC, started, serial 2026010404
>> 2026-01-04T17:32:20+0800 info: [foo.] AXFR, outgoing, remote 10.0.0.147@60880
QUIC, buffering finished, 0.87 seconds, 7390 messages, 124493148 bytes
>> 2026-01-04T17:32:20+0800 notice: QUIC, terminated connections, outbuf limit 1
>>
>> on the slave side, I got log as,
>> 2026-01-04T17:32:18+0800 info: [foo.] zone file loaded, serial 2026010403
>> 2026-01-04T17:32:19+0800 info: [foo.] loaded, serial none -> 2026010403,
92000117 bytes
>> 2026-01-04T17:32:19+0800 info: [foo.] refresh, remote 10.0.0.151@853, remote
serial 2026010404, zone is outdated
>> 2026-01-04T17:32:19+0800 info: server started
>>
>> (and, the knotd on slave will down without log.)
>>
>> Thanks in advance.
>>
>>
>> My testing environment is,
>>
>> the zone size is 1,000,000 x ( 2 NS + 2 A ), such as,
>> domain00000000 3600 NS ns1.domain00000000
>> 3600 NS ns2.domain00000000
>> ns1.domain00000000 3600 A 10.0.0.1
>> ns2.domain00000000 3600 A 10.0.0.2
>> ...
>> domain00999999 3600 NS ns1.domain00999999
>> 3600 NS ns2.domain00999999
>> ns1.domain00999999 3600 A 10.0.0.1
>> ns2.domain00999999 3600 A 10.0.0.2
>>
>> If I decrease the record number to 500,000 x ( 2 NS + 2 A ), the zone could be
transfer with QUIC successfully.
>>
>> For traditional TCP and TLS, the zone transfer is processed without error, even
for more large size.
>>
>> Version in master and slave are both 3.5.2, installed from copr.
>> OS in both side is Rocky9 x86_64.
>>
>> Best Regards,
>> SUN Guonian
>>
>> --
>