let me guess that VM3 is the nearest one to the customers? If you turn it down for a while, will memory leak on another VM occur?
Bc. Martin Doubrava
Hello to all,
First, a big thank you to all the devs that are doing a great job in providing a high quality resolver.
I have 3 instances of knot-resolver 6 running on different POP, and with similar setup and settings for an ISP. They work in anycast, the closest VM to a customer is the one serving them. If one or two VM fails, the traffic is automatically anycasted to the other ones.
VMs are all on Ubuntu 24.04.1 LTS - Linux 6.8 - x86_64.
They all worked well, until Knot Resolver 6.0.9, when one of the 3 started to have memleaks (only one curiously).
Reverting to Knot Resolver 6.0.8 solved the problem for that VM, the two other were fine with Knot Resolver 6.0.9.
I have the same exact problem with Knot Resolver 6.0.10, two VMs are fine, on one VM still has the memleak.
So I have one VM on 6.0.8 (with apt package on hold) and the two other on 6.0.10.
I cannot find what is different.
The config in /etc/knot-resolver/config.yaml is similar.
Only nsid changes between the VMs, and the management (for prometheus) and &private interface IPv6 too (aaaa:aaaa:aaaa::1, aaaa:aaaa:aaaa::2 and aaaa:aaaa:aaaa::3) :
rundir: /run/knot-resolver
interface: aaaa:aaaa:aaaa::1@8453
storage: /var/cache/knot-resolver
# unencrypted private DNS on port 53
# unencrypted public DNS on port 53
# DNS over TLS on port 853
# DNS over HTTPS on port 443
cert-file: '/etc/knot-resolver/tls/dns.mydomain.com.fullchain.pem'
key-file: '/etc/knot-resolver/tls/dns.mydomain.com.privkey.pem'
- subnets: ['0.0.0.0/0', '::/0']
- subnets: ['127.0.0.0/8', '123.123.123.0/22', '111.111.111.111/32']
- subnets: ['::1/128', 'aaaa:aaaa::/32', 'bbbb:bbbb::/32']
dst-subnet: aaaa:aaaa:aaaa::1
- subnets: ['::1/128', 'aaaa:aaaa::/32', 'bbbb:bbbb::/32']
dst-subnet: aaaa:aaaa:aaaa::44
- subnets: ['::1/128', 'aaaa:aaaa::/32', 'bbbb:bbbb::/32']
dst-subnet: aaaa:aaaa:aaaa::64
- subtree: 10.in-addr.arpa
servers: [ 'aaaa:aaaa:ffff:ffff::1', '22.22.22.22' ]
servers: [ 'aaaa:aaaa:ffff:ffff::1', '22.22.22.22' ]
Here are the memory behavior of the 3 VMs. Around 11am, I upgraded the 3 of them. Clearly, VM3 has a different behavior and the used RAM keeps increasing.
I finale reverted to 6.0.8 for that VM (et the very end of the third graph) and all is fine.
Any idea of what is going on?
What do you need to help diagnose the issue?
Thank you for you attention, and Best Regards,
Gabriel ROUSSEAU
--