Vladimir,
Huge progress!
During the night, our VM still on v6.0.8 crashed. Thankfully, exabgp did its job and the
anycast went directly to another VM.
This time, the second VM reached a memleak and rebooted (and the third took over the time
the second restarted).
You can see the the RAM usage increasing quite fast and VM1, then VM2, and generally, it
is cleared in time to prevent a crash, except around 6 am.

Now, the good news is that we have identified the source :)
We don’t know what the queries were, because it was DoH, but one of our customer was using
this:
https://github.com/0xERR0R/blocky
As soon as he stopped using it, no more memleaks or sawtooth graphs.
Sorry for the quick and dirty graph superposition below, but it shows the correlation:
– the blue line is the the answer rate to this particular customer in pps,
– in purple, it is the memory usage of the VM with knot resolver on it
– before 10 am this VM was offline, so anything before is irrelevant.

The customer stopped his blocky around 10:40 (I believe he might have restarted it briefly
between 10:50 and 11:35).
So in our case, blocky was the culprit behind the knot resolver memleaks.
The other knot resolver users experiencing memleaks should look if any requests are coming
from a blocky instance.
My next step will be to upgrade this VM and confirm that there are no memleaks anymore
even with version > 6.0.8 and libknot15.
All the Best,
Gabriel
Le 3 avr. 2025 à 11:43, Vladimír Čunát via
knot-resolver-users
<knot-resolver-users_at_lists_nic_cz_48qbhjm2vj0347_06aede6e(a)icloud.com> a écrit :
On 02/04/2025 23.19, oui.mages_0w(a)icloud.com <mailto:oui.mages_0w@icloud.com>
wrote:
So knot-resolver 6.0.8 with libknot15 seems to
also trigger the memory leak I was experiencing with knot-resolver 6.0.9+ by the
unidentified traffic pattern (or whatever is causing this).
Thanks, this is very
interesting. I confirm that (for our Ubuntu 24.04 packages), libknot15 (i.e. knot 3.4) is
used exactly since 6.0.9, so the timing checks out, too. That's just a matter of
binary builds. Even the latest versions can still be built with libknot14 (3.3.x)
Have you looked into which libdnssec and libzscanner you have there? The thing is that
these two didn't change soname between knot 3.3 and 3.4, so here I see larger risks
than with libknot itself.
--