On 11. 05. 19 4:39, Vladimír Čunát wrote:
On 5/10/19 9:27 PM, Christoph wrote:
Just to make sure I understood you correctly:
If the cache reaches cache.size, kresd will flush the **entire** cache?
Yes, correct. I agree it's surprising.
> How do people work around this limitation?
With sufficient memory for cache it is very unlikely to happen, so it
was not a practical problem up until now.
I'm curious:
What amount of memory are you using and how big is your user population?
Place kresd
behind a caching
resolver that does cache housekeeping?
nginx -> doh-httpproxy -> knot-resolver -> unbound ?
this chain is getting longer and longer ;)
I certainly haven't heard of anyone doing something similar. It sounds
possible, but I don't think it's worth it.
That really does not make much sense. You could also double the amount
of memory and lower the probability of cache flush significantly.
Can you say
anything about when this feature will be available in a
released version of kresd?
I personally can't promise anything. In any case, you can watch the
issue:
https://gitlab.labs.nic.cz/knot/knot-resolver/issues/257
We have a prototype code for the garbage collector, so we might polish
it a bit and provide you with instructions how to run it.
Testing it would tremendously help to expedite the process because
garbage collection very dependent on the deployment, so first user is
what we exactly need at the moment!
Just let me know if you want to test it in your environment.
Petr Špaček @ CZ.NIC
but so far typical deployments can afford setting the
limit so large that it
only fills up very rarely.
Even if it happens rarely (lets say you have enough
memory for 3 weeks
worth of traffic), that will also result in slow response times (due to
empty cache) every 3 weeks when the cache gets cleared if I understood
you correctly.
Yes, the few seconds after clearing will be noticeably slower (I
believe). Records of the same name and type get updated in-place (with
a short overlap, sometimes), so for each GB you'll typically need
millions of *different* records to fill it - and that's not as fast as
one might expect, because people mostly visit a not-too-huge set of
sites, and these sets tend to overlap mostly. You could try with your
traffic and see how fast it grows (`du` command should be accurate until
we have the GC).
Now that I think of it, zram might be usable to extend the capacity in
exchange for bearable latency hit on rarely used records, but I haven't
tried to use it this way. Swapping to an SSD also, I guess; directly
placing the cache on an SSD would be more likely to cause too many
writes, though it's possible almost all will be sequential writes and
thus not too bad.
--Vladimir