Hi there,
it is progressive at the moment, but it probably doesn't suit well for
this issue.
The progression is set to random 0-30s increments up to 10 minutes maximum.
The question is now what is sane, I'd like to avoid another configurable here.
To start the ball rolling, the attached patch tweaks the settings to
double the previous interval plus jitter (30s).
I think it is a good compromise that allows relatively fast few tries
(about 3 in a few subsequent minutes) and spins
into hours in about 7 tries. Interval cap is set at the 24 hours, it
seems reasonable as the zone can be always refreshed using knotc. The
cut-off could be a bit more aggressive probably, that's the open
question.
Marek
On 22 December 2013 22:34, Anand Buddhdev <anandb(a)ripe.net> wrote:
Hi Knot developers,
I'm testing Knot 1.4.0-rc2, which is configured with 5167 zones, all
slaves. When I start Knot, it has to bootstrap all of them. It manages
to bootstrap 4331 of them, but for the other 832, I get SERVFAIL from
the master. Knot schedules retries for them within a 5-minute period,
with some jitter. But with 832 zones, they keep coming up for AXFR
continuously, and Knot keeps trying continuously.
I'd like to request an improvement to Knot's scheduler so that it tries
failing zones less and less frequently, to avoid being stuck in a retry
cycle. How about some kind of exponentail back-off with a sane maximum
of something like 24 hours?
Before anyone asks why those 832 zones are SERVFAILing, I'll tell you.
They're not under my direct control, and I can't get the operators to
fix that easily, but I'm stuck with them, so I have to deal with them.
Regards,
Anand Buddhdev
RIPE NCC
_______________________________________________
knot-dns-users mailing list
knot-dns-users(a)lists.nic.cz
https://lists.nic.cz/cgi-bin/mailman/listinfo/knot-dns-users