Hi Olafur and Marek,
On 29 Jan 2014, at 17:21 , Olafur Gudmundsson <ogud(a)ogud.com> wrote:
On Jan 29, 2014, at 10:54 AM, Marek Vavruša
<marek.vavrusa(a)nic.cz> wrote:
> Hi Johan,
>
> On 29 January 2014 16:34, Johan Ihrén <johani(a)johani.org> wrote:
>> Hi Jan,
>>
>> On 27 Jan 2014, at 16:26 , Jan Včelák <jan.vcelak(a)nic.cz> wrote:
>>
>>> Today, CZ.NIC Labs proudly announce the Knot DNS 1.4.2.
>>
>> Congratulations!
>>
>>> There are quite a lot of changes:
>>
>> Now, that's a surprise ;-)
>>
>>> * We also fixed several problems in DNSSEC. Firstly, the 'knotc
signzone'
>>> command was broken and caused a deadlock of the main server thread. It does
>>> not happen with the new version.
>>
>> And here my only comment is (as you well know by now) that I like simple things,
because more complicated things break in new and inventive ways. An authoritative server
that didn't try to sign zones would never have had a deadlock like this.
>>
>> That said, I do understand that the signing stuff is brand new, and of course new
code has bugs, and as the bugs are found they get fixed, which is good.
>>
>> But there will be more bugs in Knot because of this added complexity than
otherwise. And this is a concern to me.
>
> I know and I sort of get where this is coming from. I'd just like to
> say that it doesn't ring hollow. I reckon we are nearing to
> 'feature-complete' (as of today), not quite there but almost. So our
> focus is shifting from features to polishing features we already have
> and improving the feature testing and this is going to be a big thing
> for the upcoming releases. I agree that some things are made too
> complicated for a very little outcome, but it's a continuous effort to
> balance the demand for features and simplicity. I think neither
> extreme is good and can only hope we can find a sweet spot between
> those some time down the road.
I agree with this, I also believe there's a sweet spot somewhere in the middle. But,
as we have discussed previously, I also believe that the sweet spot is a question of,
let's call it "system functionality", i.e. the "Knot system"
should be able to provide the functionality that is required and wanted by users.
That is not necessarily the same thing as having it all in the same binary ;-)
But I really don't want to make too much of this, I'm much impressed by what you
guys have achieved in a very short time frame, and I'm much looking forward to the
polishing phase.
> A good measure is the codebase size. If
> it gets smaller with the further releases, I'm going to be a happy man
> :)
True that. We think alike ;-)
Secondly,
prior to this release, the signatures were refreshed two hours
before their expiration, which was found to be extremely insufficient. With
the new release, signatures are refreshed one tenth of the signature
lifetime before their expiration. With the default configuration, the
signature lifetime is 30 days, which implies that the signatures are
refreshed three days before the expiration.
In this particular area I think BIND9 has it right. To begin with BIND9 uses 1/4 of the
signature lifetime as the default for when to resign. In addition to that there is a
configuration parameter called "resigning interval" which specifies the amount
of "remaining lifetime" in the signature before it will get resigned.
I.e. with a signature lifetime of ten days and a resigning interval of four days the zone
will get resigned every six days if nothing else changes.
This makes a lot of sense, because a fixed percentage of the signature lifetime simply
doesn't work for very long or very short lifetimes.
Makes sense, I think Jan might tell more about the plans for future?
In my experience policy that only looks at signature lifetime will lead to validation
failures in some cases,
in particular when TTL's are long and/or Signature lifetime is short.
The only "safe" policy is to refresh
signatures based on
the greater of:
Longest TTL in zone + fudge for distribution time to secondaries
x% of Signature lifetime
If LONGEST TTL is too large then you need to have policy based on each RRset
max(TTL + fudge, % of Signature lifetime ) + next check for signature refreshing
I must admit that my memory is getting fuzzy (or perhaps fudgy?) on this issue, but I do
remember that we debated this intensely some years ago. Wasn't the conclusion that the
TTL should be capped to the remaining signature lifetime *by the validator* (or rather by
whoever intends to cache stuff for future use)?
I.e. the place for alignment of remaining sig life and TTL is in the validator, because
that's when they are both converted into wall clock time (which can be compared). In
the signer end, you just don't know anything about how old the data will be when it
enters someone elses cache and that will force the need for "too much fudge".
Look at it this way: assume 10 days of sig life, a max TTL of 3 days (just to force the
issue) and a SOA expire of 8 days. Freshly signed zone transfers to slave. Connection to
master is broken. Slave keeps publishing for eight days. Just before expiration validator
queries for stuff, which will have remaining sig life = 2 days and TTL = 3 days. I.e.
unless the validator/cache caps the TTL to remaining life it will keep data with expired
sigs around, which is not good.
My point is not that this is a good signing policy, but rather that you cannot fix this
with the "resigning interval", because the signer may keep resigning the zone
until the cows come home with no effect, because the connectivity between master and rest
of the system is broken. The validator OTOH can trivially avoid problems by capping TTLs.
Hence the validation failures that concern you (which are real) can not be fully mitigated
by any resigning policy.
Johan