Just for the record, the cause was malformed changeset (SOA serial FROM is equal to SOA
serial TO) in the journal.
Even previous versions (2.8.x) are unable to process this changeset. So it doesn't
relate to upgrade to 2.9.0.
We will add proper handling of such a broken journal.
Thanks go to Lukáš for the help with debugging.
Daniel
On 10/15/19 1:53 PM, Lukáš Kocourek wrote:
So problem is solved for me, I've tried to start
with just one zone which didn’t helped, but removing journal did. Many thanks to Daniel
Salzman for kind help, maybe he will post some other details after analyzing our journal
records.
--
Lukas Kocourek
On 2019-10-15, at 12:27,
knot-dns-users-request(a)lists.nic.cz wrote:
Message: 5
Date: Tue, 15 Oct 2019 11:56:26 +0200
From: Lukáš Kocourek <kocour(a)direcat.net>
To: knot-dns-users(a)lists.nic.cz
Subject: [knot-dns-users] 2.9.0 fails to start after upgrade
Message-ID: <110451DC-3CC1-419D-9066-EF784FDDAC00(a)direcat.net>
Content-Type: text/plain; charset=utf-8
Hi,
I’ve upgraded from 2.8.4 to 2.9.0 (Ubuntu 18.04, using ppa:cz.nic-labs/knot-dns-latest)
and now knotd fail to start.
After dispatching "service knot start" command knotd process immediately goes
to 100% CPU and after 3 minutes it’s killed by systemd because "Start operation timed
out”.
Here log from service start...
Oct 15 11:37:04 ns1-u18 knotc[4972]: Configuration is valid
Oct 15 11:37:04 ns1-u18 knotd[4984]: info: Knot DNS 2.9.0 starting
Oct 15 11:37:04 ns1-u18 knotd[4984]: info: loaded configuration file
'/etc/knot/knot.conf'
Oct 15 11:37:04 ns1-u18 knotd[4984]: info: using reuseport for UDP
Oct 15 11:37:04 ns1-u18 knotd[4984]: info: binding to interface X.X.X.X@53
Oct 15 11:37:04 ns1-u18 knotd[4984]: info: binding to interface ::@53
Oct 15 11:37:04 ns1-u18 knotd[4984]: info: loading 199 zones
[cut] - info that all zones will be loaded
Oct 15 11:37:04 ns1-u18 knotd[4984]: info: [xxx.yyy.] zone will be loaded
[/cut]
Oct 15 11:37:04 ns1-u18 knotd[4984]: info: starting server
Oct 15 11:38:34 ns1-u18 systemd[1]: knot.service: Start operation timed out.
Terminating.
Oct 15 11:40:04 ns1-u18 systemd[1]: knot.service: State 'stop-sigterm' timed out.
Killing.
Oct 15 11:40:04 ns1-u18 systemd[1]: knot.service: Killing process 4984 (knotd) with
signal SIGKILL.
Oct 15 11:40:04 ns1-u18 systemd[1]: knot.service: Main process exited, code=killed,
status=9/KILL
Oct 15 11:40:04 ns1-u18 systemd[1]: knot.service: Failed with result 'timeout'.
Any advice what to do?
Thank very much
--
Lukas Kocourek
------------------------------
Message: 6
Date: Tue, 15 Oct 2019 12:27:34 +0200
From: Daniel Salzman <daniel.salzman(a)nic.cz>
To: knot-dns-users(a)lists.nic.cz
Subject: Re: [knot-dns-users] 2.9.0 fails to start after upgrade
Message-ID: <99813b76-c350-442f-35c7-2447cb07acce(a)nic.cz>
Content-Type: text/plain; charset=utf-8
Hi Lukáš,
Could you try to temporarily disable some zones? We have no idea at the moment :-(
Would it be possible to show me a snippet of your configuration?
Daniel
On 10/15/19 11:56 AM, Lukáš Kocourek wrote:
> Hi,
>
> I’ve upgraded from 2.8.4 to 2.9.0 (Ubuntu 18.04, using
ppa:cz.nic-labs/knot-dns-latest) and now knotd fail to start.
> After dispatching "service knot start" command knotd process immediately
goes to 100% CPU and after 3 minutes it’s killed by systemd because "Start operation
timed out”.
>
> Here log from service start...
>
> Oct 15 11:37:04 ns1-u18 knotc[4972]: Configuration is valid
> Oct 15 11:37:04 ns1-u18 knotd[4984]: info: Knot DNS 2.9.0 starting
> Oct 15 11:37:04 ns1-u18 knotd[4984]: info: loaded configuration file
'/etc/knot/knot.conf'
> Oct 15 11:37:04 ns1-u18 knotd[4984]: info: using reuseport for UDP
> Oct 15 11:37:04 ns1-u18 knotd[4984]: info: binding to interface X.X.X.X@53
> Oct 15 11:37:04 ns1-u18 knotd[4984]: info: binding to interface ::@53
> Oct 15 11:37:04 ns1-u18 knotd[4984]: info: loading 199 zones
>
> [cut] - info that all zones will be loaded
> Oct 15 11:37:04 ns1-u18 knotd[4984]: info: [xxx.yyy.] zone will be loaded
> [/cut]
>
> Oct 15 11:37:04 ns1-u18 knotd[4984]: info: starting server
> Oct 15 11:38:34 ns1-u18 systemd[1]: knot.service: Start operation timed out.
Terminating.
> Oct 15 11:40:04 ns1-u18 systemd[1]: knot.service: State 'stop-sigterm' timed
out. Killing.
> Oct 15 11:40:04 ns1-u18 systemd[1]: knot.service: Killing process 4984 (knotd) with
signal SIGKILL.
> Oct 15 11:40:04 ns1-u18 systemd[1]: knot.service: Main process exited, code=killed,
status=9/KILL
> Oct 15 11:40:04 ns1-u18 systemd[1]: knot.service: Failed with result
'timeout'.
>
> Any advice what to do?
>
> Thank very much
>
> --
> Lukas Kocourek
>