Hello all,
I have a question regarding cache size and it’s using. We are running
knot-resolver on RaspberryPI 4 (as secondary cache). Previously we had
256MB cache as tmpfs and hit issue that process crashed due to “No space
left on device (workdir '/var/lib/knot-resolver')” I doubled the site to
512MB, but same issue occurs. My question is why it crash even from the
logs the usage of cache is about 80 percent (same in the case with 256MB).
I taught that kres-cache-gc is taking care about its size and it does not
allow to full the cache and prevent writing to it in the way that main
process crash. Should we increase the cache size or we hit a some bug? Any
other suggestion what could cause that cache is full and not “cleared”?
Thank you for tip and have a nice day. Petr Kyselak
Config:
/etc/fstab
tmpfs /var/cache/knot-resolver tmpfs
rw,size=512m,uid=knot-resolver,gid=knot-resolver,nosuid,nodev,noexec,mode=0700
0 00
-- Cache size
cache.size = cache.fssize() - 10*MB
Logs:
Jun 4 16:32:05 dns-cache-2 kres-cache-gc[548]: Cache analyzed in 1.41
secs, 1038386 records, limit category is 100.
Jun 4 16:32:05 dns-cache-2 kres-cache-gc[548]: 0 records to be deleted
using 0.00 MBytes of temporary memory, 0 records skipped due to memory
limit.
Jun 4 16:32:05 dns-cache-2 kres-cache-gc[548]: Deleted 0 records (0
already gone) types
Jun 4 16:32:05 dns-cache-2 kres-cache-gc[548]: It took 0.00 secs, 0
transactions (OK)
Jun 4 16:32:05 dns-cache-2 kres-cache-gc[548]: Usage: 81.32% (428077056 /
526385152)
Jun 4 16:32:05 dns-cache-2 kres-cache-gc[548]: Cache analyzed in 1.41
secs, 1038420 records, limit category is 100.
Jun 4 16:32:05 dns-cache-2 kres-cache-gc[548]: 0 records to be deleted
using 0.00 MBytes of temporary memory, 0 records skipped due to memory
limit.
Jun 4 16:32:05 dns-cache-2 kres-cache-gc[548]: Deleted 0 records (0
already gone) types
Jun 4 16:32:05 dns-cache-2 kres-cache-gc[548]: It took 0.00 secs, 0
transactions (OK)
Jun 4 16:32:05 dns-cache-2 kres-cache-gc[548]: Usage: 81.32% (428077056 /
526385152)
Jun 4 16:32:05 dns-cache-2 kres-cache-gc[548]: Cache analyzed in 1.41
secs, 1038420 records, limit category is 100.
Jun 4 16:32:05 dns-cache-2 kres-cache-gc[548]: 0 records to be deleted
using 0.00 MBytes of temporary memory, 0 records skipped due to memory
limit.
Jun 4 16:32:05 dns-cache-2 kres-cache-gc[548]: Deleted 0 records (0
already gone) types
Jun 4 16:32:05 dns-cache-2 kres-cache-gc[548]: It took 0.00 secs, 0
transactions (OK)
Jun 4 16:32:05 dns-cache-2 kres-cache-gc[548]: Usage: 81.32% (428077056 /
526385152)
Jun 4 16:32:05 dns-cache-2 kres-cache-gc[548]: Cache analyzed in 1.41
secs, 1038421 records, limit category is 100.
Jun 4 16:32:32 dns-cache-2 kresd[672]: [cache] MDB_BAD_TXN, probably
overfull
Jun 4 16:32:32 dns-cache-2 kresd[672]: [cache] clearing error, falling back
Jun 4 16:32:32 dns-cache-2 kresd[672]: [cache] MDB_BAD_TXN, probably
overfull
Jun 4 16:32:32 dns-cache-2 kresd[675]: [cache] MDB_BAD_TXN, probably
overfull
Jun 4 16:32:32 dns-cache-2 kresd[675]: [cache] clearing error, falling back
Jun 4 16:32:32 dns-cache-2 kresd[675]: [cache] MDB_BAD_TXN, probably
overfull
Jun 4 16:32:32 dns-cache-2 kresd[675]: [cache] clearing failed to get
./.cachelock; retry later
Jun 4 16:32:32 dns-cache-2 kresd[675]: [cache] clearing because overfull,
ret = -17
Jun 4 16:32:32 dns-cache-2 kresd[675]: [cache] MDB_BAD_TXN, probably
overfull
Jun 4 16:32:32 dns-cache-2 kresd[675]: [cache] clearing error, falling back
Jun 4 16:32:32 dns-cache-2 kresd[675]: [cache] MDB_BAD_TXN, probably
overfull
Jun 4 16:32:32 dns-cache-2 kresd[675]: [cache] clearing failed to get
./.cachelock; retry later
Jun 4 16:32:32 dns-cache-2 kresd[675]: [cache] clearing because overfull,
ret = -17
Jun 4 16:32:32 dns-cache-2 kresd[675]: [cache] MDB_BAD_TXN, probably
overfull
Jun 4 16:32:32 dns-cache-2 kresd[675]: [cache] clearing error, falling back
Jun 4 16:32:32 dns-cache-2 kresd[675]: [cache] MDB_BAD_TXN, probably
overfull
Jun 4 16:32:32 dns-cache-2 kresd[675]: [cache] clearing failed to get
./.cachelock; retry later
Jun 4 16:32:32 dns-cache-2 kresd[675]: [cache] clearing because overfull,
ret = -17
…
Jun 4 16:32:32 dns-cache-2 systemd[1]: kresd(a)1.service: Main process
exited, code=killed, status=11/SEGV
Jun 4 16:32:32 dns-cache-2 kresd[675]: [cache] clearing failed to get
./.cachelock; retry later
Jun 4 16:32:32 dns-cache-2 kresd[675]: [cache] clearing because overfull,
ret = -17
Jun 4 16:32:32 dns-cache-2 kresd[675]: [cache] MDB_BAD_TXN, probably
overfull
Jun 4 16:32:32 dns-cache-2 kresd[675]: [cache] clearing error, falling back
Jun 4 16:32:32 dns-cache-2 kresd[675]: [cache] MDB_BAD_TXN, probably
overfull
Jun 4 16:32:32 dns-cache-2 kresd[675]: [cache] clearing failed to get
./.cachelock; retry later
Jun 4 16:32:32 dns-cache-2 kresd[675]: [cache] clearing because overfull,
ret = -17
…
Jun 4 16:32:32 dns-cache-2 systemd[1]: kresd(a)1.service: Failed with result
'signal'.
…
Jun 4 16:32:32 dns-cache-2 systemd[1]: kresd(a)2.service: Main process
exited, code=killed, status=11/SEGV
Jun 4 16:32:32 dns-cache-2 kresd[675]: [cache] clearing failed to get
./.cachelock; retry later
Jun 4 16:32:32 dns-cache-2 kresd[675]: [cache] clearing because overfull,
ret = -17
Jun 4 16:32:32 dns-cache-2 kresd[675]: [cache] MDB_BAD_TXN, probably
overfull
Jun 4 16:32:32 dns-cache-2 kresd[675]: [cache] clearing error, falling back
…
Jun 4 16:32:32 dns-cache-2 systemd[1]: kresd(a)2.service: Failed with result
'signal'.
Jun 4 16:32:32 dns-cache-2 kresd[675]: [cache] clearing because overfull,
ret = -17
Jun 4 16:32:32 dns-cache-2 kresd[675]: [cache] MDB_BAD_TXN, probably
overfull
Jun 4 16:32:32 dns-cache-2 kresd[675]: [cache] clearing error, falling back
…
Jun 4 16:32:32 dns-cache-2 kresd[675]: [cache] MDB_BAD_TXN, probably
overfull
Jun 4 16:32:32 dns-cache-2 kresd[675]: [cache] clearing because overfull,
ret = -28
Jun 4 16:32:32 dns-cache-2 systemd[1]: kresd(a)1.service: Service
RestartSec=100ms expired, scheduling restart.
Jun 4 16:32:32 dns-cache-2 systemd[1]: kresd(a)1.service: Scheduled restart
job, restart counter is at 1.
Jun 4 16:32:32 dns-cache-2 systemd[1]: kresd(a)2.service: Service
RestartSec=100ms expired, scheduling restart.
Jun 4 16:32:32 dns-cache-2 systemd[1]: kresd(a)2.service: Scheduled restart
job, restart counter is at 1.
Jun 4 16:32:32 dns-cache-2 kresd[30052]: [system] error while loading
config: /usr/lib/knot-resolver/sandbox.lua:400: can't open cache path
'/var/cache/knot-resolver'; working directory '/var/lib/knot-resolver';
No
space left on device (workdir '/var/lib/knot-resolver')
Jun 4 16:32:32 dns-cache-2 systemd[1]: kresd(a)1.service: Main process
exited, code=exited, status=1/FAILURE
Jun 4 16:32:32 dns-cache-2 systemd[1]: kresd(a)1.service: Failed with result
'exit-code'.
Jun 4 16:32:32 dns-cache-2 kresd[30051]: [system] error while loading
config: /usr/lib/knot-resolver/sandbox.lua:400: can't open cache path
'/var/cache/knot-resolver'; working directory '/var/lib/knot-resolver';
No
space left on device (workdir '/var/lib/knot-resolver')
Jun 4 16:32:32 dns-cache-2 systemd[1]: kresd(a)2.service: Main process
exited, code=exited, status=1/FAILURE
Jun 4 16:32:32 dns-cache-2 systemd[1]: kresd(a)2.service: Failed with result
'exit-code'.