Hello all,

 

I have a question regarding cache size and it’s using. We are running knot-resolver on RaspberryPI 4 (as secondary cache). Previously we had 256MB cache as tmpfs and hit issue that process crashed due to “No space left on device (workdir '/var/lib/knot-resolver')” I doubled the site to 512MB, but same issue occurs. My question is why it crash even from the logs the usage of cache is about 80 percent (same in the case with 256MB). I taught that kres-cache-gc is taking care about its size and it does not allow to full the cache and prevent writing to it in the way that main process crash. Should we increase the cache size or we hit a some bug? Any other suggestion what could cause that cache is full and not “cleared”?


Thank you for tip and have a nice day. Petr Kyselak


Config:

/etc/fstab

tmpfs        /var/cache/knot-resolver        tmpfs   rw,size=512m,uid=knot-resolver,gid=knot-resolver,nosuid,nodev,noexec,mode=0700 0 00


-- Cache size

cache.size = cache.fssize() - 10*MB


Logs:

Jun  4 16:32:05 dns-cache-2 kres-cache-gc[548]: Cache analyzed in 1.41 secs, 1038386 records, limit category is 100.

Jun  4 16:32:05 dns-cache-2 kres-cache-gc[548]: 0 records to be deleted using 0.00 MBytes of temporary memory, 0 records skipped due to memory limit.

Jun  4 16:32:05 dns-cache-2 kres-cache-gc[548]: Deleted 0 records (0 already gone) types

Jun  4 16:32:05 dns-cache-2 kres-cache-gc[548]: It took 0.00 secs, 0 transactions (OK)

Jun  4 16:32:05 dns-cache-2 kres-cache-gc[548]: Usage: 81.32% (428077056 / 526385152)

Jun  4 16:32:05 dns-cache-2 kres-cache-gc[548]: Cache analyzed in 1.41 secs, 1038420 records, limit category is 100.

Jun  4 16:32:05 dns-cache-2 kres-cache-gc[548]: 0 records to be deleted using 0.00 MBytes of temporary memory, 0 records skipped due to memory limit.

Jun  4 16:32:05 dns-cache-2 kres-cache-gc[548]: Deleted 0 records (0 already gone) types

Jun  4 16:32:05 dns-cache-2 kres-cache-gc[548]: It took 0.00 secs, 0 transactions (OK)

Jun  4 16:32:05 dns-cache-2 kres-cache-gc[548]: Usage: 81.32% (428077056 / 526385152)

Jun  4 16:32:05 dns-cache-2 kres-cache-gc[548]: Cache analyzed in 1.41 secs, 1038420 records, limit category is 100.

Jun  4 16:32:05 dns-cache-2 kres-cache-gc[548]: 0 records to be deleted using 0.00 MBytes of temporary memory, 0 records skipped due to memory limit.

Jun  4 16:32:05 dns-cache-2 kres-cache-gc[548]: Deleted 0 records (0 already gone) types

Jun  4 16:32:05 dns-cache-2 kres-cache-gc[548]: It took 0.00 secs, 0 transactions (OK)

Jun  4 16:32:05 dns-cache-2 kres-cache-gc[548]: Usage: 81.32% (428077056 / 526385152)

Jun  4 16:32:05 dns-cache-2 kres-cache-gc[548]: Cache analyzed in 1.41 secs, 1038421 records, limit category is 100.

Jun  4 16:32:32 dns-cache-2 kresd[672]: [cache] MDB_BAD_TXN, probably overfull

Jun  4 16:32:32 dns-cache-2 kresd[672]: [cache] clearing error, falling back

Jun  4 16:32:32 dns-cache-2 kresd[672]: [cache] MDB_BAD_TXN, probably overfull

Jun  4 16:32:32 dns-cache-2 kresd[675]: [cache] MDB_BAD_TXN, probably overfull

Jun  4 16:32:32 dns-cache-2 kresd[675]: [cache] clearing error, falling back

Jun  4 16:32:32 dns-cache-2 kresd[675]: [cache] MDB_BAD_TXN, probably overfull

Jun  4 16:32:32 dns-cache-2 kresd[675]: [cache] clearing failed to get ./.cachelock; retry later

Jun  4 16:32:32 dns-cache-2 kresd[675]: [cache] clearing because overfull, ret = -17

Jun  4 16:32:32 dns-cache-2 kresd[675]: [cache] MDB_BAD_TXN, probably overfull

Jun  4 16:32:32 dns-cache-2 kresd[675]: [cache] clearing error, falling back

Jun  4 16:32:32 dns-cache-2 kresd[675]: [cache] MDB_BAD_TXN, probably overfull

Jun  4 16:32:32 dns-cache-2 kresd[675]: [cache] clearing failed to get ./.cachelock; retry later

Jun  4 16:32:32 dns-cache-2 kresd[675]: [cache] clearing because overfull, ret = -17

Jun  4 16:32:32 dns-cache-2 kresd[675]: [cache] MDB_BAD_TXN, probably overfull

Jun  4 16:32:32 dns-cache-2 kresd[675]: [cache] clearing error, falling back

Jun  4 16:32:32 dns-cache-2 kresd[675]: [cache] MDB_BAD_TXN, probably overfull

Jun  4 16:32:32 dns-cache-2 kresd[675]: [cache] clearing failed to get ./.cachelock; retry later

Jun  4 16:32:32 dns-cache-2 kresd[675]: [cache] clearing because overfull, ret = -17

Jun  4 16:32:32 dns-cache-2 systemd[1]: kresd@1.service: Main process exited, code=killed, status=11/SEGV

Jun  4 16:32:32 dns-cache-2 kresd[675]: [cache] clearing failed to get ./.cachelock; retry later

Jun  4 16:32:32 dns-cache-2 kresd[675]: [cache] clearing because overfull, ret = -17

Jun  4 16:32:32 dns-cache-2 kresd[675]: [cache] MDB_BAD_TXN, probably overfull

Jun  4 16:32:32 dns-cache-2 kresd[675]: [cache] clearing error, falling back

Jun  4 16:32:32 dns-cache-2 kresd[675]: [cache] MDB_BAD_TXN, probably overfull

Jun  4 16:32:32 dns-cache-2 kresd[675]: [cache] clearing failed to get ./.cachelock; retry later

Jun  4 16:32:32 dns-cache-2 kresd[675]: [cache] clearing because overfull, ret = -17

Jun  4 16:32:32 dns-cache-2 systemd[1]: kresd@1.service: Failed with result 'signal'.

Jun  4 16:32:32 dns-cache-2 systemd[1]: kresd@2.service: Main process exited, code=killed, status=11/SEGV

Jun  4 16:32:32 dns-cache-2 kresd[675]: [cache] clearing failed to get ./.cachelock; retry later

Jun  4 16:32:32 dns-cache-2 kresd[675]: [cache] clearing because overfull, ret = -17

Jun  4 16:32:32 dns-cache-2 kresd[675]: [cache] MDB_BAD_TXN, probably overfull

Jun  4 16:32:32 dns-cache-2 kresd[675]: [cache] clearing error, falling back

Jun  4 16:32:32 dns-cache-2 systemd[1]: kresd@2.service: Failed with result 'signal'.

Jun  4 16:32:32 dns-cache-2 kresd[675]: [cache] clearing because overfull, ret = -17

Jun  4 16:32:32 dns-cache-2 kresd[675]: [cache] MDB_BAD_TXN, probably overfull

Jun  4 16:32:32 dns-cache-2 kresd[675]: [cache] clearing error, falling back

Jun  4 16:32:32 dns-cache-2 kresd[675]: [cache] MDB_BAD_TXN, probably overfull

Jun  4 16:32:32 dns-cache-2 kresd[675]: [cache] clearing because overfull, ret = -28

Jun  4 16:32:32 dns-cache-2 systemd[1]: kresd@1.service: Service RestartSec=100ms expired, scheduling restart.

Jun  4 16:32:32 dns-cache-2 systemd[1]: kresd@1.service: Scheduled restart job, restart counter is at 1.

Jun  4 16:32:32 dns-cache-2 systemd[1]: kresd@2.service: Service RestartSec=100ms expired, scheduling restart.

Jun  4 16:32:32 dns-cache-2 systemd[1]: kresd@2.service: Scheduled restart job, restart counter is at 1.

Jun  4 16:32:32 dns-cache-2 kresd[30052]: [system] error while loading config: /usr/lib/knot-resolver/sandbox.lua:400: can't open cache path '/var/cache/knot-resolver'; working directory '/var/lib/knot-resolver'; No space left on device (workdir '/var/lib/knot-resolver')

Jun  4 16:32:32 dns-cache-2 systemd[1]: kresd@1.service: Main process exited, code=exited, status=1/FAILURE

Jun  4 16:32:32 dns-cache-2 systemd[1]: kresd@1.service: Failed with result 'exit-code'.

Jun  4 16:32:32 dns-cache-2 kresd[30051]: [system] error while loading config: /usr/lib/knot-resolver/sandbox.lua:400: can't open cache path '/var/cache/knot-resolver'; working directory '/var/lib/knot-resolver'; No space left on device (workdir '/var/lib/knot-resolver')

Jun  4 16:32:32 dns-cache-2 systemd[1]: kresd@2.service: Main process exited, code=exited, status=1/FAILURE

Jun  4 16:32:32 dns-cache-2 systemd[1]: kresd@2.service: Failed with result 'exit-code'.