Hi Matthijs,

On 22 August 2016 at 06:21, Matthijs Mekking <matthijs@pletterpet.nl> wrote:
Hi Marek,

Thanks for your pointers, I really appreciate it.

On 15-08-16 19:27, Marek Vavruša wrote:
Hey Matthijs,

On 15 August 2016 at 06:32, Matthijs Mekking <matthijs@pletterpet.nl
<mailto:matthijs@pletterpet.nl>> wrote:

    Hi Jan,

    Thanks for your response. Some comments inline:

    On 15-08-16 14:29, Jan Včelák wrote:

        Hi Matthijs,

        processing of queries in Knot DNS is synchronous. So the UDP
        thread is
        blocked until the query processing is finished. This usually doesn't
        matter for authoritative server because the server can construct the
        response immediately.

        For dnsproxy, this is a problem. The dnsproxy module establishes
        a TCP
        connection to the resolver and waits for the answer. During that
        time
        the UDP handler thread is just idling.


    The dnsproxy AFAICS will use the same transport protocol as the
    incoming query. In our tests this is all UDP. So I don't think any
    TCP connection is established.

    Also, the query is forwarded to an authoritative server nearby, not
    a resolver, which should be a lot faster. There is still idle time
    of course, and perhaps too much.


Even if it's nearby, it's the latency that kills it. I think the number
of workers in Knot is ~ number of cores ~ number of proxied requests
in-flight. Resolvers, proxies and LBs use coroutines, event loops or
pipelines to resolve queries asynchronously, so they are able to have
three orders of magnitude more requests in-flight. There was someone
working on reimplementing the proxy module with per-worker libuv loops a
few months ago, I don't know what's the status now, but that would help
a lot.



        I'm affraid this couldn't be fixed easily without deep changes
        in the
        knotd architecture.

        Anyway, we are interested in your discoveries.


    Me too.

    Best regards,
      Matthijs


Perhaps you can try increasing the worker count to something like 100?
It should help. Honestly, if you're looking for public-facing DNS proxy

I tried bumping the worker count, it didn't make much impact.

Bummer, looks like I was wrong.
 

with filtering and decent performance, you'll be better served by
something else. RRL (or rate limits for uncached responses) is
relatively easy to implement in DNS firewall in Knot Resolver (have a
look at the 1.1 release post) compared to reimplementing the DNS proxy
module in authoritative.

I have tried the knot-resolver as a proxy, with the following policy configuration:

    modules = { 'policy' }
    policy.add(policy.all(policy.FORWARD('192.168.53.133')))

It's behaving indeed much better than mod-dnsproxy, I am getting around 52K QPS (compared to 7K with knot dnsproxy). Still, I am trying to reach twice as much.

This is on a 16 core machine, with 16 forks. I disabled the cache.

Is there any other performance tuning possibilities I should be aware of?


Best regards,
  Matthijs


Sounds about right, the FORWARD treats the upstream as any other auth (so it opens a new socket to get source port randomisation, fetches capabilities, fallbacks to TCP etc.), this is still somewhat expensive. It's made this way to reuse the default code path, but it can take a shortcut and reuse a connected socket instead as you don't have worry about the man in the middle in a simple proxy use case. If you could help me benchmark it, it would be very helpful (point anonymised metrics at my endpoint/allow prometheus scraper/share perf report).

I think you'll want caching at some point (with capped TTL) if you want to get close to backend speed.

Best,
Marek
 


Marek





        Cheers,

        Jan

        On 15.8.2016 14:12, Matthijs Mekking wrote:

            Hi,

            I recently tested the mod-dnsproxy performance and I am
            disappointed in
            the results:

            Knot in our test setup can do ~320K QPS.

            When using our own proxy in front of knot, we achieve quite a
            performance hit, only able to do ~120K QPS.

            However, when configuring knot to use the mod-dnsproxy, the
            performance
            drops to ~7K QPS.

            I am planning to investigate what causes this significant
            drop, but if
            you have any insights or other measurements already I would
            love to hear
            about them.

            Best regards,
              Matthijs
            _______________________________________________
            knot-dns-users mailing list
            knot-dns-users@lists.nic.cz <mailto:knot-dns-users@lists.nic.cz>
            https://lists.nic.cz/cgi-bin/mailman/listinfo/knot-dns-users
            <https://lists.nic.cz/cgi-bin/mailman/listinfo/knot-dns-users>

        _______________________________________________
        knot-dns-users mailing list
        knot-dns-users@lists.nic.cz <mailto:knot-dns-users@lists.nic.cz>
        https://lists.nic.cz/cgi-bin/mailman/listinfo/knot-dns-users
        <https://lists.nic.cz/cgi-bin/mailman/listinfo/knot-dns-users>

    _______________________________________________
    knot-dns-users mailing list
    knot-dns-users@lists.nic.cz <mailto:knot-dns-users@lists.nic.cz>
    https://lists.nic.cz/cgi-bin/mailman/listinfo/knot-dns-users
    <https://lists.nic.cz/cgi-bin/mailman/listinfo/knot-dns-users>