November 21, 2024

Oblivious DNS: Plugging the Internet’s Biggest Privacy Hole

by Annie Edmundson, Paul Schmitt, Nick Feamster

The recent news that Cloudflare is deploying their own DNS recursive resolver has once again raised hopes that users will enjoy improved privacy, since they can send DNS traffic encrypted to Cloudflare, rather than to their ISP. In this post, we explain why this approach only moves your private data from the ISP to (yet another) third party. You might trust that third party more than your ISP, but you still have to trust them.  In this post, we present an alternative design—Oblivious DNS—that prevents you from having to make that choice at all.

The Domain Name System (DNS)

When your client turns a domain name like google.com into an IP address, it relies on a recursive DNS resolver to do so. The operator of that resolver sees both your IP address and the domains that you query.

When you—or any of your devices—accesses the Internet, the first step is typically to look up a domain name (e.g., “google.com”, “princeton.edu”) in the Domain Name System (DNS) to determine which Internet address to contact. The DNS is, in essence a phone book for the Internet’s domain names.

Clients that you operate—including your browser, your smartphone, and any IoT device in your home—sends a DNS query for each domain name to a so-called “recursive DNS resolver”.On a typical home network, the default recursive DNS resolver may be operated by your Internet service provider (ISP) (e.g., Comcast, Verizon). Other entities such as Google and Quad9 also operate “open” recursive resolvers that anyone can use, with the idea that these alternative recursive resolvers give users another option for resolving DNS queries other than their ISP. Such alternatives have been useful in the past for circumventing censorship.

DNS: The Internet’s Biggest Privacy Hole

DNS queries are typically sent in cleartext, and they can reveal significant information that an Internet user may want to keep private, including the websites that user is visiting, the IP address or subnet of the device that issued the initial query, and even the types of devices that a user has in his or her home network. For example, our previous research has shown that DNS lookups can be used to de-anonymize traffic from the Tor network.

Because the queries and responses are unencrypted, any third party who can observe communication between a client and a recursive resolver, a recursive resolver, or an authoritative server may also be able to observe various steps in the DNS resolution process.

Operators of recursive DNS resolvers—typically your ISP, but typically whoever the user relies on to resolve recursive DNS queries (e.g., Google) may see individual IP addresses (which may correspond to an ISP subscriber, or perhaps an individual end-device) coupled with the fully qualified domain name that accompanies the query. Even in the case of authoritative resolvers, extensions to DNS such as EDNS0 Client Subnet may reveal information about the user’s IP address or subnet to authoritative DNS servers higher in the DNS hierarchy.

Existing Approaches

Existing proposed standards, including DNS Query Name Minimization and DNS over TLS protect certain aspects of user privacy.

Yet, these approaches do not prevent the operator of the recursive DNS server from learning which IP addresses are issuing queries for particular domain names—the fundamental problem with DNS privacy:

  • DNS Query Name Minimization provides a mechanism that the DNS servers that are authoritative for different parts of the DNS name hierarchy would not learn about the full DNS query. For example, a server that is authoritative for all of *.com would not necessarily learn about a query for maps.google.com, but would only learn that a client needs to resolve some subdomain of google.com. Yet, this mechanism doesn’t prevent the recursive DNS resolver from learning the full DNS query and the IP address of the client that issued the query.
  • DNS over TLS provides a mechanism for encrypting DNS queries. Yet, even with DNS over TLS, the recursive resolver still needs to decrypt the initial query so that it can resolve the query for the client.  It still does not prevent the recursive DNS resolver from learning the query and the IP address that send the query.

Third parties have recently been standing up new DNS resolvers that claim to respect user privacy: Quad9 (9.9.9.9) and Cloudflare’s 1.1.1.1 operate such open DNS recursive resolvers that claim to purge information about user queries. Cloudflare additionally support DNS over HTTPS, which (like DNS over TLS) will ensure that your DNS queries are encrypted from your browser to its recursive DNS resolver.

Yet, in all of these cases, a user has no guarantee that information that an operator learns might be retained, for operational or other purposes. Once such information is retained, of course, it may become vulnerable to other threats to user privacy, including data requests from law enforcement. In short, these services transfer the point of trust from your ISP to some other third party, but you still have to trust that third party.

Oblivious DNS

While you may have good reason to trust a provider that claims to purge all information about your DNS queries, we believe that user’s shouldn’t even have to make that choice in the first place.

The goal of Oblivious DNS (ODNS) is to ensure that no single party observes both the DNS query and the IP address (or subnet) that issued the query. ODNS runs as an overlay of sorts on conventional DNS; it requires no changes to any DNS infrastructure that is already deployed.

Oblivious DNS (ODNS) adds a custom stub resolver at the client to obfuscate the original query, which the authoritative server for ODNS can decrypt. But, the ODNS authoritative server never sees the IP address of the client that issued the query.

Oblivious DNS (ODNS) operates similarly to conventional DNS, but has two new components: 1) each client runs a local ODNS stub resolver, and 2) we add an ODNS authoritative zone that also operates as a recursive DNS resolver. The figure illustrates the basic approach.

When a client application initiates a DNS lookup, the client’s stub resolver obfuscates the domain that the client is requesting (via symmetric encryption), resulting in the recursive resolver being unaware of the requested domain. The authoritative name server for ODNS separates the clients’ identities from their corresponding DNS requests, such that the name servers cannot learn who is requesting specific domains. The steps taken in the ODNS protocol are as follows:

  1. When the client generates a request for www.foo.com, its stub resolver generates a session key k, encrypts the requested domain, and appends the TLD domain .odns, resulting in {www.foo.com}k.odns.
  2. The client forwards this request, with the session key encrypted under the .odns authoritative server’s public key ({k}PK) in the “Additional Information” record of the DNS query to the recursive resolver, which then forwards it to the authoritative name server for .odns.
  3. The authoritative server for ODNS queries decrypts the session key with its private key and subsequently decrypts the requested domain with the session key.
  4. The authoritative server forwards a recursive DNS request to the appropriate name server for the original domain, which then returns the answer to the ODNS authoritative server.
  5. The ODNS authoritative server can thus return the answer (with both the domain and IP address encrypted) to the recursive resolver, which forwards it on to the client’s stub resolver.  In turn, the stub resolver can decrypt both the domain and the IP address.

Other name servers see incoming DNS requests, but these only see the IP address of the ODNS recursive resolver, which effectively proxies the DNS request for the original client. These steps correspond to the following figure.

Prototype Implementation and Preliminary Evaluation

We implemented a prototype in Go to evaluate the feasibility of deploying ODNS, as well as the performance costs of using ODNS as compared to conventional DNS. We implemented an ODNS stub resolver and implemented an authoritative name server that can also issue recursive queries.

ODNS adds 10-20 milliseconds to the resolution time for an uncached DNS query.

We first compared the performance of running an ODNS  query overhead to that of conventional DNS. We issued DNS queries to the Alexa Top 10,000 domains using both ODNS and conventional DNS. The CDF below shows that ODNS adds about 10-20 milliseconds to each query. Of course, in practice DNS makes extensive use of caching, and this experiment shows a worst-case scenario. We expect the overhead in practice to be much smaller.

Along these lines, we also measured how ODNS would affect a typical Internet user’s browsing experience by evaluating the overhead of a full web page load, which involves fetching the page, and conducting any subsequent DNS lookups for embedded objects and resources in the page. We fetched popular web pages that have a lot of content using ODNS and compared the results to performing the same operations with ODNS.

Overhead for loading an entire web page is minimal.

In each group in the bar plot, the left bar in the figure is using conventional DNS and the right bar represents the time it takes using ODNS. We see that there is no significant difference in page load time between ODNS and conventional DNS because DNS lookups contribute minimal overhead to the entire page load process. As before, these experiments were run with a “cold cache”, and in practice we expect the overhead to be even less.

Summary and Next Steps

The past several years have seen much (warranted) concerns over the privacy risks that DNS queries expose. Existing approaches that allow users to use alternative DNS resolvers are a helpful step, but in some sense they merely shift the trust from the user’s ISP to another party. We believe that a better end state is one where the user doesn’t have to place trust in the operator of any DNS recursive resolver. Towards this goal, we have built ODNS to help decouple clients’ identities with their corresponding DNS queries, and have implemented a prototype. As ongoing work, we are working on a larger-scale implementation, deployment and evaluation. Additional information on ODNS can be found at our project website. We welcome any feedback and comments. We are ready to explore opportunities for broader deployment, and we are actively seeking partners to help us deploy ODNS resolvers in operational settings.

Comments

  1. litecoinextreme says

    Nice

  2. Jack Smith says

    I use Google for DNS as like my data in one place and do not want my data sold. Signed up for YouTube TV to keep my viewing data away from my cable provider. But it is also a fantastic service.

    • Nick Feamster says

      If you’re happy with Google having all of your data, including your DNS queries, then ODNS certainly is not for you.

  3. Shumon Huque says

    Interesting proposal, and worth considering ..

    A question about step 5 in your description of the proposal:

    “5. The ODNS server can thus return the answer to the client’s stub resolver directly, possibly over a confidential channel such as D-TLS.”

    * How does the ODNS server know the client’s address to send back the answer directly? It saw the query via the client’s recursive server.

    * If indeed the ODNS server is sending back the response directly to the client, doesn’t this defeat one of the privacy goals? The ODNS server should not be able to link queries to individual clients, I assume – otherwise you’ve just moved the privacy threat from the recursive server operator to the ODNS server operator.

    * Step 5 in your description also does not match the diagram, which shows the ODNS response being returned (re-encrypted) via the recursive server. This diagram appears to have the desired privacy properties of the proposal, so perhaps I misunderstood the text in step 5.

    Thanks!

    • Nick Feamster says

      Definitely a typo in Step 5. Several other folks also pointed this out. I’ve revised the post. Good catch!

  4. Because 1.1.1.1 is a valid ip address and actual in use nowadays, please use rfc5737 ip addresses for documentaion

    3. Documentation Address Blocks

    The blocks 192.0.2.0/24 (TEST-NET-1), 198.51.100.0/24 (TEST-NET-2),
    and 203.0.113.0/24 (TEST-NET-3) are provided for use in
    documentation.

    This to prevent confusion between this project, and the public cloudflare resolver

  5. Amreesh Phokeer says

    Hi Nick, trying to understand the operational model:

    1. Is it the same party that manages the stub and the ODNS authoritative server? For e.g. taking a campus network, I suppose the campus will operate both the stub and the authoritative, having control on both would make sense, especially for key sharing. In the other case (a 3rd party operating the ODNS authoritative) then we are entrusting our privacy to that 3rd party, as the latter knows the source IP address, not ideal right?

    2. Why is D-TLS not used from Stub -> Authoritative as opposed to Authoritative -> Stub as mentioned in step 5, in other words, can we bypass completely the recursive server both ways?

    3. Would the recursive server be able to cache any response in this setup?

    • Nick Feamster says

      Amreesh, these are fantastic questions. Here’s my thinking:

      1. That’ll be an ephemeral session key picked by the client. ODNS authoritative still has no idea where/what IP address the client is coming from, since that’s headed through the recursive.

      2. Yes, D-TLS should be possible in both directions.

      3. Caching, good observation. worth studying more, but the names will be on the order of the life of the session key, which could certainly be longer than the TTL on a typical A record. Some caching benefits _across_ clients will be lost. This is something worth studying in the (forthcoming) paper. Thanks.

      Let’s follow up.

  6. Andrew McConachie says

    Why not just run your own recursive server and disable EDNS client-subnet? Seems a lot less invasive and broken than this approach.

    • Nick Feamster says

      In addition to “running your own recursive” being a larger ask than the typical device or user may be capable of or willing to do, it actually doesn’t solve the problems.

      Specifically: From what IP address would you “run your own recursive”? That IP address would be exposed to the authoritative servers, which is part of what one would like to avoid.

      • Andrew McConachie says

        “1) each client runs a local ODNS stub resolver,”

        DNSSEC-trigger would not require anything more complicated than what the proposal already asks. It’s also been out there for a while and is well tested. You can configure it to not use the DHCP learned recursive server.

        The approach requires everyone to “just trust” the .odns. authoritative servers. Besides the obvious bad design of overloading the namespace with a protocol switching pseudo-TLD this proposal creates a single point of trust. It’s also not clear how the recursive finds the .odns server. Does this proposal require a change to the DNS root-zone?

        • Nick Feamster says

          DNSSEC doesn’t solve the privacy problem that ODNS does.

          You don’t have to trust the ODNS authoritative, because (unlike in the existing models), it doesn’t see the client IP.

          No change to the root required. Any subdomain will work, as well.

  7. The recursive resolver and the ODNS server must be owned by different organizations, of course, but someone who can snoop traffic to both can match them up.

    I would think the ODNS server cannot return a result directly to the client’s stub resolver because it doesn’t have the stub’s IP address. What am I missing?

    In the limit where all queries are ODNS, the recursive server becomes a pure distributing proxy with no caching ability whatsoever; caching can only occur in the stub or the ODNS server. Would ISPs (or anyone else) still have any motivation to maintain such a server?

    • Nick Feamster says

      You’re right on Step 5 (typo; fixed).

      ISPs don’t need to have “motivation” to maintain such a server. A hosting provider could very well do it.

      The caching concern you raise does not really seem to pose a problem. A client can still cache A records just as it always would. The ODNS authoritative similarly can cache NS records, just as any other resolver would do. There is, in some cases, an additional hop in the sequence of iterative queries, of course, but that would only come into play when caches are cold. ODNS does not really affect a DNS stub resolver’s ability to cache in the same way that it does today.

  8. Brad Knowles says

    All the authoritative servers ever see is the IP address of the recursive server anyway, so how is your solution improving that picture?

    I get that you’re using public key cryptography, but how does that help?

    • Nick Feamster says

      We’re concerned about what the recursive sees. In our case, the recursive doesn’t ever see the name in the A query.

  9. The high level of this feels as though trust moves from the recursive/authorative DNS servers to the ODNS servers. Seems like the same thing.

    • Nick Feamster says

      Nope. It separates the coupling between the query, and the IP address that issued the query.

      EDNS0 client subnet will “break” this separation by leaking a client IP all the way through to the authoritative, but some recursives like Cloudflare’s actually disable that, too. So, one could combine with the use of the CF resolver to avoid that problem.

  10. Dominik Herrmann says

    Nice. Seems somewhat similar to our approach EncDNS (ESORICS 2014): https://svs.informatik.uni-hamburg.de/publications/2014/2014-07-30-HFLF14-ESORICS14-EncDNS.pdf

  11. Eric Hellman says

    “Yet, these approaches do prevent the operator of the recursive DNS server from learning which IP addresses are issuing queries for particular domain names”
    should be
    “Yet, these approaches do NOT prevent the operator of the recursive DNS server from learning which IP addresses are issuing queries for particular domain names