Evaluating DNS Resiliency and Responsiveness with Truncation, Fragmentation & DoTCP Fallback

—Since its introduction in 1987, the DNS has become one of the core components of the Internet. While it was designed to work with both TCP and UDP, DNS-over-UDP (DoUDP) has become the default option due to its low overhead. As new Resource Records were introduced, the sizes of DNS responses increased considerably. This expansion of message body has led to truncation and IP fragmentation more often in recent years where large UDP responses make DNS an easy vector for amplifying denial-of-service attacks which can reduce the resiliency of DNS services. This paper investigates the resiliency, responsiveness, and usage of DoTCP and DoUDP over IPv4 and IPv6 for 10 widely used public DNS resolvers. In these experiments, these aspects are investigated from the edge and from the core of the Internet to represent the communication of the resolvers with DNS clients and authoritative name servers. Overall, more than 14M individual measurements from 2527 RIPE Atlas Probes have been analyzed, highlighting that most resolvers show similar resiliency for both DoTCP and DoUDP. While DNS Flag Day 2020 recommended 1232 bytes of buffer sizes yet we ﬁnd out that 3 out of 10 resolvers mainly announce very large EDNS(0) buffer sizes both from the edge as well as from the core, which potentially causes fragmentation. In reaction to large response sizes from authoritative name servers, we ﬁnd that resolvers do not fall back to the usage of DoTCP in many cases, bearing the risk of fragmented responses. As the message sizes in the DNS are expected to grow further, this problem will become more urgent in the future.


I. INTRODUCTION
The Domain Name System (DNS), which is responsible for the resolution of hostnames to IP addresses, has become one of the most widely used components on the Internet.Hostnames (domain names) are organized in a tree structure that is hierarchically separated into zones.The resolution of domain names is realized by different components such as stub resolvers, recursive resolvers, and authoritative Name Servers (NSes).While authoritative NSes are responsible for the authoritative mapping of domains in a zone to their IP addresses, stub, and recursive resolvers cache and deliver such information from the NSes to the clients via a DNS request [1] (RFC 1034 [2]).DNS communication supports both major transport protocols on the Internet, namely the Transmission Control Protocol (TCP) (RFC 793 [3]) and the User Datagram Protocol (UDP) (RFC 768 [4]).Due to its comparably low overhead, UDP has become the default transport protocol for DNS.The UDP message body is restricted to 512 bytes (RFC 1035 [5]).However, the increase in deployment of DNS Security (DNSSEC) and IPv6 (RFC 7766 [6]) has resulted in larger message sizes, thereby leading to two important developments in the protocol.Firstly, DNS-over-TCP (DoTCP) was declared to be mandatory for hosts (RFC 5966 [7]) as it enables a larger message body by default.Secondly, Extension Mechanisms for DNS (EDNS) were introduced to augment the capabilities of the DNS protocol in terms of message size expansion (RFC 2671 [8]).With the new EDNS capability, it was required that DNS replies would continue to provide responses as UDP datagrams even though the response was larger than 512 bytes.Stipovic et al. in [9] examines the level of compatibility for a number of public DNS servers for some popular Internet domains while exploring the behavior of some contemporary DNS implementations such as Microsoft Windows 2012, 2016 and 2019 as well as Linux-based BIND in regards to the EDNS.However, using too large UDP buffer sizes can cause IP fragmentation in certain networks, thereby reducing resiliency in DNS communication [10].To avoid fragmentation, the DNS Flag Day, 2020 1 , an association of DNS software maintainers and service providers, recommended the usage of a default buffer size of 1232 bytes.DoTCP is a useful measure against fragmentation and can increase DNS resiliency by allowing fallback options.Resolvers should also avoid frag-mentation by using the recommended default EDNS(0) buffer size of 1232 bytes.To this end, our paper puts forward three goals: a) to evaluate DoTCP support (both over IPv4 and IPv6) and its usage across several DNS resolvers, b) to analyze the responsiveness/ latency over DoTCP and DoUDP for IPv4 and IPv6, and c) to investigate which buffer sizes are currently used in DNS traffic around the globe.In pursuit of these goals, we evaluate the behavior of the resolvers from two different vantage points.Firstly, DoTCP adoption, responsiveness, and EDNS(0) configuration are analyzed from the edge where the interaction between recursive resolvers and DNS clients running on the RIPE Atlas probes is measured.To scope DNS requests to the Edge of the network, we perform DNS queries for a domain that is likely cached by all resolvers, unlike in previous studies [11].Secondly, the interaction of recursive resolvers with authoritative NSes is further studied.To allow DNS requests to leave the edge and move into the Core of the network, we provision dedicated NSes for a custom-crafted domain whose resolution is requested from the DNS resolvers.Using this methodology (see §III), we study failure rates, response times, EDNS options, and usage of DoTCP and DoUDP, as well as the EDNS(0) configurations both from the edge and the core (except response times analysis) that gives detailed insights into the potential resiliency of DNS communication on the Internet as depicted in Figure 1.We perform measurements over IPv4 and IPv6 [12].Our main findings (see §IV) are -Resiliency from the edge: We observe that DoTCP (4.01%) tends to fail less often than DoUDP (6.3%) requests over IPv4.Contrarily, in the case of IPv6, we find a higher failure rate over both transport protocols (DoTCP 10%, DoUDP 9.61%).The analysis of response times for Public and Probe resolvers confirms the pattern of approximately doubled median response times for DoTCP compared to DoUDP in both the IP versions.We also observe that several public DNS resolvers still lack adoption (< 3.5%) of 1232B from the DNS Flag Day recommendation.
Resiliency from the Core: We find that DoTCP requests over IPv4 exhibit failure rates of 9.09% on public resolvers against higher failure rate of 11.53% over IPv6.Surprisingly, we find that the RIPE Atlas measurements ended successfully even after receiving a response with the TC-bit set, indicating a lack of proper fallback to DoTCP in many probes.Moreover, communication between resolvers and the authoritative NSes utilizes an EDNS(0) buffer size of 512 bytes less preferably (IPv4 0.24%, IPv6 0.13%) compared to the buffer sizes advertised to the RIPE Atlas probes (IPv4 27.41%, IPv6 26.04%).All DNS resolvers use EDNS(0) in most of the cases (> 99.84%).We also see other DNS options such as Cookie (4.80% IPv4, 7.91% IPv6) and EDNS Client Subnet (ECS) (1.81% IPv4, 1.49% IPv6) advertised by the public resolvers, while Google mostly uses ECS (14.24%IPv4, 12.53% IPv6).
DoTCP Usage Rates: We observe that when 2KB responses are received from the NSes, all resolvers that mainly use canonical (see §IV scenarios, use TCP in their last request for >95% of the cases.In situations where 4KB responses are received, we observe that almost all resolvers use TCP in the vast majority of measurements over both IP versions (>98%).This paper builds upon our previous study [13].In this paper, we have additionally added significant background information (see: §II) related to the monitoring and performance evaluation of DNS query-response over both TCP and UDP transport protocols.DNS response times can be a critical metric when using DoTCP fallback, we therefore conduct further measurements comparing DoTCP and DoUDP response times from the edge of the network (see: §IV-B2).Subsequently, when evaluating from the core, we present additional insights by including a detailed analysis using traceroute and DoUDP response time for public resolvers (see: §IV-B3).Additionally, we perform a deep dive by measuring the number of successful responses that do not contain any valid ANSWER sections for the DNS queries (see: §IV-C4).Notably, our investigation reveals instances where RIPE Atlas measurements have successfully terminated, despite receiving a response with the TC-bit set, thereby, indicating a lack of proper fallback to DoTCP across multiple probes.Towards the end, we discuss the limitations of our study and highlight future research directions in §V, followed by concluding statements in §VI.

A. DNS Measurement
To measure DNS failure rates, DNS performance, and the buffer sizes used, several studies have been conducted in the last few years.Some of them are discussed here.
1) Fragmentation: With the increased message sizes, DNS queries can exceed the MTU of many networks.Giovane et al. in [14] analyzes the fragmentation rates of DNS queries to the .nltop-level domain showing that less than 10k of 2.2B observed DNS responses by authoritative NSes are fragmented.Although fragmentation is in general fairly rare in DNS communication, the consequences can have negative effects on the resiliency and connectivity of Internet applications (RFC 8900 [15]).Herzberg and Shulman in [16] presented an attack allowing to spoof of Resource Records (RRs) in fragmented DNS responses by which attackers can hijack domains or nameservers under certain conditions.Following a similar procedure, Shulman and Waidner in [17] showed the opportunity to predict the source port used by the client.Both of these approaches belong to the class of DNS cache poisoning attacks, one of the most common and dangerous on the DNS.Though, cache poisoning attacks are also possible when DNS messages are not fragmented [18][19] [20].The aforementioned studies however show the additional security risk caused by fragmented responses.This potentially exposes the DNS user to several other types of attacks.Koolhaas et al. in [10] analyzed the behavior of different EDNS(0) buffer sizes in 2020.It was shown that the likelihood of a failing DNS query increases with growing buffer sizes.For a size of 1500 bytes, the default MTU of Ethernet which causes fragmentation of most of the DNS messages of the DNS queries to stub resolvers failed for 18.92% over IPv4, with 26.16% over IPv6 (RFC 2464 [21]).As countermeasures, in 2017, Cao et al. in [22] presented an "Encoding scheme against DNS Cache Poisoning Attacks".Berger et al. in [19] presented a way of detecting DNS cache poisoning attacks in 2019.Even though it was later shown that DNS cache poisoning attacks are also possible against DoTCP by Dai et al. in [23], this emphasizes its importance as a fallback option to the usage of DoUDP.Herzberg and Shulman in [17] recommend keeping the indicated buffer size less or equal to 1500 bytes.As a consequence, Weaver et al. summarize a list of recommendations to stakeholders in the DNS ecosystem in [24].These include the proposition for stub resolvers as well as authoritative nameservers to stick to buffer sizes of 1400B and below.The study conducted in [10] in 2020 yielded detailed recommendations for the EDNS(0) buffer size configuration of authoritative name servers and stub resolvers dependent on the used IP version and network type.The recommendations were adopted at the DNS Flag Day, 2020 claiming that "defaults in the DNS software should reflect the minimum safe size which is 1232 bytes".The aforementioned aspects emphasize the need for DNS resolvers to adopt the buffer size recommendations as fast as possible.Some other encrypted DNS protocol implementations such as DNS-over-TLS (RFC 7858 [25]) and DNS-over-HTTPS (RFC 8484 [26]) also counter the problems of fragmentation as TCP used as transport protocol [27].They are however not yet widely enough adopted to obsolete standard DNS implementations [28][29] [30].To investigate the progress of DNS resolvers in implementing the new standards, measurements analyzing the buffer sizes used by DNS resolvers are therefore performed from different standpoints.As DoTCP support is another important requirement for DNS resolvers to avoid truncation and fragmentation, DNS failure rates over TCP and UDP are analyzed in this paper.Additionally, the DoTCP-fallback behavior of the resolvers is studied to see in which cases they make use of TCP.As furthermore response time is the main disadvantage when using DoTCP instead of DoUDP, we are also interested in comparing the two implementations with regard to this aspect.
2) Response Times and Failure Rates: The first large-scale study on DNS performance and failure rates was performed by Danzig et al. in 1992 resulting in several important recommendations to reduce DNS traffic and latency [31].Ten years later, DNS Performance and the Effectiveness of Caching analyzed DNS traffic of the MIT Laboratory for Computer Science and the KAIST (Korea Advanced Institute of Science and Technology) over several months and stated a failure rate of 36% (23% timeouts, 13% other errors) as explained by Jung et al. in [32].Ager et al. [33] compared DNS response times between local DNS resolvers of ISPs and public resolvers like Google, PublicDNS, and OpenDNS.It was found that local resolvers generally outperformed public resolvers, but Google and OpenDNS showed faster responses in certain cases due to ISP caching issues.Several other measurements have been undertaken to observe DNS performance over IPv4 from different standpoints [34][35] [36].Additionally, Doan et al. in [37] observed that the public resolvers answered faster over IPv6 than over IPv4.In 2022, Moura et al. [14] investigated the fallback capabilities of DNS resolvers to utilize DoTCP by manipulating the TC-bit in responses from a controlled authoritative name server.They analyzed the order of incoming requests over different transport protocols and introduced the distinction between canonical (UDP request followed by TCP request) and non-canonical scenarios.Evaluating the order of incoming requests, it was concluded that an estimated 2.7% (optimistic estimation) to 4.8% (pessimistic estimation) of the examined resolvers were incapable of falling back to DoTCP usage.In the same year, Kosek et al. [11] conducted the first study comparing DNS response times and failure rates based on the underlying transport protocols using RIPE Atlas for ten public resolvers and probe resolvers.8% of the queries over UDP and TCP failed, with a very high DoTCP failure rate of 75.0% for probe resolvers.The response times of DoTCP were generally higher than that of DoUDP with large differences.The queried domains were unique and thereby uncached by all of the participating resolvers.To get a preferably broad and unbiased comparison of the different DNS resolvers over the particular underlying protocols, we perform DNS queries to both the domains, which is very likely cached on each resolver (google.com),and uncached ones.Additionally, we make sure that the uncached domains are administered by an authoritative name server under our control.Using a cached domain allows a detailed estimation of the latencies in the direct communication between resolvers and client software without the additional time needed for a recursive lookup.The DNS queries for uncached domains force recursive resolvers to forward them to our server.Observing the incoming requests offers the opportunity to analyze the communication between recursive resolvers and name servers, for example, the usage of DoTCP on the same path, in detail.
3) EDNS Options: Van den Broek et al. [38] analyzed more than 8 million DNS queries to an authoritative name server in 2014.Around 75% of the queries used EDNS(0).Additionally, it was observed that 36% used a UDP buffer size higher than 1232 bytes likely causing fragmentation.Measurements after the DNS Flag Day 2020 recommendations show that DNS resolvers still seem to lack adoption to the default buffer size of 1232 bytes.Based on the analysis of 164 billion queries to authoritative name servers, Moura et al. stated that many resolvers "announce either small (512 bytes) or large (4096 bytes) EDNS(0) buffer sizes, both leading to more truncation, and increasing the chances of fragmentation/packets being lost on the network".As the DNS Flag Day 2020 recommendations have not been out in the community for a very long time, a regular examination of the adoption rates of DNS software is reasonable and necessary.

III. METHODOLOGY
To extend the previous studies by Huber and Kosek et al. [11], we utilize the identical target resolvers for our measurements.We query the ten public resolvers listed in Table I, along with the configured probe resolvers.It is worth mentioning that at the time of the initial measurements, Comodo Secure DNS did not have a known IPv6 address, resulting in measurements conducted solely over IPv4.

A. Probe Selection
This paper employs the RIPE Atlas measurement network to conduct the measurements.To avoid potential load issues occurring in the first two probe versions [39][40], we choose only probes of version 3 or 4 that are hosted with a hometag [41].The chosen probes must support IPv4, IPv6, or both.A scan conducted on December 20, 2021, reveals the availability of 2527 probes with these attributes.Out of these, 1137 probes are IPv6 capable, while all of them support IPv4.These probes are distributed across 671 different Autonomous Systems (ASs) with varying densities in different regions: 70% in Europe, 18% in North America, 6% in Asia, 3% in Oceania, 1% in Africa, and 1% in South America.Before commencing the actual measurement series, which includes analyzing DNS resolvers from the edge and the core (as depicted in Figure 1), examining DoTCP usage, EDNS(0) configuration, and DoTCP fallback, we evaluated 4343 probe resolvers associated with the 2443 participating probes, resulting in an average of 1.78 resolvers per probe.

B. From the edge
For the purpose of analyzing the behavior of different DNS resolvers from the edge, we programmatically configured RIPE Atlas measurements specifically targeting the resolvers listed in Table I.These measurements encompassed DNS resolution over both IPv4 and IPv6, utilizing both TCP and UDP transport protocols.In this study, we treated the DNS resolvers as black boxes, focusing on the direct communication between DNS client programs and recursive resolvers.

C. From the core
This evaluation allows us to analyze resolver behavior when interacting with authoritative NS (see Figure 2).In this experiment, we use uncached domain names controlled by authoritative NS under our supervision.We analyze the resolvers' DNS configuration using two customized authoritative NSes.These NSes encode incoming DNS requests, including transport protocol and requester IP address, for later analysis.By observing the EDNS section of requests reaching the authoritative NS, we gain insights into the resolvers' EDNS configuration and potential usage of options like Cookie or Client Subnet.

N o r t h A m e r ic a A s ia
O c e a n ia
2) DoTCP Fallback: The Core measurement focuses on observing the DoTCP fallback behavior of public DNS resolvers.Large responses, consisting of 72 AAAA records (>2KB response) for one server and 145 AAAA records (>4KB response) for the other, are returned by the authoritative NSes.By including different RR types (A, AAAA, and TXT) in each measurement, we aim to investigate the resolvers' reaction to both response sizes simultaneously.Given that the previous experiment revealed resolvers requesting both NSes equally, approximately 50% of the requests are expected to receive 4KB and 2KB responses respectively.As large responses cannot be handled by UDP due to fragmentation issues, resolvers are anticipated to fallback to using DoTCP.The analysis focuses on whether the resolvers continue to utilize UDP or switch to DoTCP, providing insights into potential resiliency risks.Multiple requests from resolvers to the authoritative NS are expected, such as one over UDP followed by a fallback request over TCP.To accurately map incoming requests to the RIPE Atlas measurements, the domains queried by each probe are made unique using the aforementioned technique of prepending probe-specific information.

IV. RESULTS
We evaluate the results of the measurement from the edge concerning failure rates, response times, and EDNS(0) buffer sizes.Afterward, we analyze EDNS(0) configuration and DoTCP fallback from the core.

A. Probes
A comprehensive examination of all RIPE Atlas probes reveals the presence of 2527 probes possessing the desired attributes.All probes exhibit compatibility with IPv4, while 1137 probes exhibit the additional capability of conducting measurements over IPv6.The geographic distribution of the probes, as well as the locations of the authoritative name servers, is visually depicted in Figure 2. Notably, a concentrated density of probes is observed in North America and Europe, which serves as the primary origin for the RIPE Atlas community.Specifically, Europe accounts for 70% of the probes, followed by North America with 18%, Asia with 6%, Oceania with 3%, Africa with 1%, and South America with 1%.The distribution of these probes among the Autonomous Systems is detailed in Table II, highlighting the ten Autonomous Systems housing the majority of probes.The probes from Comcast, AT&T, and UUNET are exclusively situated in North America, while the remaining Autonomous Systems are primarily distributed in Europe.Furthermore, it should be noted that certain Autonomous Systems mentioned in Table II possess only a small number of IPv6-capable probes.The mapping between Autonomous System numbers and their respective names is obtained from IPtoASN 1 .
Probe Resolvers: Before conducting the actual measurement, preliminary test measurements are performed to gather address information regarding the locally configured resolvers on each probe.It is important to note that the resolvers can 1 https://iptoasn.com

B. Evaluation from the edge
This section analyzes failure rates of transport protocols at the edge to assess DNS resilience.Response times of resolvers are compared and the performance of public resolvers is evaluated using DNS response times and traceroute round-trip times.The adoption of DNS Flag Day recommendations by individual resolvers is analyzed through EDNS(0) buffer size announced to RIPE probes.
1) Failure Rates: Based on Kosek et al.'s research in [11], failed measurements are defined as those with no DNS response at the probe.In IPv4, public resolvers show lower failure rates for DoTCP (4.01%) compared to DoUDP (6.3%), indicating higher resiliency of DoTCP (Figure 3).However, probe resolvers present a different scenario with DoTCP surpassing DoUDP by 74.15%.DoUDP failures are solely due to Timeouts (5000ms).And public resolvers' DoTCP failures are also primarily caused by Timeouts (42.75%),READ-ERROR (33.91%),CONNECT-ERROR (23.24%), and TCP-READ (0.09%).Bad address (99.17%) is the main cause of DoTCP failures for probe resolvers.Overall, probe resolvers exhibit significantly higher DoTCP failure rates across continents.In case of IPv6, for public resolvers, we find lower resiliency over both transport protocols (DoTCP 10%, DoUDP 9.61%).Most public resolvers exhibit failure rates between   To analyze the adoption of the DNS Flag Day 2020 recommendations by the public resolvers from the edge, we evaluate the EDNS(0) buffer sizes which the individual resolvers announce to the RIPE Atlas probes.Table III summarizes the buffer sizes that have been observed in the UDP measurements.As for all resolvers except Quad9 the difference in the percentages of the announced buffer sizes between IPv4 and IPv6 are fairly low (≤3.5%).The buffer sizes advertised by Cloudflare, OpenNIC, UncensoredDNS, and Quad9 (55.47%IPv4, 62.09% IPv6) conform to the DNS Flag Day 2020 recommendation of a default buffer size of 1232B in most cases.Neustar, Comodo, OpenDNS and Yandex mainly use 4096 bytes.In 23.55% of the Quad9 DNS responses over IPv4, EDNS(0) is not used at all leaving clients to the default DoUDP message size limit of 512 bytes.This first view from the edge shows that several public DNS resolvers still lack adoption to the DNS Flag Day 2020 recommendations.To see whether this also holds for the communication with authoritative NSes, we conducted another experiment from the core.
2) Response Times: RIPE Atlas employs a measurement methodology to assess the response time (RT) of DNS requests.It measures the duration from the initiation of the measurement until a valid DNS response is received at the probe.DoUDP enables immediate transmission of requests to resolvers without the need to establish a TCP connection like DoTCP.Considering the cached nature of the "google.com"domain, DoUDP requests to resolvers with efficient cache management are expected to have response times equivalent to the probe-resolver round-trip time.In contrast, DoTCP measurements involve a three-way handshake, resulting in response times roughly twice as long as DoUDP.To ensure a fair comparison, only probe-resolver pairs with successful responses over both TCP and UDP are considered.The depicted response times in Figures 5 and 6 are obtained through a twotiered approach, calculating the median response time for each probe-resolver combination and presenting the median of all probes.
IPv4.The analysis of response times for Public and Probe resolvers confirms the expected pattern of approximately doubled median response times for DoTCP compared to DoUDP.Probe resolvers, due to their close physical proximity to the probing device, exhibit faster response times than Public resolvers.Among the Public DNS resolvers examined, Cloudflare, Google, and Quad9 demonstrate the lowest median response times for both transport protocols.Yandex shows relatively long response times for both DoTCP (104.5ms) and DoUDP (51.3ms).Neustar, on the other hand, exhibits the longest median DoTCP response time (1035.8ms) in this experiment, along with high DoTCP failure rates, suggesting inadequate implementation.When comparing response times by continent, Public resolvers generally respond fastest to DoTCP and DoUDP requests from European (DoTCP 52.7ms, DoUDP 24.1ms) and North American (DoTCP 54.9ms, DoUDP 27.1ms) probes.
The increased response times observed across various continent/resolver combinations for both transport protocols can be attributed to the sparser distribution of resolver Pointsof-Presence (PoPs) in those continents.This is evident as DoTCP response times are consistently approximately twice as high as DoUDP response times, indicating longer Round-Trip Time (RTT) due to greater distances between probes and resolvers.Probe resolvers demonstrate relatively low response times for both DoTCP (6.4ms -32.2ms) and DoUDP (3.2ms -15.3ms).Among Public resolvers, Orange Autonomous System (AS) exhibits the highest overall median DoTCP response time (167.2ms)and also shows elevated DoTCP response times for CleanBrowsing, Comodo, and OpenNIC resolvers.Overall, most ASes, primarily operating in North America and Europe, do not exhibit significant anomalies in Fig. 8: Traceroute round-trip-times (RTTs) and DNS response times (RTs) as a CDF.Again, the curves reflect an accumulation of probe-medians.DoTCP and DoUDP response times.Yandex performs poorly for requests from North American ASes (223.9-290ms) but slightly better for European ASes (71.9-103.6ms).Conversely, UncensoredDNS displays higher DoTCP response times for European ASes (117.8-214.6ms)compared to North American ASes (36.6-145.9ms).Notably, UncensoredDNS consistently exhibits higher DoUDP response times than DoTCP, including a median DoUDP response time over seven times higher for requests from UUNET.Furthermore, IPv6.The majority of Public resolvers show similar median response times for both DoTCP and DoUDP compared to their IPv4 counterparts.However, there is a notable decrease in the median response time for DoTCP requests to Neustar (50.2ms) over IPv6.UncensoredDNS and Yandex still exhibit relatively high DoTCP response times.The improved performance of Neustar over DoTCP and UncensoredDNS over DoUDP contributes to lower overall median response times for public resolvers across both transport protocols.It is important to mention that probe resolvers are not considered in the IPv6 analysis.The analysis based on continents for Public resolvers reveals results similar to those observed in IPv4 measurements.Notably, the DoTCP response time for Africa is more than 70ms higher in IPv6 compared to IPv4.This can be attributed to the relatively poorer DoTCP performance of Cloudflare, Google, and OpenDNS resolvers for requests originating from Africa over IPv6.Analyzing DNS response times over IPv6 by Autonomous System reveals higher variations compared to IPv4.Generally, probes from UUNET (21.1ms) and KPN (15.6ms) receive the fastest DoTCP responses, with most Public resolvers displaying their minimal DoTCP response times for these two ASes.CleanBrowsing shows outliers in DoTCP response times for probes from the Orange (191.4ms),UUNET (296.6ms), and KPN (349.5ms).UncensoredDNS shows relatively high response times for requests from all Autonomous Systems (62.2ms-214.1ms)except for UUNET (20.5ms).
Takeaway: In IPv4, Cloudflare and Google DNS resolvers demonstrate the most stable DoTCP and DoUDP response times across all continents.Other Public resolvers generally have significantly higher DoTCP response times for at least one, and often multiple, continents, particularly in Africa (322.5ms) and South America (245.8ms)where they take the longest to respond.Similar patterns are observed for DoUDP response times.Whereas, in IPv6 measurements, Europe exhibits the lowest median response times for DoTCP and DoUDP, while Africa exhibits the highest.Cloudflare seems to have fewer IPv6-capable Points-of-Presence distributed in Africa, resulting in increased roundtrip times (RTTs) due to long distances.
Combination -IPv4 and IPv6.The evaluation of the measured response times of DNS over both transport protocols and IP versions as CDF emphasizes the results summarized above and allows their comparison from a different standpoint.Instead of the median aggregated median RTs of each probe shown in Figure [5][6], Figure 7 shows the accumulation of all recorded probe-medians based on the transport protocol and IP version used.the Figure 7a confirms the extremely high DoTCP response times of Neustar DNS over IPv4.It shows that around 80% of the DoTCP requests take more than one second to be answered.The comparably bad performance of Yandex and UncensoredDNS can be seen for all combinations of transport protocol and IP version, especially for UncensoredDNS over UDP and IPv4 (more than 90% of the requests have an RT of more than 150ms).Furthermore, the response times of CleanBrowsing over both transport protocols on average increase when IPv6 is used instead of IPv4.This observation is not reflected by the median response times as the effect shows mainly for the slowest 25% of requests.Figure 7 also confirms that Cloudflare and Google exhibit the most stable DNS response times of all Public resolvers.

V. LIMITATIONS AND FUTURE WORK
Approximately 88% of our probe measurements are concentrated in North America and Europe, limiting the generality of DNS resiliency observations to other regions.To address this limitation, we provide response times categorized by continent.However, it is important to note that observations for continents with fewer probes have smaller sample sizes, which hinders drawing reliable conclusions.Similarly, when analyzing response times of specific autonomous systems, particularly over IPv6, the sample size remains relatively low.The study of EDNS(0) options focuses on the communication between different resolvers and our custom authoritative NSes.Therefore, the usage numbers may not accurately represent the capabilities of the resolvers and their EDNS(0) options in general.The observations reveal various non-canonical sequences  employed by DNS resolvers in response to large response sizes.Further investigation is required to fully understand the behavior of different resolvers, including their adjustment of announced EDNS(0) buffer sizes when receiving large responses.
While our study emphasized on the unencrypted DNS protocols DoUDP and DoTCP, the recently standardized encrypted DNS protocol DNS-over-QUIC (DoQ) (RFC 9250) [44][45] [46][47] does inherently solve fragmentation by means of the QUIC protocol (RFC 9000) [48] while also supporting increased DNS message sizes.However, DoQ adoption currently is scarce [49]; yet, DNS over QUIC is a promising candidate to supersede both DoUDP and DoTCP in the future, thereby warranting a detailed investigation when DoQ adoption rises.

VI. CONCLUSION
We conducted the measurements analyzing DoTCP resiliency, responsiveness and deployment from the edge and the core over IPv4 and IPv6.Additionally, the EDNS(0) configurations of ten public resolvers were studied.Issuing more than 14M individual DNS requests using 2527 globally distributed RIPE Atlas probes, we performed multiple experiments focusing on observations to conclude that most resolvers show similar resiliency for both DoTCP and DoUDP where 3 out of 10 resolvers mainly announce very large EDNS(0) buffer sizes, which potentially causes fragmentation.The analysis of DoTCP and DoUDP performance revealed significant regional variations for both IP versions.Notably, requests originating from Africa or South America exhibited the highest median response times.This highlights the need for further investigation and optimization in such regions.Particularly over IPv4, Cloudflare and Google emerged as the Public resolvers with the most consistent and stable response times across all continents.In reaction to large response sizes from authoritative name servers, we find that resolvers do not fall back to the usage of DoTCP in many cases, bearing the risk of fragmented responses.As the message sizes in the DNS are expected to grow further, this problem will become more urgent in the future.

Fig. 1 :
Fig. 1: 2527 RIPE Atlas Probes communicate the DNS requests with the edge (Probe and Public Resolvers) and with the core (authoritative NSes) using IPv4 and IPv6.Cached DNS responses are sent by the edge, while uncached DNS responses (2KB and 4KB) are sent by the core.

Fig. 2 :
Fig. 2: The global distribution of the RIPE Atlas probes participating in the experiments and the authoritative name servers developed for the measurements from the core.

Fig. 3 :
Fig. 3: Failure rates observed from the edge over IPv4.The upper part represents the DoTCP failure rates of all resolvers in total and per continent and AS.The lower part reflects the difference between the DoTCP and the DoUDP failure rates for a particular pairing (a negative value hence indicates a higher DoUDP failure rate).'Public Resolver' summarizes the observations of all resolvers that are not probe resolvers.

Fig. 5 :
Fig. 5: Response times (RTs) observed from the edge over IPv4.The values represent the median RT over the medians of each probe.The upper part shows the results for DoTCP, the lower one the differences between DoTCP and DoUDP.

Fig. 6 :
Fig. 6: Response times (RTs) observed from the edge over IPv6.The values represent the median RT over the medians of each probe.The upper part shows the results for DoTCP, and the lower one the differences between DoTCP and DoUDP.White cells indicate that there is no data for the given pairing.

Fig. 7 :
Fig. 7: Response times of each resolver over TCP/UDP and IPv4/IPv6 as CDF.The curves present an accumulation of the medians RTs of each probe.

Fig. 10 :
Fig. 10: Failure rates observed from the core over IPv4.The upper part represents the DoTCP failure rates of all resolvers in total and per continent and AS.The lower part reflects the difference between the DoTCP and the DoUDP failure rates for a particular pairing (a negative value hints at a higher DoUDP failure rate).Public Resolver summarizes the observations of all resolvers that are not Probe resolvers.

Fig. 11 :
Fig. 11: Failure rates observed from the core over IPv6.The upper part presents the failure rates over DoTCP, and the lower one is the difference between DoTCP and DoUDP failure rates.White cells indicate that there is no data for the given pairing.

TABLE I :
The public DNS resolvers under investigation, IPv4 and IPv6 represent the resolvers' anycast addresses

TABLE II :
Probe distribution over Autonomous Systems along with the percentage of IPv4/IPv6 capable probes.
com".By choosing a heavily-cached domain, we aimed to minimize recursive resolution or unexpected errors, thereby emphasizing the communication between client programs and resolvers.The perceived response times by the probes should thus correspond to the duration of a simple UDP/TCP request and response.As mentioned earlier, the RIPE Atlas probes collected essential information during the measurements, including response times, error messages, and the UDP buffer sizes advertised by each resolver.This data was subsequently retrieved using the RIPE Atlas measurement API.To validate the collected response times and gain insights into the resolvers' distribution across the Internet, simultaneous Traceroute measurements were conducted.It is anticipated that the measured round-trip time (RTT) of a Traceroute to the resolver closely aligns with the response time of a DoUDP request when the queried domain is cached on the resolver.

TABLE III :
EDNS(0) Buffer Sizes announced to the RIPE probes by the resolvers observed from the edge.CB: Clean-Browsing; U-DNS: UncensoredDNS accessible IP addresses or private ones.For instance, a probe may have Google Public DNS configured, in addition to a DNS resolver exclusively accessible through its local network.Among the probes, 41.93% have public IP addresses.Out of all the registered resolvers, 97.03% are associated with their corresponding IPv4 addresses, while the remaining 2.97% are linked to their IPv6 addresses.Additionally, it is crucial to acknowledge that during a RIPE Atlas measurement employing the utilized probe resolver parameter set, all probe resolvers are requested to resolve the relevant domain name.Consequently, both IPv4 and IPv6 DNS resolvers contribute to the measurement, regardless of the IP version mandated by the RIPE user.
RT-RTT ratio (quotient of median round-trip-time and median response time for each probe) for each resolver over TCP/UDP and IPv4/IPv6 as CDF.ratio of 1 for DoUDP and 2 for DoTCP.Figure9presents the RT/RTT ratios per resolver, transport protocol, and IP version.
For Neustar over DoTCP and IPv4, the observed ratio exceeds 2 for over 80% of the probes.This further highlights that the very high response times of the resolver are not due to a sparsely distributed global network of Points-of-Presence but rather an inadequate DoTCP implementation at various PoPs.The same conclusion can be drawn for UncensoredDNS over both TCP and UDP, as shown in Figures 9. OpenDNS and Google exhibit higher RT/RTT ratios compared to other Public resolvers over both transport protocols and IP versions, indicating that their relatively fast DNS response times are more attributed to a well-distributed global network rather than exceptionally efficient request processing.

TABLE IV :
EDNS(0) Buffer Sizes announced to RIPE probes by public resolvers in the measurement series from the core.Buffer sizes that are not equal to 512, 1232, or 4096 bytes are summarized in the column other.If EDNS is not used at all this is reflected in the column none.

Table V .
Most resolvers exhibit a preferred AS, with Cloudflare, Google, Neustar, OpenDNS, and Yandex DNS primarily using their own ASs for over 94% of resolutions.Public resolvers universally employ DoUDP, emphasizing the importance of proper EDNS(0) buffer sizes in the core.Commonly used buffer sizes include 1400, 1410, and 1452 bytes.Additional buffer size details are displayed in TableVI.

TABLE V :
Distribution of the resolvers communicating with the authoritative name servers for uncached domains used by the public resolvers over AS.

TABLE VI :
EDNS(0) buffer sizes announced to the authoritative NSes.Other buffer sizes and cases in EDNS that are not used are summarized in the column "other".NOTE: CB= CleanBrowsing;U-DNS= UncensoredDNS.All the values are in percentage (%).

TABLE VII :
EDNS options announced to the authoritative name servers Valid/ Invalid responses: As it was observed that the transport protocol used by the probes does not affect the usage of DoTCP/ DoUDP in the communication between resolvers and the NSes, all measurements in this experiment are carried out over DoUDP.Overall, 11,637,539 individual measurements are conducted based on unique domain names.We furthermore observe that the NS returning 2KB responses

TABLE VIII :
Percentage of measurements that were denoted as successful, but did not receive an answer section.

TABLE IX :
Failure Rates of the RIPE Atlas measurements from the core when the unique domain generated for the request was never requested.NR = Never Requested cases are taken into consideration and the respective failure rates of the RIPE Atlas measurements are presented.Additionally, there were cases where certain domains were never requested at any of the servers, contributing to a failed DNS response (21.16%IPv4, 17.27% IPv6).Other resolvers had a failure rate of over 6.53% for IPv6 requests that were not forwarded to authoritative name servers.This behavior may be attributed to some resolvers blacklisting our authoritative name server due to the receipt of large responses.5)Canonical/ Non-canonical requests: We begin our analysis by classifying incoming requests as canonical and non-canonical according to Mao et al.'s work OpenDNS consistently provided a valid answer section in most cases (97.65%), while others included an answer section in less than 64.44% of their responses.Surprisingly, RIPE Atlas measurements ended successfully even after receiving a response with the TC-bit set, indicating a lack of proper fallback to DoTCP in many probes.In TableIX these Table X displays the resolvers' usage of different scenarios when communicating with the 2KB name server.Notably, CleanBrowsing, Cloudflare, Google, OpenDNS, and UncensoredDNS predominantly send a UDP message followed by a TCP message.As indicated in Table VI, the resolvers advertise EDNS(0) buffer sizes of 1452B or less, demonstrating the expected fallback behavior.Table X presents the usage of different scenarios by the resolvers in response to 4KB name server replies.Quad9 demonstrates more non-canonical responses to 4KB compared to 2KB responses.6) TCP Usage: To assess TCP usage, we examined the presence of DoTCP requests within the query sequence reaching the name servers.Table XI shows the DoTCP usage rates of resolvers when receiving 2KB responses.Resolvers primarily employing canonical scenarios consistently utilize TCP in their final request, including Quad9 (99.69%IPv4, 99.70% IPv6).Yandex and Comodo rarely use DoTCP with 2KB responses in the last request.When receiving 4KB responses (see Table

TABLE X :
Classification of incoming sequences of DNS queries at the 2KB and 4KB name server for each resolver.

TABLE XI :
TCP usage of DNS resolvers when 2KB and 4KB responses are received.TCP Used represents all scenarios in which TCP is used at any point in the request sequence.For Last TCP, only those sequences ending with a DoTCP request are considered.