As widely discussed recently, observed within the ICANN community several years ago, and anticipated in the broader technical community even earlier, the introduction of a new generic top-level domain (gTLD) at the global DNS root could result in name collisions with previously installed systems. Such systems sometimes send queries to the global DNS with domain name suffixes that, under reasonable assumptions at the time the systems were designed, may not have been expected to be delegated as gTLDs. The introduction of a new gTLD may conflict with those assumptions, such that the newly delegated gTLD collides with a domain name suffix in use within an internal name space, or one that is appended to a domain name as a result of search-list processing.
As discussed in the several studies on name collisions published to date, determining which queries are at risk, and thus how to mitigate the risk, requires qualitative analysis (New gTLD Security and Stability Considerations; New gTLD Security, Stability, Resiliency Update: Exploratory Consumer Impact Analysis; Name Collisions in the DNS). Blocking a second level domain (SLD) simply on the basis that it was queried for in a past sample set runs a significant risk of false positives. SLDs that could have been delegated safely may be excluded on quantitative evidence alone, limiting the value of the new gTLD until the status of the SLD can be proven otherwise.
Similarly, not blocking an SLD on the basis that it was not queried for in a past sample set runs a comparable risk of false negatives.
A better way to deal with the risk is to treat not the symptoms but the underlying problem: that queries are being made by installed systems (or internal certificates are being employed by them) under the assumption that certain gTLDs won’t be delegated.
For several years, DNS-OARC has been collecting DNS query data “from busy and interesting DNS name servers” as part of an annual “Day-in-the-Life” (DITL) effort (an effort originated by CAIDA in 2002) that I discussed in the first blog post in this series. DNS-OARC currently offers eight such data sets, covering the queries to many but not all of the 13 DNS root servers (and some non-root data) over a two-day period or longer each year from 2006 to present. With tens of billions of queries, the data sets provide researchers with a broad base of information about how the world is interacting with the global DNS as seen from the perspective of root and other name server operators.
In order for second-level domain (SLD) blocking to mitigate the risk of name collisions for a given gTLD, it must be the case that the SLDs associated with at-risk queries occur with sufficient frequency and geographical distribution to be captured in the DITL data sets with high probability. Because it is a purely quantitative countermeasure, based only on the occurrence of a query, not the context around it, SLD blocking does not offer a model for distinguishing at-risk queries from queries that are not at risk. Consequently, SLD blocking must make a stronger assumption to be effective: that any queries involving a given SLD occur with sufficient frequency and geographical distribution to be captured with high probability.
Put another way, the DITL data set – limited in time to an annual two-day period and in space to the name servers that participate in the DITL study – offers only a sample of the queries from installed systems, not statistically significant evidence of their behavior and of which at-risk queries are actually occurring.
Throughout this series of blog posts we’ve discussed a number of issues related to security, stability and resilience of the DNS ecosystem, particularly as we approach the rollout of new gTLDs. Additionally, we highlighted a number of issues that we believe are outstanding and need to be resolved before the safe introduction of new gTLDs can occur – and we tried to provide some context as to why, all the while continuously highlighting that nearly all of these unresolved recommendations came from parties in addition to Verisign over the last several years. We received a good bit of flack from a small number of folks asking why we’re making such a stink about this, and we’ve attempted to meter our tone while increasing our volume on these matters. Of course, we’re not alone in this, as a growing list of others have illustrated, e.g., SSAC SAC059’s Conclusion, published just a little over 90 days ago, illustrates this in part:
The SSAC believes that the community would benefit from further inquiry into lingering issues related to expansion of the root zone as a consequence of the new gTLD program. Specifically, the SSAC recommends those issues that previous public comment periods have suggested were inadequately explored as well as issues related to cross-functional interactions of the changes brought about by root zone growth should be examined. The SSAC believes the use of experts with experience outside of the fields on which the previous studies relied would provide useful additional perspective regarding stubbornly unresolved concerns about the longer-term management of the expanded root zone and related systems.
In 2010, ICANN’s Security and Stability Advisory Committee (SSAC) published SAC045, a report calling attention to particular problems that may arise should a new gTLD applicant use a string that has been seen with measurable (and meaningful) frequency in queries for resolution by the root system. The queries to which they referred involved invalid top-level domain (TLD) queries (i.e., non-delegated strings) at the root level of the domain name system (DNS), queries which elicit responses commonly referred to as Name Error, or NXDomain, responses from root name servers.
Do you recall when you were a kid and you experienced for the first time an unnatural event where some other kid “stole” your name and their parents were now calling their child by your name, causing much confusion for all on the playground? And how this all made things even more complicated – or at least unnecessarily complex when you and that kid shared a classroom and teacher, or street, or coach and team, and just perhaps that kid even had the same surname as you, amplifying the issue! What you were experiencing was a naming collision (in meatspace).
For nearly all communications on today’s internet, domain names play a crucial role in providing stable navigation anchors for accessing information in a predictable and safe manner, irrespective of where you’re located or the type of device or network connection you’re using. The underpinnings of this access are made possible by the Domain Name System (DNS), a behind the scenes system that maps human-readable mnemonic names (e.g.,www.Verisign.com) to machine-usable internet addresses (e.g., 188.8.131.52). The DNS is on the cusp of expanding profoundly in places where it’s otherwise been stable for decades and absent some explicit action may do so in a very dangerous manner.
Verisign recently published a technical report on new generic top-level domain (gTLD) security and stability considerations. The initial objective of the report was to assess for Verisign’s senior management our own operational preparedness for new gTLDs, as both a Registry Service Provider for approximately 200 strings, as well as a direct applicant for 14 new gTLDs (including 12 internationalized domain name (IDN) transliterations of .com and .net). The goal was to help ensure our teams, infrastructure and processes are prepared for the pilot and general pre-delegation testing (PDT) exercises, various bits of which are underway, and the subsequent production delegations and launch of new gTLDs shortly thereafter.