As discussed in the several studies on name collisions published to date, determining which queries are at risk, and thus how to mitigate the risk, requires qualitative analysis (New gTLD Security and Stability Considerations; New gTLD Security, Stability, Resiliency Update: Exploratory Consumer Impact Analysis; Name Collisions in the DNS). Blocking a second level domain (SLD) simply on the basis that it was queried for in a past sample set runs a significant risk of false positives. SLDs that could have been delegated safely may be excluded on quantitative evidence alone, limiting the value of the new gTLD until the status of the SLD can be proven otherwise.
Similarly, not blocking an SLD on the basis that it was not queried for in a past sample set runs a comparable risk of false negatives.
A better way to deal with the risk is to treat not the symptoms but the underlying problem: that queries are being made by installed systems (or internal certificates are being employed by them) under the assumption that certain gTLDs won’t be delegated.
A query for an applied-for generic top-level domain (gTLD) provides initial evidence that an installed system may be at risk from name collisions. Depending on what data is collected, that evidence may also include one or more SLDs, the IP address of the resolver that sent the query, and other forensic information such as the full query string. This information can be a good starting point for understanding why an installed system has made certain queries, what could happen if the responses to the queries were changed, and what other queries, not in the particular sample set, could also put the installed system at risk. A comprehensive analysis requires much more than just a count of the number of queries for a given gTLD and/or SLD. It also requires a set of measurements such as those described in detail in the New gTLD Security, Stability, Resiliency Update: Exploratory Consumer Impact Analysis, incorporating the context of those queries:
- Periodicity: Do the queries repeat at a regular frequency? This can help determine whether the queries are a result of user browsing, or of an automated process that depends on a certain response.
- Affinity: Where are the queries coming from? Are they correlated with one country? One network?
- Impact: Which network protocol generated the query? The WPAD, ISATAP and DNS-SD protocols all generate DNS queries in support of internal network configuration that could result in queries to the global DNS.
The analysis in the New gTLD Security, Stability, Resiliency Update: Exploratory Consumer Impact Analysis applied these measurements to produce a qualitative “risk matrix” for applied-for gTLDs including risk vectors based on frequency of occurrence of WPAD, ISATAP, DNS-SD queries, internal name certificates, HTML references, and regional affinities, among other factors (such as queries that appear to be related to McAfee antivirus defenses).
Verisign Labs’ analysis of the query data for the .CBA suffix offers an enlightening example of how mitigation should be conducted. The data illustrated that a significant number of queries for this applied-for gTLD were originating from one large network. The research team suspected that changes in the behavior of the global DNS might put those queries at risk if the gTLD were delegated. As follow up, Verisign personnel reached out to the network operator, which has since reconfigured some of its internal systems to use a different suffix. As a result of that remediation, as is shown in the figure below, the volume of queries for .CBA observed at the A and J root servers has already begun to decline.
Additional posts in this series:
- Part 1 of 4 – Introduction: ICANN’s Alternative Path to Delegation
- Part 2 of 4 – DITL Data Isn’t Statistically Valid for This Purpose
- Part 4 of 4 – Conclusion: SLD Blocking Is Too Risky without TLD Rollback