DNS Lookup
DNS stands for domain name system. Several databases are responsible for storing all the domain names online and their corresponding IP addresses. DNS lookup is a process to find the IP address when a domain name is given. The reason to check the DNS lookup time is that DNS lookup is a key bottleneck in web crawling. Long delay is mainly caused by two main factors. The first main factor is the latency between the user and the DNS resolving server. The second main factor is the latency between the resolving servers and other nameservers. The first factor is caused by round-time trip (RTT) in networking systems, geographical distance between client and server, the congestion of network, packet loss, etc. The second factor is caused by cache misses, underprovisioning, and malicious traffic. Moreover, quantitative analysis of DNS lookup is the foundation to diagnose the problem of slow web browsing.Two types of machines took the experiments of the DNS lookup for comparisons: the Duke CoLab virtual machine and EC2. Additionally, to meet the diversity of domains, this project include 5 types: “amazon.com”, “pratt.duke.edu”,“princeton.edu”, “youtube.com”, and “youku.com”. The reasons for the diversity settings were based on five assumptions: (1) the DNS lookup latency is shorter for “amazon.com” in EC2 than in the Duke CoLab virtual machine; (2) the DNS lookup latency is shorter for “pratt.duke.edu” in the Duke CoLab virtual machine than EC2; (3) “princeton.edu” is a different edu domain, whose DNS lookup latency is shorter in the Duke CoLab machine than EC2; (4) “youtube.com” is a domestic commercial domain, whose DNS lookup latency should be shorter than a foreign commercial domain in both the Duke CoLab machine and Amazon EC2; (5) “youku.com” is a foreign commercial domain, which is set to be compared with assumption (4). The author obtained several results based on her monitored data. First, the DNS lookup latency in the Duke CoLab virtual machine is shorter than that in the Amazon EC2. The reason for this result is possibly that Duke DNS server is physically located in the Durham, which is very close to the Duke CoLabvirtual machine, while Amazon DNS servers are located in another area, which is not physically close to the EC2. Secondly, the Duke CoLab virtual machine has shortest DNS lookup latencies in “pratt.duke.edu” and “princeton.edu”, while Amazon EC2 has shortest DNS lookup latency in “amazon.com” which was very reasonable. Thirdly, both the Duke CoLab virtual machine and EC2 have the worst performance in a foreign web domain, because DNS lookup of a foreign domain has more complicated routing paths than that of a domestic domain. Furthermore, in order to show a periodic pattern of result data, the author did not only present the DNS lookup in the time domain but also revealed quantitative analysis in the frequency domain. The result showed that one peak of slightly longer latency occurred once every half an hour and once every hour. This islikely due to the regular update and flush of the DNS server.