DDoS attacks may be the reason to cause overloaded Azure DNS servers
Microsoft experienced a large-scale service interruption this week and affected Microsoft users around the world, including interruptions in a variety of Microsoft business software and services, as well as game businesses.
A preliminary investigation released by Microsoft at that time showed that the query volume of Microsoft’s DNS server increased sharply, and this rapid increase in query volume caused abnormal queries by normal users.
When a normal user’s query is abnormal, the user terminal starts to perform more queries to try to restore the service, which causes the query volume to soar and the pressure on the server to increase sharply.
These attacks seem to be aimed at certain domains hosted on Microsoft Azure, that is, the target is not Microsoft, but the attack caused Microsoft’s entire cloud service to be paralyzed. Including Microsoft 36, Xbox Live game service, Microsoft Intune, Microsoft Teams, Exchange Server are all affected.
Microsoft did not disclose which advanced persistent threat group this attack can be attributed to, but it is usually difficult to win Microsoft under ordinary DDoS.
The company has a number of large data centers around the world and has extremely high server bandwidth, so small-scale attack traffic is not enough to directly paralyze Microsoft systems.
Unfortunately, this attack revealed a flaw in the Microsoft cloud infrastructure. It is this flaw that prevents the resilient caching system developed by Microsoft from improving cache efficiency.
The soaring volume of queries made the Microsoft DNS system work abnormally, which in turn led to illegal DNS requests that were considered legitimate and entered the queuing system.
Under normal circumstances, if Microsoft recognizes illegal traffic, it usually directly discards these requests, so this small defect directly causes a series of consequences to paralyze the system.
Microsoft did not specify what this flaw is. Microsoft said that this is a code flaw someplace. Perhaps the attacker did not expect the consequences of a casual attack to be so serious.
“Azure DNS servers experienced an anomalous surge in DNS queries from across the globe targeting a set of domains hosted on Azure. Normally, Azure’s layers of caches and traffic shaping would mitigate this surge. In this incident, one specific sequence of events exposed a code defect in our DNS service that reduced the efficiency of our DNS Edge caches.”
“As our DNS service became overloaded, DNS clients began frequent retries of their requests which added workload to the DNS service. Since client retries are considered legitimate DNS traffic, this traffic was not dropped by our volumetric spike mitigation systems. This increase in traffic led to decreased availability of our DNS service,” Microsoft explained in the RCA for this week’s outage.