json

Hurricane Electric massive DDoS

In Cloud Computing, Networking on October 4, 2011 at 2:37 pm

One of our hosting providers, Linode, co-locate their servers in Hurricane Electric‘s (HE) Fremont Data center. Early today HE got hit by a massive (no other word really to describe it) DDoS.. it started at around 1:30AM GMT+8 and ended around 10AM (at least for us, with   a couple of servers affected).  Linode posted this update from HE:

On October 3rd we experienced a large attack against multiple core routers on a scale and in ways not previously done against us. We had various forms of attack mitigation already in place, we have added more. It was all fixable in the end, just the size and number of routers getting attacked and the figuring out what attacks were doing what to what took some time. The attack mitigation techniques we’ve added will be left in place. We are continuing to add additional layers of security to increase the resiliency of the network.

Because the attackers were changing their methods and watching how their attacks were responded to, we are not at liberty to elaborate on the nature of the security precautions taken.

This attack is interesting for a couple of reasons:

  1. The core routers are the target.   A typical DDoS usually targets a specific domain or service, by targeting the routers  of HE the impact of the attack is broader ie it affected all the customers of the Data Center.  When Amazon was attacked, the users hardly felt any degradation in peformance. That’s because the attack was against a domain and we already know that amazon has thousands of load balanced servers which regularly takes on the load of last minute shopping.  This one was different,  instead of attacking the servers, they attacked the core routers and router switches which act as ‘gateways’ to the load balancers, firewalls and  servers.   A core or edge router provides  gateway routing and connectivity to dozens of  other routers and possibly thousands of  servers to the rest of the Internet, shut that router down and you’ve effectively made those thousands of servers inaccessible.  The attack targeted “multiple” core routers at HE.
  2. The attack was  successful.  New generation routers usually have built in  anti-DoS features already, the fact that those where all overwhelmed means that a) the volume is simply too massive — its not really difficult to congest a pipe — and or  b) a protocol exploit that used up a lot of CPU was used — e.g. BGP is frequently a target.
  3. The attack was dynamic.  HE mentioned that the attackers were changing their methods in real time and watching how their attacks are being responded to.   Obviously, they’re not dealing with script kiddies here.

I could think of several scenarios on why somebody would do this (conspiracy hat firmly in place):

  • its a red herring — there really was a target– hosted by HE or one of its customers but the perpetrators wish to hide that fact or;
  • somebody has an ax to grind with HE.. could be a disgruntled network engineer, it can happen or;
  • its a proof-of-concept test — now this is a real concern.  Obviously, the attackers have figured out a way to execute the attack dynamically and massively and considering that it took a jaded and arguably one of the most experienced data center operators almost 12 hours to stop the attack means that something new was done.  One could argue that the reason why the attack stopped was not because HE was able to apply  or adapt to the attack patterns (remember HE said that it was evolving), it could be that the attackers decided simply to stop ie they could have continued if they wanted and HE would have found a new attack pattern to apply rules against.

Whoever it is, and we’ll probably never know who he/she/they are, it is a very real major concern, specially if you’re in the business of hosting and service provisioning online.  Unfortunately, if this happened to HE this can happen to anybody.

update 2:48PM GMT+8:  well it looks like the attacker(s) simply went out to grab dinner.. linode is reporting that they’re experiencing another ‘stability issues’ with their Fremont (ie HE) ‘upstream’.

update 3:31PM GMT+8: network has stabilized ‘again’ according to linode.. just a quick clarification, apparently, its not only HE’s Fremont facility that was affected but their NY DC as well.. take no prisoners approach I see.

update 6:43PM GMT+8:apparently, a similar attack, albeit a limited one,  happened to HE a week ago.  Just a probe then..today was D-Day.

On September 28, 2011 10:20pm PDT and September 29, 2011 11:45am PDT, the Fremont 1 datacenter was subject to a  DDOS targeting a core router. The attack caused OSPF and BGP reloads resulting in elevated CPU utilization and performance degradation of the router.

The incident on September 28, 2011 10:20pm PDT was identified and mitigated at 10:40pm PDT. The incident on September 29, 2011 11:45am was identified and partial mitigation was realized shortly thereafter with full containment at approximately 12:45pm PDT. All systems are fully operational at this time. We have already been in contact with the router vendor, and have obtained a new software image that addresses this type of infrastructure attack. We will be deploying the new image shortly. A maintenance notification will be sent out separately regarding this emergency maintenance.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: