A couple of hours before folks on the east coast could see this past Wednesday's sunrise, some found themselves battling to get webpages to load, or found themselves without Internet entirely. It'd be understandable in this situation to jump to the assumption that a DDoS has taken place, since it's become (far too) common lately, but this partial outage had nothing to do with that. Instead, it hinged entirely on aging networking equipment.
BGP is a virtually unknown acronym to the end user - even those who might know a little bit about general networking - but it's integral to making the Internet work. As its Border Gateway Protocol name may suggest, BGP routers have the job of making Internet routing as efficient as possible, and to do this, massive lists of other routers are kept up-to-date often. Given many BGP routers are aging by this point, you might be able to understand where the problem lays.
In this particular example, BGP routers exceeded 512,000 current routes, which is an enormous number for any machine to concurrently manage. Memory was exceeded, and downtime was the result. Yes, it really is that simple.
The solution is also simple: Equipment needs to be upgraded so that this doesn't happen. Of course, many enterprises (and ISPs in particular) are rarely quick to upgrade aging equipment on account of the fact that it's seriously expensive. While this downtime could prove to be a wake-up call for some, it's going to take many more for real change to take place. Until then, if you find yourself without an Internet connection, it might just be due to an overloaded BGP router.