#FacebookDown: What happened and what didn’t
Image by Gerd Altmann from Pixabay

#FacebookDown: What happened and what didn’t

Facebook on Monday suffered its longest outage since 2008, disrupting access to all of its platforms including Instagram, WhatsApp and Oculus as well as having significant impact on other online services. People quickly flocked to Twitter to discuss the 6-hour service interruption and speculate on the cause.

It turns out that the cause was relatively benign and common, though the effects were felt around the world. A Facebook Engineering blog post pinpoints configuration changes as the root of the issue, which then had a “cascading effect,” with each step causing more problems.

Facebook said “you can’t find us”

Just before the outage began, Facebook sent out an update across something called BGP, or Border Gateway Protocol. BGP is like a traffic routing system for the Internet. Network operators like Facebook and Google as well as large companies, plus Internet Service Providers are known as Autonomous Systems and have an assigned ASN, Autonomous System Number. Facebook’s happens to be AS32934, one of over 100,000 and growing.

BGP updates can be compared to updating a road map: The network operator (street) shares what networks (streets) they’re connected to. Internet traffic (drivers) can then use the information to determine the best path from one address to another based on which roads connect from the beginning to the end. If a new highway opens, or a road is closed for construction, a BGP update is the way to share that information.

Cloudflare has published a readable but slightly more technical overview of how BGP works.

On Monday, Facebook’s configuration mistake was to retract all of its BGP routes. Rather than send some updates, they effectively told the Internet there was no way to get to them. In our map example, their street still existed, but it wasn’t connected to anything.

Restoring service was reportedly made more difficult by the fact that many Facebook employees are still working from home and their internal communication tools were also unavailable, as well as some at physical offices being “unable to enter buildings and conference rooms because their digital badges stopped working” according to the New York Times.

Misinformation continues

Memes abounded on Twitter, saying “Tell your family that you can only get on Facebook if you’re vaccinated now” and “Keep Facebook offline, save democracy from disinformation.” While funny, they entirely miss the mark.

Somewhat ironically, a number of online personalities and media outlets chose to promote rumors and technically inaccurate or just plain wrong information.

Many jumped on rumors that AT&T, Verizon and T-Mobile in the United States were also suffering from outages at the same time. Those reports primarily came from a website called Downdetector. It’s a useful service for seeing if you are the only one having an issue with a website or ISP, but it does not monitor services automatically to see if they’re working. Instead, they rely on user reports on their own website as well as on Twitter.

Others posted about Facebook’s “master code” being deleted. This seems to have been a bad interpretation of the fact that Facebook’s BGP routes disappeared. That misunderstanding quickly turned into conspiracy theories that Facebook’s source code and internal documents had been intentionally deleted as a result of either the Pandora Papers or the 60 Minutes interview with Frances Haugen, the whistleblower who provided the Facebook Files to the SEC and Wall Street Journal.

Those examples are relatively benign.

On the other hand, QAnon adherents – many of whom still believe that Donald Trump is still President – heralded the outage as the beginning of the “10 days of darkness” that they believe includes a media blackout and would usher in a military coup, the public reinstatement of Trump, and public arrests & executions of Democrats and media personalities. The irony of posting about that total media blackout on Twitter seems to have been lost on them.

That particular theory didn’t trend on Twitter yesterday, but far right social media site Parler did trend as people suggested their followers move there. Other websites, many of which started when Facebook, Twitter and reddit started removing some Trump and Q-related content, were also heavily promoted. These sites do advance dangerous conspiracy theories from using Ivermectin as a treatment for COVID to outright antisemitic ideas.

Given all available evidence so far, the Facebook outage was not an external attack, internal sabotage or the deep state intervening. Network and server misconfigurations are easy and common, even in a big organization. Most interruptions like this don’t have the global impact that the size and scale of Facebook enable.