As much as we all love the internet and everything it offers, we’ve also all experienced that sinking feeling when we try to access our favorite website, only to find it’s down.
If you run your own site, you know that uptime is crucial for your online success — so that sinking feeling in your chest when your own website is down is … well, even worse.
But let’s face it: even the internet giants aren’t immune to outages. Up for a cup of coffee as we stroll down memory lane together and reminisce about some of the most epic website outages of all time?
Whether they’re caused by unexpected traffic surges and server failures, DDoS attacks, or the ever-present human error, outages are never fun — but hopefully, we can all learn from the past.
Facebook, October 2021
On October 4th, 2021, Facebook experienced a massive outage that also took down other Meta services such as WhatsApp, Instagram, Messenger, and Oculus Quest (which allows VR headset Oculus users to stream TV, movies, and videos).
The outage lasted for approximately six hours, and it was caused by what Facebook later called a boring error when “configuration changes on the backbone routers that coordinate network traffic between our data centers caused issues that interrupted this communication.”
It might not sound like much, but for a social network that also provides an authentication mechanism for other companies, six hours is an eternity. Even worse, because the outage also took down the tools used to reset the routers handling traffic, employees had to manually restart all systems.
And in what sounds like one of those movies where everything goes wrong, employees could not, at first, even access the building where the data centers were located to debug the issue.
A report issued by Facebook the day after the outage explained that “this took time, because these facilities are designed with high levels of physical and system security in mind. They’re hard to get into, and once you’re inside, the hardware and routers are designed to be difficult to modify even when you have physical access to them” (as reported via Market Watch). By the time all was said and done, the outage had cost Facebook roughly $60 million in ad revenue and $47.3 billion in lost market cap. Zuckerberg lost $6 billion from personal wealth in the few hours Facebook was down, according to Bloomberg.
Fastly, June 8th, 2021
The name Fastly might not carry the same weight as Facebook or Amazon for most people out there, but this cloud-based Content Delivery Network (CDN) provider is behind the content delivery of major companies around the world.
When the 1-hour outage hit in 2021, CNBC reported that high-traffic websites and online services such as Amazon, Reddit, The New York Times, Shopify, Twitter, Spotify, and even the UK government’s official website all went down, displayed error messages or experience on-and-off difficulties.
The outage was caused by a software bug within the company’s CDN configuration. Funnily enough, the bug wasn’t triggered at the server or even from the company side – it happened when a single customer changed their own setting, accidentally “waking up” the bug (already in the system but dormant) that led to the outage. According to Fastly’s summary of the event, that single, innocent change caused 85% of their network to return errors.
According to The Wolfcast, the “Outage may have cost digital platforms up to $150 million in lost sales” – for just one hour of offline time.
British Airways, May 28th, 2017
Downtime doesn’t only affect shopping and entertainment sites. Perhaps even scarier is the fact that it can also disrupt transportation. This happened when in 2017 an outage took down many of the systems and operations for British Airways.
When an engineer accidentally disconnected the power supply to British Airway’s data center, a major outage followed, causing disruption to BA’s global operations.
According to The Guardian, over 1,000 flights were grounded, terminals in London were overflowed with 75,000 stranded passengers, and access to the booking system and baggage handling were affected.
A month later, the British Airways owner estimated the data center outage cost the company about $102 million between lost revenue and the expense of compensating thousands of passengers, according to Data Center Knowledge.
Google, December 14th, 2020
When a giant like Google goes down, the entire world feels the impact. Google’s outage in 2020 only lasted 45 minutes, but it’s considered one of the biggest outages to ever hit the internet, The Guardian reported.
The outage took down Google services, including Gmail, Google Drive, YouTube, Google Calendar, Google Home apps, and Google Maps.
The cause of the crash? A lack of storage space in Google’s authentication tools (what Google later called “an internal storage quota issue”) caused an error when the system failed to release more space automatically and caused the system to crash.
Can you guess the damage that little 45-minute crash caused? Google lost $1.7M in ad revenue during the YouTube outage, according to Fox Business.
Dyn, October 21st, 2016
Between 2001 and 2017, Dyn was an Internet performance management and web application security company that handled things like data traffic management and Domain Name System (DNS) provider.
When the outage happened in 2016, many major companies – everybody from Twitter and Spotify to Netflix, Airbnb, Amazon, Spotify, eBay, and the PlayStation Network – were using Dyn as their DNS provider, and they all went down with it.
The cause behind the outage? One of the biggest distributed denial of service attacks (DDoS) to ever hit the internet. Wired called it the DDoS attack “that took down a big chunk of the Internet for most of the Eastern seaboard.” The attack overwhelmed the company’s servers, spreading malware vulnerabilities in basic equipment like printers and IP cameras. Later reports identified the attack as the largest of its kind in history (via The Guardian).
In the end, the Dyn outage cost the business millions in lost revenue. Although there aren’t specific numbers available about the losses, CoverLink points out that “organizations spend an average of $2.5 million recovering from DDoS attacks” and because Dyn’s outage was so widespread, it likely cost a lot more than that.
Spotify and Discord, March 8, 2022
Both Spotify and Discord suffered interruptions in service in March 2022. At the time, Mashable reported that it started with smaller issues around 1 pm, with users unable to log in and support pages glitching.
Within half an hour, things were deteriorating, with API failures and further glitches complicating things. As The Verge reported later, it took about 2two hours before things started to come back online — just as Google Cloud (the service provider both Spotify and Discord operate on) announced they had their own glitch due to a malfunctioning component that required a reroute.
Twitter and Instagram July 14, 2022
Another two-per-one outage hit Twitter and Instagram on the same day in July 2022, but with a twist – the outages weren’t actually connected.
Twitter went down for 40 minutes in the early morning of July 14th, 2022. Within minutes, over half a million users were reporting issues uploading tweets and logging into the service.
Twitter had already suffered two outages earlier in February for what the company called “a technical bug that briefly impacted how Tweets were loading,” so when it happened again in July, people were less than happy (via The Verge).
About an hour later, however, Twitter was back up and running from “some trouble with internal systems.”
Just a few hours later, CNET reported that Instagram went down too, with people reporting issues accessing the service, sending DMs, or seeing the app crash as soon as they tried to open it. And here’s some irony for you –Instagram users flocked to Twitter to report Instagram outages as soon as they started.
Instagram was up again within a couple of hours, only to suffer another major outage in October 2022. This time, it wasn’t just a question of crashes and difficulty accessing the app, but accounts were accidentally locked and suspended because of a bug. By the time Instagram was back up, many large accounts had lost millions of followers, reports Lifestyle Asia.
Amazon Web Services, 2017 and 2020
Amazon has had its share of outages over the years. And because of its size, no other company out there loses more money every time its website goes down.
Amazon Web Services (AWS) had a major outage in 2017, during which millions of cloud service and website users lost access to the website.
The main issue with Amazon going down is that Amazon’s S3 web-based storage service provides cloud services for a lot of other sites out there. So when Amazon web services fail, other sites go down with it — in this case, that means everybody from Apple to Venmo to Slack suffered the consequences.
This is what happened in March 2017, when a simple human error “broke the internet.” According to Data Center Knowledge, an engineer was debugging an issue when he accidentally mistyped a command. That was it – a simple click of the wrong key took down the cloud for several hours and caused headaches for many companies. That “oops” resulted in over $150 million in losses for the companies involved.
Amazon Web Services (AWS) experienced another hours-long outage in December 2021, this time taking down Disney, Netflix, and Spotify as well. Even Alexa and iRobot reported glitches and connectivity issues.
According to CNBC, the effect of this even extended beyond commercial sites – Many colleges in the U.S. had to cancel exams as they couldn’t access the platforms where the exams were hosted.
Even worse, by the time the outage hit on December 22nd, Amazon was still recovering from two other major outages from earlier in the month — all three were caused by power outages at one of its data centers.
Because of its widespread reach, it’s almost impossible to calculate the losses caused by the December 22nd outage, but analysts believe it could potentially have cost “at least a billion dollars in economic loss to companies that depend on AWS,” TFIR says.
If there’s anything we can learn from these examples is that downtime can affect everybody — even the big players have to deal with outages and the financial losses, reputation damage, and customer dissatisfaction that come with them.
And while there’s no doubt that some outages may be unavoidable, it’s important for any company, no matter the size, to invest in proactive measures like downtime monitoring to prevent and respond to them as quickly and effectively as possible.