Delivering Expert Cyber Security Solutions to small & medium-sized businesses

We focus on educating, transforming and protecting our clients from cyber threats

Learn About GoldSky

Lessons Learned: Unpacking the Longest Facebook Outage of 2021

Social media has come a long way since its beginnings as a platform simply for connecting with people. It is now a thriving marketplace as well. In the FB Q4 2020 conference, Mark Zuckerberg revealed that Facebook has more than 2.6 billion users per day, and over 200 million small businesses use this platform to reach their customers. Thus, when a platform with such a vast number of users suddenly stops working, it catches the attention of the whole world.

On October 4 of this year, Facebook and its other platforms, Instagram, WhatsApp, and Messenger, faced a global outage for nearly six hours. As a result, billions of users were unable to access all these platforms. The Instagram screen showed a 5xx server error message, while the Facebook site mentioned something went wrong.

Along with the social media platforms owned by Facebook, its virtual reality platform Oculus and Workplace, a business communication tools suite, also went down. In Oculus, although the browser was working and users could load pre-installed games, they could not use any social features or install new games.

Following the outage, Facebook issued multiple statements acknowledging and explaining the details behind the shutdown. The news of the whistleblower holding Facebook accountable for choosing profit over safety and the global outage brought down its shares and Mark Zuckerberg’s wealth, pushing him down the list of the wealthiest people. According to Fortune estimates, the outage cost Facebook nearly $100 million in revenue.

Details of the most prolonged Facebook outage in recent history

Facebook has revealed that the configuration changes on the main routers disrupted its services. The routers are the backbone that coordinates the network traffic between data centers. Any changes or disturbances to the network traffic affect the communication of data centers and stop all services. Because of this, Facebook and its other services stopped functioning.

A command issued during routine maintenance for assessing the availability of the global backbone capacity took down the entire network and disconnected all the data centers. Facebook claimed that their systems could stop these types of commands, but this time it was unable to do so because of a bug present in the audit tool. Externally, users thought the outage to be DNS-related, which was partially true. The internet’s address book, the DNS, translates the website names into specific server IP addresses for connecting to them. The Border Gateway Protocol (BGP) routes this information within the networks and the rest of the internet.

However, with the backbone network of Facebook removed, the BGP routes withdrew from the internet. As a result, no internet user could find the servers. An onsite team was dispatched to the data centers to debug the issue and restart the system. The process was time-consuming as the facilities have high-level security, and the complicated hardware and routers resist any modification. The network was back online once the team fixed the issues. However, the challenge of sudden traffic surge persisted. Facebook has explained that the experience from past drills helped at this point to get things back online without causing any more system-wide failure.

The Impact of the Global Facebook Outage

Facebook had assured that the outage was not a malicious attack and that user data was entirely safe. However, it affected internal communications and prevented employees from sending or receiving external emails, accessing the corporate directory, and restricted many more functions.

Here are some cybersecurity lessons to learn from the global Facebook outage:

  1. Use different login methods for using other services. For example, it would be difficult for users to access accounts created with Facebook details if Facebook goes down.
  2. Users must evaluate what data is permissible for social media sites. Breach of sensitive information during an outage or cyberattack can cause issues in the future.
  3. One of the best cybersecurity practices is to keep a backup of all data. Unfortunately, the Facebook outage has reiterated the possibility of data loss if any system goes down.
  4. Use different passwords and emails for multiple accounts. The same password for multiple accounts makes them an easy target for hackers and cybercriminals.

Many small or medium-sized businesses function primarily through these social media platforms. This incident has highlighted their dependence on social media and the need to change to a more stable platform. Although Facebook, Instagram, and WhatsApp are great supplements to other platforms, situations like a global Facebook outage can bring entire business operations to a standstill.

Cyberattacks and data breaches of these social media platforms can compromise sensitive business information and cause revenue loss. To future-proof themselves, businesses can set up a website and prompt people to purchase from there. Above all, it is necessary to create proper data backups to ensure that every piece of data is safe.

Conclusion

The key takeaway from the Facebook outage is that although technology glitches will inevitably happen, how organizations respond to the issue makes all the difference. Acknowledging it publicly, explaining the details, and responding quickly help build a good reputation.

For small businesses, it is necessary to take care of your infrastructure and ensure smooth functioning throughout. Implementing strong cybersecurity practices and compliance with data laws are proven methods to build a robust infrastructure, increase security resilience, and prevent financial or reputational loss.