open
  1 (866) 866-2320 Resources Events Blog

Facebook Down Following Infrastructure Change

Blog

Facebook Down Following Infrastructure Change


 

TechCrunch explained that "Facebook is a procrastination service, but a way to chat with friends, families and sometimes important work contacts. When an outage occurs, users realize that they rely so much on uptime and service reliability."

PCMag elaborated on the significance of the event explaining that "Those who reported being affected by outage described disruptions ranging from pictures not loading properly to the entire site being unavailable."

Huffington Post stated that "Users are reporting the outage started around 6PM ET. While some of us here at HuffPostTech are able to log on to Facebook just fine, others are reporting an error message when they try to stalk their exes/procrastinate/post selfies. We've reached out to a Facebook spokesman for more information."

Was this just a bad week, or are outages for major online entities the precursor to the Mayan Doomsday prophecy?

So What Caused the Outage This Time?

As with the Gmail outage, Facebook went down due to a change made to the infrastructure. Sure change happens. In IT operations, change is important, enabling continuous improvement of services. In complex dynamic ecosystems, such as Facebook's IT infrastructure, change happens a lot. On any given day infrastructure is being upgraded, patches are being installed, automated processes are running that alter files and system environments and configurations are also manually being changed. Sometimes these activities are performed correctly and ... sometimes they're not. When they're not, the cause may be identified only when a failure occurs.

A key component of the entire IT operations process is Change and configuration management. IT control is impacted every time a change occurs in the infrastructure—whether for the deployment of new hardware or applications, the hiring of new personnel, or some other change—questioning the effectiveness of IT control and change management processes. When an organization can manage change on a continuous basis, it gains the visibility necessary to ensure that its infrastructures are secure, compliant and effective..

Facebook's Official Response

Facebook accounted for the outage explaining that "Earlier today we made a change to our DNS infrastructure and that change resulted in some people being temporarily unable to reach the site. We detected and resolved the issue quickly, and we are now back to 100 percent. We apologize for any inconvenience."

IT Operations Analytics

Not only at Facebook, but today IT Operations face new levels of challenges that can no longer be handled with existing approaches. This means applying some more serious brain power to help deal with the complexity and dynamics of today's IT environments. IT Operations Analytics delivers the intelligence IT operations organization are craving, allowing them to turn piles of IT operations' data into actionable information. 

In our recent webinar, we explained that "Gartner recognized IT Operations Analytics as an area on the rise with high impact on IT operations. So the analysts recognized that analytics will enable to perform processess that will significantly increased revenues or cost savings for the enterprise."

Your Turn
How are YOU managing changes in your infrastructure?

About the Author
Martin Perlin