Misconfiguration Turns Upgrade into Outage for Comcast
Modern Family. Grey's Anatomy. Family Guy. This week outages at Comcast meant TVs were dark in many cities across America, leaving many Comcast subscribers without TV service (and their favorite shows), and in some reports no Internet service as well.
reported, "It seems like today is a bad day for Comcast subscribers who are fans of daytime TV, as the company appears to be in the midst of a large-scale television outage."
There were tons of complaints on Twitter:
What Caused This Outage?
With the evolution of Data center infrastructure, new technologies are being added contstantly to further optimize and secure environments. Yet this has also resulted in a high degree of complexity where IT are sometimes left scrambling to respond to quickly changing business requirements.
The outage, according to Comcast
, happened "While we were deploying an upgrade to the X1 platform, we discovered an issue in the way the software that updates X1 was configured. We immediately stopped the deployment, and our engineers began working to identify the root cause and fix the issue."
The configuration issue that impacted Comcast services was widely felt,"Outages were reported in Philadelphia, Chicago, New York, and other cities." according to Ars Technica
As configuration issues still making headlines with downtime and outages, the chronic state of change and configuration management challenges really stands out. Today what once would be considered a minor IT change can now quickly overwhelm today's complex systems. As shown, an infrastructure on the scale of Comcast's can be taken offline by a seemingly routine change. So ultimately this means any minute mis-configuration or omission, authorized or not, of even a single configuration parameter can turn a stable system on its head, leading to an outage, that harms reputation, creates angry customers, and even has deep financial implications.
Today's IT Operations
One of the main goals for IT organizations is to build and maintain environments that offer the highest possible availability and best performance. In the past, IT limited the number of changes happening in their environments. Eventually, optimal performance and availability in the controlled environment was reached, and reported incidents were fixed continuously. Yet as new technologies and systems were introduced, environments evolved, building on the foundation of existing layers of infrastructure and software.
IT Operations Analytics
As IT operations face new levels of challenges, IT Operations Analytics platforms have emerged to enrich a wide variety of IT management use cases. With change requests and changes now coming at IT at a blinding pace, instead of reverse engineering a problem's root cause from low-level machine events and metrics, IT operations teams can apply IT operations analytics tools to carry out a top-down analysis.
Learn how Evolven's IT Operations Analytics
solution delivers the intelligence that IT operation organizations crave, allowing them to turn piles of configuration data into actionable information.
How do YOU make sure YOUR updates roll out smoothly?