Managing Application Changes requires More than Just Automation
Recently Jason Liu, CEO of UC4, wrote in Wired magazine that 'Outages Don't Need to Be a Way of Life for the Enterprise'. Exploring the issue he writes that "Gartner cites outages as more common to cloud computing than security breaches. Both can be equally devastating to the business, so why do we accept outages as a routine part of our digital experience?"
While Liu shared how downtime has become increasingly tolerated, while their pain is present with "notable outages at companies like Amazon, United Airlines and Bank of America prove that no business is invincible when it comes to service interruptions." He suggests that "the simple reality is that technology operations are increasingly dependent on good automation to deliver the expected business performance."
Liu asserts that "the most effective way to reconcile speed and control is automation. And as application changes have brought about the biggest challenge in IT, so will automating the application delivery process."
One would assume that by automating all deployments, everything will to run as planned – no surprises, right?
Not exactly. This premise doesn't goes far enough. There are several ways to see where deployment automation alone still falls short.
The Risk of Rollback
Liu holds that "Essential to avoiding outages is automatic rollback, which has been the holy grail of deployment automation for a long time. It guarantees your system will not be left unstable due to a botched upgrade process. Automatic rollback, in a way, serves as a way to mend the conflict between Dev teams and Ops team, where Dev wants to rapidly push out new applications and features, while Ops takes a more cautious approach. Automatic rollback as part of DevOps makes for a deployment model that is flexible to meet the changing demands of application upgrades and fixes."
As was suggested in the article frequent changes, fixes, and improvements, are inherent parts of today's systems. So automated rollback to the state prior to the faulty change can be very desirable.
However, problems are often only discovered after a good amount of time have passed following a release to production. The more time that passes, the harder it is to roll back. Users get accustomed to new features, so you can't just remove them. Business and customer data is also accumulated in updated schemas making it difficult to roll the database back to support application rollback. Or sometimes a deployed change was supposed to address a critical issue. Rolling back can bring that issue back.
Speed and Control: Together Again?
Liu contends "As more organizations adopt DevOps models to establish collaborative environments amongst Development teams and IT Operations teams, naturally, there will be still be pain points around speed and control."
Yet the question is how will you manage one-off changes or changes that do not follow policy like this? Also, how can you discover and identify changes that have occurred in a network that may? It's simple to correct a change in one system. However, how can you validate your systems' configurations, and then update or correct any ad-hoc changes that were made? The problem is complex, and difficult to resolve. Identifying configuration changes in a timely manner before they impact the application or soon after, reduces the risks to business continuity.
Liu asserts that "The cause of each specific outage varies but what is true across the board is that business processes, applications and computing infrastructures are growing too intertwined and dependent on each other. With all these moving parts, it's the process and process control that needs to act as the IT backbone. IT process automation. ITPA keeps track of the complex inter-dependencies between applications, infrastructure and business workflows to help identify, and even predict and work around problems before they occur."
While due to the complexity and challenges, IT has adopted such tools for automating the deployment and release process, seeking to remove the human factor. Yet, configuration is still the first challenge faced with deployment automation, where only repeatable activity are automated, making some application deployments into nightmares. IT is then left wondering what automated platform actually does, what would be impact of changing deployment assets and what is the actual configuration of the managed environment?
Turbo Boost Automation with IT Operations Analytics
Managing the configuration of multiple environments is still the bane of IT Operations. Between applications, environments, and individual instances, mistakes and unauthorized changes can still happen (as we seen with some high profile organizations) demanding that IT ops spend time managing configuration values. Configuration management doesn't have to be such a painful experience.
IT Operations Analytics comes as a welcome relief to an industry that has already spent millions on mega solutions only to have their hopes and dreams shattered when implementing in their organizations. While able to quickly and efficiently discover the root cause of IT system performance problems, moreover the new discipline of IT operations analytics can provide IT Ops the tools to anticipate performance impacting events that can cripple IT operations management.
IT operations analytics translates abundant detailed configuration data and frequent changes into critical decision-support information, providing actionable insights that address practical day-to-day operations questions (like, when an incident occurs, can you quickly know "what changed"?).