Gartner Says Change-based Causal Analysis Makes Availability and Performance Data Actionable
Recently Gartner published a report exploring how Causal Analysis Makes Availability and Performance Data Actionable. The research brief by Gartner Research VP Will Cappelli is based on Gartner client inquiries (approximately 300 inquiries), covering the area of IT Operations Analytics (ITOA).
10 Years Almost No Progress
In the report, Cappelli describes how while availability and performance data volumes have increased by an order of magnitude over the last 10 years, enterprises still find their data to be insufficiently actionable. He notes that the "the utilization of significantly larger datasets has barely moved the needle on availability and performance process effectiveness."
The report raises doubts about how effective advances in data analysis has been, saying that, "Over the last 12 months, Gartner estimates that root causes of performance problems have taken, on average, a week to diagnose, while a meager 3% of performance incidents have been predicted. Back in 2005, the same figures stood at eight days and 2%, respectively."
To this he says "Users have begun to insist that any ITOA solution should focus not only on the ingestion, storage and access to data, but also on tools for making that data meaningful and actionable."
85% of Performance Incidents Can be Traced to Changes
The first question usually asked when performance issues come up is: "what changed?" In today's complex IT environments, the root cause of a problem often stems from an undesired change. Yet 'changes' have been a surprisingly often overlooked data source. Cappelli emphasizes this in the report, saying that IT operations need to "focus on recorded changes to systems as the key inputs to causal analysis."
Cappelli further explains that, "There are many different types of change and all need to be taken into account to be able to establish potential root causes for any given incident. Among the most important types of change are code changes, data changes, workload changes and infrastructure topology changes. While not all performance incidents can be traced to such changes, Gartner estimates that approximately 85% of all performance incidents can be so traced."
Cappelli says that "No caused event occurs unless there has been some kind of intervention made to the system within which the event occurs. Given this insight, it makes sense to establish what changes have recently been introduced to the environment and then try to correlate variable value fluctuations to those changes."
To fully realize the value of ITOA, analytics need to focus on changes to extract actionable insights for driving operational decisions and activities.
Causal Analysis Makes Data Actionable
According to the research, many Gartner clients are unsatisfied with the management of system behavior.
To address this, Cappelli explains that "There are five distinct types of causal analysis that make availability and performance data actionable. While each brings a valid perspective, change-based causal analysis — particularly when combined with Bayesian causal network analysis — holds the most promise for IT operations leaders."
Balance Agility and Stability
While today, DevOps and other agile methodologies allow IT to deliver changes into production at an overwhelming pace, they also introduce real challenges from a stability perspective that leave IT operations vulnerable.
Cappelli adds that "Given that most performance incidents, ultimately, result from changes and that many of those changes originate on the development side of the house, continuous communication between development and production is the surest path to quickly and effectively isolating the root cause of a performance incident."
To maintain the balance between agility and stability, Cappelli recommends to "Add causal analysis to the more often deployed pattern discovery and anomaly detection algorithms. Without this addition, it will not be possible to take action on availability and performance incidents in complex environments."
A Change-centered Approach to IT Operations Analytics
As mentioned, today IT operations faces new levels of challenges that can no longer be handled with existing approaches. This means re-tooling with a focus on changes to help deal with the complexity and dynamics of today's IT environments.
Gartner has found that "a small number of vendors focus primarily on the change-centered approach, including Evolven and Dynatrace."
The Evolven Blended Analytics solution takes a fresh approach to chronic performance and availability problems, delivering unparalleled IT operations insights. Evolven's ability to track end-to-end granular changes and correlate and analyze change information with other operational symptoms and IT context data is what differentiates the solution from other approaches and vendors.
Evolven's Blended Analytics technology collects information about changes – tracking, correlating and analyzing all changes, end-to-end from application to infrastructure at the most granular level, in order to quickly find a root cause.
"Gartner believes that a combination of Bayesian causal network analysis and change-based analysis is the most effective, since it combines a mathematically tractable way of representing correlations with a highly effective way of going straight to the source of a performance incident."
To gain a cross-silo view and insights, Evolven Blended Analytics also assimilates and normalizes and correlates the actual changes with: symptoms—such as those associated with events, time-series data, and log files—across a wide range of application performance management tools, event management capabilities, and other third-party solutions. Moreover, Evolven also provides insights into contexts—as can be derived from release and deployment automation, service desk, configuration management databases (CMDBs), and application discovery and dependency mapping (ADDM) tools.