Incident Management and Investigation
Challenge: Firefighting in a High Stakes Race Against Time
|
Incident Investigation (3:37)
|
In today's complex IT environments, it doesn't take much to cause a high impact incident. Any minute misconfiguration or omission of a single configuration parameter can quickly lead to an incident with high impact: reputation damage, dissatisfied customers, financial losses, legal liabilities, and full re-organization. Productivity drops drastically as IT incident management teams are transformed into a group of 'firefighters', running against time to stabilize high-priority crises.
To add to the complexity of this situation is the presence of an infinite number of these configuration parameters, and then having to find the root-cause when an environment incident hits, consuming both precious time and manpower. Deciding whether the incident was caused by an application code, configuration, the operating system or infrastructure incompatibility can be very challenging. This is further compounded by the short window of time available for resolution.
To confront this matter, most IT organizations assemble a "war room" scenario, with a team comprised of key members from development, product support and IT operations. And, it is also not surprising that over 85% of Operations and Infrastructure Vice Presidents identify the accurate investigation of environmental instabilities as very challenging and time consuming.
Solution: Prevent Incidents and Cut Investigation Time
Evolven Change Monitoring takes on the dynamics and complexity of the modern data center and cloud, in a way that was never available before, drilling deep and uncovering the most minute mis-configuration, which many times are the root causes of high impact environment incidents.
Evolven makes Incident Management easy, analyzing environment configuration information, comparing environments, or a single environment to a historical snapshot or a golden baseline, playing a critical role in preventing and investigating environment incidents.
With Evolven, incident management can focus efforts in the investigation of environment incidents, to quickly identify configuration changes and differences that are the incident's root-cause, incident management teams execute comparisons of the problematic environment with:
- a previous, historical snapshot of the environment under investigation (when it was working well) – to identify the granular changes that might trigger the incident comparing with
- the last good baseline - to identify configuration and bill-of-material drift that could lead to the incident comparing
- a working (comparable) environment – to identify environment content of configuration causing the incident comparing with
The Incident Management process involves numerous steps and stakeholders (best defined by ITIL). Evolven provides change and difference information necessary to analyze the incidents efficiently at each step of this process for each particular role. So for example, Tier I personnel can use information visualized by Evolven Change Monitoring to identify change areas that could potentially trigger an incident without diving into the change details. They can hand off appropriate detailed information to relevant Tier II and Tier III specialists that can use it to decide if the incident is triggered by the environment. In the case of Major Incident analysis, Evolven Change Monitoring provides a single picture of environment bill-of-material and configuration, its' drift and consistency. This information could be used by the "war room" team to quickly map the cause of the incident.
To prevent environment incidents, IT teams run Evolven's "Drift Analytics" to proactively identify undesired changes and difference, before they turn into environment incidents. Customers typically apply a daily scheduled comparison during the maintenance window.
Evolven Change Monitoring is powerful:
- Covers the entire IT environment - wide variety of applications and their underlying infrastructure, including: applications, front-end servers, middleware, databases, messaging layer, operating systems, virtualization layer, and hardware
- Dives to granular level – drills down to the most granular level of the individual configuration parameter in any configuration source, including configuration files, registry, database schema, stored procedures, and reference data held in the database
- Delivers actionable information – applies powerful analytics to classify configuration changes by impact and criticality
- Realizes value quickly –relies on groundbreaking analytics to delivers initial value in hours, not days or months
Results: Minimize System Downtime
When environment incidents occur, it's a race against time to find out what happened and get it solved. Incident Investigation teams are under significant pressure to identify the root-cause and get it resolved before it spirals out of control and causes a major impact on the company, brand, customer satisfaction and possibly have financial and legal consequences.
Being able to identify those critical configuration management differences that could trigger an incident quickly and easily empowers incident teams to focus their efforts on a small group of changes, rather than branching out in many different directions and wasting valuable time, on trial and error approaches.
Evolven Change Monitoring allows incident investigation teams to slash their investigation time and effort by a staggering 50%, hence minimizing system downtime and preventing the incident from impacting operations across the board.