Cloud Management Needs Context
Traditionally servers topology was quite static so that the administrators knew each server and its content driving many of the updates manually. Yet, to meet the demands and complexities imposed by increasingly intense business requirements to stay competitive and agile, Data centers evolved from static, slowly evolving processing centers to dynamic, rapidly changing, virtualized and elastic centers that provide a myriad of services to highly distributed, global user populations.
Data center designs have been influenced by industry and technology trends such as IT service automation, converged infrastructure, server virtualization, dynamic workload management, unified management of different technology domains, and the cornerstone of this transformation is the advent and proliferation of cloud infrastructure, separating logical services from physical infrastructure.
Cloud Turns the Data Center on Its Head
The powerful new cloud computing paradigm has changed the nature of the data center. Now users can request and receive information and services dynamically over networks from an abstracted set of resources.
This set of resources cannot be managed manually anymore. Cloud infrastructure relies strongly on automatic provisioning and deployment. As a result, there is a new set of key characteristics that impact system management:
Server Creation is Dynamic
When users request a new service through a self-service Web portal, the additional virtual servers required to support it could be automatically created and provisioned by supporting technologies if required. Similarly servers could be de-provisioned when workload decreases.
IP Can be Different
Function and architecture of a server cannot be deduced from its IP anymore. Multiple servers allocated and de-allocated dynamically can support the same service. Each of these servers will get a new IP when spun off.
Server Abstraction
Virtual servers are defined now by the base image used to spin them off and a set of deployment assets (scripts, templates etc.) that customize the base image. Tons of Changes Less Important
Dealing with scale, rapid change and impermanence could quickly lead to a processing overhead for humans when managing cloud environments.
Critical Behavior of Cloud
The cloud-based server is managed via a set of deployment assets, rather than through direct contact as with physical servers in a traditional data center. For example, let's say you distribute a change in the network configuration, then monitor performance of the system. In the cloud scenario, you don't look just into performance of the new system configuration. Rather you analyze system performance relative to the changes in the deployment assets that created the new configuration.
Further distancing control of the server comes out of the fact that the flow of changes takes place via scripts. Cloud assets are controlled through a set of standard scripts, following policies for security, backup, and management of sensitive data. These scripts are detailed instructions. The downside is that since manual changes still happen, scripts can fall out of sync with the desired environment configuration. Also even when changes are deployed through the automation platform not all the servers are updated. Some of the organizations prefer to keep existing instances all the time they are stable. As a result some drift can appear across the environment. This creates a circumstance where you are working double time to maintain the accuracy of these scripts, to keep them in sync with your target environments or risk the supposedly helpful scripts turning into vicious creatures, performing obsolete tasks.
So when a problem inevitably occurs, the IT ops team needs to carry out a reverse correlation to understand and identify the issues causing the problems. Cloud management includes the task of providing, managing, and monitoring applications into cloud infrastructures that do not require end-user knowledge of the physical location or of the system that delivers the services.
System management tools for cloud need to take a new approach, seeing activity in real time and identifying the affected components. This can be seen in the sense of when the server CPU activity suddenly jumps by 60%, then you need to know what type of server it is. If it is the application server, then you should be able to identify which one, and what is the configuration. Even more importantly, know what the actual configuration should be to investigate the reasons of the CPU jump.
Keep Your Server Connected to Your Management Tool
It is very important that the behavior of the server will be connected to its configuration. As a cloud based infrastructure provides more flexibility and dynamism in the computing infrastructure, configuration management tools become more and more important.
In cloud-based operations, change and configuration management really becomes even more relevant. Since configurations are changed programmatically (through automated mechanisms), configuration management tools have even more bearing for work in the cloud understanding actual cloud configuration and tying it back to programmatic activities. These tools now need to automate system deployments and make sure that every instance providing a specific service has a consistent configuration. These tools also need to automatically review configuration changes distributed to running systems.
When a configuration update is made to the running service the base or "golden" image also needs to be updated. One of the issues with the golden images is that over time there is a drift between running systems and the golden image. A configuration management tool is needed to monitor the changes on the running systems to address issues like: was the update implemented, has performance dropped, understand the causes, stay on top of changes that can enter via, various infrastructure components, especially since there are different tools for each cloud environment layer.
Monitoring Data Needs to be Connected to Assets
The currently available tools may analyze performance, availability, configuration etc., but they don't connect the monitoring results back to the related assets, like connecting the CPU performance to a Puppet manifest configuring the server. Cloud based operations means IT ops needs to address integrated management of increasingly disparate IT assets and resources as well as maintain greater visibility and control over those assets.
System Management tools need to correlate collected monitoring information to the deployment assets in order to understand the context of the trends and changes, required to manage the infrastructure effectively. This correlation should extend across all asset types—traditional physical IT assets, virtualized IT environments, elastic cloud environments, and deliver visibility in an easily accessible solution. So for example, the up to date, unified repository of actual configuration data correlated to the deployment assets, allows IT to gain insight into the drift of the cloud environment over time, impact of the drift and its desirability.