Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News OpsRamp Announces Improved Service Centricity, AIOps and Cloud Monitoring

OpsRamp Announces Improved Service Centricity, AIOps and Cloud Monitoring

This item in japanese

OpsRamp, a service-centric "artificial intelligence for IT operations (AIOps)" software-as-a-service (SaaS) platform for the hybrid enterprise, has announced new topology maps, enhanced AIOps features, and new monitoring capabilities for cloud native workloads.

The new release of OpsRamp's unified platform combines Kubernetes monitoring, intelligent alert routing, and topology mapping for modern IT operations management teams. It provides greater service-centricity and context for hybrid infrastructure monitoring and management allowing enterprise IT teams to embrace more intelligent incident management and deliver exceptional customer experiences.

Mahesh Ramachandran, VP of product management at OpsRamp, defines "service-centricity" as:

Service-centricity, or a service-centric view, shifts the focus of the digital operations team from managing the element to managing the business service. The OpsRamp AIOps solution is built to address the needs of service availability and performance through faster remediation and incident response. This helps the IT organisation by recontextualising the infrastructure environment from a collection or devices, resources, and configurations into a collection of business services which, we assert, is more manageable. It also consolidates and unifies the goals of traditional IT, DevOps, and the business into one shared vision. IT is looking at its resources like the business would - with services at its core.

The new release includes impact visibility and service context capabilities that discover topological relationships between resources at multiple levels in hybrid and multi-cloud IT stacks. The topology maps are intended to help infrastructure and operations teams understand the impact that IT resources have on each other and on end-user facing IT services. OpsRamp's topology discovery now includes applications and hypervisors. The application topology function discovers over forty popular enterprise applications and establishes topological relationships between application components and infrastructure. The hypervisor topology discovers virtual machines, hypervisor servers and clusters in VMware vSphere and KVM environments and their relationships.

OpsRamp have also enhanced the service maps function with a new user interface that enables the identification of underlying resources behind an IT service outage so that operations teams decide on the correct course of action to restore services. The new release introduces new capabilities for OpsRamp, an intelligent event management engine for alert correlation, automation, and remediation. New features include auto-incident creation and routing, augmented training for inference models and frequency-driven alert escalation.

OpsRamp' OpsQ now enables automatic incident creation and routing using alert escalation policies to auto-assign incidents based on prior alert, incident, and notification data. Machine learning-driven alert escalation uses specific learned patterns (assignee groups, business impact, urgency, and priority) to route incident assignments for different types of alerts. OpsRamp's machine learning-based inference models correlate alerts linked by a common cause using historical alert data. OpsRamp's OpsQ now allows users to augment these models with additional user-provided training data.

With such augmented training, IT operations teams can bootstrap OpsQ to recognise alert sequences that are uncommon in everyday operations, but important to identify when they occur. In order to augment the models, users can build spreadsheets (or use the sample templates provided) to escalate events to service management with predefined data across resolver groups, category, sub-category, priority, urgency and business impact. This data is then applied to OpsRamp's incident management instrumentation and for third party incident management integrations.

OpsQ now supports policies to escalate alerts based on how often an alert has recently occurred. With frequency-based alerting, operations teams can filter out alerts that flap only occasionally and escalate alerts that flap repeatedly. The OpsRamp platform provides capabilities for multi-cloud event monitoring as well as features to discover and monitor container infrastructure supporting modern microservices architectures.

OpsRamp can now discover and monitor Kubernetes environments across on-premise and cloud services like Azure Kubernetes Services, Google Kubernetes Engine, and Amazon Elastic Container Service for Kubernetes. DevOps teams can understand the total services (nodes and containers for each cluster, a breakdown of pods by namespace) and resource trends (CPU and memory utilisation) for each Kubernetes cluster. Key metrics related to the availability and performance of clusters, hosts, NameSpaces, pods and containers can be monitored.

Events are an important medium of communication for operational issues in the public cloud and are a primary source of signal in multi-cloud environments. OpsRamp can now collect, aggregate, correlate and escalate events from AWS services such as AWS Health, ECS, Redshift, Data Migration Services, and CloudWatch. With this capability, OpsRamp can serve as a single point of monitoring, management, and remediation for cloud events across multiple cloud accounts.

The new OpsRamp release also includes new patch management capabilities for patch compliance verification, synthetic transaction and SSL certificate monitoring, new integrations for monitoring open source applications, and knowledge base enhancements for easier categorisation and linking.

Rate this Article