BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News Grafana Cloud Adds Incident and On-Call Management Solutions

Grafana Cloud Adds Incident and On-Call Management Solutions

This item in japanese

Bookmarks

Grafana has announced the addition of incident management and on-call support to their Grafana Cloud offering. Grafana Incident, currently in preview, generates meeting spaces, integrates with Slack, and constructs incident timelines with information pulled from Grafana dashboards. Grafana OnCall provides on-call rotation scheduling and notification from connected monitoring systems.

Grafana Incident provides incident management functionality within Grafana Cloud. This includes integrating with existing collaboration tooling such as Zoom, Slack, and Google Meet. When an incident is declared, dedicated communication spaces can be created such as Slack channels, Zoom rooms, Google Docs, or Google Meets. An integrated Slack-based chatbot allows for interacting with Grafana Incident from Slack. This includes tasks such as creating incidents, assigning roles, managing tasks, and adding notes.

Screenshot of incident view within Grafana Incident

Screenshot of incident view within Grafana Incident (source: Grafana)

 

The chatbot will also passively monitor the conversation within the created room and extract URLs to be attached to the incident. This includes pasted URLs such as GitHub issues, pull requests, dashboards, or other external links. Based on the context of the ongoing conversation the bot will suggest dashboards that are potentially related to the ongoing investigation.

Grafana Incident automatically generates a timeline view of the incident linking together context extracted by the chatbot with embedded Grafana graphs. In addition, it is possible to track tasks and TODO items within the tool.

Now generally available, Grafana OnCall introduces on-call management within Grafana. This release includes Slack, Telegram, voice, and SMS alerting. On-call schedules are created via a calendar integration that integrates with any calendar solution that surfaces an iCal address. The on-call rotation is scheduled within the calendar using the team's Grafana usernames and automatically ported over to Grafana OnCall.

When an alert monitoring system is integrated with Grafana OnCall, the alerts will create an alert group. This group will send notifications based on escalation policies defined in routes and escalation chains. Routing options can include IF, ELSE IF, and ELSE logic and be adjusted based on the type of the alert. Alerts that are similar will be automatically grouped to help reduce alert noise.

Grafana OnCall escalation chain interface

Grafana OnCall escalation chain interface (credit: Grafana)

 

Grafana OnCall includes an API that allows for working with escalation chains, alert groups, scheduling, integrations, and outgoing webhooks. For example, listing the current alerts can be done as follows:

curl "{{API_URL}}/api/v1/alerts/" \
  --request GET \
  --header "Authorization: meowmeowmeow" \
  --header "Content-Type: application/json"

Matvey Kukuy, senior engineering manager at Grafana, noted on Hacker News that Grafana OnCall is not just for users of Grafana.

The idea of Grafana OnCall is to help you to group, deduplicate, route & deliver to Slack/SMS/Phone alerts from any sources. It could be a CloudWatch, DataDog, self-hosted Alertmanager, or Grafana of course. The only requirement for the alert source is to be able to generate a webhook and send it to us.

Reactions on social media to the releases were mixed. While some users welcomed the additions to the current on-call and incident management space, markbnj felt that "using a calendar integration to manage on-call schedules is the wrong approach" as the calendar is the output, not the input in this case. User kungfufrog felt that the lack of a corresponding mobile application was a drawback. They note that the mobile app is important as it "can override DND/volume etc. on my phone so I can get woken up at night and respond to problems."

Grafana OnCall is available as a paid offering within Grafana Cloud, but also presents a free tier offering. The free tier includes a limited number of alerts per integration, team alerts, and API requests per API key. Grafana Incident is currently available in preview to users of Grafana Cloud in both free and paid plans.

About the Author

Rate this Article

Adoption
Style

BT