BT

Monitoring as a Service

| by Carlos Sanchez Follow 0 Followers on Feb 19, 2015. Estimated reading time: 2 minutes |

James Turnbull, VP of engineering at Kickstarter and author of The Docker Book, presented at both FOSDEM and Config Management Camp about monitoring, sharing his views on modern, scalable, business oriented monitoring, provided as a service with self service APIs, and integrated in the project development.

In Turnbull's opinion, traditional monitoring is focused in the wrong things, often archaic and not easily manageable. Monitoring should focus on business outcomes, in top down level of importance:

  1. Business logic.
  2. Applications.
  3. Services.
  4. Infrastructure.

Turnbull shared some metrics on the current state of monitoring, where 68% of the people responsible for monitoring are not monitoring business logic, 90% have unmonitored failures, 80% have un-actioned alerts, and 33% monitor reactively.

Traditional monitoring tools have architectures that are not good anymore, a centralized monitoring approach does not scale, and a pull model, where centralized monitoring continuously pulls the monitored systems for state, should be changed to a push model, where events are sent by the target systems.

Monitoring should focus on how services are performing, not if they are available.

Fault detection is for yesterday. Metrics are king. Automation is key.

At Etsy everything is metered and graphed. Their top metrics are dollars, because that is their business.

The next step to modernize monitoring should focus on making it a service, something offered to customers, they being the business owners, developers,... services are always customer focused. Monitoring development should not be done when application goes to production, it should be part of the project development.

You are not the customer of your monitoring. Attach monitoring to product development. It is another feature. Treated as any other development. Change the accountability of who is responsible for monitoring. Teach the application developers to fish for themselves.

Turnbull shared that only 9% of developers do monitoring. Developers should have access to self service APIs to manage monitoring, not help desk tickets, they need to have access to configurations and metrics:

  • Services: Present configurations, logs,... to developers and business owners.
  • Consoles: Ask application developers what they want to see in the consoles, present them with tools to see their metrics.
  • Logging and reporting: Gather requirements from developers to know what they want to see in logs and reports.

Turnbull wrote a monitoring maturity model, describing the three major stages that organizations go through:

  • Manual: Monitoring is done manually, using checklists or simple scripts, focusing on minimizing downtime.
  • Reactive: Monitoring is mostly automatic, using tools like Nagios, with the focus on measuring availability and managing assets.
  • Proactive: Monitoring is considered core to managing infrastructure and the business, with a focus on measuring quality of service and customer experience, using tools like Nagios, Sensu and Graphite.

Turnbull is currently writing the book The Art of Monitoring, a hands-on introductory book on the art of modern infrastructure monitoring and metrics. The slides from both the FOSDEM and Config Management Camp are available at SlideShare.

Rate this Article

Adoption Stage
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread
Community comments

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Discuss
BT