BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Monitoring Content on InfoQ

  • Handling Incidents and Outages

    David Mytton, CEO at Server Density, shared with the devopsdays Amsterdam 2015 crowd how they handle incidents and outages. The process is grounded on a key set of principles: frequent public updates; exhaustive logging of the response activities; team effort and effective escalation. Server Density draws a lot of inspiration from the aviation industry, renowned for its safety procedures.

  • Monitoring Microservices and Containers: A Challenge by Adrian Cockcroft

    At GlueCon 2015, Adrian Cockcroft presented a list of rules for monitoring microservice and container-based applications. In addition to these guidelines, Cockcroft also highlighted a series of challenges for monitoring cloud-native container-based systems, and introduced his ‘Spigo/simianviz’ microservice simulation and visualisation tool.

  • Q&A on New Relic Software Analytics Improvement

    New Relic has released a set of new features to its Software Analytics Platform. Service Maps is a real time visual map focused on services. Together with a tool for Docker monitoring, a database dashboard for NoSQL databases and an unified alerts platform, the company wants to reduce complexity in modern software architecture.

  • Weaveworks Release ‘Weave Scope’ for Container and Microservice Monitoring

    Weaveworks, creators of the Weave Docker virtual networking solution, have released a pre-alpha version of 'Weave Scope', an open source developer-focused container monitoring tool. Scope automatically generates a map of containers, enabling developers to visualise, monitor, and control applications by using the information exposed to drive deployment and operational decisions.

  • Phil Calcado on Lessons Learnt During SoundCloud's Microservice Migration

    At QCon London 2015 Phil Calcado shared lessons learnt from SoundCloud’s move from a monolithic to microservices architecture, and stated that the core requirements for building a microservice platform include developing capabilities for rapid provisioning, basic monitoring and rapid application deployment.

  • Gain insight into the performance of your apps with Google Cloud Monitoring

    Google Cloud Monitoring is now available for free whilst in beta to all Google Cloud Platform customers. The service provides dashboards and alerts for cloud-powered applications, giving developers and operations staff insight and metrics to their services.

  • Monitoring as a Service

    James Turnbull, VP of engineering at Kickstarter and author of The Docker Book, presented at both FOSDEM and Config Management Camp about monitoring, sharing his views on modern, scalable, business oriented monitoring, provided as a service with self service APIs, and integrated in the project development.

  • Amazon CloudWatch Supports JSON Logs and Integrates AWS CloudTrail

    Shortly after releasing the AWS CloudTrail Processing Library (CPL), Amazon Web Services has also integrated AWS CloudTrail with Amazon CloudWatch Logs to enable alarms and respective "notifications from CloudWatch, triggered by specific API activity captured by CloudTrail". The implied support for monitoring JSON-formatted logs has recently been officially released as well.

  • Atlas: Netflix's Primary Telemetry Platform

    Netflix has open sourced Atlas, part of their next-generation monitoring platform they have been working on since early 2012. The company developed Atlas to store time series data in order to provide near real-time operational insight to teams.

  • State of On-Call Survey

    VictorOps published the results of its survey on the state of on-call activities, which it claims to be the first of its kind. The survey includes data about the challenges of being on-call, the way those who are on-call get notified, the tools they use to support incident resolution, the prevalence of false alarms, the average time of each incident resolution and more.

  • Amazon CloudWatch Gains Log Monitoring and Storage

    Amazon CloudWatch recently gained log file monitoring and storage for application, operating system and custom logs and meanwhile enhanced support for Microsoft Windows Server to cover a wider variety of log sources.

  • 5 years of metrics and monitoring

    Lindsay Holmwood made a retrospective about metrics and monitoring in his DevOps Days Belgium talk, listed his typical metrics and monitoring pipeline, exposed some flaws in monitoring systems, and his view of what the future may bring in the field.

  • Lessons Learned Building Distributed Systems at Bitly

    At the Bacon Conference last May, bitly Lead Application Developer Sean O'Connor explained the most relevant lessons bitly developers learned while building a distributed system that handles 6 billions clicks per month.

  • How Etsy Deploys More Than 50 Times a Day

    Daniel Schauenberg described at QCon London how Etsy, renowned for its DevOps and Continuous Delivery practices, does 50 deploys/day. A fully automated deployment pipeline, thorough application monitoring and IRC-based collaboration are all important to achieve this rate of change while keeping risk to a minimum. Etsy has about 60 million monthly visits and 1.5 billion page views per month.

  • Discussion on Nagios Fitness for Purpose

    At a recent London DevOps meetup, Andy Sykes launched a debate on whether Nagios, a well-known application that offers monitoring and alerting services, should be replaced with a better solution. Laurie Denness, from Etsy, argued in a reply that Nagios and its ecosystem still are a great solution in the monitoring and alerting arena.

BT