James Turnbull, VP of engineering at Kickstarter and author of The Docker Book, presented at both FOSDEM and Config Management Camp about monitoring, sharing his views on modern, scalable, business oriented monitoring, provided as a service with self service APIs, and integrated in the project development.
Shortly after releasing the AWS CloudTrail Processing Library (CPL), Amazon Web Services has also integrated AWS CloudTrail with Amazon CloudWatch Logs to enable alarms and respective "notifications from CloudWatch, triggered by specific API activity captured by CloudTrail". The implied support for monitoring JSON-formatted logs has recently been officially released as well.
Netflix has open sourced Atlas, part of their next-generation monitoring platform they have been working on since early 2012. The company developed Atlas to store time series data in order to provide near real-time operational insight to teams.
VictorOps published the results of its survey on the state of on-call activities, which it claims to be the first of its kind. The survey includes data about the challenges of being on-call, the way those who are on-call get notified, the tools they use to support incident resolution, the prevalence of false alarms, the average time of each incident resolution and more.
To thoroughly remove waste in a process you need flow to deliver just in time, and mindfulness and situational awareness in organizations to handle problems with processes and built in human intelligence. Organizations apply concepts from flow to develop what is needed and when it is needed and use pull to prevent inventories. What they also need is “Jidoka”: mindfulness and situational awareness.
Kanban is often used to manage work, but the concepts of kanban can also be used to guide a journey of change in an organization. This is a case study of an insurance company that used kanban to get change done to improve visibility and predictability and engaging their people.
Amazon CloudWatch recently gained log file monitoring and storage for application, operating system and custom logs and meanwhile enhanced support for Microsoft Windows Server to cover a wider variety of log sources.
Lindsay Holmwood made a retrospective about metrics and monitoring in his DevOps Days Belgium talk, listed his typical metrics and monitoring pipeline, exposed some flaws in monitoring systems, and his view of what the future may bring in the field.
Ryan Mckergow explains various ways to set up story walls for agile team. This post includes setting up columns, rows, selecting colors and avatars for the story wall.
At the Bacon Conference last May, bitly Lead Application Developer Sean O'Connor explained the most relevant lessons bitly developers learned while building a distributed system that handles 6 billions clicks per month.
3scale launched APITools in the month of April this year targeted at API consumers. InfoQ spoke to 3scale management regarding motivation and underlying technology among other things and walked away with some interesting insights as well as upcoming initiatives to involve the community.
LiquidPlanner, a PPM tool, added features like card view to make it suitable for agile teams. InfoQ spoke to Liz Pearce, CEO of LiquidPlanner to explore more about tool and its functionalities.
Daniel Schauenberg described at QCon London how Etsy, renowned for its DevOps and Continuous Delivery practices, does 50 deploys/day. A fully automated deployment pipeline, thorough application monitoring and IRC-based collaboration are all important to achieve this rate of change while keeping risk to a minimum. Etsy has about 60 million monthly visits and 1.5 billion page views per month.
At a recent London DevOps meetup, Andy Sykes launched a debate on whether Nagios, a well-known application that offers monitoring and alerting services, should be replaced with a better solution. Laurie Denness, from Etsy, argued in a reply that Nagios and its ecosystem still are a great solution in the monitoring and alerting arena.