BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Post-Mortems Content on InfoQ

News

RSS Feed
  • How External IT Providers Can Adopt DevOps Practices

    IT suppliers can follow the “you build it, you run it” mantra by working in small batches, using an experimental approach to product development, and validating small product increments in production. The supplier has to find out what his client’s goal is, and it has to become the supplier’s goal as well to work in a collaborative way.

  • PayPal Engineering Teams Implement Premortem Analysis

    In a recent blog post, the PayPal engineering team published how it uses premortem analysis as part of its regular software design process. The team adopted a customized version of premortem analysis last year, which highly benefited PayPal engineering. Premortem is a strategy in which a team imagines that a project failed and then works backward to determine what could lead to this failure.

  • Why the Most Resilient Companies Want More Incidents

    According to John Egan, the incident management process is meant to be a cycle of not just the response, but also the account of root cause and the updating of internal processes and practices across the industry. Lowering the barrier to reporting incidents, holding effective incident review meetings using blameless postmortems, and giving everyone access to postmortems is what he advises.

  • How to Embrace “You Build It, You Run It” with Paul Hammant at QCon London

    Paul Hammant talked at QCon London about having developers responsible for the first line of support in production, as the saying goes, “if you build it, you run it.” Hammant recommends following this practice only if there are proper support levels and escalation policies defined. As a result, companies could reduce the chances of burnout or staff quitting.

  • Blameless Post-Mortems and On-Call Gamification at 1st DevOpsDays Portugal (Day 2)

    Ten years after the first DevOpsDays conference in Ghent, the evolution of DevOps and organizations trying to adopt it was at the forefront of the first DevOpsDays conference in Portugal. On the second day, a mix of local and international speakers covered topics such as learning from incidents without blame, gamifying on-call, modern pipelines, and more.

  • Atlassian Announces Solutions for Incident Management

    Atlassian announced on September 4 that they have launched a new product called Jira Ops and that they will acquire OpsGenie. Organizations can use Jira Ops for resolving incidents and doing post-mortems to learn from them. OpsGenie adds prompt and reliable alerting to Jira Ops.

  • Psychological Safety in Post-Mortems

    Emotions often come to the fore when there is an incident; psychological safety in blameless post-mortems is essential for the learning process to happen. The post-mortem session must be fairly moderated, preferably by an outsider, giving everyone a turn to speak without criticism. Don’t start the analysis of the incident before there is a clear and common understanding of what actually happened.

  • How ING Bank Does SRE

    Janna Brummel and Robin van Zijll, from ING Netherlands, talked at the Velocity conference in London about how poor availability from their internet banking systems prompted the bank to implement an SRE culture. A centralized SRE team was set up in the Netherlands to provide tooling, consulting and education on reliability to product teams (known as BizDevOps squads internally).

  • Post-Mortems Trends and Behaviors

    Eric Siegler presented his findings at Velocity from analyzing data from 1000 post-mortems ran by 125 different organizations over a six month period. Main trends include the prevalence of blameless post-mortems; the fact that only 1 in 100 post-mortems refer to "human error"; and that analyzing the lifecycle of incidents can provide useful insights on weaknesses in the incident response process.

  • John Willis Talks DevOps Superpatterns at DOES17 London

    John Willis, co-author of The DevOps Handbook, spoke about the emerging DevOps Superpattern at the 2017 DevOps Enterprise Summit June 5th and 6th in London.

  • Handling Incidents and Outages

    David Mytton, CEO at Server Density, shared with the devopsdays Amsterdam 2015 crowd how they handle incidents and outages. The process is grounded on a key set of principles: frequent public updates; exhaustive logging of the response activities; team effort and effective escalation. Server Density draws a lot of inspiration from the aviation industry, renowned for its safety procedures.

BT