BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Incident Response Content on InfoQ

  • What Resiliency Means at Sportradar

    Pablo Jensen, CTO at Sportradar, talked about practices and procedures in place at Sportradar to ensure their systems meet expected resiliency levels, at this year's QCon London conference. Jensen mentioned how reliability is influenced not only by technical concerns but also organizational structure and governance, client support, and requires on-going effort to continuously improve.

  • Post-Mortems Trends and Behaviors

    Eric Siegler presented his findings at Velocity from analyzing data from 1000 post-mortems ran by 125 different organizations over a six month period. Main trends include the prevalence of blameless post-mortems; the fact that only 1 in 100 post-mortems refer to "human error"; and that analyzing the lifecycle of incidents can provide useful insights on weaknesses in the incident response process.

  • Q&A with Sanjeev Sharma on His DevOpsDays NZ Keynote

    Raf Gemmail speaks with IBM's Sanjeev Sharma about his upcoming DevOpsDays NZ closing keynote on the DevOps and SRE lessons we can learn from Apollo 13.

  • Handling Incidents and Outages

    David Mytton, CEO at Server Density, shared with the devopsdays Amsterdam 2015 crowd how they handle incidents and outages. The process is grounded on a key set of principles: frequent public updates; exhaustive logging of the response activities; team effort and effective escalation. Server Density draws a lot of inspiration from the aviation industry, renowned for its safety procedures.

BT