BT

InfoQ Homepage Failure Content on InfoQ

  • Anti-patterns for Handling Failure

    Oliver Hankeln shares the anti-patterns he found for handling failure in organizations: hiding mistakes, engaging in blame game, the arc of escalation and cowardice. He then suggests corrective actions for each of them.

  • Using Pairing for Experimenting in Presentations

    In the closing keynote of the Agile Eastern Europe 2015 conference Yves Hanoulle did an experiment together with his son Joppe in pair presenting. InfoQ interviewed Joppe and Yves Hanoulle about doing experiments, checking the safety of the environment and ways to make it safer, learning from failure, and presenting in pairs at conferences.

  • Failure Injection Testing: Controlling Failure in Production

    Netflix's Failure Injection Testing bridges the gap between isolated testing and unmitigated chaos testing by controlling the impact of the test. FIT establishes a context which other components of Netflix's production testing and infrastructure systems interpret and adjust the behavior of the system accordingly.

  • Mindfulness and Situational Awareness in Organizations

    To thoroughly remove waste in a process you need flow to deliver just in time, and mindfulness and situational awareness in organizations to handle problems with processes and built in human intelligence. Organizations apply concepts from flow to develop what is needed and when it is needed and use pull to prevent inventories. What they also need is “Jidoka”: mindfulness and situational awareness.

  • How Netflix Handled the Reboot of 218 Cassandra Nodes

    Amazon performed a major maintenance update at the end of September in order to patch a security vulnerability in a Xen hypervisor affecting about 10% of their global fleet of cloud servers. This update involved the rebooting of those servers, with consequences for AWS users and the services they provide, including one of their largest clients, Netflix.

  • Avoidance of Organizational Dysfunction Leads to Scrum Masters' Failure

    Bob Marshall explains the reason of failing of scrum master in most of the organizations as the lack of awareness on the part of adopting scrum and scrum master’s responsibility to tackle organizational dysfunction.

  • Leslie Lamport on Distributed Systems and Precise Thinking

    Leslie Lamport is the author of some of the most cited computer science papers and won a Turing Award in 2013 for his seminal work in distributed and concurrent systems. This is a summary of an interview that Lamport gave to Software Engineering Radio touching themes such as his early work in distributed systems and the importance of precise thinking in programming.

  • Fail Fast Means Learn Fast

    Failing fast and often is one of the encouraged practices for agile teams. Sander Hoogendoorn, author of the This is Agile book discusses on his blog the importance of having a strategy that helps you on the decision of aborting a project by assuming its failure on an early stage.

  • Working with Investors as a Lean Startup

    Entrepreneurs using lean startup can work with investors to raise capital for their business. Business plans from lean startups often differ from traditional startups and lean startup encourages learning from failure and to pivot, which might scare off investors. Can entrepreneurs and investors together use the lean startup approach to do fundraising?

  • Attitudes for Sustainable Lean Startup Teams

    Ramli John gave an ignite talk about the minimum viable attitudes for lean startup teams at the 2013 lean startup conference. According to Ramli there are three attitudes that help teams to run lean sustainable over time: humbleness, hunger and happiness.

  • How Can You Learn Early and Fast?

    Agile suggest that teams should fail-fast to enable quick learning from mistakes. Learning from failure is one approach, you can also learn early and fast from successes, by doing experimentation, or by using a plan for knowledge acquisition.

  • Testing Resiliency at PagerDuty Without a Simian Army

    Doug Barth, from PagerDuty, talked at DevOps Days London about their approach to start resiliency testing their systems without dedicating a lot of automation effort upfront. The goal was to quickly start learning about failure points and openly discuss how to fix them with only one hour per week of effort.

  • Learning from Failures with The Lean Startup

    The lean startup is about fast delivery of desired products to customers, and increasing your understanding about the needs of customers. With the lean startup, people can learn faster from failures and become better innovators. There are teachers that use a lean startup based approach in education, which helps their students to learn faster.

  • Avoiding Downtime When Cloud Services Fail

    Another AWS outage hit several large websites and their services last week. What can be done to avoid downtime? Architect for failover not just for scale.

  • Adopting Agile in an Environment of Fear

    Agile adoption and transformation is sometimes effective, and sometimes not. Is there a common thread to the failures? Does fear have anything to do with it? And what can we expect if we start an agile adoption initiative in an environment that is full of fear?

BT

Is your profile up-to-date? Please take a moment to review and update.

Note: If updating/changing your email, a validation request will be sent

Company name:
Company role:
Company size:
Country/Zone:
State/Province/Region:
You will be sent an email to validate the new email address. This pop-up will close itself in a few moments.