InfoQ Homepage Failure Content on InfoQ
-
Managing Internal and External Services for a High Uptime Goal
Shobana Radhakrishnan shares details about best practices adopted in implementing API integration with third party services, how to manage change and deal with failures.
-
The Power of an Agile Mindset
Linda Rising discusses the “agile mindset” - an attitude that equates failure and problems with opportunities for learning –, sharing practical suggestions to become even more agile.
-
Architecture War Stories
Stefan Tilkov shares entertaining examples of real life architectural disasters in software projects.
-
The Power of an Agile Mindset
Linda Rising discusses the “agile mindset” - an attitude that equates failure and problems with opportunities for learning –, sharing practical suggestions to become even more agile.
-
Building Resilience: How Outages Shaped Etsy's Systems
Avleen Vig presents some of the most unexpected, confusing, hilarious and face-palming events during Etsy's outages to show what can be learnt from their problems to build more resilient systems.
-
Principles of Reliable Communication & Shared State
Andy Piper describes some fundamentals of communicating reliably in an unreliable world and communication techniques used to build distributed data structures that can tolerate failures.
-
Failure: The Good Parts
Viktor Klang keynotes on the imminence and the need to prepare for failure along with several ways of managing failure in case it happens.
-
Evolving Culture and Values. Understanding the Tradeoffs. Growth through Failure. The Importance of Leadership and Open Communication.
Pedram Keyani discusses the importance of evolving the culture and values of an organization, dealing with tradeoffs, learning from failure, proper leadership and open communication.
-
Running an Agile Transformation using Lean Startup
Jason Little discusses how to avoid an organizational change failure when introducing Agile by leveraging principles of Lean Startup and Customer Development.
-
How Netflix Architects for Survival
Jeremy Edberg discusses how Netflix designs their systems in order to survive outages, network latency and random instance failure.
-
Resiliency through Failure - Netflix's Approach to Extreme Availability in the Cloud
Ariel Tseitlin discusses Netflix' failure-based suite of tools, collectively called the Simian Army, used to improve resiliency and maintain the cloud environment.
-
Keynote: System, Heal Thyself
Mike Andrews discusses architecting for failure even you when don’t know what might fail.