BT
DevOps Follow 690 Followers

From Darwin to DevOps: John Willis and Gene Kim Talk about Life after The Phoenix Project

by Helen Beal Follow 4 Followers on  May 23, 2018

IT Revolution recently published an audiobook with nearly eight hours of conversation between Gene Kim and John Willis; Beyond the Phoenix Project – the Origins and Evolution of DevOps.

DevOps Follow 690 Followers

What Resiliency Means at Sportradar

by Manuel Pais Follow 9 Followers on  Apr 06, 2018

Pablo Jensen, CTO at Sportradar, talked about practices and procedures in place at Sportradar to ensure their systems meet expected resiliency levels, at this year's QCon London conference. Jensen mentioned how reliability is influenced not only by technical concerns but also organizational structure and governance, client support, and requires on-going effort to continuously improve.

DevOps Follow 690 Followers

Chaos Engineering at Twilio

by Hrishikesh Barua Follow 12 Followers on  Dec 25, 2017

The Twilio team describes their foray into Chaos Engineering where they use Gremlin to inject failures into their homegrown queuing system shards to test for automated recovery.

Cloud Follow 250 Followers

Werner Vogels on “21st Century [Cloud] Architectures”: Availability, Reliability and Resilience

by Daniel Bryant Follow 628 Followers on  Dec 03, 2017

At the AWS re:invent 2017 conference, Werner Vogels, CTO of Amazon, presented a keynote that discussed core concepts required for building “21st Century Architectures” on the cloud. Highlights of the talk included discussion of the emerging practices of evolutionary and “cloud native” architectures, the role of security becoming everyone’s responsibility, and the benefits of chaos engineering.

DevOps Follow 690 Followers

Serverless Challenges in Hybrid Environments

by Manuel Pais Follow 9 Followers on  Nov 30, 2017

Sam Newman, independent consultant and author of the book "Building Microservices", talked at the Velocity conference in London on the challenges faced when hybrid systems rely on both serverless architectures and traditional infrastructure. In particular, Newman discussed how serverless changes our notion of resiliency and how the two paradigms clash at times of high load in the system.

DevOps Follow 690 Followers

Expedia's Journey toward Site Resiliency: Embracing Chaos Testing in Dev and Production at QCon SF

by Daniel Bryant Follow 628 Followers on  Nov 19, 2017

At QCon SF, Sahar Samiei and Willie Wheeler presented “Expedia’s Journey Toward Site Resiliency”, and discussed the building of a community of practice around resilience testing within Expedia. The results have generally been positive: Netflix’s Chaos Monkey has been running daily in production since May 15th; and resilience tests have been added to four Tier 1 service pipelines.

Architecture & Design Follow 1856 Followers

Adrian Cockcroft Discusses Chaos Architecture: "Four Layers, Two Teams, and an Attitude"

by Daniel Bryant Follow 628 Followers on  Nov 17, 2017 1

At QCon San Francisco, Adrian Cockcroft presented “Chaos Architecture”, and discussed the evolution of cloud native architecture, and how chaos engineering can be applied to produce better and safer systems. Effective chaos architecture and engineering was presented as consisting of “four layers, two teams, and an attitude”.

DevOps Follow 690 Followers

Designing Services for Resilience: Nora Jones Discusses Netflix Chaos Engineering at QCon SF

by Daniel Bryant Follow 628 Followers on  Nov 16, 2017

At QCon SF Nora Jones presented “Designing Services for Resilience Experiments: Lessons from Netflix”. Key takeaways from the talk included: the customer experience is a priority; designing for resiliency testability is a shared responsibility; configuration changes can cause outages; and engineers should have have explicit monitoring in place to detect antipatterns in configuration changes.

DevOps Follow 690 Followers

Choose Your Own Adventure: Chaos Engineering at QCon New York 2017

by Pierre-Luc Maheu Follow 3 Followers on  Aug 22, 2017

Nora Jones, senior chaos engineer at Netflix, talked about chaos engineering at QCon New York 2017. She presents different stages of chaos engineering adoption and gives stories from her previous experiences at Jet and Netflix.

Cloud Follow 250 Followers

Netflix Engineer Lorin Hochstein on Chaos Monkey 2.0

by Rags Srinivas Follow 10 Followers on  Oct 25, 2016

Netflix made waves when it initially announced Chaos Monkey, a tool that would terminate normally healthy VM instances in production. The goal was to embrace failure and thereby increase resiliency. Rags Srinivas caught up with Lorin Hochstein at Netflix regarding the recent upgrade to Chaos Monkey.

DevOps Follow 690 Followers

Chaos Monkey 2.0 Runs via Spinnaker

by Abel Avram Follow 7 Followers on  Oct 24, 2016

Netflix has recently made available the source code of the Chaos Monkey 2.0. The latest iteration of the resilience tool is fully integrated with Spinnaker and event tracking systems, but the SSH support has been removed.

DevOps Follow 690 Followers

DevOps Days Kiel Day 2

by Manuel Pais Follow 9 Followers on  May 19, 2016

Round up of the talks at DevOps Days Kiel's second day.

Java Follow 823 Followers

Google Kick-Starts Git Ketch: A Fault-Tolerant Git Management System

by Abraham Marín Pérez Follow 8 Followers on  Feb 02, 2016

Although development has only started, Google has announced their first commits of Git Ketch, a multi-master Git management system that replicates information across multiple Git servers for resilience and scalability. The changes are based on JGit, a Java-based Git server, although other Git servers may be part of the multi-master cluster.

Followers

Microsoft Makes Available Their Platform for Building Microservices

by Abel Avram Follow 7 Followers on  Apr 30, 2015 3

Microsoft has announced and made available the preview of Azure Service Fabric (ASF), a cloud platform including a runtime and lifecycle management tools for creating, deploying, running and managing microservices. ASF microservices can be deployed on Azure or on-premises on Windows Server private or hosted clouds. Support for Linux is to come in the future.

Followers

Anti-patterns for Handling Failure

by Manuel Pais Follow 9 Followers on  Apr 04, 2015

Oliver Hankeln shares the anti-patterns he found for handling failure in organizations: hiding mistakes, engaging in blame game, the arc of escalation and cowardice. He then suggests corrective actions for each of them.

Login to InfoQ to interact with what matters most to you.


Recover your password...

Follow

Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.

Like

More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.

Notifications

Stay up-to-date

Set up your notifications and don't miss out on content that matters to you

BT