Netflix has recently made available the source code of the Chaos Monkey 2.0. The latest iteration of the resilience tool is fully integrated with Spinnaker and event tracking systems, but the SSH support has been removed.
Katharina Probst and Justin Becker, engineering managers at Netflix, recently wrote an article on maintaining developer autonomy in API environments for Netflix's tech blog. The August 23 blog post "Engineering Trade-Offs and the Netflix API Re-Architecture" explores the difficulty of reconciling developer code and process ownership with multiple team-wide shared services in API environments.
Netflix's goal is to predict what you want to watch before you watch it. They do this by running a number of machine learning (ML) workflows every day. Meson is a workflow orchestration and scheduling framework that manages the lifecycle of all these machine learning pipelines that build, train and validate personalization algorithms to help with the video recommendations.
On May 11th, 2016 Pivotal announced that their latest release of Spring Cloud has reached General Availability (GA). InfoQ recently had the chance to chat with Pieter Humphrey, consulting product marketing manager at Pivotal, to gain further insight into this release and the state of their platform.
At QCon London 2016 Peter Alvaro and Kolton Andrus shared lessons learned from a fruitful collaboration between academia and industry, which ultimately resulted in the creation of a novel method for automating failure injection testing at Netflix. Core learnings included: work backwards from what you know; meet in the middle; and adapt the theory to the reality.
At the microXchg 2016 conference, held in Berlin, Rick Buskens presented “Microservice Deployment Pipelines with Spinnaker”, which discussed the collaboration between Netflix and Google on the Netflix-conceived Spinnaker continuous delivery platform. Spinnaker can be used to create build pipelines for safe and predictable deployment of microservice applications across multiple cloud providers.
Last year, Netflix Cloud Database Engineering (CDE) team introduced Dynomite. Dynomite is a proxy layer, aiming to turn any non-distributed database into a sharded, multi-region replication aware distributed database system. Now Netflix released a benchmark using Dynomite with Redis in AWS infrastructure.
Rising from the ashes of GigaOm the tribal gathering of cloud elders that is Structure has returned, and got off to a strong start with Battery Ventures' Adrian Cockcroft presenting on the State of the Cloud and Container Ecosystems. Cockcroft paid particular attention to the impact of containers, which wasn’t even a major discussion topic at the last Structure conference in 2013.
InfoQ had the opportunity to interview Daniel Jacobson about ephemeral APIs, their link to experience-based APIs and when to consider them. He also explains why generic resource-based API architectures can run into problems at scale and why he doesn’t use an API descriptor language. Finally, he describes the various tools they built to deliver those APIs including Falcor, Scryer or Nicobar.
Today, Pivotal announced an update to Pivotal Cloud Foundry (PCF), the commercial version of a popular open-source platform for building, deploying, and running cloud-native applications. This 1.6 release gives developers native access to a subset of Spring Cloud’s Netflix OSS services, built-in support for .NET applications, beta support for Docker images, and integrated ALM tools.
Based on their experience with arbitrarily shutting down servers or simulating the shutdown of an entire data center in production, Netflix has proposed a number of principles of chaos engineering.
At QCon New York 2015, Kolton Andrus discussed Netflix’s Failure Injection Testing (FIT) platform, which allows the injection and monitoring of arbitrary failure scenarios to a targeted group of customers using the Netflix production web services. FIT allows Netflix to maintain an ‘antifragile’ programming culture, which results in the creation of systems that are resilient to failure.
The Netflix team has released FIDO -- an open source system for automatically analysing security events. Not to be confused with FIDO Alliance, Netflix's platform stands for Fully Integrated Defense Operation, the platform's Github describes FIDO as "an orchestration layer used to automate the incident response process by evaluating, assessing and responding to malware."
Recently Adrian Cockcroft gave an interview to ActiveState's John Wetherill about microservices. In it he talks about how polyglot fits into microservices and the impact on him when he head that companies such as Target and Macy's, as well as Homeland Security had adopted that architectural approach.