InfoQ Homepage Observability Content on InfoQ
-
Reducing Build Time with Observability in the Software Supply Chain
Tools commonly used in production can also be applied to gain insight into the CI/CD pipeline to reduce the build time. Ben Hartshorne, engineer at honeycomb.io, gave the presentation Observability in the SSC: Seeing into Your Build System at QCon San Francisco 2019.
-
AWS CloudWatch Adds Observability Tool for Visualizing Distributed Applications
AWS released ServiceLens, a fully managed observability solution built within CloudWatch. ServiceLens is designed to visualize and analyze the health, performance, and availability of distributed applications. Currently it is available in all commercial regions but requires the usage of AWS X-Ray.
-
Elastic Releases New Security Suite Integrating SIEM with Endpoint Protection
Elastic recently released Elastic Endpoint Protection, a new feature for integrated security built upon Elastic’s acquisition of Endgame. With Endpoint, Elastic is combining their SIEM product and endpoint security into a single solution built on the Elastic stack.
-
Managing Microservice "Deep Systems": Q&A with Ben Sigelman
InfoQ interviewed Ben Sigelman, CEO of LightStep, about managing microservice "depth" at scale.
-
CircleCI Adds New Sumo Logic Integration to Provide Build Pipeline Analytics
CircleCI and Sumo Logic have released an integration to allow developers to view analytical data about CircleCI jobs from within a Sumo Logic dashboard. This integration is packaged using the CircleCI package management solution, Orbs. The integration includes real-time pipeline data such as number of failed builds, average run time, and job status.
-
HashiCorp Releases Consul 1.5.0 with Layer 7 Observability and Centralized Configuration
Hashicorp released version 1.5.0 of Consul, their service mesh application and key-value store. These are the first features released on their new roadmap for Consul, including support for L7 observability and load balancing via Envoy, centralized configuration, and ACL authentication support for trusted third-party applications.
-
Observability in Testing with ElasTest
In a distributed application it is difficult to use debugging techniques common in developing non-distributed applications. Bringing production observability to your testing environment helps to find bugs, argued Francisco Gortázar at the European Testing Conference 2019. He presented ElasTest, a tool for developers to test and validate complex distributed systems using observability.
-
Recommendations When Starting with Microservices: Ben Sigelman at QCon London
During the years Ben Sigelman worked at Google, they were creating what we today call a microservices architecture. Some mistakes were made during this adoption, which he believes are being repeated today by the rest of the industry. In his presentation at QCon London 2019, Sigelman described his recommendations to avoid making these mistakes when starting with microservices.
-
Chaos Engineering Observability: Q&A with Russ Miles
In a new O’Reilly report, “Chaos Engineering Observability: Bringing Chaos Experiments into System Observability”, the author, Russ Miles, explores why he believes the topics of observability and chaos engineering “go hand in hand”. He argues that as engineers begin to run chaos experiments, they will need to be able to ask many questions about the underlying system being experimented on.
-
Three Pillars with Zero Answers: Rethinking Observability with Ben Sigelman
At KubeCon NA, held in Seattle, USA, in December 2018, Ben Sigelman presented “Three Pillars, Zero Answers: We Need to Rethink Observability” and argued that many organisations may need to rethink their approach to metrics, logging and distributed tracing.
-
Testing Complex Distributed Systems at FT.com: Sarah Wells Shares Lessons Learned
The complexity in complex distributed systems isn’t in the code, it’s between the services or functions. Testing implies balancing finding problems versus delivering value, said Sarah Wells at the European Testing Conference. Testers often have the best understanding of what the system does; they have a good hypothesis about what went wrong, and are able to validate it pretty quickly.
-
Adopting Envoy as a Service-to-Service Proxy at Reddit
Reddit introduced Envoy into their backend framework as service-to-service proxy to support their ongoing architectural improvements. By adopting Envoy as a service-to-service Layer 4/Layer 7 proxy, they discovered significant improvements in observability, ease of adoption, and performance.
-
The Evolution of Full Cycle Developers at Netflix: Greg Burrell at QCon SF
At QCon San Francisco, Greg Burrell talked about the journey towards “full cycle developers” within the Netflix edge engineering team. Following the principle of “operate what you build”, developers within this team chose to take on more operational responsibility for their services, and were facilitated by comprehensive tooling, training and management support.
-
Shipping More Safely by Encouraging Ownership of Deployments
Many incidents happen during or right after the release argues Charity Majors, CEO at Honeycomb. She believes that stronger ownership of the deployment process by developers will ensure it is executed regularly and reduce risk. She argues for investment in the tooling, high observability during and after release, and small, frequent releases to minimize the impact caused by shipping new code.
-
Scaling Observability at Uber: Building In-House Solutions, uMonitor and Neris
Uber’s infrastructure consists of thousands of microservices supporting mobile applications, infrastructure, and internal services. To provide high observability of these services, Uber’s Observability team built two in-house monitoring solutions: uMonitor for time-series metrics-based alerting, and Neris for host-level checks and metrics.