InfoQ Homepage Monitoring Content on InfoQ
-
CNCF Adds Security, Service Mesh and Tracing Projects: Docker Notary, Lyft Envoy and Uber Jaeger
The Cloud Native Computing Foundation (CNCF) has announced the addition of four new hosted projects over the past month: Docker’s Notary, The Update Framework (TUF), Lyft’s Envoy, and Uber’s Jaeger.
-
Monitoring Cloudflare's Global Network Using Prometheus
Matt Bostock’s SRECON 2017 Europe talk covers how Prometheus, a metric-based monitoring tool, is used to monitor CDN, DNS and DDoS mitigation provider CloudFlare’s globally distributed infrastructure and network.
-
Amazon CloudWatch Dashboards Gains API and CloudFormation Support
Amazon Web Services (AWS) recently added programmatic creation and manipulation of CloudWatch dashboards and widgets to support use cases such as dynamic resource lifecycle tracking and consistent cross-account dashboard maintenance.
-
Amazon CloudWatch Events Gains Cross-Account Event Delivery
Amazon Web Services (AWS) recently added cross-account event delivery to Amazon CloudWatch Events to support use cases such as the tracking of events across an entire organization and the handling of events in separate accounts to implement advanced security schemes.
-
Why the JVM is a Good Choice for Serverless Computing: John Chapin Discusses AWS Lambda at QCon NY
At QCon New York John Chapin presented “Fearless AWS Lambdas”, and not only argued that the JVM is a good platform on which to deploy serverless code, but also provided guidance on extracting the best performance from Java-based AWS Lambda functions.
-
AWS Lambda Support Added to AWS X-Ray Distributed Tracing Service
Following from the General Availability (GA) release of the AWS X-Ray distributed tracing service in April, Amazon has added AWS Lambda support for AWS X-Ray, enabling function invocations and associated metadata to be recorded, displayed graphically via the AWS Console, and analysed for debugging or fault resolution purposes.
-
Metrics Collection and Monitoring at Robinhood Engineering
The Robinhood server operations team published a series of articles talking about their metrics collection, monitoring and alerting infrastructure. OpenTSDB, Grafana, Kafka and Riemann form the core of the stack, with Kafka acting as a proxy layer from which the data is pushed into Riemann for stream processing of the metrics and into OpenTSDB for storage.
-
Weaveworks Adds Release Automation and Incident Management to Weave Cloud Continuous Delivery SaaS
Weaveworks has released new features for the Weave Cloud SaaS platform that aims to simplify deployment, monitoring and management for containers and microservices, including: incident management with historical audit, instant query, and customisable analytics and dashboards; release automation and point-in-time rollback for continuous delivery pipelines; and advanced Kubernetes troubleshooting.
-
DigitalOcean Adds Monitoring and Alerting Features
Cloud infrastructure provider DigitalOcean recently released capabilities for monitoring servers and sending alerts. While not novel, this free feature is indicative of growing industry attention paid to server and application insight.
-
Avoiding Alerts Overload from Microservices: Sarah Wells at QCon London
At QCon London, Sarah Wells presented “Avoiding Alerts Overload from Microservices”, and cautioned that developers and operators must fundamentally change the way they think about monitoring when building a microservice system. Key takeaways included: build a system that can be supported; focus on ‘stuff that matters’ when creating monitoring and alerts; and cultivate and improve alerts.
-
The Future of Microservices: Functional Service Design and Observability
In preparation for the upcoming microXchg conference, running 16th and 17th February in Berlin, InfoQ sat down with Uwe Friedrichsen and Adrian Cole and discussed functional service design, the new challenges with observing a distributed system, and what the future holds for the microservice architectural style.
-
Logz.io Offers Machine Learning Based Log Analysis
Logz.io offers a hosted service which performs intelligent log analysis by using machine learning to derive insights from human interactions with log data that includes discussions on tech forums and public code repositories.
-
Honeycomb - A Tool for Debugging Complex Systems
Honeycomb is a tool for observing and correlating events in distributed systems. It provides a different approach from existing tools like Zipkin in that it moves away from the single-request-tracing model to a more free-form model of collecting and querying data across layers and dimensions.
-
Chaos Monkey 2.0 Runs via Spinnaker
Netflix has recently made available the source code of the Chaos Monkey 2.0. The latest iteration of the resilience tool is fully integrated with Spinnaker and event tracking systems, but the SSH support has been removed.
-
Continuous Deployment at Coolblue
Continuous deployment results in a higher sense of responsibility and better quality of deployments, argues Paul de Raaij, technical pathfinder at Coolblue. Coding standards prevent your code base from becoming a mess, automated inspections are great for tedious and boring checks, and manual checks are great for checking if the logic or use of code actually makes sense.