InfoQ Homepage Monitoring Content on InfoQ
-
How ING Bank Does SRE
Janna Brummel and Robin van Zijll, from ING Netherlands, talked at the Velocity conference in London about how poor availability from their internet banking systems prompted the bank to implement an SRE culture. A centralized SRE team was set up in the Netherlands to provide tooling, consulting and education on reliability to product teams (known as BizDevOps squads internally).
-
Monitoring Microservices - A Prediction for 2018
The monitoring and distributed tracing of microservices has been a recognised challenge for a number of years. Recently Péter Márton, CTO of RisingStack, has written an article on experiences with various approaches including the OpenTracing initiative and has some recommendations, example code and makes a prediction or two about the future.
-
Observability and the Monitoring of Cloud-Native Applications
Cindy Sridharan summarizes her thoughts on observability and its relevance in monitoring cloud native applications in her recent article. Observability is a philosophy that encompasses monitoring, log aggregation, metrics and distributed tracing to gain deeper, ad-hoc insights into a system.
-
CNCF Adds Security, Service Mesh and Tracing Projects: Docker Notary, Lyft Envoy and Uber Jaeger
The Cloud Native Computing Foundation (CNCF) has announced the addition of four new hosted projects over the past month: Docker’s Notary, The Update Framework (TUF), Lyft’s Envoy, and Uber’s Jaeger.
-
Monitoring Cloudflare's Global Network Using Prometheus
Matt Bostock’s SRECON 2017 Europe talk covers how Prometheus, a metric-based monitoring tool, is used to monitor CDN, DNS and DDoS mitigation provider CloudFlare’s globally distributed infrastructure and network.
-
Amazon CloudWatch Dashboards Gains API and CloudFormation Support
Amazon Web Services (AWS) recently added programmatic creation and manipulation of CloudWatch dashboards and widgets to support use cases such as dynamic resource lifecycle tracking and consistent cross-account dashboard maintenance.
-
Amazon CloudWatch Events Gains Cross-Account Event Delivery
Amazon Web Services (AWS) recently added cross-account event delivery to Amazon CloudWatch Events to support use cases such as the tracking of events across an entire organization and the handling of events in separate accounts to implement advanced security schemes.
-
Why the JVM is a Good Choice for Serverless Computing: John Chapin Discusses AWS Lambda at QCon NY
At QCon New York John Chapin presented “Fearless AWS Lambdas”, and not only argued that the JVM is a good platform on which to deploy serverless code, but also provided guidance on extracting the best performance from Java-based AWS Lambda functions.
-
AWS Lambda Support Added to AWS X-Ray Distributed Tracing Service
Following from the General Availability (GA) release of the AWS X-Ray distributed tracing service in April, Amazon has added AWS Lambda support for AWS X-Ray, enabling function invocations and associated metadata to be recorded, displayed graphically via the AWS Console, and analysed for debugging or fault resolution purposes.
-
Metrics Collection and Monitoring at Robinhood Engineering
The Robinhood server operations team published a series of articles talking about their metrics collection, monitoring and alerting infrastructure. OpenTSDB, Grafana, Kafka and Riemann form the core of the stack, with Kafka acting as a proxy layer from which the data is pushed into Riemann for stream processing of the metrics and into OpenTSDB for storage.
-
Weaveworks Adds Release Automation and Incident Management to Weave Cloud Continuous Delivery SaaS
Weaveworks has released new features for the Weave Cloud SaaS platform that aims to simplify deployment, monitoring and management for containers and microservices, including: incident management with historical audit, instant query, and customisable analytics and dashboards; release automation and point-in-time rollback for continuous delivery pipelines; and advanced Kubernetes troubleshooting.
-
DigitalOcean Adds Monitoring and Alerting Features
Cloud infrastructure provider DigitalOcean recently released capabilities for monitoring servers and sending alerts. While not novel, this free feature is indicative of growing industry attention paid to server and application insight.
-
Avoiding Alerts Overload from Microservices: Sarah Wells at QCon London
At QCon London, Sarah Wells presented “Avoiding Alerts Overload from Microservices”, and cautioned that developers and operators must fundamentally change the way they think about monitoring when building a microservice system. Key takeaways included: build a system that can be supported; focus on ‘stuff that matters’ when creating monitoring and alerts; and cultivate and improve alerts.
-
The Future of Microservices: Functional Service Design and Observability
In preparation for the upcoming microXchg conference, running 16th and 17th February in Berlin, InfoQ sat down with Uwe Friedrichsen and Adrian Cole and discussed functional service design, the new challenges with observing a distributed system, and what the future holds for the microservice architectural style.
-
Logz.io Offers Machine Learning Based Log Analysis
Logz.io offers a hosted service which performs intelligent log analysis by using machine learning to derive insights from human interactions with log data that includes discussions on tech forums and public code repositories.