InfoQ Homepage Monitoring Content on InfoQ
-
AWS Publishes Best Practices Guide for Operational Dashboards
AWS recently added to the Amazon Builders' Library their best practices for building dashboards for operational visibility. The document includes a detailed description of the different types of dashboards that exist at Amazon as well as a discussion of the design best practices used to create dashboards.
-
Amazon Cloudwatch Dashboards Supports Sharing
AWS recently introduced the ability to share Amazon CloudWatch Dashboards with users who do not have access to the AWS account. This feature opens up new use cases for dashboards, including sharing metrics and information on big screens, or embedding real-time information in public pages.
-
Observability Strategies for Distributed Systems - Lessons Learned at InfoQ Live
A good observability strategy makes it easy for teams to share their data, and uses data from across a distributed system to identify if business goals are being achieved. These were some of the ideas discussed during the InfoQ Live roundtable discussion on observability patterns for distributed systems, held on August 25.
-
Brenda - an Artificial Intelligence Team Member
Brenda uses artificial intelligence with machine learning to monitor the infrastructure, do quality assurance checks and support troubleshooting, handle alerts and communicate critical issues, and apply auto-healing. Sree Rama Murthy Pakkala and Collin Mendons from Swisscom will talk about an AI/ML framework named Brenda, who helps their teams to increase quality at Swiss Testing Day 2020.
-
How Netlify’s Infrastructure Team Improved Observability While Increasing Deployment Speed
Netlify's infrastructure team shared their story of how they increased their customer deployment speeds by up to 2x by optimizing their deployment algorithm and increased observability into their systems in the process.
-
Moogsoft Adds Virtual Network Operations Centre Capability
AIOps platform vendor, Moogsoft, has announced the release of Moogsoft Enterprise 8.0, featuring a capability for technology teams to build a virtual Network Operations Centre (NOC). Moogsoft Enterprise consolidates monitoring tools with the intention of helping technology teams reduce noise, prioritize incidents, reduce escalations and ensure uptime.
-
Periskop: SoundCloud's Exception Monitoring Service
SoundCloud's engineering team wrote about their exception monitoring software called Periskop, which collects and aggregates exceptions across servers and reports to a central server for analysis.
-
Grafana Labs Announces GA of Cortex v1.0 and Discusses Architectural Changes
Grafana Labs, the company behind popular open-source monitoring projects Grafana and Loki, announced the General Availability of Cortex v1.0. Cortex is a clustered Prometheus implementation that includes features such as horizontal scalability, multi-tenancy, durability, and long-term storage.
-
NGINX Releases Controller 3.0 with Major Redesign Providing Consolidated Application View
NGINX announced the release of NGINX Controller 3.0, their control-plane solution to manage the NGINX data plane. The 3.0 release sees a full redesign of Controller moving it into an "app-centric experience" that allows for interacting with the infrastructure at the application level. This includes a full configuration API, a role based self-service portal, and a built in certificate manager.
-
Logz.io Survey Finds Tool Sprawl and Complex Architecture Key Challenges for Observability
Logz.io released their annual survey of the DevOps industry with the spotlight this year on observability. The key findings include that DevOps and observability tool sprawl is becoming an issue and complex architectures present the key challenge in implementing an observability solution. In the next year, they predict greater investment in observability with a focus on distributed tracing.
-
AWS CloudWatch Adds Observability Tool for Visualizing Distributed Applications
AWS released ServiceLens, a fully managed observability solution built within CloudWatch. ServiceLens is designed to visualize and analyze the health, performance, and availability of distributed applications. Currently it is available in all commercial regions but requires the usage of AWS X-Ray.
-
Amazon Announces AWS Firelens – a New Way to Manage Container Logs
Recently, Amazon announced a new log aggregation service called AWS Firelens. The service unifies log filtering and routing across all AWS container services including Amazon ECS, Amazon EKS, and AWS Fargate.
-
Heroku's Journey to Automated Continuous Deployment
Heroku's engineering team wrote about their journey from manual deployments to automated continuous deployments for Heroku Runtime, their managed environment for applications. They achieved this using Heroku primitives and a custom deployer tool.
-
Full Stack Monitoring of JVM Applications, Using Micrometer
Clint Checketts, core committer of Micrometer Project, recently spoke at SpringOne Platform 2019 conference about Micrometer monitoring and alerting framework.
-
Amazon Releases the Anomaly Detection Feature for CloudWatch to General Availability
Recently, Amazon announced the general availability of the Anomaly Detection feature in Amazon CloudWatch, a monitoring and management service providing customers data and insights from AWS, hybrid, and on-premises applications and infrastructure resources.