Facilitating the spread of knowledge and innovation in professional software development



Choose your language

InfoQ Homepage Metrics Content on InfoQ

  • Elastic Stack 7.6 Released with Security, Performance, and Observability Improvements

    Elastic announced the release of Elastic Stack 7.6. This release contains a number of security improvements including a new SIEM detection engine and a redesigned SIEM overview dashboard page. This release also includes performance improvements to queries that are sorted by date, enhanced supervised machine learning capabilities, and support for ingesting Jaeger trace data.

  • Amazon Announces AWS Firelens – a New Way to Manage Container Logs

    Recently, Amazon announced a new log aggregation service called AWS Firelens. The service unifies log filtering and routing across all AWS container services including Amazon ECS, Amazon EKS, and AWS Fargate.

  • Managing eBay Vast Service Architecture Using Knowledge Graphs

    Knowledge graphs describe knowledge domains based on expert input, data, and machine learning algorithms. eBay is using an application/infrastructure knowledge graph to manage its vast service architecture and provide a better experience for the roughly 200M buyers visiting the site.

  • Microsoft Announces 1.0 Release of Kubernetes-Based Event-Driven Autoscaling (KEDA)

    Microsoft has announced the 1.0 version of the Kubernetes-based event-driven autoscaling (KEDA) component, an open-source project that can run in a Kubernetes cluster to provide "fine grained autoscaling (including to/from zero)" for every container. KEDA also serves as a Kubernetes Metrics Server and allows users to define autoscaling rules using a dedicated Kubernetes custom resource.

  • Scaling Graphite at's engineering team scaled their Graphite deployment from a small cluster to one that handles millions of metrics per second. Along the way, they modified and optimized Graphite's core components - the carbon-relay and carbon-cache, and the rendering API.

  • Q&A on Cloud Discovery Tool for Multi-Cloud Environments

    Cloud Discovery is an open-source tool from Twistlock that connects to cloud providers and gets an inventory of all the various infrastructure resources deployed. Cloud Discovery gathers and reports resources metadata in an aggregated way. Furthermore, application security holes can be identified when there’s more visibility across environments, such as which resources are missing a firewall rule.

  • Three Pillars with Zero Answers: Rethinking Observability with Ben Sigelman

    At KubeCon NA, held in Seattle, USA, in December 2018, Ben Sigelman presented “Three Pillars, Zero Answers: We Need to Rethink Observability” and argued that many organisations may need to rethink their approach to metrics, logging and distributed tracing.

  • Making Machine Learning Adoptable for Clinicians

    Dr. Alexander Scarlat explains the core tenants of machine learning in his 12-part series "Machine Learning Primer for Clinicians." Scarlat covers defining aspects of machine learning, followed by examples that communicate aspects of measuring the performance of machine learning models. The series uses animated charts in place of the math to help readers understand the machine learning concepts.

  • Grafana Adds Log Data Correlation to Time Series Metrics

    The Grafana team announced an alpha version of Loki, their logging platform that ties in with other Grafana features like metrics query and visualization. Loki adds a new client agent promtail and serverside components for log metadata indexing and storage.

  • Uber Open Sources Its Large Scale Metrics Platform M3

    Uber’s engineering team released its metrics platform M3 as open source which it has been using internally for some years. The platform was built to replace its Graphite based system, and provides cluster management, aggregation, collection, storage management, a distributed time series database (TSDB) and a query engine with its own query language M3QL.

  • Enabling Continuous Delivery with a Dedicated Team

    Robin Weston describes how an external enablement team was able to introduce continuous delivery practices in an organization with high resistance to change and siloed teams. Rather than just bringing in new technology and tools, the team focused on sharing and educating teams. Practices ranged from continuous integration, to following the test pyramid, or reducing cycle time by identifying waste.

  • Building Observable Distributed Systems

    Today's systems are more and more complex; microservices distributed over the network and scaling dynamically, resulting in many more ways of failure, ways we can't always predict. Investing in observability gives us the ability to ask questions to systems, things we never thought about before. Some of the tools that can be used for this are metrics, tracing, structured and correlated logging.

  • Thanos - a Scalable Prometheus with Unlimited Storage

    The Improbable engineering team open sourced Thanos, a set of components that adds high availability to Prometheus installations by cross-cluster federation, unlimited storage and global querying across clusters.

  • Metrics Collection from Large Scale IoT Deployments at Vivint

    Vivint's engineering team built their own metrics collection platform to collect and analyze metrics from their deployed devices. The key motivation behind writing their own system was to be able to store only aggregated data and focus on its analysis, which they achieve by their Rothko project.

  • How to Measure Continuous Delivery

    Stability and throughput are the things that you can measure when adopting continuous delivery practices. These metrics can help you reduce uncertainty, make better decisions about which practices to amplify or dampen, and steer your continuous delivery adoption process in the right direction.