InfoQ Homepage DevOps Content on InfoQ
-
How OpenAI Built a Secure Windows Sandbox for Codex Agents
OpenAI details Codex Windows sandbox architecture, showing how SIDs, ACLs, restricted tokens, and dedicated sandbox accounts enable safe execution of autonomous coding tasks. The design balances isolation with real developer workflows and shows how OS security primitives must be composed for AI agents on local development environments.
-
Dropbox Introduces Nova, an Internal Platform for Running AI Coding Agents at Scale
Dropbox has unveiled Nova, an internal platform designed to orchestrate and operationalize AI coding agents across the company's engineering workflows.
-
How Netflix Maps Thousands of Microservices in Real-Time
Netflix has shared details about Service Topology. This internal system creates and updates a live dependency graph for thousands of microservices. It helps engineers see how services connect and resolve issues more quickly. The system merges three separate data sources into a single, queryable graph. It updates almost in real-time as traffic patterns shift.
-
AWS Replaces Fat-Tree Data Center Networks with Random Graph Theory, Cutting Routers by 69%
AWS disclosed that Resilient Network Graphs, a flat network architecture based on quasi-random graph theory, is now the default for most new data center builds. The design replaces fat-tree hierarchies with direct ToR-to-ToR mesh connections using passive optical ShuffleBoxes, cutting routers by 69%, boosting throughput by 33%, and reducing network power consumption by 40%.
-
Inside Google’s System for Coordinated A/B Testing across its Global Service Fleet
Google has shared details of its fleet wide large scale A/B experimentation system designed to standardize experiment assignment, exposure logging, and configuration propagation across distributed services. The approach enables consistent measurement across products, reduces experiment conflicts, and improves reliability of data driven decision making at scale.
-
OpenTelemetry Launches “Blueprints” Initiative to Simplify Enterprise Observability Adoption
OpenTelemetry has introduced a new "Blueprints" initiative aimed at reducing the growing complexity of deploying and operating observability systems at scale.
-
BadHost Vulnerability Exposes AI Agents, Evaluators, and LLM Gateways
BadHost is a high-severity authentication bypass vulnerability in the widely used Python web framework Starlette, with 325 million weekly downloads. The flaw allows attackers to use malformed HTTP Host headers to bypass path-based access controls and access sensitive AI agent infrastructure, among other systems.
-
A Trailing Slash Bypassed AWS API Gateway Authorization
A security researcher found that adding a trailing slash to AWS HTTP API paths bypassed Lambda authorizer authentication entirely, enabling unauthenticated wire transfers at a fintech. The root cause is a path normalization mismatch between HTTP API's greedy route matching and its authorization layer. The same vulnerability class appeared in gRPC-Go via CVE-2026-33186.
-
Arm Open-Sources Metis, an AI Security Framework Outperforming Traditional SAST Tools
Arm has open-sourced Metis, an agentic AI security framework designed to autonomously uncover complex software vulnerabilities. Unlike traditional pattern-based tools, Metis applies semantic reasoning to analyze cross-component dependencies and provides clear, natural language explanations for its findings.
-
Google Cloud Suspends Railway's Production Account, Causing Eight-Hour Platform-Wide Outage
Google Cloud's automated systems suspended Railway's production account without notice, triggering an eight-hour platform-wide outage affecting 3 million users. The cascade took down workloads across all providers including AWS and bare metal because Railway's control plane was hosted on GCP. Railway is demoting GCP to backup-only status.
-
AI-Assisted Migration Tool Helps Teams Move from ingress-nginx to Higress in Minutes
The Cloud Native Computing Foundation has highlighted a new AI-assisted migration approach that enabled engineers to migrate 60 ingress-nginx resources to Higress in roughly 30 minutes, demonstrating how artificial intelligence is increasingly being applied to modernize Kubernetes networking and gateway infrastructure.
-
GitHub Slashes Agent Workflow Token Spend up to 62% with Daily Audits and MCP Pruning
GitHub reports cutting token costs in agentic CI workflows by up to 62% by pruning unused MCP tools, swapping some MCP calls for gh CLI, and running daily “auditor” and “optimizer” agents. A token-usage.jsonl artefact and an Effective Tokens metric help track spend across models and spot regressions.
-
Microsoft Announces Azure Linux 4.0, Its First General-Purpose Server Linux Distribution
Microsoft announced Azure Linux 4.0 and Azure Container Linux at Open Source Summit. Azure Linux 4.0 is a Fedora-based general-purpose server distribution for Azure VMs, the first time Microsoft has offered a supported Linux beyond container hosting. Azure Container Linux is an immutable container-optimized host built on Flatcar.
-
Azure Logic Apps Adds Sandboxed Code Interpreters to Agent Workflows
Microsoft added sandboxed code interpreters to Azure Logic Apps, enabling agents within integration workflows to generate and execute Python, JavaScript, C#, and PowerShell in Hyper-V isolated sessions. Architects get full control over model selection per workflow. The capability positions Logic Apps as an agent platform for integration alongside Foundry and Copilot Studio.
-
Platform Engineering Labs Expands formae with Kubernetes Support, Native Helm Integration
Platform Engineering Labs has announced a major update to its open-source Infrastructure-as-Code platform, formae, introducing full Kubernetes support, native Helm integration, direct .tfvars compatibility, and a new public plugin hub aimed at simplifying cloud-native infrastructure management