InfoQ Homepage Database Content on InfoQ
-
Uber’s Hive Federation Decentralizes 16K Datasets and 10+ PB for Zero-Downtime Analytics at Scale
Uber has decentralized its Hive data warehouse, migrating 16,000 datasets totaling over 10 petabytes using pointer-based federation. The migration ensures zero downtime, strict ACL enforcement, improved governance, and scalable, domain-specific datasets for analytics and machine learning workloads.
-
Cloudflare and ETH Zurich Outline Approaches for AI-Driven Cache Optimization
Cloudflare and ETH Zurich highlight how AI-driven crawler traffic challenges traditional caching in CDNs and databases. They propose AI-aware strategies including separate cache tiers, adaptive algorithms, and pay-per-crawl models to balance performance for human users and AI services while maintaining cache efficiency and system stability.
-
TigerFS Mounts PostgreSQL Databases as a Filesystem for Developers and AI Agents
TigerFS is a new experimental filesystem that mounts a database as a directory and stores files directly in PostgreSQL. The open source project exposes database data through a standard filesystem interface, allowing developers and AI agents to interact with it using common Unix tools such as ls, cat, find, and grep, rather than via APIs or SDKs.
-
ProxySQL Introduces Multi-Tier Release Strategy with Stable, Innovative, and AI Tracks
ProxySQL 3.0.6 was recently released, along with a new multi-tier release strategy. The Stable Tier focuses on reliability and production use, the Innovative Tier introduces newer features earlier, and the AI/MCP Tier explores future capabilities, including AI integrations.
-
Uber Launches IngestionNext: Streaming-First Data Lake Cuts Latency and Compute by 25%
Uber launches IngestionNext, a streaming-first data lake ingestion platform that reduces data latency from hours to minutes and cuts compute usage by 25%. Built on Kafka, Flink, and Apache Hudi, it supports thousands of datasets, enabling faster analytics, experimentation, and machine learning workloads globally.
-
AWS Expands Aurora DSQL with Playground, New Tool Integrations, and Driver Connectors
Amazon has announced several updates for Aurora DSQL, focusing on usability, integrations, and developer tooling. The improvements include a new interactive Aurora DSQL Playground that lets developers explore and experiment with the database directly in the browser, without registration or associated costs.
-
QCon London 2026: How to Run on Three Clouds at Once, and When Not to
Form3 runs UK bank payments across three clouds simultaneously. At QCon London, their engineers explained how they built their custom Kubernetes operators, cross-cloud DNS tricks, and distributed databases, and what happened when they tried to sell them in America. Spoiler: US customers wanted East/West failover, not triple-active multi-cloud.
-
Pinterest’s CDC-Powered Ingestion Slashes Database Latency from 24 Hours to 15 Minutes
Pinterest launched a next-generation CDC-based database ingestion framework using Kafka, Flink, Spark, and Iceberg. The system reduces data availability latency from 24+ hours to 15 minutes, processes only changed records, supports incremental updates and deletions, and scales to petabyte-level data across thousands of pipelines, optimizing cost and efficiency.
-
Databricks Introduces Lakebase, a PostgreSQL Database for AI Workloads
Databricks has recently announced the general availability of Lakebase, a serverless, PostgreSQL-based OLTP database that scales compute and storage independently. Lakebase is designed to integrate with the Databricks platform, providing a hybrid solution that combines both transactional and analytical capabilities.
-
AWS Enables Lambda Function Triggers from RDS for SQL Server Database Events
In a blog post, AWS recently described an event-driven pattern for Amazon RDS for SQL Server, allowing developers to trigger Lambda functions in response to database events via CloudWatch Logs and SQS.
-
Firestore Adds Pipeline Operations with over 100 New Query Features
Google has overhauled Firestore’s query engine, introducing "Pipeline operations" that enable complex server-side aggregations and array unnesting. The update shifts Firestore Enterprise toward an optional indexing model, allowing architects to prioritize write speed and lower costs. While it brings parity with MongoDB-style aggregations, the preview currently lacks real-time and emulator support.
-
Google Introduces Managed Connection Pooling for AlloyDB
Google Cloud has launched managed connection pooling for AlloyDB for PostgreSQL, boosting client connections by 3x and transactional throughput by up to 5x. This feature simplifies database management by automating connection management and reducing latency.
-
Expired Oracle Patent Opens Fast Sorting Algorithm to Open Source Databases
A recent article reports that an Oracle patent on a fast sorting method has expired, allowing open source databases to use it freely. Mark Callaghan, the inventor behind the sorting algorithm, shows how this 20-year-old approach can speed up sorting similar data and could make database systems faster and more efficient.
-
Cloudflare Introduces Aggregations in R2 SQL for Data Analytics
Cloudflare recently announced support for aggregations in R2 SQL, a new feature that lets developers run SQL queries on data stored in R2. This enhancement expands R2 SQL beyond basic filtering and makes it more useful for analytical workloads without requiring separate data warehouse tools.
-
LangGrant Unveils LEDGE MCP Server to Enable Agentic AI on Enterprise Databases
LangGrant has launched the LEDGE MCP Server, a new enterprise platform designed to let large language models reason across complex database environments without directly accessing or exposing underlying data.