InfoQ Homepage Data Content on InfoQ

News

RSS Feed

Newer Older

Emerging Technologies

Evolutionary Data through Schemaboi: Achieving Forward, Backwards, and Sideways Compatibility

Drawing from the enduring adaptability of HTML and HTTP, Seph Gentle proposes embedding self-contained schemas directly into file headers, ensuring data remains readable without external definitions. His experimental format prioritises forward, backwards, and sideways compatibility, enabling data format evolution without central coordination or data loss

Olimpiu Pop
on Jul 14, 2026
Cloud

Firestore Adds Pipeline Operations with over 100 New Query Features

Google has overhauled Firestore’s query engine, introducing "Pipeline operations" that enable complex server-side aggregations and array unnesting. The update shifts Firestore Enterprise toward an optional indexing model, allowing architects to prioritize write speed and lower costs. While it brings parity with MongoDB-style aggregations, the preview currently lacks real-time and emulator support.

Steef-Jan Wiggers
on Feb 14, 2026
AI, ML & Data Engineering

Pandas 3.0 Introduces Default String Dtype and Copy-on-Write Semantics

The pandas team has released pandas 3.0.0, a major update that changes core behaviors around string handling, memory semantics, and datetime resolution, while removing a substantial amount of deprecated functionality. The release introduces several changes to core behaviors in the library’s API.

Robert Krzaczyński
on Feb 11, 2026
Culture & Methods

European Initiative for Data Sovereignty Released a Trust Framework

The Danube release of the Gaia-X trust framework provides mechanisms for the automation of compliance and supports interoperability across sectors and geographies to ensure trusted data transactions and service interactions. The Gaia-X Summit 2025 hosted facilitated discussions on AI and data sovereignty, and presented data space solutions that support innovation across Europe and beyond.

Ben Linders
on Jan 22, 2026
DevOps

LangGrant Unveils LEDGE MCP Server to Enable Agentic AI on Enterprise Databases

LangGrant has launched the LEDGE MCP Server, a new enterprise platform designed to let large language models reason across complex database environments without directly accessing or exposing underlying data.

Craig Risi
on Jan 13, 2026
DevOps

Nexla Launches Express: a Conversational Platform for AI Data Engineering

Nexla recently introduced Express, a conversational data engineering platform designed to dramatically lower the barrier for building data pipelines for AI applications.

Craig Risi
on Nov 22, 2025
Development

Meta Open Sources OpenZL: a Universal Compression Framework for Structured Data

Meta’s OpenZL changes the way data is compressed by maximizing efficiency for structured datasets, outperforming traditional methods like Zstandard. With a universal decompressor and custom compression plans, it simplifies operational deployment while achieving superior compression ratios and speeds, making it an essential tool for modern data infrastructures.

Steef-Jan Wiggers
on Oct 28, 2025
AI, ML & Data Engineering

Vercel Introduces Drains for Unified Data Export

Vercel has released Vercel Drains, a system for exporting observability data from its platform into external services. The feature unifies logs, distributed traces, web analytics events, and performance metrics into a single streaming mechanism.

Daniel Dominguez
on Oct 04, 2025
AI, ML & Data Engineering

Hugging Face Introduces AI Sheets, a No-Code Tool for Dataset Transformation

Hugging Face has released AI Sheets, an open-source application designed to let users build, transform, and enrich datasets using AI models through a spreadsheet-like interface. The tool, available both on the Hub and for local deployment, allows users to experiment with thousands of open models, including OpenAI’s gpt-oss, without requiring code.

Robert Krzaczyński
on Sep 08, 2025
Web Development

TanStack DB Enters Beta with Reactive Queries, Optimistic Mutations, and Local-First Sync

Introducing TanStack DB: a groundbreaking embedded client-side database that revolutionizes frontend development. With features like reactive queries, typed collections, and optimistic mutations, TanStack DB simplifies state management, ensuring blazing-fast updates. Easily integrate with existing TanStack Query applications in an open-source, beta format.

Daniel Curtis
on Aug 30, 2025
AI, ML & Data Engineering

Google Launched LangExtract, a Python Library for Structured Data Extraction from Unstructured Text

Google has introduced LangExtract, an open-source Python library designed to help developers extract structured information from unstructured text using large language models such as the Gemini models.

Daniel Dominguez
on Aug 08, 2025
AI, ML & Data Engineering

Synthetic Data Generator Simplifies Dataset Creation with Large Language Models

Hugging Face has introduced the Synthetic Data Generator, a new tool leveraging Large Language Models (LLMs), that offers a streamlined, no-code approach to creating custom datasets. The tool facilitates the creation of text classification and chat datasets through a clear and accessible process, making it usable for both non-technical users and experienced AI practitioners.

Robert Krzaczyński
on Jan 27, 2025
Culture & Methods

Setting up a Data Mesh Organization

A data mesh organization: producers, consumers, and the platform. According to Matthias Patzak, the mission of the platform team is to make the lives of the producer and consumers simple, efficient and stress free. Data must be discoverable and understandable, trustworthy, and shared securely and easily across the organization.

Ben Linders
on Oct 10, 2024
Culture & Methods

Data Teams Survey: Lag in DataOps and Value Delivered

We report on Jesse Anderson's 2024 Data Teams Survey which showed a lag in DataOps capabilities, slow LLM adoption, and a concerning decline in perceived value creation by data teams. It called out the importance of teams spread with data science, engineering, and operations capabilities. We also cover Petr Janda's recent podcast on the need for more engineering rigour for parity with other teams.

Rafiq Gemmail
on Oct 09, 2024
AI, ML & Data Engineering

Anthropic Unveils Contextual Retrieval for Enhanced AI Data Handling

Anthropic has announced Contextual Retrieval, a significant advancement in AI systems' interaction with extensive knowledge bases. This technique addresses the challenge of context loss in Retrieval-Augmented Generation (RAG) systems by enriching text chunks with contextual information before embedding or indexing.

Daniel Dominguez
on Sep 25, 2024

Newer News

Older News

InfoQ Software Architects' Newsletter

News