InfoQ Homepage Data Management Content on InfoQ
-
Meta Open Sources OpenZL: a Universal Compression Framework for Structured Data
Meta’s OpenZL changes the way data is compressed by maximizing efficiency for structured datasets, outperforming traditional methods like Zstandard. With a universal decompressor and custom compression plans, it simplifies operational deployment while achieving superior compression ratios and speeds, making it an essential tool for modern data infrastructures.
-
Datadog Launches Monocle, a Unified Rust-Powered Real-Time Metrics Engine
Datadog has launched Monocle, a new real-time time series storage engine written in Rust. The system unifies the company’s metrics storage infrastructure, delivering higher ingestion throughput and lower query latency while reducing operational complexity. Monocle replaces several generations of storage backends, addressing concurrency challenges and scaling limits that accumulated over time.
-
Google Spanner Unifies OLTP and OLAP with Columnar Engine
Google Spanner now features a columnar engine, allowing its distributed database to handle both OLTP and OLAP workloads on a single platform. This hybrid architecture eliminates the need for separate data warehouses and ETL pipelines. The engine's columnar storage and vectorized execution accelerate analytical queries up to 200x on live data, which is especially beneficial for AI applications.
-
Microsoft Azure Storage Discovery Enters Preview with Enhanced Blob Storage Analytics
Azure Storage Discovery is a service that offers a comprehensive overview of your blob storage ecosystem. Leverage advanced insights for cost optimization and security in real-time, using natural language with Azure Copilot. Quickly analyze data trends, detect outliers, and access 18 months of historical data, all in one intuitive dashboard. Experience unparalleled visibility and efficiency.
-
AWS Announces DataZone, a New Data Management Service to Govern Data
At AWS re:Invent, Amazon Web Services announced Amazon DataZone, a new data management service that makes it faster and easier for customers to catalog, discover, share, and govern data stored across AWS, on-premises, and third-party sources.
-
Developing and Evolving SaaS Infrastructures for Enterprises
SaaS companies that are focused on the enterprise market need to evolve their infrastructure to meet the security, reliability, and other IT requirements of their customers. IT admins and large customers are two important sources of requirements to drive development.
-
MLOps: Continuous Delivery of Machine Learning Systems
Developing, deploying, and keeping machine learning models productive is a complex and iterative process with many challenges. MLOps means combining the development of ML models and especially ML systems with the operation of those systems. To make MLOps work, we need to balance iterative and exploratory components from data science with more linear software engineering components.
-
AWS Announces Lower Cost Storage Classes for Amazon Elastic File System
Recently AWS announced the new Amazon Elastic File System (Amazon EFS) One Zone storage classes, which deliver the same features and benefits as the existing Amazon EFS storage classes yet reduce storage costs by 47%. With One Zone storage classes, customers can redundantly store data within a single Availability Zone (AZ).
-
DataOps and Operations-Centric Data Architecture
Eric Estabrooks from DataKitchen spoke at this year's Data Architecture Summit 2019 Conference about how DevOps tasks should be managed for data architecture. DataOps is a collaborative data management practice and is emerging as an area of interest in the industry.
-
Bringing Intelligence to Enterprise Content Management, Google Releases Document Understanding AI
At the recent Google Cloud Next Conference, Google announced a new beta machine learning service, called Document Understanding AI. The service targets Enterprise Content Management (ECM) workloads by allowing customers to organize, classify and extract key value pairs from unstructured content, in the enterprise, using Artificial Intelligence (AI) and Machine Learning (ML).
-
Amazon Announces AWS Storage Gateway Hardware Appliance
Amazon has announced their AWS Storage Gateway hardware appliance, which provides hybrid storage between on-premises applications and AWS’ storage services. By providing a hardware appliance, Amazon gives a preconfigured solution to cache data locally while synchronizing with the cloud.
-
Is On-Premise a Better Fit for SaaS Compliance with GDPR?
The EU's GDPR has led to a debate between those who feel it is advantageous to move to an on-premise solution to best meet the requirements of the GDPR, and those who feel that achieving compliance is independent of the hosting model.
-
Microsoft Introduces New Option for Cloud Data Import
During the recent Microsoft Ignite conference, Microsoft introduced a public preview of a new option for moving large volumes of data to the cloud. Microsoft Azure Data Box provides a way to move data in a device that you can ship directly to a data center.
-
Getting the Data Needed for Data Science
Data science is about the data that you need; deciding which data to collect, create, or keep is fundamental argues Lukas Vermeer, an experienced Data Science professional and Product Owner for Experimentation at Booking.com. True innovation starts with asking big questions, then it becomes apparent which data is needed to find the answers you seek.
-
Luca Olivari on Multi-Model NoSQL Database OrientDB 2.1 New Features
Multi-model NoSQL database OrientDB supports storing and managing document and graph data sets. Orient Technologies, the company behind OrientDB, announced last month the general availability of version 2.1 of the database.