InfoQ Homepage Data Warehousing Content on InfoQ

News

RSS Feed

Newer Older

Architecture & Design

Instacart Creates Real-Time Item Availability Architecture with ML and Event Processing

Instacart combined machine learning with event-based processing to create an architecture that provides customers with an indication of item availability in near real-time. The new solution helped to improve user satisfaction and retention by reducing order cancellations due to out-of-stock items. The team also created a multi-model experimentation framework to help enhance model quality.

Rafal Gancarz
on Feb 08, 2024
AI, ML & Data Engineering

Next Generation of Data Movement and Processing Platform at Netflix

Netflix engineering recently published in a tech blog how they used data mesh architecture and principles as the next generation of data platform and processing to unleash more business use cases and opportunities. Data mesh is the new paradigm shift in data management that enables users to easily import and use data without transporting it to a centralized location like a data lake.

Reza Rahimi
on Aug 29, 2022
AI, ML & Data Engineering

Uber Open-Sourced Its Highly Scalable and Reliable Shuffle as a Service for Apache Spark

Uber engineering has recently open-sourced its highly scalable and reliable shuffle as a service for Apache Spark. Spark is one of the most important tools and platforms in data engineering and analytics. It is shuffling data on local machines by default and causes challenges while the scale is getting very large. Shuffle as a service is a solution developed at Uber for this problem.

Reza Rahimi
on Aug 14, 2022
Cloud

Amazon Redshift Serverless Generally Available to Automatically Scale Data Warehouse

Amazon recently announced the general availability of Redshift Serverless, an elastic option to scale data warehouse capacity. The new service allows data analysts, developers and data scientists to run and scale analytics without provisioning and managing data warehouse clusters.

Renato Losio
on Jul 23, 2022
Cloud

Google Launches a New Cross-Platform Data Storage Engine BigLake in Preview

At the recent Cloud Data Summit, Google recently announced the preview of BigLake, a new data lake storage engine that makes it easier for enterprises to analyze the data in their data warehouses and data lakes.

Steef-Jan Wiggers
on Apr 19, 2022
Cloud

AWS Announces the Public Preview of AWS Data Exchange for Amazon Redshift

Recently AWS announced the public preview of AWS Data Exchange for Amazon Redshift. This new feature enables customers to find and subscribe to third-party data in AWS Data Exchange to query in an Amazon Redshift data warehouse.

Steef-Jan Wiggers
on Oct 27, 2021
Cloud

Amazon Redshift Data Sharing Now Generally Available

Amazon has recently announced the general availability of the Amazon Redshift Data Sharing functionality to share live data across Amazon Redshift clusters. This allows the use of a single data warehouse cluster for multi-cluster deployments and sharing data instantly without the need to copy or move them.

Renato Losio
on Mar 20, 2021
AI, ML & Data Engineering

The Future of Data Engineering: Chris Riccomini at QCon San Francisco

At QCon San Francisco 2019, Chris Riccomini presented “The Future of Data Engineering”. The key takeaway of his talk is about reaching an end goal with data engineering, which is having a fully automated decentralized data warehouse.

Steef-Jan Wiggers
on Nov 18, 2019
AI, ML & Data Engineering

Databricks Open Sources Delta Lake to Make Data Lakes More Reliable

Databricks recently announced open sourcing Delta Lake, their proprietary storage layer, to bring ACID transactions to Apache Spark and big data workloads. Databricks is the company behind the creators of Apache Spark, while Delta Lake is already being used in several companies like McAffee, Upwork etc . Delta Lake is addressing the heterogeneous data problem that data lakes often have...

Alex Giamas
on May 20, 2019
AI, ML & Data Engineering

Data Workflow Management Using Airbnb's Airflow

Airbnb recently opensourced Airflow, its own data workflow management framework. Airflow is being used internally at Airbnb to build, monitor and adjust data pipelines. Airflow’s creator, Maxime Beauchemin and Agari’s Data Architect and one of the framework’s early adopters Siddharth Anand discuss about Airflow, where it can be of use and future plans.

Alex Giamas
on Sep 08, 2015
Software Defined Data Mart In The Enterprise Using Metanautix Quest

Metanautix recently announced the newest version of its product, Quest. Quest allows enterprises to build software defined data marts that can run in virtualized servers. Designed from the ground up with security and auditability in mind, Quest can deal with Big Data workloads and encapsulate it into different logical views, ready for consumption by different departments in enterprise.

Alex Giamas
on Jun 29, 2015
Thoughtworks Technology Radar March 2012

ThoughtWorks recently published the latest update to its Technology Radar; a report produced to help technology decision makers understand emerging trends in software development techniques, tools, languages and platforms. There are some interesting observations of interest to Agile software development teams.

Craig Smith
on Mar 22, 2012
What’s New in SQL Server 2012 RC0

Microsoft has released SQL Server 2012 Release Candidate 0. There are many new features, including: AlwaysOn, better performance management, more reporting and visualization tools, Columnstore index, and FileTables. The product will come in 3 main editions: Standard, Business Intelligence and Enterprise.

Abel Avram
on Nov 18, 2011
Olap4j 1.0: a Java API for OLAP Servers

Business Intelligence vendor Pentaho has announced the release of olap4j 1.0, a new, common Java API for any online analytical processing (OLAP) server.

Charles Humble Jai Hirsch
on Jun 24, 2011
Column-based Storage in SQL Server 2011

Imagine ad hock data mining queries against a single table with 1 TB of data and 1.44 billion rows coming back in roughly a second. This is the scenario Microsoft intends to support using 32-core machines and their new column-based storage engine.

Jonathan Allen
on Mar 07, 2011

Newer News

Older News

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?

News