InfoQ Homepage Data Warehouse Content on InfoQ
-
The Distributed Data Mesh as a Solution to Centralized Data Monoliths
Instead of building large, centralized data platforms, corporations and data architects should create distributed data meshes.
-
Microsoft Announces Azure Synapse for Data Warehousing and Analytics
During Microsoft's annual Ignite conference the company announced a new analytics service called Azure Synapse. The service, which is a continuation of Azure SQL Data Warehouse, focuses on bringing enterprise data warehousing and big data analytics into a single service.
-
The Future of Data Engineering: Chris Riccomini at QCon San Francisco
At QCon San Francisco 2019, Chris Riccomini presented “The Future of Data Engineering”. The key takeaway of his talk is about reaching an end goal with data engineering, which is having a fully automated decentralized data warehouse.
-
Databricks Open Sources Delta Lake to Make Data Lakes More Reliable
Databricks recently announced open sourcing Delta Lake, their proprietary storage layer, to bring ACID transactions to Apache Spark and big data workloads. Databricks is the company behind the creators of Apache Spark, while Delta Lake is already being used in several companies like McAffee, Upwork etc . Delta Lake is addressing the heterogeneous data problem that data lakes often have...
-
William McKnight on Data Platforms and Creating a Modern Data Architecture
William McKnight gave a keynote presentation last week at Data Architecture Summit 2018 Conference on creating a modern data architecture using different data platforms.
-
Data Workflow Management Using Airbnb's Airflow
Airbnb recently opensourced Airflow, its own data workflow management framework. Airflow is being used internally at Airbnb to build, monitor and adjust data pipelines. Airflow’s creator, Maxime Beauchemin and Agari’s Data Architect and one of the framework’s early adopters Siddharth Anand discuss about Airflow, where it can be of use and future plans.
-
Snowflake Announces General Availability of their Cloud Data Warehouse Offering
Snowflake Computing has announced the general availability of their Snowflake Elastic Data Warehouse, a software as a service offering that provides a SQL data warehouse on top of Amazon Web Services.
-
Software Defined Data Mart In The Enterprise Using Metanautix Quest
Metanautix recently announced the newest version of its product, Quest. Quest allows enterprises to build software defined data marts that can run in virtualized servers. Designed from the ground up with security and auditability in mind, Quest can deal with Big Data workloads and encapsulate it into different logical views, ready for consumption by different departments in enterprise.
-
Implementing Agile in Data Warehouse Projects
This post talks about using an agile implementation for data warehouse projects.
-
Google unveils Mesa - Geo-Replicated Near-Realtime Scalable Data Warehouse
Google has unveiled their new data-warehouse called Mesa. Mesa is a system that scales across multiple data centers and processes petabytes of data, while being able to respond to queries in sub-second time and maintain ACID properties.
-
Teradata Offers Data Warehouse as a Service as Part of Their Cloud Strategy
Teradata revamps its cloud offering, offers Data Warehouse Data Platform as a Service solution. Teradata Cloud is aspiring to become a worthy competitor to Amazon Redshift, with a richer set of predefined libraries and a more effective way of loading data.
-
Amazon Makes Compelling Case for Hosting and Processing Your Big Data
The AWS team has announced a limited preview of Amazon Redshift, a cloud-hosted data warehouse whose cost and capabilities are poised to disrupt the industry. In addition, AWS revealed two new massive compute instance types, and a data integration tool called Data Pipeline.
-
Better Developer Experience in Version 1.5 of the Data Access Framework MetaModel
Eobject.org's open-source Java framework MetaModel implements a unified API for the access, exploration, and query of different datastores. Eobjects.org, both a website and an open source software organization dedicated to "the development of Open Source software related to Business Intelligence and Data Warehousing", has recently published version 1.5 of MetaModel.
-
Facebook on Hadoop, Hive, HBase, and A/B Testing
The Hadoop Summit of 2010 included presentations from a number of large scale users of Hadoop and related technologies. Notably, Facebook presented a keynote and details information about their use of Hive for analytics. Mike Schroepfer, Facebook's VP of Engineering delivered a keynote describing the scale of their data processing with Hadoop.
-
Mahout 0.3: Open Source Machine Learning
The need for machine-learning techniques like clustering, collaborative filtering, and categorization has steadily increased the last decade along with the number of solutions needing quick and efficient algorithms to transform vast amounts of raw data into relevant information. Apache Mount 0.3 has been announced on March, adding more functionality, stability and performance.