InfoQ Homepage Data Analytics Content on InfoQ
-
Designing for Failure in the BBC's Analytics Platform
Last week at InfoQ Live, Blanca Garcia-Gil, principal systems engineer at BBC, gave a session on Evolving Analytics in the Data Platform. During this session, Garcia-Gil focused on how her team prepared and designed for two types of failure - "known unknowns" and "unknown unknowns."
-
Accelerating Machine Learning Lifecycle with a Feature Store
Feature Store is a core part of next generation ML platforms that empowers data scientists to accelerate the delivery of ML applications. Mike Del Balso and Geoff Sims recently spoke at Spark AI Summit 2020 Conference about the feature store driven ML development.
-
DataOps and Operations-Centric Data Architecture
Eric Estabrooks from DataKitchen spoke at this year's Data Architecture Summit 2019 Conference about how DevOps tasks should be managed for data architecture. DataOps is a collaborative data management practice and is emerging as an area of interest in the industry.
-
Microsoft Announces Azure Synapse for Data Warehousing and Analytics
During Microsoft's annual Ignite conference the company announced a new analytics service called Azure Synapse. The service, which is a continuation of Azure SQL Data Warehouse, focuses on bringing enterprise data warehousing and big data analytics into a single service.
-
Unlocking Market Data, Amazon Introduces AWS Data Exchange
In a recent blog post, Amazon introduced a new market data publisher/subscriber service called AWS Data Exchange. This service is an add-on to the existing AWS Marketplace and contains more than 1000 licensable data products from more than 80 data providers. These data feeds include both free and paid offerings that span industries such as financial services, health care, weather and mapping.
-
Databricks' Unified Analytics Platform Supports AutoML Toolkit
Databricks recently announced the Unified Data Analytics Platform, including an automated machine learning tool called AutoML Toolkit. The toolkit can be used to automate various steps of the data science workflow.
-
Los Angeles CTO Roundtable about AI and Data
The recent "Leaders in Data CTO Roundtable" in Los Angeles included discussions about an artificial intelligence (AI) framework/platform for business, data in the next five years, data software stacks, and acquiring data talent.
-
Autonomous Analytics: Driving the Future of Data in Business Analytics
Autonomous data analytics will be the driver of business analytics in the future. and will be seamlessly integrated into our lives. John Thuma, from Arcadia Data, spoke at Enterprise Data World 2019 Conference in Boston about self-driving analytics.
-
Microsoft Announces New Azure Analytics Services ADLS, ADX and More
Microsoft has announced the general availability of two new Azure analytics services - Azure Data Lake Storage Gen2 (ADLS) and Azure Data Explorer (ADX). Furthermore, Microsoft also announced the preview of Azure Data Factory Mapping Data Flow.
-
Uber Introduces AresDB: GPU-Powered, Open-Source, Real-Time Analytics Engine
Uber recently introduced AresDB, an open-source real-time analytics engine leveraging an unconventional power source - graphics processing units (GPUs) - for meeting the growing demands of analysis at scale and at the same time unifying, simplifying and improving Uber’s existing solutions.
-
Agile Data Modeling for NoSQL Databases
Pascal Desmarets recently spoke at Data Architecture Summit 2018 Conference about agile modeling and best practices for NoSQL databases.
-
A Team's Transformation from Software Development to ML: Golestan Radwan at QCon NY
As companies start to add Big Data and Machine Learning initiatives to their project portfolios, they face several challenges including the teams' transition from software engineering to data engineering and machine learning. Golestan "Sally" Radwan spoke at QCon New York 2018 Conference about her experience in leading a traditional software engineering team on a machine learning/AI journey.
-
Distributed Messaging Framework Apache Pulsar 2.0 Supports Schema Registry and Topic Compaction
The latest version of open-source distributed pub-sub messaging framework Apache Pulsar enables companies to move “beyond batch” by acting on data in motion. Streamlio recently announced the availability of Apache Pulsar 2.0 streaming messaging solution. The new version supports Pulsar Functions, Schema Registry and Topic Compaction.
-
eBay's Accelerator Data Processing Framework Provides Parallel Execution and Live Recommendations
eBay's Accelerator data processing framework provides parallel execution and automatic organization of source code, input data, and results. It can be used for data analysis, and algorithm development, as well as a live recommendation system.
-
PayPal's Gimel Analytics Platform Provides Unified Data API and GSQL
Romit Mehta and Deepak Chandramouli from PayPal spoke at the recent QCon.ai Conference about Gimel data analytics platform and how it can be used to commoditize data access. InfoQ spoke with Mehta and Chandramouli about the data platform and its support in the areas of security,