InfoQ Homepage Big Data Content on InfoQ
-
William McKnight on Data Platforms and Creating a Modern Data Architecture
William McKnight gave a keynote presentation last week at Data Architecture Summit 2018 Conference on creating a modern data architecture using different data platforms.
-
High Volume Space Exploration Time-Series Data Storage in PostgreSQL
The European Space Agency Science Data Center (ESDC) switched to PostgreSQL with the TimescaleDB extension for their data storage. ESDC’s diverse data includes structured, unstructured and time series metrics running to hundred of terabytes, and querying requirements across datasets with open source tools.
-
Netflix Keystone Real-Time Stream Processing Platform
Netflix recently published a post in their tech blog discussing the design considerations and insights of Keystone, their Real-time stream processing platform. Keystone has been operational since December 2015 and has grown significantly over the years as Netflix subscribers have grown from 65 to over 130 million in the past 3 years. This article follows on the latest state of Keystone platform...
-
California Creates Consumer Privacy Act
California has enacted the California Consumer Privacy Act (CCPA) of 2018 which, starting on January 1, 2020, would grant consumers several rights with respect to information about them that businesses collect, store, sold, and share. This is the first legislation of its kind in the United States.
-
New York Creates Task Force to Examine Automated Decision Making
New York City has created an Automated Decision Systems Task Force to demand accountability and transparency in how algorithms are used in city government. The final report of the task force is due in December 2019. This task force is the first in the United States to study this issue.
-
A Team's Transformation from Software Development to ML: Golestan Radwan at QCon NY
As companies start to add Big Data and Machine Learning initiatives to their project portfolios, they face several challenges including the teams' transition from software engineering to data engineering and machine learning. Golestan "Sally" Radwan spoke at QCon New York 2018 Conference about her experience in leading a traditional software engineering team on a machine learning/AI journey.
-
Distributed Messaging Framework Apache Pulsar 2.0 Supports Schema Registry and Topic Compaction
The latest version of open-source distributed pub-sub messaging framework Apache Pulsar enables companies to move “beyond batch” by acting on data in motion. Streamlio recently announced the availability of Apache Pulsar 2.0 streaming messaging solution. The new version supports Pulsar Functions, Schema Registry and Topic Compaction.
-
eBay's Accelerator Data Processing Framework Provides Parallel Execution and Live Recommendations
eBay's Accelerator data processing framework provides parallel execution and automatic organization of source code, input data, and results. It can be used for data analysis, and algorithm development, as well as a live recommendation system.
-
PayPal's Gimel Analytics Platform Provides Unified Data API and GSQL
Romit Mehta and Deepak Chandramouli from PayPal spoke at the recent QCon.ai Conference about Gimel data analytics platform and how it can be used to commoditize data access. InfoQ spoke with Mehta and Chandramouli about the data platform and its support in the areas of security,
-
Chile’s Energy Regulator to Adopt Blockchain
PV magazine, a publication focused on reporting photovoltaics (solar power generation), has announced the Chile Energy Regulator is set to adopt blockchain in March 2018. The regulator plans to use blockchain technology to transparently record market prices, marginal costs, fuel prices and compliance documentation.
-
Oral Arguments before Supreme Court in Microsoft Cloud Computing Case Focus on Legal Issues
On February 27, 2018, the Supreme Court of the United States heard oral arguments on the Microsoft cloud computing case. A ruling against Microsoft could require companies based in the United States to hand over to law enforcement data stored on foreign servers. U.S. based organizations might then not be able to provide cloud computing services to foreign countries.
-
Managing and Operating Kafka Clusters in Kubernetes
Nenad Bogojevic, platform solutions architect at Amadeus, spoke at KubeCon + CloudNativeCon North America 2017 Conference on how to run and manage Kafka clusters in Kubernetes environment. He talked about provisioning Kafka clusters and configuring them using Kubernetes custom resources or ConfigMaps.
-
Modern Big Data Pipelines over Kubernetes
Container management technologies like Kubernetes make it possible to implement modern big data pipelines. Eliran Bivas, senior big data architect at Iguazio, spoke at the recent KubeCon + CloudNativeCon North America 2017 Conference about big data pipelines and how Kubernetes can help develop them.
-
TensorFlow Lite Supports On-Device Conversational Modeling
TensorFlow Lite, the light-weight solution of open source deep learning framework TensorFlow, supports on-device conversation modeling to plugin the conversational intelligence features into chat applications. The TensorFlow team recently announced the release of TensorFlow Lite, which can be used in mobile and embedded devices.
-
Leslie Miley on Bias in Big Data/ML and AI - QCon San Francisco
At QCon San Francisco Leslie Miley gave a keynote talk in which he explained how inherent bias in data sets have affected things from the 2016 Presidential race to criminal sentencing in the United States.