InfoQ Homepage Data Content on InfoQ
-
Microsoft Announces New Azure Analytics Services ADLS, ADX and More
Microsoft has announced the general availability of two new Azure analytics services - Azure Data Lake Storage Gen2 (ADLS) and Azure Data Explorer (ADX). Furthermore, Microsoft also announced the preview of Azure Data Factory Mapping Data Flow.
-
Microsoft Announces the General Availability of Azure Data Box Disk
In a recent blog post, Microsoft has announced the general availability of Azure Data Box Disk, an SSD-based solution for offline data transfer to Azure. Furthermore, Microsoft also announced the public preview of Azure Data Box Blob Storage – a feature allowing customers to copy data to Blob Storage on a Data Box.
-
Google Cloud Announces Transfer Appliance in Beta for Cloud Data Migrations in the EU
Google announced that Transfer Appliance, a high-capacity server that lets customers move large amounts of data to Google Cloud Platform (GCP) quickly and securely, is available in beta in the European Union (EU). Google will handle the data transfer with Transfer Appliance in GCP in the EU, and data will not leave the EU.
-
Bank of America - Blockchain Data Storage Patent Released
On April 12, the United States Patent and Trademark Office (USPTO) released a patent filing from the Bank of America outlining their plans for a permissioned blockchain implementation that enables personal and business data sharing. A user will authorize service providers to securely access their data, but only for the specific records they have access to.
-
Baidu Release Huge Dataset "ApolloScape" for Autonomous Vehicle Research
Baidu, the Chinese internet giant, has released ApolloScape, a massive data-set for autonomous vehicle simulation and research. ApolloScape is an order of magnitude more complex than similar open data-sets. It is part of Apollo, Baidu's vehicle simulation and hardware platform. With this release, Baidu strengthens its position in the automated driving sector.
-
Chile’s Energy Regulator to Adopt Blockchain
PV magazine, a publication focused on reporting photovoltaics (solar power generation), has announced the Chile Energy Regulator is set to adopt blockchain in March 2018. The regulator plans to use blockchain technology to transparently record market prices, marginal costs, fuel prices and compliance documentation.
-
Data-Driven Thinking for Continuous Improvement
Organizations need an objective way to measure performance and tie actions back to business outcomes to improve continuously. Avvo uses a data-driven decision framework with an autonomous team model and a practice of retrospectives to help people make better decisions and proposals for continuous improvement.
-
LinkedIn Ordered to Allow Scraping of Public Profile Data
A United States federal judge has ruled that Microsoft’s LinkedIn cannot block third party web scrapers from scraping data from publicly available profiles.
-
Netflix Introduces Hollow, a Java Library for Processing In-Memory Datasets
Netflix recently introduced Hollow, a Java library and toolset for processing in-memory datasets that aren’t characterized as “big data.” A single producer provides datasets from which many consumers have read-only access. The communication mechanism between producer and consumer includes real-time dataset changes.
-
Facebook Builds an Efficient Neural Network Model over a Billion Words
Using Neural Networks for sequence prediction is a well-known Computer Science problem with a vast array of applications in speech recognition, machine translation, language modeling and other fields. FB AI Research scientists designed adaptive softmax, an approximation algorithm tailored for GPUs which can be used to efficiently train neural networks over vocabularies of a billion words & beyond.
-
Cloudera Announces Partnership with the Broad Institute
Cloudera announced their partnership with MIT & Harvard's Broad Institute and detailed some of their experience with the Genome Analytics Toolkit pipeline.
-
Yahoo! Benchmarks Apache Flink, Spark and Storm
Yahoo! has benchmarked three of the main stream processing frameworks: Apache Flink, Spark and Storm.
-
UI Design: Go Out and Get Data
Chris Atherton did the closing keynote of the GOTO Berlin 2015 conference in which she talked about designing software. She suggests that, in stead of relying on professional opinions on how software should look or work, it can be better to go out and get data from real users. InfoQ interviewed her about designing and testing user interfaces.
-
Samsung SAMI – a D3 Platform for the IoT
Samsung SAMI is a Data-driven Development (D3) platform for receiving, storing and sending data to/from IoT devices. Any device can send data in various formats which is then normalized into a JSON format and stored in the cloud. Data can then be requested by other devices.
-
Data Quality at Prezi
For an organization to be data-driven, it's not enough to just dump mountains of data. That data needs to be accurate and meaningful. Julianna Göbölös-Szabó, data engineer at Prezi shared how they improved the quality of its log data. Their solution involved moving from unstructured to structured data with a lightweight, contract-based approach to nudge all teams in the right direction.