InfoQ Homepage Data Analytics Content on InfoQ
-
Fighting Financial Fraud with Machine Learning at Airbnb
Airbnb, the online marketplace that matches people who rent out their homes with people who are looking for a place to stay, uses machine learning (ML) techniques to fight financial fraud. They use targeted friction to battle the chargebacks while minimizing impact to good guests using their online reservation system.
-
XebiaLabs Announce DevOps Intelligence Engine
XebiaLabs, the developers of Continuous Delivery and DevOps tooling XL Release and XL Deploy, has announced availability of the first release of XL Impact, a goal-based, data-driven recommendation and decision making tool for DevOps organisations. XebiaLabs claims this is the first tool of its kind and the capability is essential for organisations to prove DevOps performance improvements.
-
Confluent Releases KSQL, a Distributed Streaming SQL Engine for Apache Kafka
Confluent released KSQL: interactive, distributed streaming SQL engine for Apache Kafka. KSQL supports stream processing operations like aggregations, joins, windowing, and sessionization on topics in Apache Kafka. Confluent announced the open source streaming SQL engine at the recent Kafka Summit conference.
-
Microsoft Updates AI Services and Tools for Data Scientists and Developers
At the recent Ignite conference, Microsoft released several updates related to its Artificial Intelligence (AI) services and tools. These updates include the release of the Azure ML Experimentation service, Azure ML Model Management service, Azure ML Workbench and the general availability of Microsoft Cognitive Services.
-
Q&A with Andrew Brust of Datameer Regarding Big Data's Role in AI
Rags Srinivas talks to Datameer's Andrew Brust about the larger role of Big Data in AI and how it's operationalized with SmartAI.
-
Microsoft Updates Azure IoT Platform: Adds Connectivity, Time Series Insights and Edge Analytics
Microsoft has recently made some announcements regarding their Internet of Things (IoT) capabilities within Azure. Microsoft’s news includes adding a new service called Azure Time Series Insights, additional connectivity platform support for OPC UA/DA and Azure Stream Analytic support on edge devices. In addition, Microsoft also announced a new SaaS-based IoT Solution called Azure IoT Central.
-
Data Preparation Pipelines: Strategy, Options and Tools
Data preparation is an important aspect of data processing and analytics use cases. Business analysts and data scientists spend about 80% of their time gathering and preparing the data rather than analyzing it or developing machine learning models. Kelly Stirman spoke last week at Enterprise Data World 2017 Conference about the data preparation best practices.
-
Julien Le Dem on the Future of Column-Oriented Data Processing with Apache Arrow
Julien Le Dem, the PMC chair of the Apache Arrow project, presented on Data Eng Conf NY on the future of column-oriented data processing. Apache Arrow is an open-source standard for columnar in-memory execution. InfoQ interviewed Le Dem to find out the differences between Arrow and Parquet.
-
Microservices and Stream Processing Architecture at Zalando Using Apache Flink
Javier Lopez and Mihail Vieru spoke at Reactive Summit 2016 Conference about cloud-based data integration and distribution platform used for stream processing in business intelligence use cases. Their solution is based on technologies such as Flink, Kafka and Elasticsearch.
-
Stream Processing and Lambda Architecture Challenges
Lambda architecture has been a popular solution that combines batch and stream processing. Kartik Paramasivam at LinkedIn wrote about how his team addressed stream processing and Lambda architecture challenges using Apache Samza for data processing. The challenges described are the late arrival of events and the processing of duplicated messages.
-
Reactive Summit 2016 Conference: Reactive Microservices and Staging Data Pipelines
Reactive microservices, data center scale operating system (DCOS), and staging reactive data pipelines were the highlighted topics at Reactive Summit 2016 Conference held this week. InfoQ team attended the conference and this post is a summary of the first day's events at the conference.
-
Data Streaming Architecture with Apache Flink
Jamie Grier recently spoke at OSCON 2016 Conference about data streaming architecture using Apache Flink. He talked about the building blocks of data streaming applications and stateful stream processing with code examples of Flink applications and monitoring.
-
Elephant in the Cloud - Hadoop as a Service
Hadoop and other big data technologies revolutionized the way organizations run data analytics but the organizations are still facing challenges with operating costs of using these technologies for on-premise data processing. Ashish Thusoo recently spoke at Enterprise Data World Conference about Hadoop as a service offering that helps organizations bridge the gaps with these capabilities.
-
Google Cloud Machine Learning and Tensor Flow Alpha Release
Late last month Google released an alpha version of their TensorFlow (TF) integrated cloud machine learning service as a response to a growing need to make their Tensor Flow library to run at scale on the Google Cloud Platform (GCP). Google describes several new feature sets around making TF usage scale by integrating several pieces of the GCP like Dataproc, a managed Hadoop and Spark service.
-
Microsoft Releases Power BI Embedded Preview
Recently at the 2016 Build Event in San Francisco, Microsoft announced a change to their Power BI offering. The update comes in the form of giving customers and ISVs with the ability to embed Power BI reports within their own applications. Microsoft is calling this service Power BI Embedded and it is currently in preview.