Stream Processing and Lambda Architecture Challenges

by Alexandre Rodrigues on  Oct 19, 2016 3

Lambda architecture has been a popular solution that combines batch and stream processing. Kartik Paramasivam at LinkedIn wrote about how his team addressed stream processing and Lambda architecture challenges using Apache Samza for data processing. The challenges described are the late arrival of events and the processing of duplicated messages.

Reactive Summit 2016 Conference: Reactive Microservices and Staging Data Pipelines

by Srini Penchikala on  Oct 08, 2016

Reactive microservices, data center scale operating system (DCOS), and staging reactive data pipelines were the highlighted topics at Reactive Summit 2016 Conference held this week. InfoQ team attended the conference and this post is a summary of the first day's events at the conference.

Getting the Data Needed for Data Science

by Ben Linders on  Sep 02, 2016

Data science is about the data that you need; deciding which data to collect, create, or keep is fundamental argues Lukas Vermeer, an experienced Data Science professional and Product Owner for Experimentation at True innovation starts with asking big questions, then it becomes apparent which data is needed to find the answers you seek.

Meson Workflow Orchestration and Scheduling Framework for Netflix Recommendations

by Srini Penchikala on  Jul 10, 2016

Netflix's goal is to predict what you want to watch before you watch it. They do this by running a number of machine learning (ML) workflows every day. Meson is a workflow orchestration and scheduling framework that manages the lifecycle of all these machine learning pipelines that build, train and validate personalization algorithms to help with the video recommendations.

Data Streaming Architecture with Apache Flink

by Srini Penchikala on  Jun 09, 2016

Jamie Grier recently spoke at OSCON 2016 Conference about data streaming architecture using Apache Flink. He talked about the building blocks of data streaming applications and stateful stream processing with code examples of Flink applications and monitoring.

Precision Medicine Modeling Demonstration with Spark on EMR, ADAM, and the 1000 Genomes Project

by Dylan Raithel on  May 19, 2016

AWS engineers Christopher Crosbie and Ujjwal Ratan detail using Spark on EMR for precision medicine data analysis on the ADAM platform with data from the 1000 genomes project.

Elephant in the Cloud - Hadoop as a Service

by Srini Penchikala on  May 02, 2016 2

Hadoop and other big data technologies revolutionized the way organizations run data analytics but the organizations are still facing challenges with operating costs of using these technologies for on-premise data processing. Ashish Thusoo recently spoke at Enterprise Data World Conference about Hadoop as a service offering that helps organizations bridge the gaps with these capabilities.

Google Cloud Machine Learning and Tensor Flow Alpha Release

by Dylan Raithel on  Apr 18, 2016

Late last month Google released an alpha version of their TensorFlow (TF) integrated cloud machine learning service as a response to a growing need to make their Tensor Flow library to run at scale on the Google Cloud Platform (GCP). Google describes several new feature sets around making TF usage scale by integrating several pieces of the GCP like Dataproc, a managed Hadoop and Spark service.

Microsoft Releases Power BI Embedded Preview

by Kent Weare on  Apr 17, 2016

Recently at the 2016 Build Event in San Francisco, Microsoft announced a change to their Power BI offering. The update comes in the form of giving customers and ISVs with the ability to embed Power BI reports within their own applications. Microsoft is calling this service Power BI Embedded and it is currently in preview.

Funnel Analysis at Twitter for Improving User Engagement

by Srini Penchikala on  Feb 25, 2016

Funnel analysis is used to analyze a sequence of events to help with user engagement on a website or a mobile application. Data Science team at Twitter uses this concept to learn how users interact with user interfaces during sign up or tweeting for improving user engagement with Twitter.

IBM Extends its Cloud Data Analytics Services

by Abel Avram on  Feb 06, 2016

IBM has announced four new data services: Analytics Exchange, Compose Enterprise, Graph, and Predictive Analytics. IBM’s new data services are meant to enable users to analyze their own data or get access to datasets provided by IBM. While some of the services run on Bluemix, for others the data can be deployed on other clouds, including private ones.

How Airbnb Uses Net Promoter Score to Predict Guest Rebooking

by Srini Penchikala on  Feb 02, 2016 1

Net Promoter Score (NPS) is a customer loyalty metric used to determine the likelihood that a customer will return to a company's website or use their service again. Airbnb uses NPS extensively in measuring the customer loyalty, as a more effective measurement to determine the likelihood that a customer will return to book again or recommend the company to their friends.

Yahoo! Benchmarks Apache Flink, Spark and Storm

by Abel Avram on  Dec 23, 2015

Yahoo! has benchmarked three of the main stream processing frameworks: Apache Flink, Spark and Storm.

Data Science in F# using FsLab: Interview with Tomas Petricek

by Pierre-Luc Maheu on  Dec 23, 2015

FsLab, a collection of F# ooen source libraries for doing Data Science, was released earlier this year, InfoQ reached out with Tomas Petricek, creator of the project, to get more details.

IBM Brings Watson to IoT

by Abel Avram on  Dec 16, 2015 1

IBM has inaugurated the IoT Global Headquarters and will use the Watson technology to analyze and interpret IoT data.

General Feedback
Marketing and all content copyright © 2006-2016 C4Media Inc. hosted at Contegix, the best ISP we've ever worked with.
Privacy policy

We notice you're using an ad blocker

We understand why you use ad blockers. However to keep InfoQ free we need your support. InfoQ will not provide your data to third parties without individual opt-in consent. We only work with advertisers relevant to our readers. Please consider whitelisting us.