InfoQ Homepage Big Data Content on InfoQ

Articles

RSS Feed

Newer Older

AI, ML & Data Engineering

Data Leadership Book Review and Interview

Data Leadership book, authored by Anthony Algmin, covers the data leadership topic and how data leaders should manage and govern the data management programs in their organizations. Data Leadership is how organizations choose to apply their energy and resources toward creating data capabilities to influence their business.

Srini Penchikala Anthony Algmin
on Jul 25, 2020
Java

Apache Arrow and Java: Lightning Speed Big Data Transfer

Apache Arrow puts forward a cross-language, cross-platform, columnar in-memory data format for data. It is designed to eliminate the need for data serialization and reduce the overhead of copying.

Joris Gillis
on May 23, 2020
Culture & Methods

Data Analytics in the World of Agility

Is it all about customer-centric business, or is there any data left? Can we integrate data analytics and customer empathy? This article explores how we can move towards a more customer-centric business and what information we require in order to understand the most valuable thing we have: our customer.

Almudena Rodriguez Pardo
on Sep 06, 2019
AI, ML & Data Engineering

Stream Processing Anomaly Detection Using Yurita Framework

In this article, author Guy Gerson discusses the stream processing anomaly detection framework they developed by PayPal, called Yurita. The framework is based on Spark Structured Streaming.

Guy Gerson
on Jul 10, 2019
AI, ML & Data Engineering

Real-Time Data Processing Using Redis Streams and Apache Spark Structured Streaming

Structured Streaming, introduced with Apache Spark 2.0, delivers a SQL-like interface for streaming data. Redis Streams enables Redis to consume, hold and distribute streaming data between multiple producers and consumers. In this article, author Roshan Kumar walks us through how to process streaming data in real time using Redis and Apache Spark Streaming technologies.

Roshan Kumar
on May 13, 2019
AI, ML & Data Engineering

Conquering the Challenges of Data Preparation for Predictive Maintenance

Predictive maintenance (PdM) applications aim to apply machine learning (ML) on IIoT datasets in order to reduce occupational hazards, machine downtime, and other costs. In this article, the author addresses some of the data preparation challenges faced by the industrial practitioners of ML and the solutions for data ingest and feature engineering related to PdM.

Ian Downard
on Jan 04, 2019
AI, ML & Data Engineering

Analytics Zoo: Unified Analytics + AI Platform for Distributed Tensorflow, and BigDL on Apache Spark

In this article we described how Analytics Zoo can help real-world users to build end-to-end deep learning pipelines for big data, including unified pipelines for distributed TensorFlow and Keras on Apache Spark, easy-to-use abstractions such as transfer learning and Spark ML pipeline support, built-in deep learning models and reference use cases, etc.

Jason Dai
on Dec 11, 2018
AI, ML & Data Engineering

Sentiment Analysis: What's with the Tone?

Sentiment analysis is widely applied in voice of the customer (VOC) applications. In this article, the authors discuss NLP-based Sentiment Analysis based on machine learning (ML) and lexicon-based approaches using KNIME data analysis tools.

Rosaria Silipo Kathrin Melcher
on Nov 27, 2018
AI, ML & Data Engineering

Spark Application Performance Monitoring Using Uber JVM Profiler, InfluxDB and Grafana

In this article, author Amit Baghel discusses how to monitor the performance of Apache Spark based applications using technologies like Uber JVM Profiler, InfluxDB database and Grafana data visualization tool.

Amit Baghel
on Nov 18, 2018
AI, ML & Data Engineering

Natural Language Processing with Java - Second Edition: Book Review and Interview

Natural Language Processing with Java - Second Edition book covers the Natural Language Processing (NLP) topic and various tools developers can use in their applications. Technologies discussed in the book include Apache OpenNLP and Stanford NLP. InfoQ spoke with co-author Richard Reese about the book and how NLP can be used in enterprise applications.

Srini Penchikala
on Oct 10, 2018
AI, ML & Data Engineering

Democratizing Stream Processing with Apache Kafka® and KSQL - Part 2

In this article, author Robin Moffatt shows how to use Apache Kafka and KSQL to build data integration and processing applications with the help of an e-commerce sample application. Three use cases discussed: customer operations, operational dashboard, and ad-hoc analytics.

Robin Moffatt
on Sep 07, 2018
AI, ML & Data Engineering

How to Choose a Stream Processor for Your App

Choosing a stream processor for your app can be challenging with many options to choose from. The best choice depends on individual use cases. In this article, the authors discuss a stream processor reference architecture, key features required by most streaming applications and optional features that can be selected based on specific use cases.

Miyuru Dayarathna Srinath Perera
on Aug 21, 2018

Newer Articles

Older Articles

InfoQ Software Architects' Newsletter

Articles