InfoQ Homepage AI, ML & Data Engineering Content on InfoQ

Articles

RSS Feed

Newer Older

AI, ML & Data Engineering

Predicting Movie Ratings: NLP Tools is What Film Studios Need

In this article, the author discusses how to use Natural Language Processing (NLP) techniques to predict the movie ratings using the data shared on social media platforms. Sentiment analysis of movie reviews can also be used to classify movies into different genres and to improve the movie recommendation systems.

Tatsiana Levdikova
on May 13, 2017
AI, ML & Data Engineering

Pascal Desmarets on NoSQL Data Modeling Best Practices

NoSQL databases are specialized to store different types of data like Key Value, Documents, Column Family, Time Series, Graph, and IoT data. Pascal Desmarets talks about how to perform data modeling in NoSQL databases compared to the modeling in Relational databases.

Srini Penchikala
on May 01, 2017
AI, ML & Data Engineering

Virtual Panel: Data Science, ML, DL, AI and the Enterprise Developer

InfoQ caught up with experts in the field to demystify the different topics surrounding AI, and how enterprise developers can leverage them today and thereby render their solutions more intelligently.

Rags Srinivas
on Apr 27, 2017
AI, ML & Data Engineering

From Alibaba to Apache: RocketMQ’s Past, Present, and Future

Feng Jia and Wang Xiaorui share the core distributed systems principals behind RocketMQ, Alibaba's distributed messaging and data streaming platform now open sourced through the Apache Foundation.

Feng Jia Wang Xiaorui
on Apr 21, 2017
Development

Key Takeaway Points and Lessons Learned from QCon London 2017

This year was the 11th for QCon London; it was also our largest London event to date. Including our 140 speakers we had 1435 team leads, architects, and project managers attending 112 technical sessions across 18 concurrent editorial tracks and 16 in-depth workshops.

Abel Avram
on Apr 18, 2017
Java

Want to Know What’s in a GC Pause? Go Look at the GC Log!

Sometimes a superficial analysis of our application performance can incorrectly have the Garbage Collector point to itself. A proper GC log analysis can lead us past the “blame the collector” game. When this happens, we can make amazing discoveries that improve the performance and stability of our applications.

Kirk Pepperdine
on Apr 17, 2017
AI, ML & Data Engineering

Building Pipelines for Heterogeneous Execution Environments for Big Data Processing

The Pipeline61 framework supports the building of data pipelines involving heterogeneous execution environments. It reuses the existing code of the deployed jobs in different environments and provides version control and dependency management that deals with typical software engineering issues. A real-world case study shows its effectiveness.

Liming Zhu Qinghua Lu Daniel Sun Sherif Sakr Dongyao Wu Xiwei Xu
on Mar 31, 2017
Java

Introducing Reladomo - Enterprise Open Source Java ORM, Batteries Included!

Goldman Sachs is widely known as a leader in investment banking, but they are very much a leading technology firm as well. Reladomo is the primary Java ORM used at GS, and it is now open source. In this article GS Technology Fellow, Mohammad Rezaei, takes us on a deep dive into Reladomo.

Mohammad Rezaei
on Mar 28, 2017
AI, ML & Data Engineering

There's No AI (Artificial Intelligence) without IA (Information Architecture)

Artificial intelligence (AI) is increasingly hyped by everyone, from well-funded startups to well-known software brands. In this article the author describes the need for high-quality, structured data before AI technologies can be of use to organizations and their customers.

Seth Earley
on Mar 25, 2017
AI, ML & Data Engineering

Big Data Processing Using Apache Spark - Part 6: Graph Data Analytics with Spark GraphX

In this article, author Srini Penchikala discusses Apache Spark GraphX library used for graph data processing and analytics. The article includes sample code for graph algorithms like PageRank, Connected Components and Triangle Counting.

Srini Penchikala
on Mar 14, 2017
AI, ML & Data Engineering

Three Experts on Big Data Engineering

Clemens Szyperski (Microsoft), Martin Petitclerc (IBM), and Roger Barga (Amazon Web Services) answer three questions: What major challenges do you face when building scalable, big data systems? How do you address these challenges? Where should the research community focus its efforts to create tools and approaches for building highly reliable, scalable, big data systems?

Roger Barga Clemens Szyperski Martin Petitclerc
on Mar 12, 2017
AI, ML & Data Engineering

Data Preprocessing vs. Data Wrangling in Machine Learning Projects

This article compares different alternative techniques to prepare data, including extract-transform-load (ETL) batch processing, streaming ingestion and data wrangling. The article also discusses how this is related to visual analytics, and best practices for how different user roles such as the Data Scientist or Business Analyst should work together to build analytic models.

Kai Wähner
on Mar 05, 2017

Newer Articles

Older Articles

InfoQ Software Architects' Newsletter

Articles