InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
Full Stack Web Development Using Neo4j
When building a web application there are a lot of choices for the database. In this article, author discusses why Neo4j Graph database is a good choice as a data store for your web application if your data model contains lot of connected data and relationships.
-
Shaping Big Data Through Constraints Analysis
In this article, author Carlos Bueno describes a method for analyzing constraints on the shape and flow of data in systems. He talks about the factors useful for system analysis like working set & average transaction sizes, request & update rates, consistency, locality, computation, and latency. He also discusses big data architecture details of two use cases, movie streaming and face recognition.
-
Big Data Processing with Apache Spark - Part 2: Spark SQL
Spark SQL, part of Apache Spark big data framework, is used for structured data processing and allows running SQL like queries on Spark data. In this article, Srini Penchikala discusses Spark SQL module and how it simplifies running data analytics using SQL interface. He also talks about the new features in Spark SQL, like DataFrames and JDBC data sources.
-
Highly Distributed Computations Without Synchronization
Synchronization of data across systems is expensive and impractical when running systems at scale. Traditional approaches for performing computations or information dissemination are not viable. In this article Basho Sr. Software Engineer Chris Meiklejohn explores the basic building blocks for crafting deterministic applications that guarantee convergence of data without synchronization.
-
Big Data Processing with Apache Spark – Part 1: Introduction
Apache Spark is an open source big data processing framework built around speed, ease of use, and sophisticated analytics. In this article, Srini Penchikala talks about how Apache Spark framework helps with big data processing and analytics with its standard API. He also discusses how Spark compares with traditional MapReduce implementation like Apache Hadoop.
-
Building a Mars Rover Application with DynamoDB
DynamoDB is a NoSQL database service that aims to be easily managed, so you don't have to worry about administrative burdens such as operating and scaling. This article shows how to use Amazon DynamoDB to create a Mars Rover application. You can use the same concepts described in this post to build your own web application.
-
Apache Ignite GridGain Incubator Project - Q&A Interview with Nikita Ivanov
GridGain announced that the In-Memory Data Fabric has been accepted into Apache Incubator program as Apache Ignite. InfoQ spoke with Nikita Ivanov about their product becoming part of Apache.
-
Interview with Alex Holmes, author of “Hadoop in Practice. Second Edition”
The new “Hadoop in Practice. Second Edition” book by Alex Holmes provides a deep insight into Hadoop ecosystem covering a wide spectrum of topics such as data organization, layouts and serialization, data processing, including MapReduce and big data patterns, special structures along with their usage to simplify big data processing, and SQL on Hadoop data.
-
Matt Schumpert on Datameer Smart Execution
Datameer, a big data analytics application for Hadoop, introduced Datameer 5.0 with Smart Execution to dynamically select the optimal compute framework at each step in the big data analytics process. InfoQ spoke with Matt Schumpert from Datameer team about the new product and how it works to help with big data analytics needs.
-
Stats Anomalies Detector
The article describes the general outline of the Stats Anomalies Detector we developed at MyHeritage and provides a detailed explanation of how to enhance the code (will be available soon at MyHeritage GitHub) to meet your company’s needs.
-
Analytics Across the Enterprise: How IBM Realizes Business Value from Big Data and Analytics
Analytics Across the Enterprise: How IBM Realizes Business Value from Big Data and Analytics book by Brenda L. Dietrich, Emily C. Plachy, and Maureen F. Norton is a collection of experiences by analytics practitioners in IBM. InfoQ spoke with the authors about the lessons learned from the book, the arsenal of technologies IBM has about Big Data and the future of Analytics.
-
Let Me Graph That For You
In this article on Graph Databases, author Ian Robinson discusses the problems Graph DBs aim to solve. He also talks about the data, storage, and query models for managing variably structured, densely connected data.