InfoQ Homepage Big Data Content on InfoQ
-
Apache Ignite GridGain Incubator Project - Q&A Interview with Nikita Ivanov
GridGain announced that the In-Memory Data Fabric has been accepted into Apache Incubator program as Apache Ignite. InfoQ spoke with Nikita Ivanov about their product becoming part of Apache.
-
Interview with Alex Holmes, author of “Hadoop in Practice. Second Edition”
The new “Hadoop in Practice. Second Edition” book by Alex Holmes provides a deep insight into Hadoop ecosystem covering a wide spectrum of topics such as data organization, layouts and serialization, data processing, including MapReduce and big data patterns, special structures along with their usage to simplify big data processing, and SQL on Hadoop data.
-
Matt Schumpert on Datameer Smart Execution
Datameer, a big data analytics application for Hadoop, introduced Datameer 5.0 with Smart Execution to dynamically select the optimal compute framework at each step in the big data analytics process. InfoQ spoke with Matt Schumpert from Datameer team about the new product and how it works to help with big data analytics needs.
-
Stats Anomalies Detector
The article describes the general outline of the Stats Anomalies Detector we developed at MyHeritage and provides a detailed explanation of how to enhance the code (will be available soon at MyHeritage GitHub) to meet your company’s needs.
-
Analytics Across the Enterprise: How IBM Realizes Business Value from Big Data and Analytics
Analytics Across the Enterprise: How IBM Realizes Business Value from Big Data and Analytics book by Brenda L. Dietrich, Emily C. Plachy, and Maureen F. Norton is a collection of experiences by analytics practitioners in IBM. InfoQ spoke with the authors about the lessons learned from the book, the arsenal of technologies IBM has about Big Data and the future of Analytics.
-
Real-Time Stream Processing as Game Changer in a Big Data World with Hadoop and Data Warehouse
This article discusses what stream processing is, how it fits into a big data architecture with Hadoop and a data warehouse (DWH), when stream processing makes sense, and what technologies and products you can choose from.
-
Nikita Ivanov on GridGain’s In-Memory Accelerator for Hadoop
GridGain recently announced the In-Memory Accelerator for Hadoop, offering the benefits of in-memory computing to Hadoop based applications. It includes two components: an in-memory file system and a MapReduce implementation. InfoQ spoke with Nikita Ivanov, CTO of GridGain about the architecture of the product.
-
Introducing Spring XD, a Runtime Environment for Big Data Applications
Spring XD (eXtreme Data) is Pivotal’s Big Data play. It joins Spring Boot and Grails as part of the execution portion of the Spring IO platform. Whilst Spring XD makes use of a number of existing Spring projects it is a runtime environment rather than a library or framework, comprising a bin directory with servers that you start up and interact with via a shell.
-
MLConf NYC 2014 Highlights
The MLConf conference was going strong in NYC on April 11th and was a full day packed with talks around Machine Learning and Big Data, featuring speakers from many prominent companies.
-
Lambda Architecture: Design Simpler, Resilient, Maintainable and Scalable Big Data Solutions
Lambda Architecture proposes a simpler, elegant paradigm designed to store and process large amounts of data. In this article, author Daniel Jebaraj presents the motivation behind the Lambda Architecture, reviews its structure with the help of a sample Java application.
-
Embedded Analytics and Statistics for Big Data
This article provides an overview of tools and libraries available for embedded data analytics and statistics, both stand-alone software packages and programming languages with statistical capabilities. The authors also discuss how to combine and integrate these embedded analytics technologies to handle big data.
-
Big Data Analytics for Security
In this article, authors discuss the role of big data and Hadoop in security analytics space and how to use MapReduce to efficiently process data for security analysis for use cases like Security Information and Event Management (SIEM) and Fraud Detection.