InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
Apache MetaModel – Providing Uniform Data Access Across Various Data Stores
MetaModel - an Apache Incubator project – is a Java library used to browse, query and update various types of data stores including traditional SQL databases, unusual stores such as CSV or Excel, or the more modern NoSQL stores in a uniform and programmatic way.
-
Cassandra Mythology
In this article, author Jonathan Ellis addresses the concerns of using Apache Cassandra NoSQL database, in terms of architecture, deployment and configuration, performance, query language (CQL), and database maturity.
-
Jepsen: Testing the Partition Tolerance of PostgreSQL, Redis, MongoDB and Riak
Distributed systems are characterized by exchanging state over high-latency or unreliable links. The system must be robust to both node and network failure if it is to operate reliably--however, not all systems satisfy the safety invariants we'd like. In this article, we'll explore some of the design considerations of distributed databases, and how they respond to network partitions.
-
Graph Databases - Book Review and Interview
"Graph Databases" book covers the Graph based NoSQL database technology and different options available for storing "Connected Data" in the real world applications. InfoQ spoke with co-authors Ian Robinson and Jim Webber about the book, role of Graph Databases in the NoSQL database space, and what’s coming up in the Graph Databases.
-
Mike Barlow on Real-Time Big Data Analytics
"Real-Time Big Data Analytics: Emerging Architecture" white paper authored by Mike Barlow covers big data analytics topic and how real-time big data analytics (RTBDA) are different from traditional analytics. InfoQ spoke with Mike about the current state of real-time big data analytics and the emerging trends in the Big Data space like Decision Science.
-
Interview and Video Review: Working with Big Data: Infrastructure, Algorithms, and Visualizations
Paul Dix leads a practical exploration into Big Data in this video training series. The first five lessons of the training span multiple server systems with a focus on the end to end processing of large quantities of XML data from real Stack Exchange posts. He completes the training with a lesson on developing visualizations for gaining insights from the macro level analysis of Big Data.
-
The Datomic Information Model
Rich Hickey, the author of Clojure, explains the information model of Datomic - a new database designed as a composition of simple services, combining the capabilities of RDBMS and scalability of NoSQL.
-
Interview and Book Review: NoSQL Distilled
InfoQ spoke with both authors of the book, Pramod and Martin Fowler about NoSQL database space, the emerging trends in NoSQL.
-
Hadoop Virtual Panel
In this virtual panel, InfoQ talks to several Hadoop vendors and users about their views at current and future state of Hadoop and the things that are the most important for Hadoop’s further adoption and success.
-
The Architecture of Datomic
Rich Hickey, the author of Clojure, explains the architecture of Datomic - a new database designed as a composition of simple services, combining the capabilities of RDBMS and scalability of NoSQL.
-
Julien Nioche on Apache Nutch 2 Features and Product Roadmap
Open source web-search framework Apache Nutch version 2 supports large scale crawling, link-graph database and HTML parsing. InfoQ spoke with Julien Nioche, VP of Apache Nutch project, about the framework new features and its future roadmap.
-
Blueprint for a Big Data Solution
In his new article Jonathan Natkins explains how to use components of Apache Hadoop, including Flume, Hive and Oozie to implement a typical Data management system. He also gives a practical example of such architecture to measure Twitter user’s influence.