InfoQ Homepage Database Content on InfoQ
-
Graph Computing at Scale
Matthias Broecheler discusses graph computing, introducing the Aurelius graph cluster enabling graph computing at scale by building on distributed systems like Cassandra, HBase, and Hadoop.
-
R for Big Data
Indrajit Roy presents HP Labs’ attempts at scaling R to efficiently perform distributed machine learning and graph processing on industrial-scale data sets.
-
Search for the Holy Grail (and test it once found)
Baruch Sadogursky overviews and compares search and testing tools available to Grails developers.
-
Real-World Datomic: An Experience Report
Craig Andera explains Datomic from the perspective gained in implementing and optimizing a real-world production system, detailing the Datomic indexing process.
-
Tracking Millions of Ganks in Near Real Time
Garrett Eardley explores how Riot Games is using Riak for their stats system, discussing why they chose Riak, the data model and indexes, and strategies for working with eventually consistent data.
-
REEF: Retainable Evaluator Execution Framework
Rusty Sears introduces REEF along with examples of computational frameworks, including interactive sessions, iterative graph processing, bulk synchronous computations, Hive queries, and MapReduce.
-
Deploying Machine Learning and Data Science at Scale
Nick Kolegraff discusses common problems and architecture to support all the phases of data science and how to start a data science initiative, sharing lessons from Accenture, Best Buy, and Rackspace.
-
Spanner - Google's Distributed Database
Sebastian Kanthak details how Spanner relies on GPS and atomic clocks to provide two of its innovative features: Lock-free strong reads and global snapshots consistent with external events.
-
Working with Databases and Groovy
Paul King presents working with databases in Groovy, covering datasets, GMongo, Neo4J, raw JDBC, Groovy-SQL, CRUD, Hibernate, caching, Spring Data technologies, etc.
-
Functional Programming for Optimization Problems with City of Palo Alto Open Data
Paco Nathan reviews an example data analysis application written in Cascalog used for a recommender system based on City of Palo Alto Open Data.
-
Add ALL the Things: Abstract Algebra Meets Analytics
Avi Bryant discusses how the laws of group theory provide a useful codification of the practical lessons of building efficient distributed and real-time aggregation systems.
-
Big Data Platform as a Service at Netflix
Jeff Magnusson details some of Netflix' key services: Franklin, Sting and Lipstick.