InfoQ Homepage Distributed Systems Content on InfoQ
-
Large Scale Map-Reduce Data Processing at Quantcast
Ron Bodkin presents the architecture used by Quantcast to process 100s of TB of data daily using Hadoop on dedicated systems, the applications, the type of data processed, and the infrastructure used.
-
Development at the Speed and Scale of Google
Ashish Kumar on how Google keeps the source code of over 2000 projects in a single code trunk containing 100s of M of code lines, with more than 5,000 developers accessing the same repository.
-
Availability, the Cloud and Everything
Joe Williams discusses how distributed systems, cloud computing and configuration management affect system’s availability. He exemplifies with a database service built on CouchDB, Erlang, Chef, EC2.
-
Global Software Delivery with Distributed Agile
Matthew Simons and Steven Boswell consider that distributed software development is a strategic capability for a company, presenting a framework and Agile practices for building such an environment.
-
Test-Driven Development of Asynchronous Systems
Nat Pryce exemplifies how he dealt with flickering, false positives, slow, and messy tests appearing in asynchronous testing when trying to perform end-to-end testing.
-
Social Networks: Getting Distributed Web Services Done with NoSQL
Lars George and Fabrizio Schmidt present Germany’s largest social networks, Schuelervz, Studivz and Meinvz, the initial architecture, why it didn’t work and how they solved it with a NoSQL solution.
-
Embracing Concurrency At Scale
Justin Sheehy explains the principles behind concurrent distributed systems: no global state, no ACID but rather BASE, no RPC but protocols over APIs, prepare for failure, degradation, measurement.
-
Horizontal Scalability via Transient, Shardable, and Share-Nothing Resources
Adam Wiggins details how memcached, CouchDB, Hadoop, Redis, Varnish, RabbitMQ, Erlang apply the transient, shardable and share-nothing principles to achieve horizontal scalability.
-
Facebook’s Petabyte Scale Data Warehouse using Hive and Hadoop
Ashish Thusoo and Namit Jain explain how Facebook manages to deal with analysis of 12 TB of compressed new data everyday with Hive’s help, an open source data warehousing framework built on Hadoop.
-
RPC and its Offspring: Convenient, Yet Fundamentally Flawed
Steve Vinoski covers the history of RPC, standardization, distributed objects, CORBA, DCOM, Java, SOAP, WS-*, flaws in RPC, REST vs RPC philosophy, Erlang reliability and concurrency.
-
Hypertable - An Open Source, High Performance, Scalable Database
This presentation discusses Hypertable, an open source, high performance, distributed database modeled after Google's Bigtable. Doug offers a comprehensive discussion of all aspects of Hypertable.
-
Erlang - software for a concurrent world
How do you program a multicore computer? Easy - do it in Erlang. Joe introduces Erlang, the ideas of Concurrent Oriented Programming and commercial applications written in Erlang.