Bindings, Platforms, and Innovation
This presentation focuses on the Internet and separating myth from fact, history from the future, and the mundane from the imaginative. Bob Frankston presents a vision of what could and should be.
Tracking change and innovation in the enterprise software development community
Posted by Stefan Tilkov on Jun 25, 2007 04:05 PM
In a blog post, Microsoft’s Dare Obasanjo shared his notes on a session given by Jeff Dean from Google at the Google Seattle Conference on Scalability, “MapReduce, BigTable, and Other Distributed System Abstractions for Handling Large Datasets”. According to Dare, the talk covered the three main elements of Google’s massively scalable architecture: GFS (the Google File System), MapReduce, an infrastructure capable of processing large datasets in parallel, and BigTable, Google’s distributed store for structured data.
The report contains some fascinating details about Google’s infrastructure. About GFS:
There are currently over 200 GFS clusters at Google, some of which have over 5000 machines. They now have pools of tens of thousands of machines retrieving data from GFS clusters that run as large as 5 petabytes of storage with read/write throughput of over 40 gigabytes/second across the cluster.
On MapReduce:
A developer only has to write their specific map and reduce operations for their data sets which could run as low as 25 - 50 lines of code while the MapReduce infrastructure deals with parallelizing the task and distributing it across different machines, handling machine failures and error conditions in the data, optimizations such as moving computation close to the data to reduce I/O bandwidth consumed, providing system monitoring and making the service scalable across hundreds to thousands of machines.
Concerning BigTable:
BigTable is not a relational database. It does not support joins nor does it support rich SQL-like queries. Instead it is more like a multi-level map data structure. It is a large scale, fault tolerant, self managing system with terabytes of memory and petabytes of storage space which can handle millions of reads/writes per second. BigTable is now used by over sixty Google products and projects as the platform for storing and retrieving structured data.
For those who want to try these ideas out on their own, the Apache Lucene Hadoop subproject, which contains an implementation of MapReduce and a HDFS, a GFS-like distributed file system, might be a good start.
5 Ways to Ensure Application Performance
Performance Management and Diagnostics in Distributed Java and .NET Applications
Effective Management of Static Analysis Vulnerabilities and Defects
Would you enroll in an India Forex Group i.e http://www.indiaforex.com Groups?
This presentation focuses on the Internet and separating myth from fact, history from the future, and the mundane from the imaginative. Bob Frankston presents a vision of what could and should be.
This article explores the use of JBoss and jBPM to implement design solutions that effectively address the issue of orchestrating long running activities.
This presentation covers the use of graph databases as an optimal solution for data that is difficult to fit in static tables, rapidly evolving data or data that has a lot of optional attributes.
This session introduces Real Options and shows how it can help in running your project. Real Options is a decision-making process that can be used to manage risk.
This article discusses the use of bindings on services and references (including the instance of non-configured bindings) as the means to implement SCA communications in a Web and SOA environment.
After a short introduction to DSLs, Scott Davis plays with the keyboard showing how to approach the creation of a DSL by typing working snippets of Groovy code that get executed.
IBM Rational and InfoQ present, Scaling Agile with C/ALM, an eBook showing organizations how to become “finely tuned software delivery machines” by enabling team integration and scaling.
Amanda Laucher presents a real life enterprise application written in F#. She shows actual code snippets, explaining design decisions and suggesting how to use some of the F# constructs.
No comments
Watch Thread Reply