Hadoop is definitely the platform of choice for Big Data analysis and computation. While data Volume, Variety and Velocity increases, Hadoop as a batch processing framework cannot cope with the requirement for real time analytics. Spark, Storm and the Lambda Architecture can help bridge the gap between batch and event based processing.
Mobile Backend as a Service provider AnyPresence continues to hone their chops. Launching the fifth update to their self-titled platform geared for the enterprise. Co-founder Rich Mendis provides some insights for InfoQ readers…
Twitter has open-sourced Storm, its distributed, fault-tolerant, real-time computation system, at GitHub under the Eclipse Public License 1.0. Storm is the real-time processing system developed by BackType, which is now under the Twitter umbrella.
FlightCaster recently open sourced Crane, a tool for distributing and remotely controlling Clojure instances, currently specialized for EC2. Incanter is a Clojure library and tool that makes R-like statistical computations easy with Clojure. Also: the build and dependency management tool Leiningen 1.0 is now available.
With the multiplicity of existing remoting mechanisms it is often necessary to build clients in a way that allows to swap/introduce new protocols with no/minimal impact to the client’s implementation. A new framework – CRISPY - provides support for such implementations.
Tim Bray of Sun Microsystems writes of the Fallacies of Distributed Computing; He observes that despite its profound implications when designing distributed systems, “you don’t often find them coming up in conversations about building big networked systems”.
Recently, an early release draft of a Distributed OSGi requirements and design document has been published, along with a reference implementation as part of Apache CXF. In a new article, Eric Newcomer writes about the current status of distributed OSGi and explains the reasons for standardizing it in the first place, and its significance to the OSGi specification and community.
Google caused a stir by releasing Protocol Buffers, a binary serialization format. We take a look at what exactly Protocol Buffers are and what alternatives are available in ASN.1 or Facebook's Thrift.
In this interview from QCon San Francisco 2007, Randy Shoup discusses the architecture of eBay. Topics discussed include eBay's architectural principles, horizontal and vertical partitioning, ACID vs. BASE, handling data inconsistency, distributed caching, updating eBay on the fly, architectural and coding standards, eBay's search infrastructure, grid computing, and SOA.
We get more and more cores in our CPUs, but does our software run linearly faster? In most cases - no. We've hit a trend change when it comes to faster CPUs. We'll get more and more cores, but each core will be slower as the number of cores increase. In his talk, Joe Armstrong introduces Erlang and the ideas of Concurrent Oriented Programming which is one way to solve the problem.
The MapReduce design pattern to distribute data processing was introduced by Google in 2004, and came first with a C++ implementation. A new Ruby implementation is now available under the name of Skynet released by Adam Pisoni. InfoQ had the chance to catch up with Adam about its features and how it compares to an existing Ruby implementation called Starfish.
MPI or Message Passing Interface is the standard for distributed programming such as that used in supercomputers and implementations can be found for FORTRAN, C, and C++. There are several projects in the works to bring that power to .NET. Today we look at two of them.