Cloudera recently released the latest version of its software distribution, CDH5. Almost 20 months after the last major version, CDH4 seems like ages in the Big Data world. We take a look at new features this release brings and the future direction of Cloudera after the latest round of investment from Intel and Google Ventures.
Hadoop 2.4.0 was recently released with several enhancements to both HDFS and YARN. This includes support for Access Control Lists, Native support for Rolling upgrades, Full HTTPS support for HDFS, Automatic failover of YARN and other operational improvements
We should build systems more loosely coupled to achieve properties like robustness, resilience and scalability, Udi Dahan emphasizes in a recent presentation discussing how we can model our systems using more event-driven and asynchronous patterns and some of the challenges developers face when introducing these principles and patterns into development.
The social-networking company AddThis open-sourced Hydra under the Apache version 2.0 License in a recent announcement. Hydra grew from an in-house platform created to process semi-structured social data as live streams and do efficient query processing on those data sets.
Spark users can now use a new Big Data platform provided by intelligence company Atigeo, which bundles most of the UC Berkeley stack into a unified framework optimized for low-latency data processing that can provide significant improvements over more traditional Hadoop-based platforms.
The 2014 Edition of The Shallot - the online magazine which conducts deep analysis of the state of the information technology industry - has been released.
The New York Times R&D Lab has released streamtools, a general purpose, graphical tool for dealing with streams of data, under Apache 2 license.
Over the past year or so we've started to hear about Microservices as potentially new architectural style. Recently Thoughtworks' Martin Fowler and James Lewis wrote an article defining Microservices. However, Steve Jones takes issue with the general theme and much in that article, believing that there is little new here and this is just a Service Oriented Deliver approach.
Independently from each other, Richard Warburton in a presentation, and Mark Seemann in a blog post both talks about object-orientation and the SOLID design principles from a functional programming perspective.
Espresso Logic has added RESTful endpoints for SQL stored procedures to their DBaaS service.
According to a new Forrest report, Hadoop’s momentum is unstoppable. Its usage in the enterprise is continuously growing due to its ability to offer companies new ways to store, process, analyze, and share big data. The report takes a look at Hadoop vendors and ranks them.
Recently, Spark graduated from the Apache incubator. Spark claims up to 100x speed improvements over Apache Hadoop over in-memory datasets and gracefully falling back to 10x speed improvement for on-disk performance. Based on Scala, it can run SQL queries and be used directly in R. It provides Machine Learning, Graph database capabilities and other further discussed in the article.
There are both commonalities and some differences when comparing architectural principles and coding styles in Akka Actors and Java EE 7 Enterprise JavaBeans, specifically stateless session beans and JMS message-driven beans, Dr Gerald Loeffler concludes in a recent introductory talk when explaining and comparing the three approaches from a high-level concurrency view.
Version 2.1 of CQRS framework Axon supports annotations and ordering of event handlers, a new conflict resolution together with performance improvements. The recently released version also adds compatibility with OSGi.