NASA Center for Climate Simulation (NCCS) is using Apache Hadoop for high-performance data analytics. Glenn Tamkin from NASA team, recently spoke at ApacheCon Conference and shared the details of the platform they built for climate data analysis with Hadoop.
Big data vendors Hortonworks, IBM, and Pivotal recently announced that their Hadoop based platform products will use the common Open Data Platform (ODP). They made the announcement at the recent HadoopSummit Europe Conference of the open platform which includes Apache Hadoop 2.6 (HDFS, YARN, and MapReduce) and Apache Ambari software.
Standardizing on common models for business objects that are exchanged within an enterprise, e.g. Customer, Order and Product together with the attributes and associations they have, might seem compelling but for Stefan Tilkov this creation of Canonical Data Models (CDMs) is a horrible idea which he strongly advices against.
After three developer previews, six release candidates and over 1500 closed tickets the Apache foundation has announced version 1.0 of Apache HBase, a NoSQL database in the Hadoop ecosystem. After more than 7 years of active development, the team behind HBase felt that the project had matured and stabilized enough to warrant a 1.0 version.
A service is a logical construct owning a business capability and made up of internal autonomous components or microservices that together fulfil the responsibilities of the service, Jeppe Cramon suggests continuing a previous series of blog posts clarifying his view on building services around business capabilities and bounded contexts.
Structuring data as a stream of events is an idea appearing in many areas and is the ideal way of storing data. Aggregating a read model from these events is an ideal way to present data to a user, Martin Kleppmann claims explains when describing the fundamental ideas behind Stream Processing, Event Sourcing and Complex Event Processing (CEP).
Matt Ranney, Chief Systems Architect at Uber, gave an overview of their dispatch system, responsible for matching Uber's drivers and riders. Ranney explained the driving forces that led to a rewrite of this system. He described the architectural principles that underpin it, several of the algorithms implemented and why Uber decided to design and implement their own RPC protocol.
After living with microservices for three years at Gilt we can see advantages in team ownership, boundaries defined by APIs and complex problems broken down. Challenges still exists in tooling, integration environments and monitoring, Yoni Goldberg explained in a presentation at the QCon London conference describing the challenges they encountered moving to a microservices architecture.
Microservices are conceptually too big; they conflate optimizing for organisational and technical factors, but solutions to problems of each type may not fit together very well, Phil Wills, senior architect at The Guardian, explained in a presentation at the QCon London conference promoting thinking about independent services and single responsibility applications, rather than microservices.
The goal of software is to sustainably minimize lead time to positive business impact, everything else is detail, Dan North claimed in a presentation at the QCon London conference describing ways of reasoning about code and how this leads him into an architecture style that may fit microservices.
When designing and building Halo 4, the next version in a video game series, a new solution was created based on the Actor model implemented by the Orleans framework. Caitie McCaffrey told in a presentation at the QCon London conference talking about the work designing and building the services supporting the new game.
Pivotal recently released Spring XD 1.1 GA with new features including stream processing with Reactor, RxJava, Spark Streaming and Python. Additionally support for Kafka, batching and compression with RabbitMQ, and support for container group management when running on YARN are now featured.
The assumption that a large system must have a single environment, often with a one-to-one mapping between a project’s scope and the system built are challenged today Stefan Tilkov explains when looking into ways to split a large system into smaller parts and comparing the characteristics of systems, applications and microservices.
Google announced last week the release of open source MapReduce framework for C, called MR4C, that allows developers to run native code in Hadoop framework. MR4C framework brings together the performance and flexibility of natively developed algorithms with the scalability and throughput provided by Hadoop execution framework.
Pivotal has decided to open source core components of their Big Data Suite and has announced the Open Data Platform, an initiative promoting open source and standardization for Big Data.