Using Hunk+Hadoop as a Backend for Splunk

by Jonathan Allen on  Sep 22, 2015

Splunk can now store archived indexes on Hadoop. At the cost of performance, this offers a 75% reduction in storage costs without losing the ability to search the data. And with the new adapters, Hadoop tools such as Hive and Pig can process the Splunk-formatted data.

Splunk .conf 2015 Keynote

by Jonathan Allen on  Sep 22, 2015

Splunk opened their big data conference with an emphasis on “making machine data accessible, usable, and valuable to everyone”. This is a shift from their original focus: indexing arbitrary big data sources. Reasonably happy with their ability to process data, they want to ensure that developers, IT staff, and normal people have a way to actually use all of the data their company is collecting.

Lessons Learned Working with Distributed Systems

by Jan Stenberg on  Aug 13, 2015

Preparing for problems like partial failure is the best thing you can do when working with distributed systems, Vaughn Vernon explains in a conversation with InfoQ and refers to a blog post by Jeff Hodges noting its down-to-earth approach and practical advices e.g. designing for partial availability, and using capped exponential back off to restore full operation when dependencies are unavailable.

GameAnalytics Open-Source Erlang Scheduler for Distributed Tasks

by Sergio De Simone on  Aug 05, 2015

GameAnalytics, maker of a free analytics platform, has recently open sourced gascheduler an Erlang library that provides a generic scheduler for parallel execution of distributed tasks. InfoQ has spoken to Chris de Vries, one of gascheduler’s creators.

A Critical Look at CQRS

by Jan Stenberg on  Jul 20, 2015 7

Looking at Command Query Responsibility Segregation (CQRS) in a larger architectural context there are other architectural styles available. There are database technologies solving the same problems but in a simpler way, Udi Dahan states looking into ways of approaching CQRS. There is also a way that fulfils a lot of the CQRS goals but with fewer moving parts when CQRS is really needed.

ELIoT: Distributed Programming for the Internet of Things

by Sergio De Simone on  Jul 16, 2015

ELIoT (Extensible Language for the Internet of Things) is a simple and small programming language aiming to make distributed programming easier. A program in ELIoT may appear as a sigle program, but it actually runs on different computers, so, e.g., a variable or function declared on one computer is transparently used on another.

DDD, Events and Microservices

by Jan Stenberg on  Jun 29, 2015 1

To make microservices awesome Domain-Driven Design (DDD) is needed, the same mistakes made 5-10 years ago and solved by DDD are made again in the context of microservices, David Dawson claimed in his presentation at this year’s DDD Exchange conference in London.

Twitter Has Replaced Storm with Heron

by Abel Avram on  Jun 12, 2015

Twitter has replaced Storm with Heron which provides up to 14 times more throughput and up to 10 times less latency on a word count topology, and helped them reduce the needed hardware to a third.

Parquet Becomes Top-Level Apache Project

by Jérôme Serrano on  Jun 11, 2015

Apache Parquet, the open-source columnar storage format for Hadoop, recently graduated from the Apache Software Foundation Incubator and became a top-level project. Initially created by Cloudera and Twitter in 2012 to speed up analytical processing, Parquet is now openly available for Apache Spark, Apache Hive, Apache Pig, Impala, native MapReduce, and other key components of the Hadoop ecosystem.

Stefan Tilkov: Skip the Monolith, Start with Microservices

by Jan Stenberg on  Jun 10, 2015 2

During the last months Martin Fowler among others have claimed that a microservices architecture should always start with a monolith, but Stefan Tilkov is convinced this is wrong, building a well-structured monolith with cleanly separated modules that later may be pulled apart into microservices is extremely hard, if not impossible in most cases.

MemSQL 4 Database Supports Community Edition, Geospatial Intelligence and Spark Integration

by Srini Penchikala on  May 30, 2015

Latest version of MemSQL, in-memory database with support for transactions and analytics, includes a new Community Edition for free use by organizations. MemSQL 4, released last week, also supports integration with Apache Spark, Hadoop Distributed File System (HDFS), and Amazon S3.

Glenn Tamkin on Applying Apache Hadoop to NASA's Big Climate Data

by Srini Penchikala on  May 06, 2015

NASA Center for Climate Simulation (NCCS) is using Apache Hadoop for high-performance data analytics. Glenn Tamkin from NASA team, recently spoke at ApacheCon Conference and shared the details of the platform they built for climate data analysis with Hadoop.

Hortonworks, IBM and Pivotal to Support Open Data Platform in Their Big Data Solutions

by Srini Penchikala on  Apr 24, 2015

Big data vendors Hortonworks, IBM, and Pivotal recently announced that their Hadoop based platform products will use the common Open Data Platform (ODP). They made the announcement at the recent HadoopSummit Europe Conference of the open platform which includes Apache Hadoop 2.6 (HDFS, YARN, and MapReduce) and Apache Ambari software.

Apache HBase Hits 1.0

by Benjamin Darfler on  Apr 07, 2015

After three developer previews, six release candidates and over 1500 closed tickets the Apache foundation has announced version 1.0 of Apache HBase, a NoSQL database in the Hadoop ecosystem. After more than 7 years of active development, the team behind HBase felt that the project had matured and stabilized enough to warrant a 1.0 version.

A Service is a Logical Construct Built by Microservices

by Jan Stenberg on  Mar 31, 2015

A service is a logical construct owning a business capability and made up of internal autonomous components or microservices that together fulfil the responsibilities of the service, Jeppe Cramon suggests continuing a previous series of blog posts clarifying his view on building services around business capabilities and bounded contexts.

