The AWS team has announced a limited preview of Amazon Redshift, a cloud-hosted data warehouse whose cost and capabilities are poised to disrupt the industry. In addition, AWS revealed two new massive compute instance types, and a data integration tool called Data Pipeline.
Big Data services available from cloud computing vendors such as Amazon, Microsoft and Google are uncovering interesting trends and opportunities.
During the PASS Summit 2012, a technical conference for SQL Server professionals, Microsoft announced Hekaton, an in-memory row-based data management system targeted at transaction processing (TP) workloads. Besides the advertised increase in TP speeds of up to 10x for old applications and up to 50x for new optimized ones, Microsoft touts Hekaton as being fully integrated into SQL Server.
Several new Hadoop-based frameworks where announced during this year O’Reilly Strata Conference + Hadoop World 2012 in New York last week.
Precog has recently announced a Big Data warehousing and analysis service which takes care of the data capture, storage, transformation, analysis and visualization process and the infrastructure on which it runs, but leaving open various access points throughout the service via RESTful APIs enabling developers and data scientists to control the entire process.
Apache new project Drill is aimed to support real-time interactive analysis of large-scale (terabytes size) data sets.
Amazon Glacier is a new service from Amazon Web Services (AWS) that provides extremely low cost, durable storage for archive-ready data. This service targets organizations who want to retain large, infrequently-used data sets but don’t want to maintain a local storage infrastructure.
Recently Cap Gemini's Steve Jones has written an article on how he believes that thinking about solutions to problems is less important these days than jumping on the latest hype bandwagon. Although he uses REST and Big Data as examples, he believes it goes beyond any single technology and that eventually IT will no longer belong to IT people.
In their presentation posted at InfoQ systems and data architects Ben Stopford, Farzad Pezeshkpour and Mark Atwell show how RBS leveraged new technologies in their architectures while facing difficult challenges such as regulation, competition and tighter budgets. They also need to cope with stringent technical challenges, for instance with efficiency and scalability.
Azavea a company based in Philadelphia that provides products for geographical data, has published an open source product called GeoTrellis under GNU GPL v3 license which is a geographic data processing engine for high performance applications.
A new open source project – Dempsy adds one more option for people trying to do real time processing of big data. Comparable to Storm and S4 Dempsy is most applicable to near real time stream processing where latency is more important than guaranteed delivery.
Developed since 2010 by Rich Hickey and the Relevance team, Datomic offers some new approaches to database architecture. Leveraging current trends in cloud and storage it has strong transactions, rich query API and read scaling.
LinkedIn engineering releases SenseiDB 1.0.0, a NoSQL database focused on high update rates and complex semi-structured search queries, already used in production by LinkedIn in its search related pages (e.g. People/Company search)
Version 2.0 of Hazelcast, a Java-based caching, clustering and data distribution solution, has recently been released. As part of this, the product is now offered in both commercial Enterprise and free open-source Community Editions.
VMware have announced the availability of Spring Hadoop, which integrates the Spring Framework and the Apache Hadoop platform.
CONTENT IN THIS BOX PROVIDED BY OUR SPONSOR
LET'S BUILD A BETTER ENTERPRISE
Spring helps development teams everywhere
build simple, portable, fast and flexible
JVM-based systems and applications.
GETTING STARTED: Developer Guides