Scalability vs distributed transactional semantics,is no longer a compromise as per Google's research work on Spanner. Spanner's features include non-blocking reads, lock-free read only transactions and atomic schema changes across a globally replicated relational database. The central idea that tackles the latency issues with distributed transactions is the exposure of clock uncertainty.
Apache new project Drill is aimed to support real-time interactive analysis of large-scale (terabytes size) data sets.
Recently Cap Gemini's Steve Jones has written an article on how he believes that thinking about solutions to problems is less important these days than jumping on the latest hype bandwagon. Although he uses REST and Big Data as examples, he believes it goes beyond any single technology and that eventually IT will no longer belong to IT people.
In their presentation posted at InfoQ systems and data architects Ben Stopford, Farzad Pezeshkpour and Mark Atwell show how RBS leveraged new technologies in their architectures while facing difficult challenges such as regulation, competition and tighter budgets. They also need to cope with stringent technical challenges, for instance with efficiency and scalability.
Want to try out Hadoop with the Microsoft Stack and figure out what capabilities this brings to you? We point to some resources that can help.
VMware have announced the availability of Spring Hadoop, which integrates the Spring Framework and the Apache Hadoop platform.
In his new article “MapReduce Patterns, Algorithms, and Use Cases”, Ilya Katsov gives a systematic view of the different MapReduce patterns, algorithms and techniques that can be found on the web or in scientific articles along with several practical use case studies.
After six years of gestation, Big data framework Apache Hadoop 1.0.0 was recently released. Core features in the release include Kerberos Authentication, support for Apache HBase and RESTful API to HDFS. InfoQ spoke with Arun Murthy, VP of Apache Hadoop, about the new release.
Baidu Technical Salon is a regular offline communication activity hosted by Baidu, planned, executed and implemented by InfoQ. The topics included cloud computing, mobile Internet, big data, log analysis and other current popular topics. This article mainly reviews Baidu’s support for technical community via Technical Salon, community’s feedbacks on these activities as well as a brief plan in 2012
HPCC Systems, which is part of LexisNexis, is launching this week its Thor Data Refinery Cluster on the Amazon EC2. HPCC Systems is an enterprise-grade, open source Big Data analytics technology platform capable of ingesting vast amounts of data, transforming, linking and indexing that data, with parallel processing power spread across the nodes.
eBay presented a keynote at Hadoop World, describing the architecture of its completely rebuilt search engine, Cassini, slated to go live in 2012. It indexes all the content and user metadata to produce better rankings and refreshes indexes hourly. It is built using Hadoop for hourly index updates and HBase to provide random access to item information.
Hortonworks, a company created in June 2011 by Yahoo! and Benchmark Capital, has announced the Technical Preview Program of Data Platform based on Hadoop. The company employs many of the core Hadoop contributors and intends to provide support and training.
A new post by Joe McKendrick outlines Hadoop’s ability to significantly simplify enterprise SOA implementation through improved data access services build on a common enterprise data platform.
Companies rely more and more on big data when making their decisions. Amazon, Cloudera, and IBM have announced their Hadoop-as-a-Service offerings, while Microsoft promises to do the same next year.
Microsoft announced that the next version of SQL Server, known by the codename "Denali", will be called SQL Server 2012. It will feature the big data capabilities of Apache Hadoop and Power View, a touch-based business intelligence tool.