DataBricks, the company behind Apache Spark, has announced a new addition into the Spark ecosystem called Spark SQL. Spark SQL is separate from Shark, and does not use Hive under the hood. InfoQ reached out to Reynold Xin and Michael Armbrust, software engineers at DataBricks, to learn more about Spark SQL.
Cloudera recently released the latest version of its software distribution, CDH5. Almost 20 months after the last major version, CDH4 seems like ages in the Big Data world. We take a look at new features this release brings and the future direction of Cloudera after the latest round of investment from Intel and Google Ventures.
Microsoft announced the availability of the Windows Management Framework V5 Preview, which includes Windows PowerShell OneGet, a package manager in the spirit of yum and apt-get; a set of cmdlets to manage network switches; and some polishing on Windows PowerShell Desired State Configuration (DSC).
A report on why agile works for Australia’s most progressive organizations like ANZ, Bankwest, Commonwealth Bank, NAB, Suncorp, Allianz, SunSuper and many more and their journey to DevOps and continuous delivery.
The social-networking company AddThis open-sourced Hydra under the Apache version 2.0 License in a recent announcement. Hydra grew from an in-house platform created to process semi-structured social data as live streams and do efficient query processing on those data sets.
The new version of Azure brings with it enhanced options for private networks, virtual private networks, and multi-region load balancing.
Spark users can now use a new Big Data platform provided by intelligence company Atigeo, which bundles most of the UC Berkeley stack into a unified framework optimized for low-latency data processing that can provide significant improvements over more traditional Hadoop-based platforms.
After thirteen years of development and evolution, JSR-107 - JCACHE, has been finalized.
Docker Inc., the company behind Docker, introduced a range of new services, including its first paid offering: private repositories. Docker index, Docker.io's public registry, now also offers webhooks, triggers and links for trusted builds and e-mail notifications.
Apache released HBase 0.98 primarily addressing convergence with Apache Accumulo via cell-based security while resolving over 230 JIRA issues. These new security features are modeled after Accumulo.
DevOps tool provider Vagrant announced significant features in their 1.5 release, including a public image repository and the ability to share access to running environments. The Vagrant Cloud is meant to simplify the discovery and distribution of complete development environments. Vagrant Share lets developers collaborate with others by exposing HTTP or SSH access to these virtual environments.
Processing extremely large graphs has been and remains a challenge, but recent advances in Big Data technologies have made this task more practical. Tapad, a startup based in NYC focused on cross-device content delivery, has made graph processing the heart of their business model using Big Data to scale to terabytes of data.
Daniel Schauenberg described at QCon London how Etsy, renowned for its DevOps and Continuous Delivery practices, does 50 deploys/day. A fully automated deployment pipeline, thorough application monitoring and IRC-based collaboration are all important to achieve this rate of change while keeping risk to a minimum. Etsy has about 60 million monthly visits and 1.5 billion page views per month.
ZeroTurnaround was born in Estonia in 2006. It was founded by Jevgeni Kabanov and aimed to solve Java's core problem - the redeployment bottleneck. Since then, they've launched two products, JRebel and LiveRebel, and started two community efforts: RebelLabs and vJUG. For an insider's perspective on ZeroTurnaround, I interviewed their CEO.