Twitter has open sourced their MapReduce streaming framework, called Summingbird. Available under the Apache 2 license, Summingbird is a large-scale data processing system enabling developers to uniformly execute code in either batch-mode (Hadoop/MapReduce-based) or stream-mode (Storm-based) or a combination thereof, called hybrid mode.
2013 has been rich in announcements for new programs, degrees and grants for aspiring data scientists and Big Data practitioners.
Chef Sugar is an extension to Chef that offers DSL methods to make more readable recipes. Seth Vargo, Chef Sugar's author, recently wrote about his motivations for creating Chef Sugar, highlighting them with examples. InfoQ interviewed Seth to know more about his views on syntactic sugar and the benefits of a plug-in architecture in the context of Chef.
Rackspace, cloud computing platform provider and founder of OpenStack, is currently introducing its new DevOps Automation Service. They offer managed support for things like infrastructure and workflow automation, monitoring and log aggregation, and source control for infrastructure code.
In 2011 Trevor Eckhart found logs on his device that he believed were associated with Carrier iQ data. Our response at the time, which has since been confirmed by a detailed FTC investigation, is that the data collection logs were associated with and used by the manufacturer of the device, not Carrier iQ. They were not Carrier iQ logs.
Jez Humble and Gene Kim, prominent figures of the DevOps movement, are working with Puppet Labs on the 2013 DevOps Survey Of Practice. The survey's goal is to better understand which IT practices drive an organization to high performance, building upon the 2012 DevOps Survey. The survey will close on the 15th of January and everyone is invited to participate.
The PowerShell team released a new set of Desired Configuration State (DSC) resources, packaged in five modules: xWebAdministration; xComputerManagement; xPSDesiredStateConfiguration; xNetworking and xHyperV. This release aims to encourage the PowerShell community to author more DSC resources. It also becomes possible with this release to create a web server from scratch using only DSC resources.
Mitchell Hashimoto, creator of Vagrant, gave a talk last month at Velocity Conf London about his vision for a “FutureOps” with immutable infrastructures and built-in failure recovery.
Arun Kejariwal, from Twitter, talked at Velocity Conf London last month about forecasting algorithms used at Twitter to proactively predict system resource needs as well as business metrics such as number of users or tweets. Given the dynamic nature of their data stream, they found that a refined ARIMA model works well once data is cleansed, including removal of outliers.
Trifacta, a data analysis services platform, recently received VC investment to advance on their efforts of making data wrangling easier for data analysts. The goal is to collect, cleanse and munge data in a fraction of the time and effort it currently takes.
Chaitanya Mishra, from Facebook, spoke at Velocity Conf London last month about the approach to scale Facebook’s Android app from a web view interface to a full-fledged native app. To achieve this transition each product team took ownership of their features on Android. A core integration team regression tests and focuses on global app optimization over individual features optimization.
Qubole, a managed Hadoop-as-a-Service offering is now available on Google Compute Engine (GCE). Qubole was so far only available on Amazon's AWS and this announcement follows only a few days after Google releasing GCE into general availability.
The MapReduce paradigm is not always ideal when dealing with large computationally intensive algorithms. A small team of entrepreneurs is building a product called ParallelX to solve that bottleneck by harnessing the power of GPUs to give Hadoop jobs a significant boost.
Mirage OS is a ‘cloud operating system’ that seeks to avoid security vulnerabilities and bloat by facilitating the creation of single purpose virtual appliances. Applications are developed in the OCaml functional programming language and compiled into standalone ‘unikernels’ that run directly on the Xen hypervisor.
Want to do DevOps automation in a Microsoft world? Typically that meant using Microsoft-provided tools like PowerShell and System Center instead of the popular open source tools that have been slow to support the full Microsoft product stack. That’s beginning to change as developers and system administrators can now use tools like Puppet to provision and manage resources in Windows Azure.