BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Hadoop Content on InfoQ

  • Hadoop-as-a-Service from Amazon, Cloudera, Microsoft and IBM

    Companies rely more and more on big data when making their decisions. Amazon, Cloudera, and IBM have announced their Hadoop-as-a-Service offerings, while Microsoft promises to do the same next year.

  • 'Denali' No More: SQL Server 2012 Announced, Focuses on BI and Big Data

    Microsoft announced that the next version of SQL Server, known by the codename "Denali", will be called SQL Server 2012. It will feature the big data capabilities of Apache Hadoop and Power View, a touch-based business intelligence tool.

  • Twitter Storm: Open Source Real-time Hadoop

    Twitter has open-sourced Storm, its distributed, fault-tolerant, real-time computation system, at GitHub under the Eclipse Public License 1.0. Storm is the real-time processing system developed by BackType, which is now under the Twitter umbrella.

  • MapR Releases Commercial Distributions based on Hadoop

    MapR Technologies released a big data toolkit, based on Apache Hadoop with their own distributed storage alternative to HDFS. The software is commercial, with both a free edition, M3, as well as a paid edition, M5. M5 includes snapshots and mirroring for data, Job Tracker recovery, and commercial support. MapR's M5 edition will form the basis of EMC Greenplum's upcoming HD Enterprise Edition.

  • Yahoo Hadoop Spinout Hortonworks Announces Plans

    Yahoo spun-out its core Hadoop team, forming a new company Hortonworks. CEO Eric Baldeschwieler presented their vision of easing adoption of Hadoop and making core engineering improvements for availability, performance, and manageability. Hortonworks will sell support, training, and certification, primarily indirects through partners.

  • Hadoop Futures at Structure Big Data: DataStax Brisk, EMC, and MapR

    DataStax described Brisk their new Hadoop distribution that stores data in Cassandra, EMC published an ad that promised big news about Hadoop and Greenplum on May 9th, and GigaOm claimed that MapR Technologies is building a proprietary version of Hadoop. DataStax told InfoQ there are production Cassandra clusters of 700 nodes, storing hundreds of terbaytes, and doing 200,000 writes per second.

  • Hadoop Redesign for Upgrades and Other Programming Paradigms

    Yahoo recently announced and presented a redesign of the core map-reduce architecture for Hadoop to allow for easier upgrades, larger clusters, fast recovery, and to support programming paradigms in addition to Map-Reduce. The new design is quite similar to the open source Mesos cluster management project - both Yahoo and Mesos commented on the differences and opportunities.

  • JasperSoft 4 Released with Big Data Support

    JasperSoft announces reporting support for Hadoop and leading NoSQL databases.

  • Membase and Cloudera Announce Integration

    Membase and Cloudera announced integration of the Membase NoSQL database and Cloudera's Distribution for Hadoop, the distributed map-reduce and storage system, allowing for bi-direction data replication between the systems.

  • Cloudera Enterprise Released: Interview with Charles Zedlewski

    Cloudera recently announced Cloudera Enterprise, a commercial bundling of Hadoop and a dozen other supporting open source projects.  InfoQ interviewed Product Manager Charles Zedlewski for more detail about what this means for conventional enterprises and the future face of Hadoop.

  • LinkedIn's Data Infrastructure

    Jay Kreps of LinkedIn presented some informative details of how they process data at the recent Hadoop Summit. Kreps described how LinkedIn crunches 120 billion relationships per day and blends large scale data computation with high volume, low latency site serving.

  • Facebook on Hadoop, Hive, HBase, and A/B Testing

    The Hadoop Summit of 2010 included presentations from a number of large scale users of Hadoop and related technologies. Notably, Facebook presented a keynote and details information about their use of Hive for analytics. Mike Schroepfer, Facebook's VP of Engineering delivered a keynote describing the scale of their data processing with Hadoop.

  • Amazon Elastic MapReduce Updates from Hadoop Summit 2010

    The Hadoop Summit of 2010 included a keynote from Peter Sirota, General Manager of Amazon Elastic MapReduce (EMR), which is a hosted Hadoop offering from Amazon that includes web-based management tools.

  • Yahoo! Updates from Hadoop Summit 2010

    The Hadoop Summit of 2010 started off with a vuvuzela blast from Blake Irving, Chief Product Officer for Yahoo. Yahoo delivered keynote addresses that outlined the scale of their use, technical directions for their contributions, and architectural patterns in how they apply the technology.

  • Mahout 0.3: Open Source Machine Learning

    The need for machine-learning techniques like clustering, collaborative filtering, and categorization has steadily increased the last decade along with the number of solutions needing quick and efficient algorithms to transform vast amounts of raw data into relevant information. Apache Mount 0.3 has been announced on March, adding more functionality, stability and performance.

BT