The MapReduce paradigm is not always ideal when dealing with large computationally intensive algorithms. A small team of entrepreneurs is building a product called ParallelX to solve that bottleneck by harnessing the power of GPUs to give Hadoop jobs a significant boost.
This post presents the results of a Hortonworks survey of over 500 Hadoop Summit 2013 attendees on how they use Hadoop, and an interview with David McJannet on Hadoop trends today.
With Facebook recently releasing Presto as open source, the already crowded SQL-in-Hadoop market just became a tad more intricate. A number of open source tools are competing for the attention of developers: Hortonworks Stinger initiative around Hive, Apache Drill, Apache Tajo, Cloudera’s Impala, Salesforce’s Phoenix (for HBase) and now Facebook’s Presto.
On each day of the 3-day conference at the inviting environs offered at the Hyatt there was a jam-packed schedule of speakers, exhibits and activities that made for some difficult decisions as to which tracks and what happening to attend.
New version of Cascading released this week incorporates Hadoop 2 support and includes Cascading Lingual - an open source project that provides a comprehensive ANSI SQL interface for accessing Hadoop-based data
Hadoop 2 is now Generally Available, with YARN bringing ability to build data-processing applications that work natively in Hadoop. We spoke to Rohit Bakhshi, product manager at Hortonworks, about YARN and what it means for Hadoop users.
Big Data analytics startup QuantCell Research has announced the release of the first public beta of what they are positioning as their "Big Data" spreadsheet.
In his new whitepaper, Best Practices for Amazon EMR, Parviz Deyhim outlines the best practices in using AWS EMR including moving data to AWS, strategies for collecting, compressing, aggregating the data, and common architectural patterns for setting up and configuring Amazon EMR clusters for processing.
Concurrent, Inc., the enterprise Big Data application platform company, today announced Pattern, a machine learning based on an industry standard called PMML which allows analytics frameworks such as SAS, R, Microstrategy, Oracle, etc., to export predictive models and run them on Hadoop clusters
The recently released Windows Azure updates include support for Hadoop service, HTML5/JS, CORS, PhoneGap including Mercurial, Dropbox, CodePlex and Bitbucket deployment integration.
Datastax Enterprise 3.0 was announced last month with several Enterprise security features for a cluster using Cassandra, Hadoop and Solr. InfoQ caught up with Robin Schumacher, VP of Products at DataStax to learn more.
Concurrent, Inc., the enterprise Big Data application platform company, today announced Lingual, an open source project enabling fast and simple Big Data application development on Apache Hadoop using SQL.
EMC Greenplum has announced Pivotal HD, a new Hadoop distribution including a fully compliant SQL MPP database running on HDFS and being “hundreds of times faster than Hive”.
Hortonworks’ new Stinger initiative joins Apache Drill and Cloudera Impala in competition for the best real-time Hadoop implementation.
Oracle’s key-value database, known simply as “Oracle NoSQL Database” has hit version 2.0. Oracle NoSQL Database is essentially a distributed frontend for Berkeley DB, but it offers much more than that. Support for SQL queries, both absolute and eventual consistency, and the option to reduce storage space using Avro schemas sets it apart.