InfoQ Homepage Hadoop Content on InfoQ
-
Big Data Hadoop Solutions, State of Affairs in Q1/2014
According to a new Forrest report, Hadoop’s momentum is unstoppable. Its usage in the enterprise is continuously growing due to its ability to offer companies new ways to store, process, analyze, and share big data. The report takes a look at Hadoop vendors and ranks them.
-
Spark Officially Graduates From Apache Incubator
Recently, Spark graduated from the Apache incubator. Spark claims up to 100x speed improvements over Apache Hadoop over in-memory datasets and gracefully falling back to 10x speed improvement for on-disk performance. Based on Scala, it can run SQL queries and be used directly in R. It provides Machine Learning, Graph database capabilities and other further discussed in the article.
-
Elasticsearch 1.0.0 released
Elasticsearch released version 1.0.0 of its self-titled, open-source analytics tool. Elasticsearch is a distributed search engine which allows for real-time data analysis in big-data environments. The new version comes with various functional enhancements and changes to the API to make Elasticsearch more intuitive and powerful to use.
-
Google Improves Hadoop Performance with New Cloud Storage Connector
With a new connector, it is now possible for Hadoop to run directly against Google Cloud Storage instead of using the default, distributed file system. This results in lower storage costs, fewer data replication activities, and a simpler overall process.
-
New Education Opportunities for Data Scientists
2013 has been rich in announcements for new programs, degrees and grants for aspiring data scientists and Big Data practitioners.
-
Hadoop-as-a-Service Provider Qubole Now Runs on Google Compute Engine
Qubole, a managed Hadoop-as-a-Service offering is now available on Google Compute Engine (GCE). Qubole was so far only available on Amazon's AWS and this announcement follows only a few days after Google releasing GCE into general availability.
-
Hadoop Jobs on GPU with ParallelX
The MapReduce paradigm is not always ideal when dealing with large computationally intensive algorithms. A small team of entrepreneurs is building a product called ParallelX to solve that bottleneck by harnessing the power of GPUs to give Hadoop jobs a significant boost.
-
A Survey and Interview on How Hadoop Is Used Today
This post presents the results of a Hortonworks survey of over 500 Hadoop Summit 2013 attendees on how they use Hadoop, and an interview with David McJannet on Hadoop trends today.
-
Open Source SQL-in-Hadoop Solutions: Where Are We?
With Facebook recently releasing Presto as open source, the already crowded SQL-in-Hadoop market just became a tad more intricate. A number of open source tools are competing for the attention of developers: Hortonworks Stinger initiative around Hive, Apache Drill, Apache Tajo, Cloudera’s Impala, Salesforce’s Phoenix (for HBase) and now Facebook’s Presto.
-
A Few Highlights from QConSF2013- Part 1 of 2
On each day of the 3-day conference at the inviting environs offered at the Hyatt there was a jam-packed schedule of speakers, exhibits and activities that made for some difficult decisions as to which tracks and what happening to attend.
-
YARN Brings New Capabilities To Hadoop
Hadoop 2 is now Generally Available, with YARN bringing ability to build data-processing applications that work natively in Hadoop. We spoke to Rohit Bakhshi, product manager at Hortonworks, about YARN and what it means for Hadoop users.
-
QuantCell Research Announces First Public Beta of their Java-Aware Big-Data Spreadsheet
Big Data analytics startup QuantCell Research has announced the release of the first public beta of what they are positioning as their "Big Data" spreadsheet.
-
Concurrent Releases Pattern, a Machine Learning DSL for Hadoop
Concurrent, Inc., the enterprise Big Data application platform company, today announced Pattern, a machine learning based on an industry standard called PMML which allows analytics frameworks such as SAS, R, Microstrategy, Oracle, etc., to export predictive models and run them on Hadoop clusters
-
Windows Azure Updated with Hadoop, HTML5/JS, CORS, PhoneGap, Mercurial and Dropbox
The recently released Windows Azure updates include support for Hadoop service, HTML5/JS, CORS, PhoneGap including Mercurial, Dropbox, CodePlex and Bitbucket deployment integration.
-
Greenplum Pivotal HD Combines the Strengths of SQL and Hadoop
EMC Greenplum has announced Pivotal HD, a new Hadoop distribution including a fully compliant SQL MPP database running on HDFS and being “hundreds of times faster than Hive”.