Google's BigQuery Gaining Momentum
Google's BigQuery, a SaaS query offering by Google, seems to gain more and more momentum. It allows to query large-scale columnar data structures in the cloud. Developers can load data into BigQuery via upload to Google Cloud Storage (the equivalent of Amazon's S3) or stream data into the platform, and then perform OLAP-style queries using a SQL-like query language.
Practitioners now increasingly share their hands-on experience with BigQuery. For example, Graham Polley of Shine Technologies reported:
We decided to run our own tests, laying down the gauntlet to BQ and using a data set with 1.5 billion rows. Things were about to get very interesting – could Google’s sales pitch of "being able to interactively analyse massive data sets with billions of rows" really do what it said it could? It could and we were impressed. In fact, very impressed. Even when not using cached results (cached results can be toggled on and off) we experienced consistent results in the 20-25 second range for grinding through our massive data set of 1.5 billion rows using relatively complex queries to aggregate the data.
BigQuery can be used stand-alone, but also provides integration with other services, such as Google Apps Script or Google Analytics. Concerning the latter, Jonathan Weber (Data Evangelist at LunaMetrics) wrote an informative piece where he states:
First, BigQuery export is available only for Google Analytics Premium customers. You can have the BigQuery export turned on through your Premium account manager. Note that there are costs for both data storage and processing in BigQuery, but GA Premium users get a $500/month credit to use toward those charges. In many cases, that $500 will take you a long way. For reference, I took a look at one of our Premium customers using BigQuery. Their site has about 6M visits and 50M pageviews per month. Data has been exporting since September, and this month their storage charges will be about $12.86.
While BigQuery is only available as a cloud-based solution, the underlying technology that powers BigQuery (Dremel) is the core of many Open Source SQL-in-Hadoop solutions such as Apache Drill or Impala.
Tom Gilb & Kai Gilb Jan 26, 2015