Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News Snowflake Announces General Availability of their Cloud Data Warehouse Offering

Snowflake Announces General Availability of their Cloud Data Warehouse Offering

Leia em Português

Snowflake Computing has announced the general availability of their Snowflake Elastic Data Warehouse, a software as a service offering that provides a SQL data warehouse on top of Amazon Web Services.

In a post from October 2014 Curt Monash explains that the service "is built from scratch (as opposed, to for example, being based on PostgreSQL or Hadoop)" and "is columnar and append-only, as has become common for analytic RDBMS". "Data is stored in compressed 16 megabyte files on Amazon S3, and pulled into Amazon EC2 servers for query execution on an as-needed basis". Additionally, while "Snowflake has no indexes ... it does have zone maps, aka data skipping" which allows it to skip files that are not necessary to service a query.

Snowflake's strengths stem from three core system features. First, Snowflake is a fully managed SaaS offering which reduces the operational burden to near zero. While services like Amazon's Redshift have greatly reduced the burden of creating a data warehouse there is still an operational overhead to managing and scaling Redshift on an ongoing basis.

Second Snowflake is built to support a combination of both structured and semi-structured data. For instance, it can ingest any data in JSON, XML, or Avro format, all of which support nesting and repeated data types. This allows snowflake to move beyond the typical data warehouse use cases and encroach on Hadoop and other semi-structured use cases.

Finally, the elasticity of the service brings a new and interesting pricing model to the data warehouse market. Pricing is based on data storage size and per hour compute usage. If compute is not needed (say during over night hours) you can simply scale down the compute until it is required again. Redshift provides similar functionality, using snapshot and restore, but restores can take a significant amount of time to copy the data back to the Redshift hosts. By contrast, Snowflake can spin up much more quickly since it copies data to the hosts as needed.

In a separate announcement Snowflake also disclosed $45 million in new funding from Altimeter Capital, Redpoint Ventures, Sutter Hill Ventures and Wing Ventures. This builds upon their previous funding round in October of 2014 when Snowflake raised a total of $26 million in funding from Redpoint Ventures, Sutter Hill Ventures and Wing Ventures.

Rate this Article