BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News Google Launches a New Cross-Platform Data Storage Engine BigLake in Preview

Google Launches a New Cross-Platform Data Storage Engine BigLake in Preview

This item in japanese

At the recent Cloud Data Summit, Google recently announced the preview of BigLake, a new data lake storage engine that makes it easier for enterprises to analyze the data in their data warehouses and data lakes.

With BigLake, users get fine-grained access controls and performance acceleration across BigQuery and multi-cloud data lakes on AWS and Azure. In addition, the service also ensures that data is uniformly accessible and secure across Google Cloud and open-source engines.

BigLake enables its users to extend BigQuery to multi-cloud data lakes and open formats like Parquet and ORC while maintaining fine-grained security controls, all without new infrastructure. Furthermore, they can keep a single copy of their data and impose consistent access rules across their analytics engines of choice, such as Google Cloud and open-source technologies like Spark, Presto, Trino, and Tensorflow. And finally, the integration with DataPlex users will have unified governance and administration at scale.


Source: https://cloud.google.com/blog/products/data-analytics/unifying-data-lakes-and-data-warehouses-across-clouds-with-biglake

Using policy tags lets users specify security on BigLake tables at the table, row, or column level. Fine-grained security is continuously implemented across Google Cloud and supported open-source engines utilizing BigLake connections for BigLake tables defined via Google Cloud Storage. And BigQuery Omni enforces security restrictions for BigLake tables defined on Amazon S3 and Azure data lake storage Gen 2 to enable regulated multi-cloud analytics leading to benefits pointed out in a medium article from a Big Data enthusiast Christian Laurer:

It’s a big benefit that you don’t have to duplicate your data in two different environments and create data silos. You can also support your data governance because, with BigLake, you can also assign rights to the data.

Currently, Google is not the only cloud provider that offers a lakehouse product (a combination of data lakes and data warehouses). Databricks was the first pioneer in the space with their Delta Lake offering. In addition, there are others like AWS with its solution and other companies in the open data ecosystem, from Dremio to Starburst.

Tony Bear, principal with dbInsight LLC, stated in a tweet:

Google #BigLake and related announcements hit a core theme I've been hammering at this year: cloud providers need connective tissue to take burden of integration off the shoulders of their customers.

Lastly, more details on BigLake are available on the documentation website, and pricing details can be found in the pricing section.

About the Author

Rate this Article

Adoption
Style

BT