Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News Amazon Q Data Integration in AWS Glue Simplifies Data Transformation on AWS

Amazon Q Data Integration in AWS Glue Simplifies Data Transformation on AWS

This item in japanese

Recently, AWS announced the preview of a new feature for AWS Glue, enabling customers to use natural language for authoring and troubleshooting data integration jobs. With Amazon Q data integration in AWS Glue, developers can provide a description of their data integration workload, and the service will generate an ETL script.

Powered by the managed service for generative AI Bedrock, the new chat experience for AWS Glue introduces natural language processing to ETL, with the goal of simplifying the authoring and troubleshooting of data integration jobs. Irshad Buchh, principal advisor at AWS, explains:

You can describe your data integration workload and Amazon Q will generate a complete ETL script. You can troubleshoot your jobs by asking Amazon Q to explain errors and propose solutions. Amazon Q provides detailed guidance throughout the entire data integration workflow. Amazon Q helps you learn and build data integration jobs using AWS Glue. Amazon Q can help you connect to common AWS sources such as Amazon Simple Storage Service (Amazon S3), Amazon Redshift, and Amazon DynamoDB.

Introduced in preview at re:invent, Amazon Q serves as a generative AI-powered assistant to help developers and customers "solve problems, generate content, and take action." Designed to simplify ETL pipeline development, AWS Glue is a serverless data integration service facilitating the discovery, preparation, and aggregation of data for analytics, machine learning, and application development.

The recent announcement highlights that a prompt like:

Write a Glue ETL job that reads from Redshift, 
drops null fields, and writes to S3 as parquet files.

describes the necessary steps and generates Python code, as illustrated in the following screenshot. This code can be further customized and transferred into the script editor or notebook.

Source: AWS console

In recent years, the process of extract, transform, and load (ETL) has gained importance to manage structured and unstructured data from various sources, including marketing, costumers and sensor data. improving business intelligence and analytics. Bala Balakumar, director at Waka Online NZ, comments:

Amazon Q is so powerful in terms of working across disparate data sources. Now with AWS Glue integration of Amazon Q together with Zero ETL, we have data insights at our fingertips!

Amazon Q Data Integration is not the sole recent enhancement for AWS Glue: Glue Data Catalog supports the creation, management, and access control of multiple engine SQL views; Glue Data Quality provides anomaly detection and insights and tracks how data changes over time. Finally, just before re:Invent, the cloud provider added the Glue serverless Spark UI and observability metrics.

Amazon Q Data Integration is available in every region where the AI-powered assistant is currently supported. As per the documentation, data is transmitted to and stored in an AWS Region in the US regardless of where customers use the generative AI–powered assistant.

About the Author

Rate this Article