Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News AWS Releases Fully-Managed Data Lake for CloudTrail Logs

AWS Releases Fully-Managed Data Lake for CloudTrail Logs

This item in japanese


AWS announced the release of CloudTrail Lake, a fully-managed data lake for storing and analyzing CloudTrail logs. CloudTrail Lake can aggregate logs across regions and accounts. Once in the lake, the logs can be queried using SQL syntax.

CloudTrail Lake provides a singular location for aggregating logs from multiple regions and multiple accounts. Multiple lakes can be created to isolate logs by region if desired. Multi-account support is only possible for accounts under the same AWS Organization. This improves upon the query functionality present within CloudTrail event history, as event history can only pull from a single region within a single account. It also isn't possible to query multiple attributes within event history.

The logs are stored immutably with a default retention period of seven years. The retention period is adjustable anywhere between seven days and seven years. It is currently possible to collect data on both management events and data events. Management events include all control plane operations such as configuring security, registering devices, configuring rules for routing data, and setting up logging. Data events by default are not included in CloudTrails and cover data plane operations such as S3 API activity on buckets and objects, Lambda function execution activity, or DynamoDB object-level API activity.

Once stored within the lake, the logs can be queried using standard SQL syntax. As the data lake is immutable, only SELECT queries are permitted. The service includes a number of sample queries which can be used as templates. For example, to show all the recorded API activity for a specific IAM key you could use the following sample query:

SELECT eventTime, eventName, userIdentity.principalId
WHERE userIdentity.accessKeyId like 'AKIAXZUQIC6XEVCJJFM7'

Note that the event data store ID is used as the table name within the query. Another sample query will show any security group changes after a certain time:

SELECT eventname, useridentity.username, sourceIPAddress, eventtime, element_at(requestParameters, 'groupId') as SecurityGroup, element_at(requestParameters, 'ipPermissions') as ipPermissions
WHERE (element_at(requestParameters, 'groupId') like '%sg-%')
and eventtime > '2017-11-01T00:00:00Z'
order by eventtime asc;

The query editor lists all the available event properties that can be queried. Within the console, queries can be saved for future access. A log of recent queries is also available for review.

The release also includes a number of CLI commands for creating, querying, and working with CloudTrail Lake. For example, the command aws cloudtrail list-event-data-stores will show all event data stores within a given account. Queries can be started via the CLI as well using aws cloudtrail start-query. The status of a query can be obtained via describe-query, and if the run was successful, the results can be retrieved from get-query-results.

CloudTrail Lake is available within most regions in AWS and can be enabled via the CloudTrail console, by SDK, or via the AWS CLI. More details on using CLoudTrail Lake can be found in the documentation.

About the Author

Rate this Article


Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Community comments

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p