Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News Recap of AWS re:Invent 2023: Amazon Q, Frugal Architectures, Database Upgrades

Recap of AWS re:Invent 2023: Amazon Q, Frugal Architectures, Database Upgrades

This item in japanese

The 12th edition of re:Invent has just ended in Las Vegas. As expected, artificial intelligence was a key topic of the conference, with Amazon Bedrock and Amazon Q, a new type of generative AI-powered assistant, the main focus of Adam Selipsky's keynote.

While most of the announcements targeted new technologies, support for IBM Db2 on RDS shows that the cloud provider still pays attention to the lift and shift of legacy and enterprise deployments. In a keynote focused on deployment costs as a non-functional requirement, Werner Vogels promoted frugality, encouraged innovation, and stressed:

The most dangerous phrase in the English language is: "We have always done it this way."

Below is a review of the main announcements impacting computing, database, storage, machine learning, cost optimization, and development.


As for EC2, a key update was the preview for memory-optimized R8g instances powered by the fourth-generation Graviton processor. The Graviton4 instances will be equipped with up to 96 Neoverse V2 cores, 2 MB of L2 cache per core, and 12 DDR5-5600 channels. In an unusual reference to competitors, Selipsky said:

We are now on our fourth generation in just five years. Other cloud providers have not even delivered on their first server processors yet.

EC2 high memory U7i instances are now in preview: powered by custom fourth-generation Intel Xeon Scalable Processors, they provide 896 vCPUs and from 16 to 32 TB of memory. They support up to 128 EBS volumes and 25 Gbps per network flow. To ease the challenge of selecting from over 700 instances, Compute Optimizer introduces customizable rightsizing recommendations, including a 32-day lookback option.

Lambda can now scale 12 times faster when handling high-volume requests. External endpoints and testing of task states are now available in Step Functions, the visual workflow service to build distributed applications. HTTPS endpoints use EventBridge connections to manage the authentication credentials for the target. Marcia Villalba, principal developer advocate at AWS, writes:

Using Step Functions HTTPS endpoints, you can directly integrate with popular payment platforms while ensuring that your users' credit cards are only charged once and errors are handled automatically.

Generative AI and Machine Learning

Amazon Q, a new type of assistant powered by generative artificial intelligence, was the main announcement of Selipsky's keynote. Currently in preview, Q brings assistance to different AWS services to simplify cloud development: network troubleshooting assistance, conversational Q&A capability, optimization of EC2 instance selection, and troubleshooting errors in the console are some examples of Q's promised capabilities.

The community feedback has been mixed and not everyone is convinced by the answers in the console, with Ben Kehoe highlighting that suggestions about IAM users are what AWS wants customers to stop doing. The pricing for developers is a concern as well but many of the capabilities are available without any charge during the preview.

Redshift announced Q generative SQL in Redshift Query Editor, while feature development capability in CodeCatalyst and the integration of Q with QuickSight are supposed to convert a human prompt to an actionable plan and provide quicker data insights. A more detailed review of Amazon Q has been published this week on InfoQ.

There were also announcements for SageMaker: SageMaker Canvas, the visual interface to build machine learning models, now supports foundation models from Bedrock and SageMaker Jumpstart. SageMaker Clarify can instead assist in evaluating and selecting foundation models based on accuracy, robustness, factual knowledge, bias, and toxicity.

LLMs have recently experienced remarkable growth across various applications. To address the new market, SageMaker launched a new version of Large Model Inference DLC with TensorRT-LLM support. Furthermore, the machine learning platform added new inference capabilities to deploy one or more FMs on the same endpoint: users can now control the allocation of accelerators and memory for each model. Jeremy Daly, CEO and founder of Ampt, notes:

Amazing to hear Werner point out the fact that LLMs (aka new AI) are NOT the AI silver bullet. We have lots of old AI tools (e.g. machine learning and deep learning) that are still invaluable!

Two new Titan multimodal foundation models were announced by Swami Sivasubramanian: the preview of Titan Image Generator and Titan Multimodal Embeddings. Furthermore, Titan Text Lite and Titan Text Express are generally available. To simplify Bedrock usage, the cloud provider introduced automatic evaluation and human evaluation options for foundation models.

Bedrock now provides access to Anthropic's latest model, Claude 2.1, providing a 200K token context window, reduced rates of hallucination, and improved accuracy over long documents. However, users noticed that the prices for Cloude significantly increased during the conference. Announced in preview earlier this year, Agents for Amazon Bedrock now has improved orchestration control and better visibility into the chain-of-thought reasoning. There is also the option to customize models and build applications for specific domains using fine-tuning and continued pre-training.

One of the early announcements of the conference was the ability to build generative AI apps using Step Functions and Bedrock, with two new API actions: InvokeModel, to run inferences for text, image, and embedding models, and CreateModelCustomizationJob, an integration that creates a job to customize a base model.

Glue Data Quality is now in preview: adding anomaly detection and insights, the feature lets customers detect anomalies automatically and track how data changes over time.


The primary storage announcement was an S3 storage class for processing data in AI/ML training and financial modeling: the S3 Express One Zone is a high-performance, single-AZ storage class that delivers single-digit millisecond data access, improving data access speeds and reducing request costs. Yan Cui, AWS Serverless Hero, warns:

One thing a lot of people have missed about the new S3 express one-zone class is that, although it's 50% cheaper for requests, it's also ~7x more expensive for storage. This doesn't make it less appealing. It just means there are clear trade-offs you need to consider.

On separate announcements, Mountpoint for Amazon S3 and EMR support S3 Express One Zone. To help manage data lake permissions, S3 Access Grants integrate with identity providers such as Active Directory or IAM Principal to map identities in directories and datasets in S3.

A new storage class for Elastic File System has been added for long-lived data, with EFS Archive offering similar performance to EFS Infrequent Access at a lower price. Failback support for EFS replication allows the synchronization of changes between different EFS when performing DR workflows. Finally, EFS supports up to 250K read IOPS and 50K write IOPS per file system.

Among other announcements targeting enterprise deployments, FlexGroup volume management for FSx for NetApp ONTAP is now available, as well as FSx for ONTAP scale-out file systems. FSx for ONTAP handles Multi-AZ file systems in Shared VPC participant accounts and FSx for OpenZFS has support for on-demand data replication across file systems.

AWS Backup added support for restore testing, a functionality to perform automated and periodic restore tests. EBS Snapshots Archive is now available with Backup, supporting the transition of infrequently accessed EBS snapshots to lower-cost archive storage.


Compatible with Redis and Memcached, ElastiCache Serverless is a new option that "instantly" scales capacity based on application traffic patterns. According to the cloud provider, the service scales without downtime by allowing the cache to scale up and initiating a scale-out in parallel to meet capacity needs. The product received mixed feedback, with Corey Quinn, chief cloud economist at The Duckbill Group, being among those questioning the minimum metered data storage, the scaling algorithm, and the 90 USD / GB price. He concludes:

You really, really do not want to use this as your production datastore.

The first announcement of Peter DeSantis' keynote, Aurora Limitless Database, is a feature of Aurora supporting automated horizontal scaling to process millions of write transactions per second. Shards are PostgreSQL DB instances, allowing for parallel processing and achieving higher write throughput. Aurora Limitless Database is currently in private preview and only supports PostgreSQL.

Source: AWS blog

In a surprising legacy database commitment, RDS now supports Db2, both Standard Edition and Advanced Edition running version 11.5.

Multiple "zero-ETL integrations" were announced during the conference, including the preview of OpenSearch Service integration with S3, to query operational logs. DynamoDB integration with OpenSearch Service is now available: relying on OpenSearch Ingestion to synchronize the data, the feature lets developers perform full-text search, fuzzy search, auto-complete, and vector search on DynamoDB data.

Redshift Serverless introduced a preview of AI-driven scaling and optimizations, with the cloud data warehouse learning workload patterns and adapting capacity according to data volume changes, concurrent users, and query complexity.

SQS FIFO queues now support up to 70K transactions per second and dead letter queue redrive to handle messages that are not consumed after a specific number of retries.

To address the popularity of vector embeddings, AWS announced the general availability of vector search for DocumentDB and vector engine for OpenSearch Serverless. Channy Yun, principal developer advocate at AWS, writes:

You can now store, update, and search billions of vector embeddings with thousands of dimensions in milliseconds. The highly performant similarity search capability of vector engine enables generative AI-powered applications to deliver accurate and reliable results with consistent milliseconds-scale response times.

Neptune Analytics is an analytics database for data scientists and application developers who need to analyze large amounts of graph data.

Business applications

The WorkSpaces Thin Client is a small device that provides secure access to virtual desktops on AWS. The client supports WorkSpaces, WorkSpaces Web, and AppStream 2.0, with different options for managing user identities and credentials using Active Directory. Jonathan Rau, VP and distinguished engineer at Query, comments:

Only works for main regions but it does work with AppStream too. I do not know many folks doing permanent DaaS/VDI anymore but it is cool to see some innovation still happening at EUC.

Announced in preview last summer, Healthscribe, a HIPAA-eligible service to generate clinical documentation, is generally available. Targeting manufacturing, retail, and healthcare workloads, B2B Data Interchange is a managed service to automate the transformation of Electronic Interchange Data (EDI) documents into common data representations such as JSON and XML.

Vision system data is a capability in preview for IoT FleetWise, the service that lets automotive companies collect, organize, and process vehicle data.

Monitoring and Security

CloudWatch introduced natural language query generation for logs and metrics, allowing developers to generate new queries from a description and refine existing ones. The monitoring platform added a log class called Infrequent Access that halves the ingestion price, reducing monitoring costs for deployments that do not need advanced features like Live Tail, alarming, or data protection. CloudWatch Logs now provides automated pattern analytics and anomaly detection: discussing observability improvements, Danilo Poccia, chief evangelist (EMEA) at AWS, explains:

When looking at the results of a log query, you can use patterns to help you find the needle in the haystack. You can also quickly compare the results with a previous period to see what changed and find if there is a new pattern that was not there yesterday or last week.

Source: AWS Blog

CloudWatch Application Signals (in preview) automatically correlates telemetry across metrics, traces, logs, real user monitoring, and synthetic monitoring for automatic instrumentation of applications. Finally, CloudWatch added the option to create and manage external sources to consolidate hybrid, multi-cloud, and on-premises metrics. Jeff Barr, vice president and chief evangelist at AWS, writes:

When I first heard about this new feature, I thought, "Wait, I can do that with PutMetricData, what is the big deal?" Quite a bit, as it turns out. PutMetricData stores the metrics in CloudWatch, but this cool new feature fetches them on demand, directly from the source.

While Control Tower added 65 controls to help customers meet digital sovereignty requirements, new data visualizations, filtering, and customization enhancements are available in the Security Hub dashboard, allowing customers to more easily focus on risks, and reducing the need to export data to BI tooling. Inspector introduced new open-source plugins and an API to assess container images for software vulnerabilities at build time.

EC2 Runtime Monitoring is a GuardDuty feature that improves threat detection, adding visibility into on-host operating system–level activities and providing container-level context into detected threats.

Application Load Balancer now supports Automatic Target Weights (ATW), which uses a new routing algorithm to optimize traffic sent to each target based on health information, like 5XX errors and connection errors. With mutual authentication for Application Load Balancer, customers can now offload client authentication that presents X509 certificates to the load balancer. Aaron Walker, technology director at base2Service, comments:

Been waiting for this one for a VERY long time. It's time to delete more undifferentiated heavy lifting.

Source: AWS blog

The general availability of logs in the AWS Distro for OpenTelemetry was unveiled a few days before the conference.

Architecture, Coding, and Productivity Tools

One of the latest announcements at the conference was myApplications, a new application-centric option in the console to manage and monitor applications running on AWS. It acts on specific resources in the relevant services, such as CloudWatch for application performance or Security Hub for security findings. Luc van Donkersgoed, principal engineer at PostNL, comments:

I think myApplications is interesting, and it's a good idea for many teams to enable it. Understanding your application (cost, behavior, incidents) is essential for any production workload, and myApplications is lowering the barrier to achieving it.

During the week before re:Invent, AWS announced the preview of CodeWhisperer for the command line, currently available on MacOS only, and during the conference added AI-powered code remediation, IaC support, and integration with Visual Studio. Irshad Buchh, principal solutions architect at AWS, explains:

Since its launch, Amazon CodeWhisperer has identified hard-to-find security vulnerabilities with built-in security scans. It now provides generative AI-powered code suggestions to help remediate identified security and code quality issues.

The preview of the generative AI capability Console-to-Code helps move from prototyping in the console to deploying production code. The ability to generate code for console actions gained some positive reactions in the community.

Q Code Transformation promises to simplify upgrading and modernizing existing application code: according to AWS, Q can perform Java application upgrades from version 8 and 11 to version 17 and it will soon support Windows-based .NET Framework applications to cross-platform .NET. Matthew Wilson, VP and distinguished engineer at Amazon, explains:

Amazon Q Code Transformation uses OpenRewrite to accelerate Java upgrades for customers (...) Before the internal development of Amazon Q Code Transformation started, teams at Amazon had been getting excited about how OpenRewrite could make the task of updating software at scale less toilsome. I'm thankful that it exists and has a strong community.

CloudFormation now supports Git management of stacks, enabling developers to synchronize their stacks from a CloudFormation template stored in a remote Git repository. Cui comments:

I guess now you can do away with your CI/CD pipeline and just have CloudFormation deploy your stack straight from your repo. Seemed useful at first, but then I'd still want to run tests, etc. in my pipeline. So this is probably not something I will actually use, except in REALLY simple, infra-only cases.

Lens Catalog for the AWS Well-Architected Tool is a central lens repository for customers looking to explore, review, and implement the latest cloud best practices.

Route 53 Application Recovery Controller added zonal autoshift to automatically traffic away from an AZ when AWS identifies a potential failure and shift it back once it is resolved.

re:Post Private is a private version of re:Post that offers a managed knowledge service for enterprises.

Cost Optimization

The cloud provider introduced Cost Optimization Hub, a feature that helps customers consolidate cost optimization recommendations across member accounts and regions. Finally, there is a new API to query the Free Tier usage programmatically. Alex DeBrie, independent consultant and AWS Data Hero, comments:

Very interesting — could see some helpful tooling built on that. And perhaps a bridge to a more beginner-friendly Free Tier experience.


Vogels wrote some tech predictions for 2024 suggesting that generative AI becomes culturally aware and AI assistants redefine developer productivity:

AI assistants will evolve from basic code generators into teachers and tireless collaborators that provide support throughout the software development lifecycle. They will explain complex systems in simple language, suggest targeted improvements, and take on repetitive tasks, allowing developers to focus on the parts of their work that have the most impact.

During the event, AWS updated an article with the main announcements, and the five keynotes along with over 400 sessions are already available on YouTube.


About the Author

Rate this Article