BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News AWS Releases Next Generation of Amazon OpenSearch Serverless

AWS Releases Next Generation of Amazon OpenSearch Serverless

Listen to this article -  0:00

Amazon Web Services has recently announced the general availability of the next generation of Amazon OpenSearch Serverless, with a redesigned architecture that enables 20 times faster resource provisioning than the previous serverless architecture, true scale-to-zero capability, and up to 60% lower cost than a provisioned cluster for peak loads.

The company says it is positioning the service as a building block for developing agentic AI applications, with dedicated integrations with AI-integrated development environments such as Cursor and Kiro, and new skills for connecting to and managing OpenSearch Serverless resources. Users can start creating new collections using the web console, the AWS SDK, and the AWS CLI, with support for AWS CloudFormation coming soon.

Amazon OpenSearch Serverless is a fully managed service that allows engineers to operate and scale both text and vector search engines in the AWS Cloud. It is based on OpenSearch, the open-source search and observability suite.

AWS says it is positioning OpenSearch Serverless as a foundational building block for agentic AI workloads, providing native integrations with AI development platforms like Vercel and Kiro. AWS has also contributed to the OpenSearch Agent Skills with dedicated skills that enable developers to provision and manage OpenSearch resources from popular AI-assisted coding platforms like Claude Code, Cursor, and Codex.

AWS is also extending support for OpenSearch Serverless in Vercel. Developers building AI agent applications can use it to create new serverless collections or connect to existing ones directly from the Vercel console.

In their blog, Sohaib Katariwala, senior specialist solutions architect at AWS, Arjun Nambiar, senior analytics and AI specialist solutions architect at AWS, and Raj Ramasubbu, product manager with Amazon OpenSearch Service, describe how AWS revisited the service to achieve these improvements. The authors introduce two named architectures: Classic, which is the one existing collections will refer to, and NextGen, which will be the default when creating new collections and which is going to be the one benefitting from the improvements.

Amazon OpenSearch Serverless architecture
Amazon OpenSearch Serverless architecture

The new shared storage layer in the NextGen architecture decouples compute, referred to as OpenSearch Capacity Units (OCU), from storage. It makes OCUs stateless, which has two practical consequences: fast provisioning and efficient scale down.

With fast provisioning, OCUs do not need to bootstrap the local disk; they can start serving requests in seconds. The shared storage is mounted directly on the OCU.

In efficient scale down, idle capacity can be released without impacting user data since the data does not live in the OCU.

The new architecture also introduces two new endpoint formats under the on.aws domain, both using AWS PrivateLink, which allows the creation of virtual private cloud (VPC) endpoints for internal access from the user’s VPC or on-premise infrastructure.

The per-collection endpoint (<collection-id>.aoss.<region>.on.aws) works the same as before, providing access to one collection per endpoint.

The new per-account regional endpoint (<account-id>.aoss.<region>.on.aws), instead, grants users access to all collections through a single hostname; users can specify the target collection using either the x-amz-aoss-collection-id or x-amz-aoss-collection-name headers. This new endpoint enables better network resource management, such as a single connection pool and Transport Layer Security (TLS) session.

Collection groups, introduced in February 2026, are taking a more central role when creating and managing new NextGen collections. The generation, either Classic or NextGen, is set only at group-level and applies to all collections created in it. Furthermore, users can use collection groups to share compute capacity across multiple collections, which can reduce costs for smaller workloads.

Users can create NextGen collections either via the console, the AWS SDK, or the AWS CLI. AWS says support for AWS CloudFormation will be coming soon.

When creating from the console, AWS has implemented a simplified Express create method, in addition to the standard one, with sensible defaults.

Amazon OpenSearch Serverless sits between traditional search platforms and newer AI-focused data stores. Its closest competitor is Elasticsearch Serverless, which provides similar managed search and analytics capabilities. PostgreSQL with pgvector offers a simpler database-centric approach for teams already invested in Postgres, while specialist vector databases such as Pinecone focus on high-performance similarity search for AI and RAG workloads. Together, these options represent different trade-offs between search functionality, operational simplicity, and AI optimisation.

The creation process with AWS SDK or AWS CLI does not benefit from the console's simplified experience and requires creating the collection group first, then creating the collection.

aws opensearchserverless create-collection-group \
--name articles-cg \
--generation NEXTGEN \
--standby-replicas ENABLED \
--capacity-limits "minIndexingCapacityInOCU=0,maxIndexCapacityInOCU=4,minSearchCapacityInOCU=4,maxSearchCapacityInOCU=2"

aws opensearchserverless create-collection \
--name articles-vectors \
--type VECTORSEARCH \
--collection-group-name articles-cg 

On social media platforms, users welcome the scale-to-zero introduction, highlighting how this feature was one of the main pain points when using this service for small use cases.

This is huge, until now we had to use solutions like algolia to have true serverless search dbs.

Now we can start using opesearch for small apps too.

Others warn that scale-to-zero comes with trade-offs, such as cold start, and teams have to assess how they impact their applications.

[…] Lower idle costs and better multi-tenancy, but teams should still plan for cold starts and initialization latency

The next generation of Amazon OpenSearch Service is available in all commercial AWS regions where Amazon OpenSearch Serverless is already available. Users are charged for the compute used in OCUs for search, indexing, and GPU acceleration. Storage is charged separately in GB-month.

About the Author

Rate this Article

Adoption
Style

BT