Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News AWS Releases Enhancements to AI Services for NLP, Speech-to-Text Transcription, and Image Detection

AWS Releases Enhancements to AI Services for NLP, Speech-to-Text Transcription, and Image Detection

This item in japanese

Amazon Web Services (AWS) released new features for three of its AI services: Amazon Comprehend, Amazon Rekognition, and Amazon Transcribe.

Amazon Comprehend is a fully managed natural-language processing (NLP) service that can perform a variety of common tasks on unstructured text data, such as identifying the names of people and places, determining sentiment, or detecting topics. Comprehend can process large collections of documents stored in Amazon S3. For compliance reasons, many companies store their documents in S3 in encrypted form; NLP, though, requires the decrypted text data. S3 offers the option to manage the encryption keys, but some companies opt to use Amazon's Key Management Service (KMS) to encrypt the data with their own keys. A new feature of Comprehend integrates with KMS to seamlessly decrypt data for processing.

Amazon Transcribe is an automatic speech recognition (ASR) service that converts speech audio into text transcripts. The service allows users to define custom vocabularies, which are lists of words that the system might not recognize by default, such as domain-specific phrases or acronyms. With this release, the custom vocabulary allows users to better describe the pronunciation of the words in the vocabulary by using the International Phonetic Alphabet (IPA) representation, or a "sounds like" pronunciation using the source language's own spelling system. Users can also specify how Transcribe outputs the words; for example "street" might be output as "St."

Amazon Rekognition is a service for analyzing images and video to detect objects, faces, and text. Amazon announced the fifth update to their facial detection and analysis models, with improved accuracy of "gender identification, emotion detection...and attributes such as ‘EyesOpen'".

These new features are available in all regions where the services are supported, and at no additional charge.

Rate this Article