Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News How Amazon Prime Video Delivers 99.999% Availability While Reducing Costs

How Amazon Prime Video Delivers 99.999% Availability While Reducing Costs

Amazon Prime Video created a highly available live video streaming architecture by combining redundant components to achieve the five-nines of availability that they require for their platform. The company used in-depth monitoring to mitigate issues before any customer impact was observed. They also optimized the deployment topology and video encoding to reduce costs while ensuring optimal video quality for users.

Prime Video began operating live video streaming in 2015 and gradually expanded its offering with over 1,000 TV stations spanning news, sports, and entertainment. Ben Forman, the head of live channels architecture at Prime Video, summarises the expectations on the availability of Prime Video live streaming services:

We operate in partnership with over 30 broadcasters and broadcast services providers, those partners deliver live video into Prime Video 24 hours a day, seven days a week, and 365 days a year. [...] Traditional broadcast television has 100 years of innovation in reliability; and viewers all over the world expect this when they switch on their TVs. So, the most important thing for us at Prime Video is that this experience, of the TV always being on, is the same for Prime Video customers watching live TV services, delivered through our applications.

The architecture of the live streaming platform combines AWS Media Services components. AWS Elemental MediaConnect is a high-quality transport service for live video. AWS Elemental MediaLive is a live video processing service that can encode video content for broadcast and streaming. AWS Elemental MediaPackage prepares and distributes video content to various connected devices in multiple formats.

Live Streaming Platform Architecture (Source: Prime Video Technology Blog)

Prime Video hosts its video processing and distribution stack in two separate regions for each external source and utilizes Direct Connect and Transit Gateway to enable network connectivity between partner devices and the platform. Each partner delivers two source signals (primary and backup) to each region, enabling failover if one becomes unavailable. Media processing and network components are arranged sequentially in each region.

Considering the overall availability of components in series A = Ax * Ay and in parallel A= 1 - (1 - Ax) * (1 - Ay), and the fact that AWS Elemental services offer 99.9% SLA, it is possible to calculate the overall availability of media components of Prime Video's platform: A = 1 - (1 - 99.9% * 99.9% * 99.9%) * (1 - 99.9% * 99.9% * 99.9%) = 99.999%. By deploying redundant processing/distribution stacks in separate regions, Prime Video is able to deliver five-nines of availability using services offering lower SLAs.

Overall Availability Calculations (Source: Prime Video Technology Blog)

Even with the architecture that can deliver high availability, it’s essential to actively monitor the health and performance of all its components to quickly react if any key metrics fall short of the expected levels, potentially leading to degraded quality of service. Additionally, collecting metrics allows for observing overall trends and SLO (Service Level Objective) reporting.

Prime Video team strives to reduce infrastructure costs by choosing the most optimal AWS regions to host media processing/distribution stack based on the partner's signal source location or utilizing the AWS backbone network to reliably deliver the data or pick up the signal within the AWS if possible. They also tune the infrastructure footprint and configurations and optimize adaptive bitrate encoding based on data science analysis.

About the Author

Rate this Article