Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News Amazon Introduces the Predictive Scaling Feature to EC2 Instances

Amazon Introduces the Predictive Scaling Feature to EC2 Instances

This item in japanese

In a recent blog post, Amazon announced that they have made Auto Scaling for EC2 instances more powerful with the addition of a predictive scaling feature. Furthermore, with this new feature, customers can create a scaling plan without the need to manually adjust autoscaling over time manually.

In 2009 Amazon added scaling capabilities such as Auto Scaling and Elastic Load Balancing to EC2 instances that could automatically respond to rapid changes in traffic demand. Going forward with the evolution of machine learning in the cloud, Amazon today adds well-trained models to predict customers' expected traffic and EC2 usage -- this addition is the named "predictive scaling" . Jeff Barr, chief evangelist for AWS, stated in the blog post:

The model needs at least one day’s of historical data to start making predictions; it is re-evaluated every 24 hours to create a forecast for the next 48 hours.

The predictions will be based on the customer's usage and from millions of data points from tens of thousands of EC2 instances with different runtimes by Amazon itself. Amazon uses these data points to build a sophisticated Recurrent Neural Network (RNN) that can, for example, predict the average CPU utilization of the EC2 fleet.

Customers can enable predictive scaling by using a three-step wizard process to choose the resources that they want to observe and scale. The first step is opening the Auto Scaling Console, and search for scalable resources. Next, select an EC2 Auto Scaling group, assign the group a name, pick a scaling strategy, and leave both Enable predictive scaling and Enable dynamic scaling


Barr explains in the blog that predictive scaling works by forecasting load and scheduling minimum capacity; dynamic scaling uses "target tracking" to converge a designated CloudWatch metric to a specific target. Both models work well together because of the scheduled minimum capacity already set by predictive scaling. The forecast can be made on one of the three pre-chosen metrics or custom metrics, and customers can fine-tune the predictive scaling to their own needs. Finally, when the plan is ready, and the learning and prediction process can begin. Through the console customers can observe the forecast on the chosen metrics.


According to Barr, there are a few considerations about predictive scaling:

  • Timing – Once the initial set of predictions have been made and the scaling plans are in place, the plans are updated daily, and forecasts are made for the following two days.
  • Cost – You can use predictive scaling at no charge, and may even reduce your AWS charges.
  • Resources – We are launching with support for EC2 instances, and plan to support other AWS resource types over time.
  • Applicability – Predictive scaling is a great match for websites and applications that undergo periodic traffic spikes. It is not designed to help in situations where spikes in load are not cyclic or predictable.
  • Long-Term Baseline – Predictive scaling maintains the minimum capacity based on historical demand; this ensures that any gaps in the metrics won’t cause an inadvertent scale-in.

The predictive scaling feature for EC2 instances will be available initially in the US East (N. Virginia), US East (Ohio), US West (Oregon), Europe (Ireland), and Asia Pacific (Singapore) regions. For more details on EC2 Auto Scaling pricing, see the pricing page.

Rate this Article