In a recent blog post, Google announced the introduction of the Accelerator-Optimized VM (A2) family on Google Compute Engine, based on the NVIDIA Ampere A100 Tensor Core GPU. A2 provides up to 16 GPUs in a single VM and is the first A100-based offering in the public cloud.
Google designed the A2-family of VMs to boost training and inference computing performance for its customers. The A2 features the NVIDIA A-100 Tensor Core graphics processing unit based up the newly NVIDIA Ampere architecture. According to the blog post, the A-100 offers up to 20 times the compute performance compared to the previous generation GPU, and comes with 40 GB of high performance HBM2 GPU memory. Also, A2 VMs come with up to 96 Intel Cascade Lake vCPUs, optional Local SSDs for workloads requiring faster data feeds to GPUs and up to 100 Gbps of networking.
When customers have more demanding workloads, the A2 offers the a2-megagpu-16g instance with 16 A100 GPUs, which include a total of 640 GB of GPU memory, 1.3 TB of system memory and all-in-all connected through NVSwitch with up to 9.6TB/s of aggregate bandwidth.
Note that A2 also offers smaller configurations allowing customers to match their need for GPU compute power. Customers can choose between five configurations, from one to 16 GPU’s, with two different CPU- and networking-to-GPU ratios - each GPU to be partitioned into as many as seven GPU instances, owing to Ampere’s multi-instance group (MIG) capability.
Ian Buck, general manager and vice president of Accelerated Computing at NVIDIA, wrote in a recent company blog post about the availability of the A-100 on GCP:
In cloud data centers, A100 can power a broad range of compute-intensive applications, including AI training and inference, data analytics, scientific computing, genomics, edge video analytics, 5G services, and more.
With the A2 family, VMs Google further expands its portfolio of predefined and custom VMs ranging from compute- to accelerator optimized machines. Moreover, the company continues to compete with other cloud vendors such as Microsoft, which recently released new general purpose and memory-optimized VM families on various Intel Chipsets (AVX-512) – and AWS, which recently released EC2 Inf1 instances based on its Inferentia chips. Many of these new VM types are targeted for customers with AI and Machine Learning workloads.
Holger Mueller, principal analyst and vice president at Constellation Research Inc., told InfoQ:
The battle for cloud leadership is primarily fought in the AI battle, and that is all about getting the AI load of enterprises attracted to each vendor's cloud. In the middle are platform vendors like NVidia, that provide a cross-cloud platform and on-premise option. So with Google bringing the newest Nvidia platform to its Google Cloud, it makes it easier for CxOs to move AI workloads across on-premises and to the (Google) cloud.
Also, he said:
With Google being the #3 vendor, it has to be more open and more creative at attracting load - and this is another example of the Google strategy. In contrast, the larger AWS and Azure strategy is still to move to cloud proprietary compute architectures for AI loads. CxOs need to be aware that lock-in is still a desirable outcome for most technology vendors and needs to balance the risks between convenience, speed and lock-in.
Currently, the A2 VM family is in alpha, and customers can request access by signing up. Moreover, Google states that public availability and pricing information will come later in the year. And finally, the company also announced forthcoming Nvidia A100 support for Google Kubernetes Engine, Cloud AI Platform, and other services.