Microsoft Announces the General Availability of NDm A100 v4 Series Virtual Machines

Recently, Microsoft announced the general availability (GA) of a brand-new virtual machine (VM) series in Azure, the NDm A100 v4 Series - featuring NVIDIA A100 Tensor Core 80 GB GPUs. This high-performance computing (HPC) VM is designed to deliver high performance, scalability, and cost efficiency for various real-world HPC workloads.

Earlier, the company released the ND A100 v4 series, featuring NVIDIA A100 Tensor Core GPUs, each equipped with 40 GB of HBM2 memory. With the new NDm A100 v4 series, it doubles to 80 GB, along with a 30 percent increase in GPU memory bandwidth. Sherry Wang, senior program manager, Azure HPC, and AI, stated in an Azure blog post on the new series:

The high-memory NDm A100 v4 series brings AI-Supercomputer power to the masses by creating opportunities for all businesses to use it as a competitive advantage. Cutting-edge AI customers are using both 40 GB ND A100 v4 VMs and 80 GB NDm A100 v4 VMs at scale for large-scale production AI and machine learning workloads, and seeing impressive performance and scalability, including OpenAI for research and products.

The NDm A100 v4 series starts with a single virtual machine (VM) and eight NVIDIA Ampere A100 80GB Tensor Core GPUs. Furthermore, NDm A100 v4-based deployments can scale up to thousands of GPUs with a 1.6 Tb/s of interconnect bandwidth per VM – and each chip has its own HDR 200G InfiniBand link that can create fast connections to thousands of GPUs in Azure.

Source: https://azure.microsoft.com/en-us/blog/microsoft-expands-its-aisupercomputer-lineup-with-general-availability-of-the-latest-80gb-nvidia-a100-gpus-in-azure-claims/

Microsoft remains committed to delivering VMs for high-end Deep Learning training and tightly coupled scale-up and scale-out HPC workloads. In a tech community blog post, Ian Buck, vice president and general manager of Accelerated Computing at NVIDIA, said:

The convergence of HPC and AI is a revolution, bringing dramatic acceleration to every kind of simulation and advancing fields across science and industry. The Azure NDm A100 v4 instance combines the power of NVIDIA GPU acceleration and NVIDIA InfiniBand networking to enable researchers to make new discoveries faster and advance state-of-the-art science.

Lastly, other leading public cloud providers AWS and Google also offer a wide selection of instance types, varying storage, CPU, memory, and networking capacity to support various workloads. Moreover, AWS offers VMs for HPC, including the recently released G5 instances, and Google provides accelerated computing (A2) based on the NVIDIA Ampere A100 Tensor Core GPU.

InfoQ Software Architects' Newsletter

Write for InfoQ

Rate this Article

This content is in the Cloud topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

The InfoQ Newsletter