Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News Microsoft Debuts Custom Chips for Cloud and AI: Azure Maia AI Accelerator and Azure Cobalt CPU

Microsoft Debuts Custom Chips for Cloud and AI: Azure Maia AI Accelerator and Azure Cobalt CPU

During the recent Ignite conference, Microsoft introduced two custom-designed chips for their cloud infrastructure: Microsoft Azure Maia AI Accelerator (Athena), optimized for artificial intelligence (AI) tasks and generative AI, and Microsoft Azure Cobalt CPU, an Arm-based processor tailored to run general-purpose compute workloads on the Microsoft Cloud.

The Azure Maia AI and Azure Cobalt chip will arrive in 2024 at Microsoft’s datacenters, initially powering the company’s services such as Microsoft Copilot or Azure OpenAI Service.

The Maia AI chip is manufactured on a 5-nanometer TSMC process and has 105 billion transistors. It's designed specifically for the Azure hardware stack and will power some of the largest internal AI workloads running on Microsoft Azure, according to a press release from the company.

At the same time, the Azure Cobalt is a 128-core chip built on an Arm Neoverse CSS design and customized for Microsoft. Wes McCullough, a corporate vice president of hardware product development at Microsoft, said in a press release:

Choosing Arm technology was a key element in Microsoft’s sustainability goal. It aims to optimize "performance per watt" throughout its datacenters, which essentially means getting more computing power for each unit of energy consumed.

In the past, Microsoft built its servers and racks to drive down cost, and by having silicon added to it, the company can also influence cooling efficiency and optimize server capacity. The latter is driven by Microsoft's mission to become carbon-negative by 2030.

Microsoft competitors Google and AWS already have their silicon for AI workloads. For instance, Google's Tensor Processing Unit, announced at Google I/O in 2016, is used in their data centers to support the TensorFlow framework for machine learning applications such as neural networks. Furthermore, AWS introduced its Graviton Arm-based chip and Inferentia AI processor in 2018, and it announced Trainium, for training models in 2020.

A respondent, Aromasin, on a Hacker News thread, commented:

Note that Microsoft has been using "custom chips" for years; they've just been on FPGAs, not ASICs. They've developed IP to accelerate a whole bunch of processes, so it's not like they've suddenly come around to a magic sauce that Amazon and Google have been doing this whole time. It would surprise me if over half of the new chip is just based on RTL from their FPGA design.

The only thing that's changed is that they're scaling like crazy now and can justify overhead that comes with designing ASICs versus using off the shelf parts.

Microsoft partners and works closer with silicon providers AMD (AMD MI300X accelerated VMs) and Nvidia (NVIDIA H200 Tensor Core GPU). However, unlike AMD and Nvidia, Microsoft will not let customers buy servers with their chips.

Scott Guthrie, executive vice president of Microsoft’s Cloud and AI Group, tweeted:

Microsoft has reimagined our infrastructure with an end-to-end systems approach to meet our customer’s unique AI and cloud needs. With the launch of our new AI Accelerator, Azure Maia, and cloud native CPU, Azure Cobalt, alongside our continued partnerships with silicon providers, we can now provide even more choice and performance.

Lastly, Microsoft intends to broaden the range of choices in the future, with ongoing development efforts focused on creating second-generation iterations for both the Azure Maia AI Accelerator and the Azure Cobalt CPU.

About the Author

Rate this Article