InfoQ Homepage GPU Content on InfoQ

News

RSS Feed

Newer Older

AI, ML & Data Engineering

Microsoft Launches Open-Source Phi-3.5 Models for Advanced AI Development

Microsoft launched three new open-source AI models in its Phi-3.5 series: Phi-3.5-mini-instruct, Phi-3.5-MoE-instruct, and Phi-3.5-vision-instruct. Available under a permissive MIT license, these models offer developers powerful tools for various tasks, including reasoning, multilingual processing, and image and video analysis.

Robert Krzaczyński
on Aug 31, 2024
AI, ML & Data Engineering

Meta's Research SuperCluster for Real-Time Voice Translation AI Systems

A recent article from Engineering at Meta reveals how the company is building Research SuperCluster (RSC) infrastructure that is used for advancements in real-time voice translations, language processing, computer vision, and augmented reality (AR).

Vinod Goje
on Aug 21, 2024
AI, ML & Data Engineering

NVIDIA Announces Next-Generation AI Superchip Blackwell

NVIDIA recently announced their next generation GPU architecture, Blackwell. Blackwell is the largest GPU ever built, with over 200 billion transistors, and can train large language models (LLMs) up to 4x faster than previous generation hardware.

Anthony Alford
on Apr 09, 2024
AI, ML & Data Engineering

Nvidia Announces Robotics-Oriented AI Foundational Model

At its recent GTC 2024 event, Nvidia announced a new foundational model to build intelligent humanoid robots. Dubbed GR00T, short for Generalist Robot 00 Technology, the model will understand natural language and be able to observe human actions and emulate human movements.

Sergio De Simone
on Apr 05, 2024
AI, ML & Data Engineering

Meta Unveils 24k GPU AI Infrastructure Design

Meta recently announced the design of two new AI computing clusters, each containing 24,576 GPUs. The clusters are based on Meta's Grand Teton hardware platform, and one cluster is currently used by Meta for training their next-generation Llama 3 model.

Anthony Alford
on Apr 02, 2024
AI, ML & Data Engineering

NVIDIA Introduces Metropolis Microservices for Jetson to Run AI Apps at the Edge

NVIDIA has expanded its Nvidia Metropolis Microservices Cloud-based AI solution to run on the NVIDIA Jetson IoT embedded platform, including support for video streaming and AI-based perception.

Sergio De Simone
on Feb 08, 2024
DevOps

LeftoverLocals May Leak LLM Responses on Apple, Qualcomm, and AMD GPUs

Security firm Trail of Bits disclosed a vulnerability allowing malicious actors to recover data from GPU local memory on Apple, Qualcomm, AMD, and Imagination GPUs. Dubbed LeftoverLocals, the vulnerability affects any application using the GPU, including Large Language Models (LLMs) and machine learning (ML) models.

Sergio De Simone
on Jan 25, 2024
AI, ML & Data Engineering

AWS Unveils Gemini, a Distributed Training System for Swift Failure Recovery in Large Model Training

AWS and Rice University have introduced Gemini, a new distributed training system to redefine failure recovery in large-scale deep learning models. According to the research paper, Gemini adopts a daring strategy by utilizing CPU memory to ensure previously unheard-of speeds in failure recovery, overcoming obstacles related to high recovery costs and constrained checkpoint storage capacity.

Daniel Dominguez
on Nov 10, 2023
AI, ML & Data Engineering

Microsoft Releases DeepSpeed-FastGen for High-Throughput Text Generation

Microsoft has announced the alpha release of DeepSpeed-FastGen, a system designed to improve the deployment and serving of large language models (LLMs). DeepSpeed-FastGen is the synergistic composition of DeepSpeed-MII and DeepSpeed-Inference . DeepSpeed-FastGen is based on the Dynamic SplitFuse technique. The system currently supports several model architectures.

Andrew Hoblitzell
on Nov 07, 2023
Development

Python-Like Numerical Computation Library MatX Brings Transforms as Operators and Other Features

Developed by Nvidia for its own GPUs, MatX is a C++ library that aims to bring near-native performance in numerical computing using a high-level syntax not far from those available in Python scipy or MATLAB. Its latest release brings a number of new features, including the possibility to use transforms as operators, new operators such as upsample, downsample, pwelch, and more.

Sergio De Simone
on Oct 23, 2023
DevOps

Google Cloud Ops Agent Can Now Monitor Nvidia GPUs

Google Cloud announced that Ops Agent, the agent for collecting telemetry from Compute Engine instances, can now collect and aggregate metrics from NVIDIA GPUs on VMs.

Claudio Masolo
on Oct 17, 2023
Cloud

Azure Previews ND H100 V5 Virtual Machines to Accelerate Generative AI

Azure recently announced the preview of the ND H100 v5, virtual machines that integrate the latest Nvidia H100 Tensor Core GPUs and support Quantum-2 InfiniBand networking. According to Microsoft, the new option will offer AI developers improved performance and scaling across thousands of GPUs.

Renato Losio
on Apr 08, 2023
AI, ML & Data Engineering

AWS and NVIDIA to Collaborate on Next-Gen EC2 P5 Instances for Accelerating Generative AI

AWS and NVIDIA announced the development of a highly scalable, on-demand AI infrastructure that is specifically designed for training large language models and creating advanced generative AI applications. The collaboration aims to create the most optimized and efficient system of its kind, capable of meeting the demands of increasingly complex AI tasks.

Daniel Dominguez
on Mar 24, 2023
AI, ML & Data Engineering

NVIDIA Kubernetes Device Plug-in Brings Temporal GPU Concurrency

Starting from the v12 release, the Nvidia GPU device plug-in framework started supporting time-sliced sharing between CUDA workloads on Kubernetes. This feature aims to prevent under-utilization of GPU units and make it easier to scale applications by leveraging concurrently-executing CUDA contexts.

Sabri Bolkar
on Dec 19, 2022
Development

Asahi Linux Gets Alpha GPU Drivers on Apple Silicon

After two years of work to reverse engineer Apple Silicon GPU instruction set and to implement the kernel driver, Asahi Linux has finally got an alpha-quality release of its GPU driver that is already good enough to run a smooth desktop experience and some games, Asahi developers Alyssa Rosenzweig and Asahi Lina say.

Sergio De Simone
on Dec 11, 2022

Newer News

Older News

InfoQ Software Architects' Newsletter

News