InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
Microsoft Introduces Vector Data Abstractions Library for .NET
On October 29th 2024, Microsoft released Microsoft.Extensions.VectorData.Abstractions library for .NET in preview. It makes it easier to integrate .NET solutions with the AI Semantic Kernel SDK, using abstractions over concrete AI implementations and models.
-
Meta AI Introduces Thought Preference Optimization Enabling AI Models to Think before Responding
Researchers from Meta FAIR, the University of California, Berkeley, and New York University have introduced Thought Preference Optimization (TPO), a new method aimed at improving the response quality of instruction-fine tuned LLMs.
-
PostgreSQL 17 Released with Improved Vacuum Process and Performance Gains
The PostgreSQL Global Development Group recently announced the general availability of PostgreSQL 17, the latest version of the popular open-source database. This release focuses on performance improvements, including a new memory management implementation for vacuum, storage access optimizations, and enhancements for high-concurrency workloads.
-
Meta Spirit LM Integrates Speech and Text in New Multimodal GenAI Model
Presented in a recent paper, Spirit LM enables the creation of pipelines that mixes spoken and written text to integrate speech and text in the same multimodal model. According to Meta, their novel approach, based on interleaving text and speech tokens, makes it possible to circumvent the inherent limitations of prior solutions that use distinct pipelines for speech and text.
-
PyTorch 2.5 Release Includes Support for Intel GPUs
The PyTorch Foundation recently released PyTorch version 2.5, which contains support for Intel GPUs. The release also includes several performance enhancements, such as the FlexAttention API, TorchInductor CPU backend optimizations, and a regional compilation feature which reduces compilation time. Overall, the release contains 4095 commits since PyTorch 2.4.
-
RAG-Powered Copilot Saves Uber 13,000 Engineering Hours
Uber recently detailed how it built Genie, an AI-powered on-call copilot designed to improve the efficiency of on-call support engineers. Genie leverages Retrieval-Augmented Generation (RAG) to provide accurate real-time responses and significantly enhance the speed and effectiveness of incident response. Since its launch, Genie has answered over 70,000 questions, saving 13,000 engineering hours.
-
Rhymes AI Unveils Aria: Open-Source Multimodal Model with Development Resources
Rhymes AI has introduced Aria, an open-source multimodal native Mixture-of-Experts (MoE) model capable of processing text, images, video, and code effectively. In benchmarking tests, Aria has outperformed other open models and demonstrated competitive performance against proprietary models such as GPT-4o and Gemini-1.5.
-
Stable Diffusion 3.5 Improves Text Rendering, Image Quality, Consistency, and More
Stability AI has released Stable Diffusion 3.5 Large, its most powerful text-to-image generation model to date, and Stable Diffusion 3.5 Large Turbo, with special emphasis on customizability, efficiency, and flexibility. Both models come with a free licensing model for non commercial and limited commercial use.
-
AI and ML Tracks at QCon San Francisco 2024 – a Deep Dive into GenAI & Practical Applications
At QCon San Francisco 2024, explore two AI/ML-focused tracks highlighting real-world applications and innovations. Learn from industry experts on deploying LLMs, GenAI, and recommendation systems, gaining practical strategies for integrating AI into software development.
-
Meta Optimizes Data Center Sustainability with Reinforcement Learning
In a recent blog post, Meta describes how its engineers use reinforcement learning (RL), to optimize environmental controls in Meta’s data centers, reducing energy consumption and water usage while addressing broader challenges such as climate change.
-
Microsoft Unveils Azure Cobalt 100-Based Virtual Machines: Enhanced Performance and Sustainability
Microsoft's Azure Cobalt 100 VMs are now generally available. They deliver up to 50% improved price performance with energy-efficient Arm architecture. Tailored for diverse workloads, these VMs offer various configurations, including general-purpose and memory-optimized options. Their release supports sustainable computing, aligning with Microsoft's commitment to lower carbon footprints.
-
Microsoft Launches Azure Confidential VMs with NVIDIA Tensor Core GPUs for Enhanced Secure Workloads
Microsoft's Azure has launched the NCC H100 v5 virtual machines, now equipped with NVIDIA Tensor Core GPUs, enhancing secure computing for high-performance workloads. These VMs leverage AMD EPYC processors for robust data protection, making them ideal for tasks like AI model training and inferencing, while ensuring a trusted execution environment for sensitive applications.
-
Distill Your LLMs and Surpass Their Performance: spaCy's Creator at InfoQ DevSummit Munich
In her presentation at the inaugural edition of InfoQ Dev Summit Munich, Ines Montani built on top of the presentation she had earlier this year at QCon London and provided the audience with practical solutions for using the latest state-of-the-art models in real-world applications and distilling their knowledge into smaller and faster components that you can run and maintain in-house.
-
University Researchers Publish Analysis of Chain-of-Thought Reasoning in LLMs
Researchers from Princeton University and Yale University published a case study of Chain-of-Thought (CoT) reasoning in LLMs which shows evidence of both memorization and true reasoning. They also found that CoT can work even when examples given in the prompt are incorrect.
-
Microsoft and Tsinghua University Present DIFF Transformer for LLMs
Researchers from Microsoft AI and Tsinghua University have introduced a new architecture called the Differential Transformer (DIFF Transformer), aimed at improving the performance of large language models. This model enhances attention mechanisms by refining how models handle context and minimizing distractions from irrelevant information.