InfoQ Homepage AI, ML & Data Engineering Content on InfoQ

News

RSS Feed

Newer Older

AI, ML & Data Engineering

Meta AI Introduces Thought Preference Optimization Enabling AI Models to Think before Responding

Researchers from Meta FAIR, the University of California, Berkeley, and New York University have introduced Thought Preference Optimization (TPO), a new method aimed at improving the response quality of instruction-fine tuned LLMs.

Daniel Dominguez
on Nov 04, 2024
AI, ML & Data Engineering

PostgreSQL 17 Released with Improved Vacuum Process and Performance Gains

The PostgreSQL Global Development Group recently announced the general availability of PostgreSQL 17, the latest version of the popular open-source database. This release focuses on performance improvements, including a new memory management implementation for vacuum, storage access optimizations, and enhancements for high-concurrency workloads.

Renato Losio
on Nov 01, 2024
AI, ML & Data Engineering

Meta Spirit LM Integrates Speech and Text in New Multimodal GenAI Model

Presented in a recent paper, Spirit LM enables the creation of pipelines that mixes spoken and written text to integrate speech and text in the same multimodal model. According to Meta, their novel approach, based on interleaving text and speech tokens, makes it possible to circumvent the inherent limitations of prior solutions that use distinct pipelines for speech and text.

Sergio De Simone
on Oct 31, 2024
AI, ML & Data Engineering

PyTorch 2.5 Release Includes Support for Intel GPUs

The PyTorch Foundation recently released PyTorch version 2.5, which contains support for Intel GPUs. The release also includes several performance enhancements, such as the FlexAttention API, TorchInductor CPU backend optimizations, and a regional compilation feature which reduces compilation time. Overall, the release contains 4095 commits since PyTorch 2.4.

Anthony Alford
on Oct 29, 2024
Architecture & Design

RAG-Powered Copilot Saves Uber 13,000 Engineering Hours

Uber recently detailed how it built Genie, an AI-powered on-call copilot designed to improve the efficiency of on-call support engineers. Genie leverages Retrieval-Augmented Generation (RAG) to provide accurate real-time responses and significantly enhance the speed and effectiveness of incident response. Since its launch, Genie has answered over 70,000 questions, saving 13,000 engineering hours.

Eran Stiller
on Oct 29, 2024
AI, ML & Data Engineering

Rhymes AI Unveils Aria: Open-Source Multimodal Model with Development Resources

Rhymes AI has introduced Aria, an open-source multimodal native Mixture-of-Experts (MoE) model capable of processing text, images, video, and code effectively. In benchmarking tests, Aria has outperformed other open models and demonstrated competitive performance against proprietary models such as GPT-4o and Gemini-1.5.

Robert Krzaczyński
on Oct 28, 2024
AI, ML & Data Engineering

Stable Diffusion 3.5 Improves Text Rendering, Image Quality, Consistency, and More

Stability AI has released Stable Diffusion 3.5 Large, its most powerful text-to-image generation model to date, and Stable Diffusion 3.5 Large Turbo, with special emphasis on customizability, efficiency, and flexibility. Both models come with a free licensing model for non commercial and limited commercial use.

Sergio De Simone
on Oct 25, 2024
AI, ML & Data Engineering

AI and ML Tracks at QCon San Francisco 2024 – a Deep Dive into GenAI & Practical Applications

At QCon San Francisco 2024, explore two AI/ML-focused tracks highlighting real-world applications and innovations. Learn from industry experts on deploying LLMs, GenAI, and recommendation systems, gaining practical strategies for integrating AI into software development.

Artenisa Chatziou
on Oct 25, 2024
DevOps

Meta Optimizes Data Center Sustainability with Reinforcement Learning

In a recent blog post, Meta describes how its engineers use reinforcement learning (RL), to optimize environmental controls in Meta’s data centers, reducing energy consumption and water usage while addressing broader challenges such as climate change.

Claudio Masolo
on Oct 25, 2024
Cloud

Microsoft Unveils Azure Cobalt 100-Based Virtual Machines: Enhanced Performance and Sustainability

Microsoft's Azure Cobalt 100 VMs are now generally available. They deliver up to 50% improved price performance with energy-efficient Arm architecture. Tailored for diverse workloads, these VMs offer various configurations, including general-purpose and memory-optimized options. Their release supports sustainable computing, aligning with Microsoft's commitment to lower carbon footprints.

Steef-Jan Wiggers
on Oct 24, 2024
Cloud

Microsoft Launches Azure Confidential VMs with NVIDIA Tensor Core GPUs for Enhanced Secure Workloads

Microsoft's Azure has launched the NCC H100 v5 virtual machines, now equipped with NVIDIA Tensor Core GPUs, enhancing secure computing for high-performance workloads. These VMs leverage AMD EPYC processors for robust data protection, making them ideal for tasks like AI model training and inferencing, while ensuring a trusted execution environment for sensitive applications.

Steef-Jan Wiggers
on Oct 23, 2024
AI, ML & Data Engineering

Distill Your LLMs and Surpass Their Performance: spaCy's Creator at InfoQ DevSummit Munich

In her presentation at the inaugural edition of InfoQ Dev Summit Munich, Ines Montani built on top of the presentation she had earlier this year at QCon London and provided the audience with practical solutions for using the latest state-of-the-art models in real-world applications and distilling their knowledge into smaller and faster components that you can run and maintain in-house.

Olimpiu Pop
on Oct 23, 2024
AI, ML & Data Engineering

University Researchers Publish Analysis of Chain-of-Thought Reasoning in LLMs

Researchers from Princeton University and Yale University published a case study of Chain-of-Thought (CoT) reasoning in LLMs which shows evidence of both memorization and true reasoning. They also found that CoT can work even when examples given in the prompt are incorrect.

Anthony Alford
on Oct 22, 2024
AI, ML & Data Engineering

Microsoft and Tsinghua University Present DIFF Transformer for LLMs

Researchers from Microsoft AI and Tsinghua University have introduced a new architecture called the Differential Transformer (DIFF Transformer), aimed at improving the performance of large language models. This model enhances attention mechanisms by refining how models handle context and minimizing distractions from irrelevant information.

Daniel Dominguez
on Oct 20, 2024
AI, ML & Data Engineering

OpenAI Releases Swarm, an Experimental Open-Source Framework for Multi-Agent Orchestration

Recently released as an experimental tool, Swarm aims to allow developers to investigate how they can have multiple agents coordinate with one another to execute tasks using routines and handoffs.

Sergio De Simone
on Oct 20, 2024

Newer News

Older News

InfoQ Software Architects' Newsletter

News