InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
AI and ML Tracks at QCon San Francisco 2024 – a Deep Dive into GenAI & Practical Applications
At QCon San Francisco 2024, explore two AI/ML-focused tracks highlighting real-world applications and innovations. Learn from industry experts on deploying LLMs, GenAI, and recommendation systems, gaining practical strategies for integrating AI into software development.
-
Meta Optimizes Data Center Sustainability with Reinforcement Learning
In a recent blog post, Meta describes how its engineers use reinforcement learning (RL), to optimize environmental controls in Meta’s data centers, reducing energy consumption and water usage while addressing broader challenges such as climate change.
-
Microsoft Unveils Azure Cobalt 100-Based Virtual Machines: Enhanced Performance and Sustainability
Microsoft's Azure Cobalt 100 VMs are now generally available. They deliver up to 50% improved price performance with energy-efficient Arm architecture. Tailored for diverse workloads, these VMs offer various configurations, including general-purpose and memory-optimized options. Their release supports sustainable computing, aligning with Microsoft's commitment to lower carbon footprints.
-
Microsoft Launches Azure Confidential VMs with NVIDIA Tensor Core GPUs for Enhanced Secure Workloads
Microsoft's Azure has launched the NCC H100 v5 virtual machines, now equipped with NVIDIA Tensor Core GPUs, enhancing secure computing for high-performance workloads. These VMs leverage AMD EPYC processors for robust data protection, making them ideal for tasks like AI model training and inferencing, while ensuring a trusted execution environment for sensitive applications.
-
Distill Your LLMs and Surpass Their Performance: spaCy's Creator at InfoQ DevSummit Munich
In her presentation at the inaugural edition of InfoQ Dev Summit Munich, Ines Montani built on top of the presentation she had earlier this year at QCon London and provided the audience with practical solutions for using the latest state-of-the-art models in real-world applications and distilling their knowledge into smaller and faster components that you can run and maintain in-house.
-
University Researchers Publish Analysis of Chain-of-Thought Reasoning in LLMs
Researchers from Princeton University and Yale University published a case study of Chain-of-Thought (CoT) reasoning in LLMs which shows evidence of both memorization and true reasoning. They also found that CoT can work even when examples given in the prompt are incorrect.
-
Microsoft and Tsinghua University Present DIFF Transformer for LLMs
Researchers from Microsoft AI and Tsinghua University have introduced a new architecture called the Differential Transformer (DIFF Transformer), aimed at improving the performance of large language models. This model enhances attention mechanisms by refining how models handle context and minimizing distractions from irrelevant information.
-
OpenAI Releases Swarm, an Experimental Open-Source Framework for Multi-Agent Orchestration
Recently released as an experimental tool, Swarm aims to allow developers to investigate how they can have multiple agents coordinate with one another to execute tasks using routines and handoffs.
-
Vertex AI in Firebase Aims to Simplify the Creation of Gemini-powered Mobile Apps
Currently available in beta, the Vertex AI SDK for Firebase enables the creation of apps that go beyond the simple chat model and text prompting. Google has just made available a colab to help developers through the steps required to integrate it into their apps.
-
Google Publishes LLM Self-Correction Algorithm SCoRe
Researchers at Google DeepMind recently published a paper on Self-Correction via Reinforcement Learning (SCoRe), a technique for improving LLMs' ability to self-correct when solving math or coding problems. Models fine-tuned with SCoRe achieve improved performance on several benchmarks compared to baseline models.
-
OpenAI Launches Public Beta of Realtime API for Low-Latency Speech Interactions
OpenAI launched the public beta of the Realtime API, offering developers the ability to create low-latency, multimodal voice interactions within their applications. Additionally, audio input/output is now available in the Chat Completions API, expanding options for voice-driven applications. Early feedback highlights limited voice options and response cutoffs.
-
NVIDIA Unveils NVLM 1.0: Open-Source Multimodal LLM with Improved Text and Vision Capabilities
NVIDIA unveiled NVLM 1.0, an open-source multimodal large language model (LLM) that performs strongly on both vision-language and text-only tasks. NVLM 1.0 shows improvements in text-based tasks after multimodal training, standing out among current models. The model weights are now available on Hugging Face, with the training code set to be released shortly.
-
OpenAI Developer Day 2024 (SF) Announces Real-Time API, Vision Fine-Tuning, and More
On October 1, 2024, OpenAI SF DevDay unveiled innovative features, including a Real-Time API enabling instant voice interactions and function calling. Enhanced model distillation and vision fine-tuning empower developers to customize AI for diverse applications. Upcoming events in London and Singapore will further expand these capabilities.
-
Setting up a Data Mesh Organization
A data mesh organization: producers, consumers, and the platform. According to Matthias Patzak, the mission of the platform team is to make the lives of the producer and consumers simple, efficient and stress free. Data must be discoverable and understandable, trustworthy, and shared securely and easily across the organization.
-
Hugging Face Upgrades Open LLM Leaderboard v2 for Enhanced AI Model Comparison
Hugging Face has recently released Open LLM Leaderboard v2, an upgraded version of their benchmarking platform for large language models. Hugging Face created the Open LLM Leaderboard to provide a standardized evaluation setup for reference models, ensuring reproducible and comparable results.