InfoQ Homepage Large language models Content on InfoQ
-
Meta MobileLLM Advances LLM Design for On-Device Use Cases
With MobileLLM, Meta researchers aim to show that, for smaller models, quality is not a direct product of how many billions parameters they have; rather, it is the result of carefully designing their architecture. To prove their point, they coupled deep and thin architectures with embedding sharing and grouped-query attention mechanisms to improve accuracy over prior state-of-the-art models.
-
Microsoft Introduces Vector Data Abstractions Library for .NET
On October 29th 2024, Microsoft released Microsoft.Extensions.VectorData.Abstractions library for .NET in preview. It makes it easier to integrate .NET solutions with the AI Semantic Kernel SDK, using abstractions over concrete AI implementations and models.
-
Meta AI Introduces Thought Preference Optimization Enabling AI Models to Think before Responding
Researchers from Meta FAIR, the University of California, Berkeley, and New York University have introduced Thought Preference Optimization (TPO), a new method aimed at improving the response quality of instruction-fine tuned LLMs.
-
Meta Spirit LM Integrates Speech and Text in New Multimodal GenAI Model
Presented in a recent paper, Spirit LM enables the creation of pipelines that mixes spoken and written text to integrate speech and text in the same multimodal model. According to Meta, their novel approach, based on interleaving text and speech tokens, makes it possible to circumvent the inherent limitations of prior solutions that use distinct pipelines for speech and text.
-
Stable Diffusion 3.5 Improves Text Rendering, Image Quality, Consistency, and More
Stability AI has released Stable Diffusion 3.5 Large, its most powerful text-to-image generation model to date, and Stable Diffusion 3.5 Large Turbo, with special emphasis on customizability, efficiency, and flexibility. Both models come with a free licensing model for non commercial and limited commercial use.
-
AI and ML Tracks at QCon San Francisco 2024 – a Deep Dive into GenAI & Practical Applications
At QCon San Francisco 2024, explore two AI/ML-focused tracks highlighting real-world applications and innovations. Learn from industry experts on deploying LLMs, GenAI, and recommendation systems, gaining practical strategies for integrating AI into software development.
-
Distill Your LLMs and Surpass Their Performance: spaCy's Creator at InfoQ DevSummit Munich
In her presentation at the inaugural edition of InfoQ Dev Summit Munich, Ines Montani built on top of the presentation she had earlier this year at QCon London and provided the audience with practical solutions for using the latest state-of-the-art models in real-world applications and distilling their knowledge into smaller and faster components that you can run and maintain in-house.
-
University Researchers Publish Analysis of Chain-of-Thought Reasoning in LLMs
Researchers from Princeton University and Yale University published a case study of Chain-of-Thought (CoT) reasoning in LLMs which shows evidence of both memorization and true reasoning. They also found that CoT can work even when examples given in the prompt are incorrect.
-
Microsoft and Tsinghua University Present DIFF Transformer for LLMs
Researchers from Microsoft AI and Tsinghua University have introduced a new architecture called the Differential Transformer (DIFF Transformer), aimed at improving the performance of large language models. This model enhances attention mechanisms by refining how models handle context and minimizing distractions from irrelevant information.
-
Vertex AI in Firebase Aims to Simplify the Creation of Gemini-powered Mobile Apps
Currently available in beta, the Vertex AI SDK for Firebase enables the creation of apps that go beyond the simple chat model and text prompting. Google has just made available a colab to help developers through the steps required to integrate it into their apps.
-
Google Publishes LLM Self-Correction Algorithm SCoRe
Researchers at Google DeepMind recently published a paper on Self-Correction via Reinforcement Learning (SCoRe), a technique for improving LLMs' ability to self-correct when solving math or coding problems. Models fine-tuned with SCoRe achieve improved performance on several benchmarks compared to baseline models.
-
NVIDIA Unveils NVLM 1.0: Open-Source Multimodal LLM with Improved Text and Vision Capabilities
NVIDIA unveiled NVLM 1.0, an open-source multimodal large language model (LLM) that performs strongly on both vision-language and text-only tasks. NVLM 1.0 shows improvements in text-based tasks after multimodal training, standing out among current models. The model weights are now available on Hugging Face, with the training code set to be released shortly.
-
Hugging Face Upgrades Open LLM Leaderboard v2 for Enhanced AI Model Comparison
Hugging Face has recently released Open LLM Leaderboard v2, an upgraded version of their benchmarking platform for large language models. Hugging Face created the Open LLM Leaderboard to provide a standardized evaluation setup for reference models, ensuring reproducible and comparable results.
-
PayPal Adds GenAI Support with LLMs to Its Cosmos.AI MLOps Platform
PayPal extended its MLOps platform Cosmos.AI to support the development of generative AI applications using large language models (LLMs). The company incorporated support for vendor, open-source, and self-tuned LLMs and provided capabilities around retrieval-augmented generation (RAG), semantic caching, prompt management, orchestration, and AI application hosting.
-
University of Chinese Academy of Sciences Open-Sources Multimodal LLM LLaMA-Omni
Researchers at the University of Chinese Academy of Sciences (UCAS) recently open-sourced LLaMA-Omni, an LLM that can operate on both speech and text data. LLaMA-Omni is based on Meta's Llama-3.1-8B-Instruct LLM and outperforms similar baseline models while requiring less training data and compute.