InfoQ Homepage Large language models Content on InfoQ

News

RSS Feed

Newer Older

AI, ML & Data Engineering

Microsoft and Tsinghua University Present DIFF Transformer for LLMs

Researchers from Microsoft AI and Tsinghua University have introduced a new architecture called the Differential Transformer (DIFF Transformer), aimed at improving the performance of large language models. This model enhances attention mechanisms by refining how models handle context and minimizing distractions from irrelevant information.

Daniel Dominguez
on Oct 20, 2024
Mobile

Vertex AI in Firebase Aims to Simplify the Creation of Gemini-powered Mobile Apps

Currently available in beta, the Vertex AI SDK for Firebase enables the creation of apps that go beyond the simple chat model and text prompting. Google has just made available a colab to help developers through the steps required to integrate it into their apps.

Sergio De Simone
on Oct 17, 2024
AI, ML & Data Engineering

Google Publishes LLM Self-Correction Algorithm SCoRe

Researchers at Google DeepMind recently published a paper on Self-Correction via Reinforcement Learning (SCoRe), a technique for improving LLMs' ability to self-correct when solving math or coding problems. Models fine-tuned with SCoRe achieve improved performance on several benchmarks compared to baseline models.

Anthony Alford
on Oct 15, 2024
AI, ML & Data Engineering

NVIDIA Unveils NVLM 1.0: Open-Source Multimodal LLM with Improved Text and Vision Capabilities

NVIDIA unveiled NVLM 1.0, an open-source multimodal large language model (LLM) that performs strongly on both vision-language and text-only tasks. NVLM 1.0 shows improvements in text-based tasks after multimodal training, standing out among current models. The model weights are now available on Hugging Face, with the training code set to be released shortly.

Robert Krzaczyński
on Oct 11, 2024
AI, ML & Data Engineering

Hugging Face Upgrades Open LLM Leaderboard v2 for Enhanced AI Model Comparison

Hugging Face has recently released Open LLM Leaderboard v2, an upgraded version of their benchmarking platform for large language models. Hugging Face created the Open LLM Leaderboard to provide a standardized evaluation setup for reference models, ensuring reproducible and comparable results.

Vinod Goje
on Oct 10, 2024
Architecture & Design

PayPal Adds GenAI Support with LLMs to Its Cosmos.AI MLOps Platform

PayPal extended its MLOps platform Cosmos.AI to support the development of generative AI applications using large language models (LLMs). The company incorporated support for vendor, open-source, and self-tuned LLMs and provided capabilities around retrieval-augmented generation (RAG), semantic caching, prompt management, orchestration, and AI application hosting.

Rafal Gancarz
on Oct 09, 2024
AI, ML & Data Engineering

University of Chinese Academy of Sciences Open-Sources Multimodal LLM LLaMA-Omni

Researchers at the University of Chinese Academy of Sciences (UCAS) recently open-sourced LLaMA-Omni, an LLM that can operate on both speech and text data. LLaMA-Omni is based on Meta's Llama-3.1-8B-Instruct LLM and outperforms similar baseline models while requiring less training data and compute.

Anthony Alford
on Oct 08, 2024
AI, ML & Data Engineering

Meta Unveils Movie Gen, a New AI Model for Video Generation

Meta has announced Movie Gen, a new AI model designed to create high-quality 1080p videos with synchronized audio. The system enables instruction-based video editing and allows for personalized content generation using user-supplied images.

Daniel Dominguez
on Oct 08, 2024
DevOps

Intuit Engineering's Approach to Simplifying Kubernetes Management with AI

Intuit recently talked about how they managed the complexities of monitoring and debugging Kubernetes clusters using Generative AI (GenAI). The GenAI experiments were conducted to streamline detection, debugging, and remediation processes.

Aditya Kulkarni
on Sep 29, 2024
AI, ML & Data Engineering

Anthropic Unveils Contextual Retrieval for Enhanced AI Data Handling

Anthropic has announced Contextual Retrieval, a significant advancement in AI systems' interaction with extensive knowledge bases. This technique addresses the challenge of context loss in Retrieval-Augmented Generation (RAG) systems by enriching text chunks with contextual information before embedding or indexing.

Daniel Dominguez
on Sep 25, 2024
Architecture & Design

Uber Creates GenAI Gateway Mirroring OpenAI API to Support over 60 LLM Use Cases

Uber created a unified platform for serving large language models (LLMs) from external vendors and self-hosted ones and opted to mirror OpenAI API to help with internal adoption. GenAI Gateway provides a consistent and efficient interface and serves over 60 distinct LLM use cases across many areas.

Rafal Gancarz
on Sep 24, 2024
AI, ML & Data Engineering

HelixML Announces Helix 1.0 Release

HelixML has announced their Helix platform for Generative AI is production ready at version 1.0. Described as a "Private GenAI Stack," the platform provides an interface layer and applications that can be connected to a variety of LLMs. It can be used to prototype apps, starting with a laptop, with all components version controlled to ease subsequent deployment and scaling.

Chris Swan
on Sep 09, 2024
AI, ML & Data Engineering

Leveraging the Transformer Architecture for Music Recommendation on YouTube

Google has described an approach to use transformer models, which ignited the current generative AI boom, for music recommendation. This approach, which is currently being applied experimentally on YouTube, aims to build a recommender that can understand sequences of user actions when listening to music to better predict user preferences based on their context.

Sergio De Simone
on Sep 06, 2024
AI, ML & Data Engineering

Alibaba Releases Two Open-Weight Language Models for Math and Voice Chat

Alibaba released two open-weight language model families: Qwen2-Math, a series of LLMs tuned for solving mathematical problems; and Qwen2-Audio, a family of multi-modal LLMs that can accept voice or text input. Both families are based on Alibaba's Qwen2 LLM series, and all but the largest version of Qwen2-Math are available under the Apache 2.0 license.

Anthony Alford
on Sep 03, 2024
AI, ML & Data Engineering

Grok-2 Beta Version Released on X Platform

The Grok-2 language model has been released in beta on the X platform, introduced alongside Grok-2 mini. The model, tested under the designation "sus-column-r" on the LMSYS leaderboard, has achieved a higher Elo Score compared to Claude 3.5 Sonnet and GPT-4-Turbo. Grok-2 mini, a smaller variant, is also part of the beta release, designed to offer a balance between speed and performance.

Daniel Dominguez
on Sep 01, 2024

Newer News

Older News

InfoQ Software Architects' Newsletter

News