InfoQ Homepage Large language models Content on InfoQ

News

RSS Feed

Newer Older

AI, ML & Data Engineering

Defensible Moats: Unlocking Enterprise Value with Large Language Models at QCon San Francisco

In a recent presentation at QConSFrancisco, Nischal HP discussed the challenges enterprises face when building LLM-powered applications using APIs alone. These challenges include data fragmentation, the absence of a shared business vocabulary, privacy concerns regarding data, and diverse objectives among stakeholders.

Andrew Hoblitzell
on Oct 05, 2023
Culture & Methods

The Challenges of Producing Quality Code When Using AI-Based Generalistic Models

Using AI with generalistic models to do very specific things like generating code can cause problems. Producing code with AI is like using code from someone else who you don’t know which may not match your standards and quality. Creating specialised or dedicated models can be a way out.

Ben Linders
on Oct 05, 2023
AI, ML & Data Engineering

Practical Advice for Retrieval Augmented Generation (RAG), by Sam Partee at QCon San Francisco

At the recent QCon San Francisco conference, Sam Partee, principal engineer at Redis, gave a talk about Retrieval Augmented Generation (RAG). He discussed Generative Search, which combines large language models (LLMs) with vector databases to improve information retrieval. Partee discussed several innovative tricks such as Hypothetical Document Embeddings (HyDE), and semantic caching.

Roland Meertens
on Oct 04, 2023
AI, ML & Data Engineering

Generative AI Service AWS Bedrock Now Generally Available

After announcing Bedrock last April in preview, Amazon is now making its fully-managed service for generative AI apps generally available.

Sergio De Simone
on Sep 30, 2023
AI, ML & Data Engineering

Multi-Modal LLM NExT-GPT Handles Text, Images, Videos, and Audio

The NExT Research Center at the National University of Singapore (NUS) recently open-sourced NExT-GPT, an "any-to-any" multi-modal large language model (LLM) that can handle text, images, videos, and audio as input or output. NExT-GPT is based on existing pre-trained models and only required updating 1% of its total parameters during training.

Anthony Alford
on Sep 26, 2023
AI, ML & Data Engineering

Hugging Face's Guide to Optimizing LLMs in Production

When it comes to deploying Large Language Models (LLMs) in production, the two major challenges originate from the huge amount of parameters they require and the necessity of handling very long input sequences to represent contextual information. Hugging Face has documented a list of techniques to tackle those hurdles based on their experience serving such models.

Sergio De Simone
on Sep 25, 2023
AI, ML & Data Engineering

Abu Dhabi Releases Largest Openly-Available Language Model Falcon 180B

The Abu Dhabi government's Technology Innovation Institute (TII) released Falcon 180B, currently the largest openly-available large language model (LLM). Falcon 180B contains 180 billion parameters and outperforms GPT-3.5 on the MMLU benchmark.

Anthony Alford
on Sep 12, 2023
AI, ML & Data Engineering

AI, ML, Data Engineering News Roundup: Stable Chat, Vertex AI, ChatGPT and Code Llama

The most recent update, which covers developments through September 4, 2023, highlights significant pronouncements and accomplishments in the fields of artificial intelligence, machine learning, and data science. Developments from Stability AI, Google, OpenAI, and Meta were among this week's significant stories.

Daniel Dominguez
on Sep 11, 2023
AI, ML & Data Engineering

Weekly Update on Large Language Models: PointLLM, WALL-E, AskIt, and Jais

The most recent compilation of advanced research, inventive applications, and notable unveilings in the realm of Large Language Models (LLMs) during the week starting September 4th, 2023.

Daniel Dominguez
on Sep 11, 2023
Cloud

Google Cloud Unveils AlloyDB AI: Transforming PostgreSQL with Advanced Vector Embeddings and AI

During the recent Google Cloud Next, Google announced AlloyDB AI in preview as an integral part of AlloyDB for PostgreSQL, allowing developers to build generative (gen) Artificial Intelligence (AI) applications leveraging large language models (LLMs) with their real-time operational data through built-in, end-to-end support for vector embeddings.

Steef-Jan Wiggers
on Sep 06, 2023
AI, ML & Data Engineering

Meta Open-Sources Code Generation LLM Code Llama

Meta recently open-sourced Code Llama, a code generation LLM which is based on the Llama 2 foundation model and carries the same community license. Code Llama was fine-tuned on 500B tokens of code and is available in three model sizes ranging up to 34B parameters. In evaluations on code-generation benchmarks, the model outperformed all other open-source models and is comparable to ChatGPT.

Anthony Alford
on Sep 05, 2023
AI, ML & Data Engineering

Google Expands Vertex AI Search and Conversation Capabilities

At its Google Cloud Next conference, Google officially introduced new capabilities for its enterprise AI platform, Vertex AI, which aim to enable more advanced user workflows, among other things.

Sergio De Simone
on Sep 03, 2023
Development

How Spotify Improved its LLM Chatbot in Sidekick

While using a Large Language Model chatbot opens the door to innovative solutions, Spotify engineer Ates Goral argues that crafting the user experience so it is as natural as possible requires some specific efforts to order to reduce latency.

Sergio De Simone
on Aug 31, 2023
AI, ML & Data Engineering

Stability AI Launches Open Source Chatbot Stable Chat

Stability AI, makers of the image generation AI Stable Diffusion, recently launched Stable Chat, a web-based chat interface for their open-access language model Stable Beluga. At the time of its release, Stable Beluga was the best-performing open large language model (LLM) on the HuggingFace leaderboard.

Anthony Alford
on Aug 29, 2023
AI, ML & Data Engineering

MetaGPT Leverages Human Collaboration Techniques for Multi-Agent-Based Software Engineering

Created by a team of researchers from Chinese and US universities, MetaGPT is a new LLM-based meta programming framework aiming to enable collaboration in multi-agent systems by leveraging human procedural knowledge to enhance robustness, reduce errors, and engineer software solutions for complex tasks.

Sergio De Simone
on Aug 24, 2023

Newer News

Older News

InfoQ Software Architects' Newsletter

News