InfoQ Homepage Large language models Content on InfoQ
-
Databricks Unveils Lakehouse AI and MosaicML Acquisition at Data + AI Summit
The Data and AI company Databricks recently unveiled Lakehouse AI, a suite of tools for building and governing generative AI models, including large language models (LLMs), within the Databricks platform. Among the tools were LakehouseIQ, a "knowledge engine" that uses AI to understand a company's unique data, culture, and language in order to improve natural language interfaces like chatbots.
-
Google's Speech AI AudioPaLM Performs Translation with Voice Transfer
Researchers at Google announced AudioPaLM, a large language model (LLM) that performs text-to-speech (TTS), automated speech recognition (ASR), and speech-to-speech translation (S2ST) with voice transfer. AudioPaLM is based on the PaLM-2 LLM and outperforms OpenAI's Whisper on translation benchmarks.
-
UC Berkeley Researchers Open-Source API-Calling Language Model Gorilla
Researchers from UC Berkeley and Microsoft have open-sourced Gorilla, a large language model (LLM) that can write code to call APIs. In experiments measuring generated code accuracy, Gorilla outperforms several baseline models, including GPT-4.
-
Microsoft Guidance Offers Language for Controlling Large Language Models
Microsoft has recently introduced a domain-specific language called Guidance, to improve developers' ability to manage contemporary language models. The new framework integrates aspects such as generation, prompting, and logical control into a unified process for developers. The inclusion of regex pattern guides ensures the enforcement of formats, allowing for the natural completion of prompts.
-
Google's PaLM-E Combines Vision and Language AI for Robot Control
Researchers from Google's Robotics team recently announced PaLM-E, a combination of their PaLM and Vision Transformer (ViT) models designed for controlling robots. PaLM-E handles multimodal input data from robotic sensor and outputs text commands to control the robot's actuators. Besides performing well on several robotics tasks, PaLM-E also outperforms other models on the OK-VQA benchmark.
-
QCon New York 2023 Panel Discussion: Navigating the Future - LLM in Production
The recent QCon New York conference featured a panel discussion titled "Navigating the Future: LLM in Production." Some key takeaways are that there are two trends in LLMS: closed models behind APIs and open-source models, and that organizations using LLMs will need to think deeply about testing and evaluating the models themselves, with a strong emphasis on risk mitigation.
-
Voxel51 Open-Sources Computer Vision Dataset Assistant VoxelGPT - Q&A with Jason Corso
Voxel51 recently open-sourced VoxelGPT, an AI assistant that interfaces with GPT-3.5 to produce Python code for querying computer vision datasets. InfoQ spoke with Jason Corso, co-founder and CSO of Voxel51, who shared their lessons and insights gained while developing VoxelGPT.
-
Nvidia's NeMo Guardrails Enhances Safety in Generative AI Applications
Nvidia's new NeMo Guardrails package for large language models (LLMs) helps developers prevent LLM risks like harmful or offensive content and access to sensitive data, by providing an essential layer of protection in an increasingly AI-driven landscape.
-
Google Announces State-of-the-Art PaLM 2 Language Model Powering Bard
Google DeepMind recently announced PaLM 2, a large language model (LLM) powering Bard and over 25 other product features. PaLM 2 significantly outperforms the previous version of PaLM on a wide range of benchmarks, while being smaller and cheaper to run.
-
Minecraft Welcomes Its First LLM-Powered Agent
Researchers from Caltech, Stanford, the University of Texas, and NVIDIA have collaboratively developed and released Voyager, an LLM power agent that utilizes GPT-4 to engage in Minecraft gameplay. Voyager demonstrates remarkable capabilities by learning, retaining knowledge, and showcasing exceptional expertise in Minecraft.
-
InfraCopilot, a Conversational Infrastructure-as-Code Editor
Klotho announced InfraCopilot, an infrastructure as a Code (IaC) editor with natural language processing capabilities. The user can chat with InfraCopilot describing their infrastructure needs and it translates these ideas into a low-level architecture. Users can then iterate with incremental high-level and low-level architecture changes.
-
Microsoft Open-Sources 13 Billion Parameter Language and Vision Chatbot LLaVA
Researchers from Microsoft, the University of Wisconsin–Madison, and Columbia University have open-sourced Large Language and Vision Assistant (LLaVA). LLaVA is based on a CLIP image encoder and a LLaMA language decoder, is fine-tuned on a synthetic instruction-following dataset, and achieved state-of-the-art accuracy on the ScienceQA benchmark.
-
Google Previews Studio Bot, a Coding Bot for Android Development
At Google I/O 2023, Google has previewed Studio Bot, an AI-powered coding bot integrated in Android Studio latest version, codenamed Hedgehog. Studio Bot aims to help developers generate code, unit tests, and fix errors.
-
Running Large Language Models Natively on Mobile and Laptops
MLC LLM is a new open source project aimed to enable deploying large language models on a variety of hardware platforms and applications. It additionally includes a framework to optimize model performance for each specific use case.
-
Efficiently Applying LLMs to Transform Semi-Structured Data
LLMs can be an effective way to generate structured data from semi-structured data, although an expensive one. A team of Stanford and Cornell researchers claim to have found a technique to reduce inference costs by 110x while improving inference quality.