InfoQ Homepage Large language models Content on InfoQ
-
Abu Dhabi Releases Largest Openly-Available Language Model Falcon 180B
The Abu Dhabi government's Technology Innovation Institute (TII) released Falcon 180B, currently the largest openly-available large language model (LLM). Falcon 180B contains 180 billion parameters and outperforms GPT-3.5 on the MMLU benchmark.
-
AI, ML, Data Engineering News Roundup: Stable Chat, Vertex AI, ChatGPT and Code Llama
The most recent update, which covers developments through September 4, 2023, highlights significant pronouncements and accomplishments in the fields of artificial intelligence, machine learning, and data science. Developments from Stability AI, Google, OpenAI, and Meta were among this week's significant stories.
-
Weekly Update on Large Language Models: PointLLM, WALL-E, AskIt, and Jais
The most recent compilation of advanced research, inventive applications, and notable unveilings in the realm of Large Language Models (LLMs) during the week starting September 4th, 2023.
-
Google Cloud Unveils AlloyDB AI: Transforming PostgreSQL with Advanced Vector Embeddings and AI
During the recent Google Cloud Next, Google announced AlloyDB AI in preview as an integral part of AlloyDB for PostgreSQL, allowing developers to build generative (gen) Artificial Intelligence (AI) applications leveraging large language models (LLMs) with their real-time operational data through built-in, end-to-end support for vector embeddings.
-
Meta Open-Sources Code Generation LLM Code Llama
Meta recently open-sourced Code Llama, a code generation LLM which is based on the Llama 2 foundation model and carries the same community license. Code Llama was fine-tuned on 500B tokens of code and is available in three model sizes ranging up to 34B parameters. In evaluations on code-generation benchmarks, the model outperformed all other open-source models and is comparable to ChatGPT.
-
Google Expands Vertex AI Search and Conversation Capabilities
At its Google Cloud Next conference, Google officially introduced new capabilities for its enterprise AI platform, Vertex AI, which aim to enable more advanced user workflows, among other things.
-
How Spotify Improved its LLM Chatbot in Sidekick
While using a Large Language Model chatbot opens the door to innovative solutions, Spotify engineer Ates Goral argues that crafting the user experience so it is as natural as possible requires some specific efforts to order to reduce latency.
-
Stability AI Launches Open Source Chatbot Stable Chat
Stability AI, makers of the image generation AI Stable Diffusion, recently launched Stable Chat, a web-based chat interface for their open-access language model Stable Beluga. At the time of its release, Stable Beluga was the best-performing open large language model (LLM) on the HuggingFace leaderboard.
-
MetaGPT Leverages Human Collaboration Techniques for Multi-Agent-Based Software Engineering
Created by a team of researchers from Chinese and US universities, MetaGPT is a new LLM-based meta programming framework aiming to enable collaboration in multi-agent systems by leveraging human procedural knowledge to enhance robustness, reduce errors, and engineer software solutions for complex tasks.
-
LMSYS Org Releases Chatbot Arena and LLM Evaluation Datasets
Large Model Systems Organization (LMSYS Org) recently released Chatbot Arena, a comparison platform for large language models (LLMs), where users can pick the better response from a pair of chatbots. LMSYS also released a dataset containing conversations from the Arena as well as a dataset of human annotations of results from evaluating LLMs on the MT-Bench benchmark.
-
Semantic Kernel LLM Java SDK Now Available, Simplifying GenAI Integration
Microsoft has announced the availability of its Semantic Kernel software development kit (SDK) for Java, designed to mesh Large Language Models (LLMs) with popular programming languages, extending support beyond C# and Python.
-
Researchers Publish Attack Algorithm for ChatGPT and Other LLMs
Researchers from Carnegie Mellon University (CMU) have published LLM Attacks, an algorithm for constructing adversarial attacks on a wide range of large language models (LLMs), including ChatGPT, Claude, and Bard. The attacks are generated automatically and are successful 84% of the time on GPT-3.5 and GPT-4, and 66% of the time on PaLM-2.
-
Meta Open Sources New AI Model Llama 2
Meta is open-sourcing its large language model, Llama 2. The model’s code and weights are being made available free of charge for both research and commercial use. Llama 2 is the result of the expanded partnership between Meta and Microsoft, with the latter being the preferred partner for the new model.
-
LangChain - Working with Large Language Models, Made Easy
LangChain is a framework that simplifies working with large language models (LLMs) such as OpenAI GPT4 or Google PaLM by providing abstractions for common use cases. It supports both JavaScript and Python.
-
GitHub Details Key Prompt Engineering Practices Used to Build Copilot
Prompt engineering is key to creating effective LLM-based applications and does not require to have a PhD in machine learning or generative AI, say GitHub engineers Albert Ziegler and John Berryman, who also shared the lessons they learned developing GitHub Copilot.