InfoQ Homepage Neural Networks Content on InfoQ

News

RSS Feed

Newer Older

AI, ML & Data Engineering

Meta Releases Llama 3 Open-Source LLM

Meta AI released Llama 3, the latest generation of their open-source large language model (LLM) family. The model is available in 8B and 70B parameter sizes, each with a base and instruction-tuned variant. Llama3 outperforms other LLMs of the same parameter size on standard LLM benchmarks.

Anthony Alford
on May 07, 2024
AI, ML & Data Engineering

OpenAI Releases New Fine-Tuning API Features

OpenAI announced the release of new features in their fine-tuning API. The features will give model developers more control over the fine-tuning process and better insight into their model performance.

Anthony Alford
on Apr 30, 2024
AI, ML & Data Engineering

Stability AI Releases 3D Model Generation AI Stable Video 3D

Stability AI recently released Stable Video 3D (SV3D), an AI model that can generate 3D mesh object models from a single 2D image. SV3D is based on the Stable Video Diffusion model and produces state-of-the-art results on 3D object generation benchmarks.

Anthony Alford
on Apr 23, 2024
AI, ML & Data Engineering

Google Trains User Interface and Infographics Understanding AI Model ScreenAI

Google Research recently developed ScreenAI, a multimodal AI model for understanding infographics and user interfaces. ScreenAI is based on the PaLI architecture and achieves state-of-the-art performance on several tasks.

Anthony Alford
on Apr 16, 2024
AI, ML & Data Engineering

NVIDIA Announces Next-Generation AI Superchip Blackwell

NVIDIA recently announced their next generation GPU architecture, Blackwell. Blackwell is the largest GPU ever built, with over 200 billion transistors, and can train large language models (LLMs) up to 4x faster than previous generation hardware.

Anthony Alford
on Apr 09, 2024
AI, ML & Data Engineering

Meta Unveils 24k GPU AI Infrastructure Design

Meta recently announced the design of two new AI computing clusters, each containing 24,576 GPUs. The clusters are based on Meta's Grand Teton hardware platform, and one cluster is currently used by Meta for training their next-generation Llama 3 model.

Anthony Alford
on Apr 02, 2024
AI, ML & Data Engineering

Researchers Open-Source LLM Jailbreak Defense Algorithm SafeDecoding

Researchers from the University of Washington, the Pennsylvania State University, and Allen Institute for AI have open-sourced SafeDecoding, a technique for protecting large language models (LLMs) against jailbreak attacks. SafeDecoding outperforms baseline jailbreak defenses without incurring significant computational overhead.

Anthony Alford
on Mar 26, 2024
AI, ML & Data Engineering

Vesuvius Challenge Winners Use AI to Read Ancient Scroll

The Vesuvius Challenge recently announced the winners of their 2023 Grand Prize. The winning team used an ensemble of AI models to read text from a scroll of papyrus that was buried in volcanic ash nearly 2,000 years ago.

Anthony Alford
on Mar 19, 2024
AI, ML & Data Engineering

RWKV Project Open-Sources LLM Eagle 7B

The RWKV Project recently open-sourced Eagle 7B, a 7.52B parameter large language model (LLM). Eagle 7B is trained on 1.1 trillion tokens of text in over 100 languages and outperforms other similarly-sized models on multilingual benchmarks.

Anthony Alford
on Mar 12, 2024
AI, ML & Data Engineering

Amazon Announces One Billion Parameter Speech Model BASE TTS

Amazon Science recently published their work on Big Adaptive Streamable TTS with Emergent abilities (BASE TTS). BASE TTS supports voice-cloning and outperforms baseline TTS models when evaluated by human judges. Further, Amazon's experiments show that scaling model and data size improves the subjective quality of the model's output.

Anthony Alford
on Mar 05, 2024
AI, ML & Data Engineering

Google Announces 200M Parameter AI Forecasting Model TimesFM

Google Research announced TimesFM, a 200M parameter Transformer-based foundation model for time-series forecasting. TimesFM is trained on nearly 100B data points and has zero-shot forecasting performance comparable to or better than supervised-learning models.

Anthony Alford
on Feb 27, 2024
AI, ML & Data Engineering

Google Renames Bard to Gemini

Google announced that their Bard chatbot will now be called Gemini. The company also announced the launch of Gemini Advanced, the largest version of their Gemini language model, along with two new mobile apps for interacting with the model.

Anthony Alford
on Feb 20, 2024
AI, ML & Data Engineering

Google Announces Multi-Modal Gemini 1.5 with Million Token Context Length

One week after announcing Gemini 1.0 Ultra, Google announced additional details about its next generation model, Gemini 1.5. The new iteration comes with an expansion of its context window and the adoption of a "Mixture of Experts" (MoE) architecture, promising to make the AI both faster and more efficient. The new model also includes expanded multimodal capabilities.

Andrew Hoblitzell
on Feb 17, 2024
AI, ML & Data Engineering

MIT Researchers Use Explainable AI Model to Discover New Antibiotics

Researchers from MIT's Collins lab used an explainable deep-learning model to discover chemical compounds which could fight the MRSA bacteria. The model uses graph algorithms to identify chemical compounds which are likely to have antibiotic properties. Additional models predict whether or not the chemicals would be harmful to humans.

Anthony Alford
on Feb 13, 2024
AI, ML & Data Engineering

OpenAI Releases New Embedding Models and Improved GPT-4 Turbo

OpenAI recently announced the release of several updates to their models, including two new embedding models and updates to GPT-4 Turbo and GPT-3.5 Turbo. The company also announced improvements to their free text moderation tool and to their developer API management tools.

Anthony Alford
on Feb 06, 2024

Newer News

Older News

InfoQ Software Architects' Newsletter

News