InfoQ Homepage Deep Learning Content on InfoQ
-
Amazon's AlexaTM 20B Model Outperforms GPT-3 on NLP Benchmarks
Researchers at Amazon Alexa AI have announced Alexa Teacher Models (AlexaTM 20B), a 20-billion-parameter sequence-to-sequence (seq2seq) language model that exhibits state-of-the-art performance on 1-shot and few-shot NLP tasks. AlexaTM 20B outperforms GPT-3 on SuperGLUE and SQuADv2 benchmarks while having fewer than 1/8 the number of parameters.
-
Meta Develops Dataset Pruning Technique for Scaling AI Training
Researchers from Meta AI and Stanford University have developed a metric for pruning AI datasets which improves training scalability from a power-law to exponential-decay. The metric uses self-supervised learning and performs comparably to existing metrics which require more compute power.
-
Meta's Genomics AI ESMFold Predicts Protein Structure 6x Faster Than AlphaFold2
Meta AI Research recently announced ESMFold, an AI model for predicting protein structure from a sequence of genes. ESMFold is built on a 15B parameter Transform model and achieves accuracy comparable to other state-of-the-art models with an order-of-magnitude inference time speedup.
-
PrefixRL: Nvidia's Deep-Reinforcement-Learning Approach to Design Better Circuits
Nvidia has developed PrefixRL, an approach based on reinforcement learning (RL) to designing parallel-prefix circuits that are smaller and faster than those designed by state-of-the-art electronic-design-automation (EDA) tools.
-
Meta Open-Sources 200 Language Translation AI NLLB-200
Meta AI recently open-sourced NLLB-200, an AI model that can translate between any of over 200 languages. NLB-200 is a 54.5B parameter Mixture of Experts (MoE) model that was trained on a dataset containing more than 18 billion sentence pairs. On benchmark evaluations, NLLB-200 outperforms other state-of-the-art models by up to 44%.
-
Google AI Open-Sourced a New ML Tool for Conceptual and Subjective Queries over Images
Google AI open-sourced mood board search, a new ML-powered tool for subjective or conceptual queries over images. Mood board search helps users to define conceptual and subjective queries like peaceful, beautiful, over images.
-
BigScience Releases 176B Parameter AI Language Model BLOOM
The BigScience research workshop released BigScience Large Open-science Open-access Multilingual Language Model (BLOOM), an autoregressive language model based on the GPT-3 architecture. BLOOM is trained on data from 46 natural languages and 13 programming languages and is the largest publicly available open multilingual model.
-
Google's Image-Text AI LIMoE Outperforms CLIP on ImageNet Benchmark
Researchers at Google Brain recently trained Language-Image Mixture of Experts (LIMoE), a 5.6B parameter image-text AI model. In zero-shot learning experiments on ImageNet, LIMoE outperforms CLIP and performs comparably to state-of-the-art models while using fewer compute resources.
-
PyTorch 1.12 Release Includes Accelerated Training on Macs and New Library TorchArrow
The PyTorch open-source deep-learning framework announced the release of version 1.12 which includes support for GPU-accelerated training on Apple silicon Macs and a new data preprocessing library, TorchArrow, as well as updates to other libraries and APIs.
-
Google AI Developed a Language Model to Solve Quantitative Reasoning Problems
Google AI developed a deep learning language model called Minerva which could solve mathematical quantitative problems. Google AI researchers achieved a state-of-the-art deep learning model by training on a large dataset that contains quantitative reasoning with symbolic expressions. The final model, Minerva, could solve quantitative mathematical problems on STEM reasoning tasks.
-
OpenAI Releases Minecraft-Playing AI VPT
Researchers from OpenAI have open-sourced Video PreTraining (VPT), a semi-supervised learning technique for training game-playing agents. In a zero-shot setting, VPT performs tasks that agents cannot learn via reinforcement learning (RL) alone, and with fine-tuning is the first AI to craft a diamond pickaxe in Minecraft.
-
Adobe Researchers Open-Source Image Captioning AI CLIP-S
Researchers from Adobe and the University of North Carolina (UNC) have open-sourced CLIP-S, an image-captioning AI model that produces fine-grained descriptions of images. In evaluations with captions generated by other models, human judges preferred those generated by CLIP-S a majority of the time.
-
Stanford University Open-Sources Controllable Generative Language AI Diffusion-LM
Researchers at Stanford University have open-sourced Diffusion-LM, a non-autoregressive generative language model that allows for fine-grained control of the model's output text. When evaluated on controlled text generation tasks, Diffusion-LM outperforms existing methods.
-
DeepMind Trains 80 Billion Parameter AI Vision-Language Model Flamingo
DeepMind recently trained Flamingo, an 80B parameter vision-language model (VLM) AI. Flamingo combines separately pre-trained vision and language models and outperforms all other few-shot learning models on 16 vision-language benchmarks. Flamingo can also chat with users, answering questions about input images and videos.
-
Google's New Imagen AI Outperforms DALL-E on Text-to-Image Generation Benchmarks
Researchers from Google's Brain Team have announced Imagen, a text-to-image AI model that can generate photorealistic images of a scene given a textual description. Imagen outperforms DALL-E 2 on the COCO benchmark, and unlike many similar models, is pre-trained only on text data.