InfoQ Homepage Deep Learning Content on InfoQ

News

RSS Feed

Newer Older

AI, ML & Data Engineering

Meta Open-Sources Computer Vision Foundation Model DINOv2

Meta AI Research open-sourced DINOv2, a foundation model for computer vision (CV) tasks. DINOv2 is pretrained on a curated dataset of 142M images and can be used as a backbone for several tasks, including image classification, video action recognition, semantic segmentation, and depth estimation.

Anthony Alford
on May 23, 2023
AI, ML & Data Engineering

Google's Universal Speech Model Performs Speech Recognition on Hundreds of Languages

Google Research announced Universal Speech Model (USM), a 2B parameter automated speech recognition (ASR) model trained on over 12M hours of speech audio. USM can recognize speech in over 100 languages, including low-resource languages, and achieves new state-of-the-art performance on several benchmarks.

Anthony Alford
on May 16, 2023
AI, ML & Data Engineering

Stability AI Open-Sources 7B Parameter Language Model StableLM

Stability AI released two sets of pre-trained model weights for StableLM, a suite of large language models (LLM). The models are trained on 1.5 trillion text tokens and are licensed for commercial use under CC BY-SA-4.0.

Anthony Alford
on May 02, 2023
AI, ML & Data Engineering

Meta's Toolformer Uses APIs to Outperform GPT-3 on Zero-Shot NLP Tasks

Meta AI Research announced Toolformer, a language model that learns to call APIs to help solve natural language processing (NLP) tasks. Toolformer automatically annotates a training dataset which is used to fine-tune the model and can outperform the much larger GPT-3 model on several zero-shot NLP tasks.

Anthony Alford
on Apr 25, 2023
AI, ML & Data Engineering

Twitter Open-Sources Recommendation Algorithm

Twitter recently open-sourced several components of their system for recommending tweets for a user's Twitter timeline. The release includes the code for several of the services and jobs that run the algorithm, as well as code for training machine learning models for embedding and ranking tweets.

Anthony Alford
on Apr 11, 2023
AI, ML & Data Engineering

Google Uses AutoML to Discover More Efficient AI Training Algorithm

Researchers at Google have open-sourced EvoLved sIgn mOmeNtum (Lion), an optimization algorithm for training neural networks, which was discovered using an automated machine learning (AutoML) evolutionary algorithm. Models trained with Lion can achieve better accuracy on several benchmarks than models trained with other optimizers, while requiring fewer compute cycles to converge.

Anthony Alford
on Mar 21, 2023
AI, ML & Data Engineering

Meta AI’s Large Language Model with 10x Fewer Parameters

Meta AI recently released a new large language model called Language Large Models Meta AI (LLaMA) that outperforms foundational models such as GPT-3 and is competitive with PaLM, despite having 10 times fewer parameters. LLaMA has better performance in language tasks such as natural questions, common-sense reasoning and mathematical reasoning.

Bruno Santos
on Mar 18, 2023
AI, ML & Data Engineering

Microsoft Open-Sources Weather Forecasting Deep Learning Model ClimaX

Researchers from Microsoft's Autonomous Systems and Robotics Research group have open-sourced ClimaX, a deep learning foundation model for weather and climate modeling. ClimaX can be fine-tuned for a variety of prediction tasks and performs as well as or better than state-of-the-art models on several benchmarks.

Anthony Alford
on Mar 14, 2023
AI, ML & Data Engineering

Zero-Copy In-Memory Sharing of Large Distributed Data: V6d

Zero-copy and in-memory data manager Vineyard (v6d) is maintained as a CNCF sandbox project and provides distributed operators that can be utilized to share immutable data within or across cluster nodes. V6d is of interest particularly for deep network training on big (sharded) datasets such as large language and graph models.

Sabri Bolkar
on Mar 14, 2023
AI, ML & Data Engineering

DeepMind Open-Sources AI Interpretability Research Tool Tracr

Researchers at DeepMind have open-sourced TRAnsformer Compiler for RASP (Tracr), a compiler that translates programs into neural network models. Tracr is intended for research in mechanistic interpretability of Transformer AI models such as GPT-3.

Anthony Alford
on Feb 28, 2023
AI, ML & Data Engineering

Stanford Researchers Develop Brain-Computer Interface for Speech Synthesis

Researchers from Stanford University have developed a brain-computer interface (BCI) for synthesizing speech from signals captured in a patient's brain and processed by a recurrent neural network (RNN). The prototype system can decode speech at 62 words-per-minute, 3.4x faster than previous BCI methods.

Anthony Alford
on Feb 21, 2023
AI, ML & Data Engineering

Carnegie Mellon Researchers Develop AI Model for Human Detection via WiFi

Researchers from the Human Sensing Laboratory at Carnegie Mellon University (CMU) have published a paper on DensePose From WiFi, an AI model which can detect the pose of multiple humans in a room using only the signals from WiFi transmitters. In experiments on real-world data, the algorithm achieves an average precision of 87.2 at the 50% IOU threshold.

Anthony Alford
on Feb 14, 2023
AI, ML & Data Engineering

Unsupervised Object Detection and Semantic Segmentation Using Deep Learning

Meta AI released CutLER, a state-of-the-art zero-shot unsupervised object detector which improves detection performance by over 2.7 times on 11 benchmark datasets for different domains like video frames, painting, sketches, etc. This model’s simplicity allows compatibility with different object-detection architectures across different domains.

Bruno Santos
on Feb 14, 2023
AI, ML & Data Engineering

Microsoft Open Sources AI Prompt Optimization Toolkit LMOps

Microsoft Research open sourced LMOps, a collection of tools for improving text prompts used as input to generative AI models. The toolkit includes Promptist, which optimizes a user's text input for text-to-image generation, and Structured Prompting, a technique for including more examples in a few-shot learning prompt for text generation.

Anthony Alford
on Feb 07, 2023
AI, ML & Data Engineering

DeepMind Announces Minecraft-Playing AI DreamerV3

Researchers from DeepMind and the University of Toronto announced DreamerV3, a reinforcement-learning (RL) algorithm for training AI models for many different domains. Using a single set of hyperparameters, DreamerV3 outperforms other methods on several benchmarks and can train an AI to collect diamonds in Minecraft without human instruction.

Anthony Alford
on Jan 31, 2023

Newer News

Older News

InfoQ Software Architects' Newsletter

News