InfoQ Homepage Deep Learning Content on InfoQ

News

RSS Feed

Newer Older

AI, ML & Data Engineering

BigCode Project Releases Permissively Licensed Code Generation AI Model and Dataset

The BigCode Project recently released The Stack, a 6.4TB dataset containing de-duplicated source code from permissively licensed GitHub repositories which can be used to train code generation AI models. BigCode also released SantaCoder, a 1.1B parameter code generation model trained on The Stack. SantaCoder outperforms similar open-source code generation models.

Anthony Alford
on Jan 17, 2023
AI, ML & Data Engineering

3D Point Cloud Object from Text Prompts Using Diffusion Models

OpenAI recently released an alternative method called Point-E for 3D object generation from text prompts that takes less than two minutes on a single GPU, versus the other methods that could take a few GPU hours. This new model is based on diffusion models, which are generative models like GLIDE and StableDiffusion.

Bruno Santos
on Jan 14, 2023
AI, ML & Data Engineering

Google AI Unveils Muse, a New Text-to-Image Transformer Model

Google AI released a research paper about Muse, a new Text-To-Image Generation via Masked Generative Transformers that can produce photos of a high quality comparable to those produced by rival models like the DALL-E 2 and Imagen at a rate that is far faster.

Daniel Dominguez
on Jan 13, 2023
AI, ML & Data Engineering

Deep Learning Pioneer Geoffrey Hinton Publishes New Deep Learning Algorithm

Geoffrey Hinton, professor at the University of Toronto and engineering fellow at Google Brain, recently published a paper on the Forward-Forward algorithm (FF), a technique for training neural networks that uses two forward passes of data through the network, instead of backpropagation, to update the model weights.

Anthony Alford
on Jan 10, 2023
AI, ML & Data Engineering

Google Publishes Technique for AI Language Model Self-Improvement

Researchers at Google and University of Illinois at Urbana-Champaign (UIUC) have published a technique called Language Model Self-Improved (LMSI), which fine-tunes a large language model (LLM) on a dataset generated by that same model. Using LMSI, the researchers improved the performance of the LLM on six benchmarks and set new state-of-the-art accuracy records on four of them.

Anthony Alford
on Jan 03, 2023
AI, ML & Data Engineering

Researchers Publish Survey of Algorithmically-Efficient Deep Learning

Researchers from Lawrence Livermore National Laboratory and MosaicML have published a survey of over 200 papers on algorithmically-efficient deep learning. The survey includes a taxonomy of methods to speed up training as well as a practitioner's guide for mitigating training bottlenecks.

Anthony Alford
on Dec 30, 2022
AI, ML & Data Engineering

ML.NET 2.0 Release Contains New NLP APIs and AutoML Updates

Microsoft announced the release of ML.NET 2.0, the open-source machine learning framework for .NET. The release contains several updated natural language processing (NLP) APIs, including Tokenizers, Text Classification, and Sentence Similarity, as well as improved automated ML (AutoML) features.

Anthony Alford
on Dec 27, 2022
AI, ML & Data Engineering

Meta's CICERO AI Wins Online Diplomacy Tournament

Meta AI Research recently open-sourced CICERO, an AI that can beat most humans at the strategy game Diplomacy, a game that requires coordinating plans with other players. CICERO combines chatbot-like dialogue capabilities with a strategic reasoning, and recently placed first in an online Diplomacy tournament against human players.

Anthony Alford
on Dec 20, 2022
AI, ML & Data Engineering

NVIDIA Kubernetes Device Plug-in Brings Temporal GPU Concurrency

Starting from the v12 release, the Nvidia GPU device plug-in framework started supporting time-sliced sharing between CUDA workloads on Kubernetes. This feature aims to prevent under-utilization of GPU units and make it easier to scale applications by leveraging concurrently-executing CUDA contexts.

Sabri Bolkar
on Dec 19, 2022
AI, ML & Data Engineering

Wayve's End-to-End Deep Learning Model for Self-Driving Cars

Wayve released a state-of-the-art end-to-end model for learning a world model and vehicular driving policy based on simulation data from CARLA, allowing autonomy to cars without HD maps. Wayve’s new Model-based Imitation Learning (MILE) is a machine-learning model, specifically a reinforcement learning architecture, that learns a model of the world and a driving policy during offline training.

Bruno Santos
on Dec 13, 2022
AI, ML & Data Engineering

Microsoft Open-Sources Agricultural AI Toolkit FarmVibes.AI

Microsoft Research recently open-sourced FarmVibes.AI, a suite of ML models and tools for sustainable agriculture. FarmVibes.AI includes data processing workflows for fusing multiple sets of spatiotemporal and geospatial data, such as weather data and satellite and drone imagery.

Anthony Alford
on Dec 06, 2022
AI, ML & Data Engineering

Google's Code-as-Policies Lets Robots Write Their Own Code

Researchers from Google's Robotics team have open-sourced Code-as-Policies (CaP), a robot control method that uses a large language model (LLM) to generate robot-control code that achieves a user-specified goal. CaP uses a hierarchical prompting technique for code generation that outperforms previous methods on the HumanEval code-generation benchmark.

Anthony Alford
on Nov 29, 2022
AI, ML & Data Engineering

Salesforce Open-Sources Language-Vision AI Toolkit LAVIS

Salesforce Research recently open-sourced LAnguage-VISion (LAVIS), a unified library for deep-learning language-vision research. LAVIS supports more than 10 language-vision tasks on 20 public datasets and includes pre-trained model weights for over 30 fine-tuned models.

Anthony Alford
on Nov 15, 2022
AI, ML & Data Engineering

Meta Announces Next Generation AI Hardware Platform Grand Teton

Meta recently announced Grand Teton, their next-generation hardware platform for AI training. Grand Teton features several improvements over the previous generation, including 2x the network bandwidth and 4x the host-to-GPU bandwidth.

Anthony Alford
on Nov 08, 2022
AI, ML & Data Engineering

Alpa: Automating Model Sharding for Distributed Deep Learning

A new open-source library called Alpa aims to automate distributed training and serving of large deep networks. It proposes a compiler where existing model-parallel strategies are combined and the usage of computing resources is optimized according to the deep network architecture.

Sabri Bolkar
on Oct 31, 2022

Newer News

Older News

InfoQ Software Architects' Newsletter

News