InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
Researchers Publish Survey of Algorithmically-Efficient Deep Learning
Researchers from Lawrence Livermore National Laboratory and MosaicML have published a survey of over 200 papers on algorithmically-efficient deep learning. The survey includes a taxonomy of methods to speed up training as well as a practitioner's guide for mitigating training bottlenecks.
-
Meta Releases data2vec 2.0 a High Efficiency Self-Supervised Model
Meta has released version 2.0 of Data2Vec, a self-supervised algorithm that can learn in the same way from three different modalities: speech, vision, and text, and achieves the same accuracy of the other computer vision models but 16x faster. The code and pretrained models are also shared with the other researchers.
-
ML.NET 2.0 Release Contains New NLP APIs and AutoML Updates
Microsoft announced the release of ML.NET 2.0, the open-source machine learning framework for .NET. The release contains several updated natural language processing (NLP) APIs, including Tokenizers, Text Classification, and Sentence Similarity, as well as improved automated ML (AutoML) features.
-
OpenAI Unveils a Powerful, Cost-Effective, and User-Friendly Embedding Model
OpenAI is introducing text-embedding-ada-002, a cutting-edge embedding model that combines the capabilities of five previous models for text search, text similarity, and code search. This new model outperforms the previous most capable model, Davinci, on most tasks, while being significantly more cost-effective at 99.8% lower pricing.
-
How Twitter Automated Data Quality Check Process
Twitter engineering has recently shared a blog post on how they architected and developed a quality automation platform. Twitter digests and creates thousands of data sets for different data products and applications. The next natural step is to make sure of the quality of the data by adding automation on top of it. In this news post, we explore this architecture in more detail.
-
Meta's CICERO AI Wins Online Diplomacy Tournament
Meta AI Research recently open-sourced CICERO, an AI that can beat most humans at the strategy game Diplomacy, a game that requires coordinating plans with other players. CICERO combines chatbot-like dialogue capabilities with a strategic reasoning, and recently placed first in an online Diplomacy tournament against human players.
-
AWS Makes it Simpler to Share ML Models and Notebooks with Amazon SageMaker JumpStart
AWS announced that it is now easier to share machine learning artifacts like models and notebooks with other users using SageMaker JumpStart. Amazon SageMaker JumpStart is a machine learning hub that helps users accelerate their journey into the world of machine learning.
-
NVIDIA Kubernetes Device Plug-in Brings Temporal GPU Concurrency
Starting from the v12 release, the Nvidia GPU device plug-in framework started supporting time-sliced sharing between CUDA workloads on Kubernetes. This feature aims to prevent under-utilization of GPU units and make it easier to scale applications by leveraging concurrently-executing CUDA contexts.
-
AWS Announces Clean Rooms for Secure Collaboration with Analytics Data
During the recent re:Invent conference, AWS announced the preview of Clean Rooms for analytics data. The new service provides safe environments where multiple customers can securely share and analyze data with control of how the data is used, reducing the risk of sharing personal data.
-
OpenAI Releases Conversational AI Model ChatGPT
OpenAI released ChatGPT, a conversational AI model based on their GPT-3.5 language model (LM). ChatGPT is fine-tuned using Reinforcement Learning from Human Feedback (RLHF) and includes a moderation filter to block inappropriate interactions.
-
Wayve's End-to-End Deep Learning Model for Self-Driving Cars
Wayve released a state-of-the-art end-to-end model for learning a world model and vehicular driving policy based on simulation data from CARLA, allowing autonomy to cars without HD maps. Wayve’s new Model-based Imitation Learning (MILE) is a machine-learning model, specifically a reinforcement learning architecture, that learns a model of the world and a driving policy during offline training.
-
Meta MultiRay Allows Efficiency on Large-Scale AI Models
Meta developed MultiRay, a platform that allows the cost-effective running state-of-the-art machine learning models. MultiRay allows models to run on the same input in order to share the majority of the running cost with a little addictive cost per model.
-
Microsoft Open-Sources Agricultural AI Toolkit FarmVibes.AI
Microsoft Research recently open-sourced FarmVibes.AI, a suite of ML models and tools for sustainable agriculture. FarmVibes.AI includes data processing workflows for fusing multiple sets of spatiotemporal and geospatial data, such as weather data and satellite and drone imagery.
-
Recap of AWS re:Invent 2022
After a virtual-only event in 2020 and a reduced-size 2021 edition, re:Invent was back last week in Las Vegas with over 50,000 attendees for the 11th edition. During multiple sessions and keynotes at the largest AWS yearly conference, the cloud provider announced new services and features, with the focus more on business solutions and data options than new building blocks.
-
AWS Announces the General Availability of Amazon Omics
At re:Invent, AWS announced the general availability of Amazon Omics, a managed service for storage, analysis, and elaboration of genomic, transcriptomic, and other omics data. The service is designed for healthcare and life science organizations to enhance patient care and advance scientific research.