InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
AI Listens by Seeing as Well
Meta AI released a self-supervised speech recognition model that also uses video and achieves 75% better accuracy for some amount of data than current state-of-the-art models. This new model, Audio-Visual Hidden BERT (AV-HuBERT), uses audiovisual features for improving models based only on hearing speech. Visual features used are based on lip-reading, similar to what humans do.
-
Meta and AWS to Collaborate on PyTorch Adoption
Meta and AWS will work together to improve the performance for customers of applications running PyTorch on AWS and accelerate how developers build, train, deploy, and operate artificial intelligence and machine-learning models.
-
Facebook Open-Sources Two Billion Parameter Multilingual Speech Recognition Model XLS-R
Facebook AI Research (FAIR) open-sourced XLS-R, a cross-lingual speech recognition (SR) AI model. XSLR is trained on 436K hours of speech audio from 128 languages, an order of magnitude more than the largest previous models, and outperforms the current state-of-the-art on several downstream SR and translation tasks.
-
MLCommons Announces Latest MLPerf Training Benchmark Results
Engineering consortium MLCommons recently announced the results of the latest round of their MLPerf Training benchmark competition. Over 158 AI training job performance metrics were submitted by 14 organizations, with the best results improving up to 2.3x compared to the previous round.
-
Google Trains 280 Billion Parameter AI Language Model Gopher
Google subsidiary DeepMind announced Gopher, a 280-billion-parameter AI natural language processing (NLP) model. Based on the Transformer architecture and trained on a 10.5TB corpus called MassiveText, Gopher outperformed the current state-of-the-art on 100 of 124 evaluation tasks.
-
Microsoft Open-Sources Distributed Machine Learning Library SynapseML
Microsoft announced the release of SynapseML, an open-source library for creating and managing distributed machine learning (ML) pipelines. SynapseML runs on Apache Spark, provides a language-agnostic API abstraction over several datastores, and integrates with several existing ML technologies, including Open Neural Network Exchange (ONNX).
-
DeepMind Releases Weather Forecasting AI Deep Generative Models of Rainfall
DeepMind open-sourced a dataset and trained model snapshot for Deep Generative Models of Rainfall (DGMR), an AI system for short-term precipitation forecasts. In evaluations conducted by 58 expert meteorologists comparing it to other existing methods, DGMR was ranked first in accuracy and usefulness in 89% of test cases.
-
Azure Space Introduces Azure Orbital in Preview and New Geospatial Capabilities
Microsoft recently announced new satellite connectivity and geospatial capabilities for Azure Space. The cloud provider introduced the preview of Azure Orbital, a ground station as-a-service that provides communication and control of satellites, and added geospatial and data analytics partnerships with Esri, Blackshark.ai, and Orbital Insight.
-
AWS Launches SageMaker Studio Lab, Free Tool to Learn and Experiment with Machine Learning
AWS has introduced SageMaker Studio Lab, a free service to help developers learn machine-learning techniques and experiment with the technology. SageMaker Studio Lab provides users with all of the basics to get started, including a JupyterLab IDE, model training on CPUs and GPUs and 15 GB of persistent storage.
-
MIT Researchers Investigate Deep Learning's Computational Burden
A team of researchers from MIT, Yonsei University, and University of Brasilia have launched a new website, Computer Progress, which analyzes the computational burden from over 1,000 deep learning research papers. Data from the site show that computational burden is growing faster than the expected rate, suggesting that algorithms still have room for improvement.
-
Hazelcast Announces a New Unified Platform with Version 5.0
Hazelcast, the distributed computation and storage platform, has announced the release of the Hazelcast Platform version 5.0. This new platform unifies the existing products Hazelcast IMDG and Hazelcast Jet. InfoQ spoke about this new release with John DesJardins, CTO at Hazelcast.
-
Chip Huyen on Streaming-First Infrastructure for Real-Time ML
At the recent QCon Plus online conference, Chip Huyen gave a talk on continual machine learning titled "Streaming-First Infrastructure for Real-Time ML." Some key takeaways included the advantages of a streaming-first infrastructure for real-time and continual machine learning, the benefits of real-time ML, and the challenges of implementing real-time ML.
-
Get Consistent Access to Third-Party APIs with AWS Data Exchange for APIs
During the recent AWS re:Invent in Las Vegas, the company announced the AWS Data Exchange for APIs. This new capability enables customers to find, subscribe to, and use third-party API products from providers on AWS Data Exchange.
-
AMD Introduces Its Deep-Learning Accelerator Instinct MI200 Series GPUs
In its recent Accelerated Data Center Premiere Keynote, AMD unveiled its MI200 accelerator series Instinct MI250x and slightly lower-end Instinct MI250 GPUs. Designed with CDNA-2 architecture and TSMC’s 6nm FinFET lithography, the high-end MI250X provides 47.9 TFLOPs peak double precision performance and memory that will allow training larger deep networks by minimizing model sharding.
-
Katharine Jarmul on Machine Learning at the Edge
At the recent QCon Plus online conference, Katharine Jarmul gave a talk on federated machine learning titled "Machine Learning at the Edge." She covered several federated ML architectures and use cases, discussed pros and cons of federated ML, and presented tips on how to decide whether federated ML is a good solution for a given problem.