InfoQ Homepage Deep Learning Content on InfoQ
-
Meta Open-Sources Multilingual Translation Foundation Model SeamlessM4T
Meta recently open-sourced Massively Multilingual & Multimodal Machine Translation (SeamlessM4T), a multilingual translation AI that can translate both speech audio and text data across nearly 100 languages. SeamlessM4T is trained on 1 million hours of audio data and outperforms the current state-of-the-art speech-to-text translation model.
-
AWS Unveils Multi-Model Endpoints for PyTorch on SageMaker
AWS has introduced Multi-Model Endpoints for PyTorch on Amazon SageMaker. This latest development promises to revolutionize the AI landscape, offering users more flexibility and efficiency when deploying machine learning models.
-
Ai4 2023 Panel Discussion: Generative AI in Business and Society
The recent Ai4 conference featured a panel discussion titled "Generative AI in Business and Society." Some key takeaways are that generative AI offers many opportunities for operational efficiency and product personalization, that companies need to balance privacy concerns with personalization, and they need to understand how generative AI is used across their organization.
-
Ai4 2023 Summary Day Two: AI Legal Issues, AI in Education & Deploying AI
Day Two of Ai4 2023 conference was held on August 9th, 2023, at the MGM Grand hotel in Las Vegas, Nevada. This two-day event is organized by Fora Group and includes tracks focused on various industries, including automotive, financial, healthcare, and government. The day began with six mainstage presentations from leaders in AI technology.
-
Meta's Voicebox Outperforms State-of-the-Art Models on Speech Synthesis
Meta recently announced Voicebox, a speech generation model that can perform text-to-speech (TTS) synthesis in six languages, as well as edit and remove noise from speech recordings. Voicebox is trained on over 50k hours of audio data and outperforms previous state-of-the-art models on several TTS benchmarks.
-
AI, ML, Data Engineering News Round up: Claude 2, Stable Doodle, CM3leon, Llama 2, Azure and xAI
The most recent update, covering developments from July 17th, 2023, showcases significant progress and announcements in the fields of data science, machine learning, and artificial intelligence. This week's focus centers on Anthropic, Stability AI, Microsoft, Meta and xAI.
-
Berkeley Open-Sources AI Image-Editing Model InstructPix2Pix
Researchers from the Berkeley Artificial Intelligence Research (BAIR) Lab have open-sourced InstructPix2Pix, a deep-learning model that follows human instructions to edit images. InstructPix2Pix was trained on synthetic data and outperforms a baseline AI image-editing model.
-
EU AI Act: the Regulatory Framework on the Usage of Machine Learning in the European Union
After the first publication of the proposal on the operation of machine learning applications in 2021, on June 14th negotiations have started for the realization of the legislation in the EU Council. The EU countries are expected to reach an agreement by the end of 2023. The EU Act takes a risk-based approach and plans to avoid disproportionate prescriptions when executing the regulations.
-
OpenAI Introduces Superalignment to Address Rogue Superintelligent AI
OpenAI announced the formation of a specialized Superalignment team with the objective of preventing the emergence of rogue Superintelligent AI. OpenAI highlighted the need to align AI systems with human values and emphasized the importance of proactive measures to prevent potential harm.
-
Meta's Open-Source Massively Multilingual Speech AI Handles over 1,100 Languages
Meta AI open-sourced the Massively Multilingual Speech (MMS) model, which supports automatic speech recognition (ASR) and text-to-speech synthesis (TTS) in over 1,100 languages and language identification (LID) in over 4,000 languages. MMS can outperform existing models and covers nearly 10x the number of languages.
-
Meta Open-Sources Computer Vision Foundation Model DINOv2
Meta AI Research open-sourced DINOv2, a foundation model for computer vision (CV) tasks. DINOv2 is pretrained on a curated dataset of 142M images and can be used as a backbone for several tasks, including image classification, video action recognition, semantic segmentation, and depth estimation.
-
Google's Universal Speech Model Performs Speech Recognition on Hundreds of Languages
Google Research announced Universal Speech Model (USM), a 2B parameter automated speech recognition (ASR) model trained on over 12M hours of speech audio. USM can recognize speech in over 100 languages, including low-resource languages, and achieves new state-of-the-art performance on several benchmarks.
-
Stability AI Open-Sources 7B Parameter Language Model StableLM
Stability AI released two sets of pre-trained model weights for StableLM, a suite of large language models (LLM). The models are trained on 1.5 trillion text tokens and are licensed for commercial use under CC BY-SA-4.0.
-
Meta's Toolformer Uses APIs to Outperform GPT-3 on Zero-Shot NLP Tasks
Meta AI Research announced Toolformer, a language model that learns to call APIs to help solve natural language processing (NLP) tasks. Toolformer automatically annotates a training dataset which is used to fine-tune the model and can outperform the much larger GPT-3 model on several zero-shot NLP tasks.
-
Meta Open-Sourced AI Tool to Animate Child and Amateur Drawings of Human Figure
Based on a joint research by Meta AI Research, Tencent America, MIT CSAIL, and Carnegie Mellon, Meta released Animated Drawings, an AI-based tool to create animations from hand drawn human-like characters.