InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
Stanford University Open-Sources Controllable Generative Language AI Diffusion-LM
Researchers at Stanford University have open-sourced Diffusion-LM, a non-autoregressive generative language model that allows for fine-grained control of the model's output text. When evaluated on controlled text generation tasks, Diffusion-LM outperforms existing methods.
-
Amazon Released Incremental Training Feature in SageMaker JumpStart
AWS recently released a new feature in SageMaker (AWS Machine Learning Service) JumpStart to incrementally retrain machine-learning (ML) models trained with expanded datasets. By using this feature, developers could fine-tune their models for better performance in production with a couple of clicks. This recent feature is among the series of efforts to add more automation to SageMaker JumpStart.
-
Microsoft Limits Public Access to AI-Powered Facial Analysis Features
Microsoft recently announced phasing out public access to AI-powered Facial Analysis features in several Azure services.
-
GitHub Copilot Adopts Paid Model, Still Free for Some Open-Source Maintainers and Students
After almost one year in technical preview, GitHub Copilot is now prime time-ready for students and individual developers, says GitHub, while companies and larger organizations could get access to it before the end of the year.
-
DeepMind Trains 80 Billion Parameter AI Vision-Language Model Flamingo
DeepMind recently trained Flamingo, an 80B parameter vision-language model (VLM) AI. Flamingo combines separately pre-trained vision and language models and outperforms all other few-shot learning models on 16 vision-language benchmarks. Flamingo can also chat with users, answering questions about input images and videos.
-
Microsoft Launches New Storage Optimized VMs with Lasv3 and Lsv3
Recently Microsoft announced the general availability (GA) of new storage-optimized Azure Virtual Machines (VMs). These VMs are the Lasv3 and Lsv3 series designed to run workloads requiring high throughput and IOPS, including big data applications, SQL and NoSQL databases, distributed file systems, and data analytics engines.
-
Google's New Imagen AI Outperforms DALL-E on Text-to-Image Generation Benchmarks
Researchers from Google's Brain Team have announced Imagen, a text-to-image AI model that can generate photorealistic images of a scene given a textual description. Imagen outperforms DALL-E 2 on the COCO benchmark, and unlike many similar models, is pre-trained only on text data.
-
Microsoft Launches the Public Preview of Dynatrace for Azure as a SaaS Solution in Their Marketplace
Microsoft recently announced Dynatrace for Azure, a natively integrated software (SaaS) solution from Dynatrace available in preview in the Azure Marketplace.
-
Microsoft's New Simulation Framework FLUTE Accelerates Federated Learning Algorithm Development
Microsoft Research has recently released Federated Learning Utilities and Tools for Experimentation (FLUTE), a new simulation framework to accelerate federated learning ML algorithm development. The main goal of federated learning is to train complex machine-learning models over massive amounts of data without the need to share that data in a centralized location.
-
A New Microsoft Platform in Town: the Microsoft Intelligent Data Platform
Recently Microsoft introduced a new platform called the Microsoft Intelligent Data Platform that fully integrates their database, analytics, and governance offerings. The new platform encompasses everything already available in the Azure Data space (Azure Data Factory, Azure Data Explorer, etc.) to the Synapse Analytics products, Power BI, and the newly rebranded Purview data governance service.
-
Meta Open-Sources 175 Billion Parameter AI Language Model OPT
Meta AI Research released Open Pre-trained Transformer (OPT-175B), a 175B parameter AI language model. The model was trained on a dataset containing 180B tokens and exhibits performance comparable with GPT-3, while only requiring 1/7th GPT-3's training carbon footprint.
-
Amazon Elastic MapReduce Now Generally Available as a Serverless Offering
AWS recently announced that Amazon Elastic MapReduce (EMR) Serverless is generally available (GA). The offering is a serverless deployment option for customers to run big data analytics applications using open-source frameworks like Apache Spark and Hive without configuring, managing, and scaling clusters or servers.
-
Google Introduces New AI Features in Workspace
Google’s latest AI developments are aimed at assisting employees in focusing on what matters, collaborating securely, and strengthening human relationships across all work modes and locations.
-
Allen Institute for AI Open-Sources AI Model Inspection Tool LM-Debugger
The Allen Institute for AI (AI2) open-sourced LM-Debugger, an interactive tool for interpreting and controlling the output of language model (LM) predictions. LM-Debugger supports any HuggingFace GPT-2 model and allows users to intervene in the text generation process by dynamically modifying updates in the hidden layers of the model's neural network.
-
New GraphWorld Tool Accelerates Graph Neural-Network Benchmarking
Google AI has recently released GraphWorld, a tool to accelerate performance benchmarking in the area of graph neural networks (GNNs). GraphWorld is a configurable framework to generate graphs with a variety of structural properties like different node degree distributions and Gini index.