InfoQ Homepage Reinforcement Learning Content on InfoQ

News

RSS Feed

DevOps

Railway Highlights the Importance of Logs, Metrics, Traces, and Alerts for Diagnosing System Failure

Railway’s engineering team published a comprehensive guide to observability, explaining how developers and SRE teams can use logs, metrics, traces, and alerts together to understand and diagnose production system failures.

Craig Risi
on Jan 28, 2026
AI, ML & Data Engineering

Google Introduces TranslateGemma Open Models for Multilingual Translation

Google has released TranslateGemma, a set of open translation models based on the Gemma 3 architecture, offering 4B, 12B, and 27B parameter variants designed to support machine translation across 55 languages and to run on platforms ranging from mobile and edge devices to consumer hardware and cloud accelerators.

Daniel Dominguez
on Jan 28, 2026
AI, ML & Data Engineering

Prime Intellect Releases INTELLECT-2: a 32B Parameter Model Trained via Decentralized Reinforcement

Prime Intellect has released INTELLECT-2, a 32 billion parameter language model trained using fully asynchronous reinforcement learning across a decentralized network of compute contributors. Unlike traditional centralized model training, INTELLECT-2 is developed on a permissionless infrastructure where rollout generation, policy updates, and training are distributed and loosely coupled.

Robert Krzaczyński
on May 21, 2025
AI, ML & Data Engineering

HuatuoGPT-o1: Advancing Complex Medical Reasoning with AI

Researchers from The Chinese University of Hong Kong, Shenzhen, and the Shenzhen Research Institute of Big Data have introduced HuatuoGPT-o1, a medical large language model (LLM) designed to improve reasoning in complex healthcare scenarios.

Robert Krzaczyński
on Jan 14, 2025
DevOps

Meta Optimizes Data Center Sustainability with Reinforcement Learning

In a recent blog post, Meta describes how its engineers use reinforcement learning (RL), to optimize environmental controls in Meta’s data centers, reducing energy consumption and water usage while addressing broader challenges such as climate change.

Claudio Masolo
on Oct 25, 2024
AI, ML & Data Engineering

NVIDIA Open-Sources Robot Learning Framework Orbit

A team of researchers from NVIDIA, ETH Zurich, and the University of Toronto open-sourced Orbit, a simulation-based robot learning framework. Orbit includes wrappers for four learning libraries, a suite of benchmark tasks, and simulation for several robot platforms, as well as interfaces for deploying trained agents on physical robots.

Anthony Alford
on Mar 07, 2023
AI, ML & Data Engineering

Netflix’s New Algorithm Offers Optimal Recommendation Lists for Users with Finite Time Budget

Netflix developed a new machine learning algorithm based on reinforcement learning to create an optimal list of recommendations considering a finite time budget for the user. In a recommendation use case, often the factor of finite time to make a decision is ignored.

Claudio Masolo
on Sep 14, 2022
AI, ML & Data Engineering

PrefixRL: Nvidia's Deep-Reinforcement-Learning Approach to Design Better Circuits

Nvidia has developed PrefixRL, an approach based on reinforcement learning (RL) to designing parallel-prefix circuits that are smaller and faster than those designed by state-of-the-art electronic-design-automation (EDA) tools.

Claudio Masolo
on Aug 04, 2022
AI, ML & Data Engineering

OpenAI Releases Minecraft-Playing AI VPT

Researchers from OpenAI have open-sourced Video PreTraining (VPT), a semi-supervised learning technique for training game-playing agents. In a zero-shot setting, VPT performs tasks that agents cannot learn via reinforcement learning (RL) alone, and with fine-tuning is the first AI to craft a diamond pickaxe in Minecraft.

Anthony Alford
on Jul 12, 2022
AI, ML & Data Engineering

DeepMind Trains AI Controller for Nuclear Fusion Research Device

Researchers at Google subsidiary DeepMind and the Swiss Plasma Center at EPFL have developed a deep reinforcement learning (RL) AI that creates control algorithms for tokamak devices used in nuclear fusion research. The system learned control policies while interacting with a simulator, and when used to control a real device was able to achieve novel plasma configurations.

Anthony Alford
on May 10, 2022
AI, ML & Data Engineering

Allen Institute Launches Updated Embodied AI Challenge

The Allen Institute for AI (AI2) has announced the 2022 version of their AI2-THOR Rearrangement Challenge. The challenge requires competitors to design an autonomous agent that can move objects in a virtual room and includes several improvements including a new dataset and faster training using the latest release of the AI2-THOR simulation platform.

Anthony Alford
on Mar 15, 2022
AI, ML & Data Engineering

University Researchers Develop Brain-Computer Interface for Robot Control

Researchers from École Polytechnique Fédérale de Lausanne (EPFL) in Switzerland and the University of Texas at Austin (UT) have developed a brain-computer interface (BCI) that allows users to modify a robot manipulator's motion trajectories. The system uses inverse reinforcement learning (IRL) and can learn a user's preferences using less than five demonstrations.

Anthony Alford
on Feb 01, 2022
AI, ML & Data Engineering

Joanneum Research Releases Robot AI Platform Robo-Gym Version 1.0.0

Joanneum Research's Institute for Robotics and Mechatronics has released version 1.0.0 of robo-gym, an open-source framework for developing reinforcement learning (RL) AI for robot control. The release includes a new obstacle avoidance environment, support for all Universal Robots cobot models, and improved code quality.

Anthony Alford
on Jul 27, 2021
AI, ML & Data Engineering

DeepMind's Agent57 Outperforms Humans on All Atari 2600 Games

Researchers at Google's DeepMind have produced a reinforcement-learning (RL) system called Agent57 that has scored above the human benchmark on all 57 Atari 2600 games in the Arcade Learning Environment. Agent57 is the first system to outperform humans on even the hardest games in the suite.

Anthony Alford
on May 05, 2020

InfoQ Software Architects' Newsletter

News