InfoQ Homepage Artificial Intelligence Content on InfoQ
-
Google DeepMind Launches Gemini 2.5 Computer Use Model to Power UI-Controlling AI Agents
Google DeepMind has recently released the Gemini 2.5 Computer Use model, a specialized variant of its Gemini 2.5 Pro system designed to enable AI agents to interact directly with graphical user interfaces. The new model allows developers to build agents that can click, type, scroll, and manipulate interactive elements on web pages.
-
IBM Releases Granite-Docling-258M, a Compact Vision-Language Model for Precise Document Conversion
IBM Research has recently introduced Granite-Docling-258M, a new open-source vision-language model (VLM) designed for high-fidelity document-to-text conversion while preserving complex layouts, tables, equations, and lists.
-
11 Sessions Not to Miss at QCon San Francisco 2025
As QCon San Francisco (Nov 17-21, 2025) approaches, the conference's program committee and track hosts are sharing their top picks from this year's lineup. Their selections span a wide range of topics, from AI-accelerated development and platform engineering to resilience patterns and career growth, all with QCon's signature focus on real-world case studies and lessons learned.
-
Thinking Machines Releases Tinker API for Flexible Model Fine-Tuning
Thinking Machines has released Tinker, an API for fine-tuning open-weight language models. The service is designed to reduce infrastructure overhead for developers, providing managed scheduling, GPU allocation, and checkpoint handling. By abstracting away cluster management, Tinker allows fine-tuning through simple Python calls.
-
Microsoft Announces Open-Source Agent Framework to Simplify AI Agent Development
Microsoft has announced the preview release of Microsoft Agent Framework, an open-source software development kit designed to simplify the creation and deployment of artificial intelligence agents for developers across all skill levels, as reported in official blog posts from the company's development teams.
-
Microsoft Tests Microfluidic Cooling for Next-Generation AI Chips
Microsoft has announced progress on a new chip cooling approach that could help address one of the biggest bottlenecks in scaling AI infrastructure: heat. The company’s researchers have successfully demonstrated in-chip microfluidic cooling, a system that channels liquid coolant directly into etched grooves on the back of silicon chips.
-
DeepMind Releases Gemini Robotics-ER 1.5 for Embodied Reasoning
Google DeepMind introduced Gemini Robotics-ER 1.5, a new embodied reasoning model for robotic applications. The model is available in preview through Google AI Studio and the Gemini API.
-
xAI Releases Grok 4 Fast with Lower Cost Reasoning Model
xAI has introduced Grok 4 Fast, a new reasoning model designed for efficiency and lower cost.
-
OpenAI Releases GPT-5-Codex Optimized for Complex Code Refactoring and Code Reviews
Introducing GPT-5-Codex: OpenAI's latest AI model revolutionizing software engineering with advanced capabilities in code refactoring and review. Operating autonomously for over 7 hours, it ensures efficiency and accuracy, achieving 51.3% accuracy in complex tasks. Adaptively reasoning, it enhances developer workflows, producing high-quality, tested code while minimizing noise.
-
Replit Introduces Agent 3 for Extended Autonomous Coding and Automation
Replit has introduced Agent 3, its latest autonomous software agent built to extend the use of AI in programming and workflow automation. Unlike earlier coding assistants that provide small pieces of help through autocomplete or single-step code generation, Agent 3 is designed to carry out tasks over an extended period of time.
-
Temporal and OpenAI Launch AI Agent Durability with Public Preview Integration
Temporal has unveiled a public preview integration with the OpenAI Agents SDK, introducing durable execution capabilities to AI agent workflows built using OpenAI's framework.
-
Kaggle Introduces Game Arena to Benchmark AI Models in Strategic Games
Kaggle, in collaboration with Google DeepMind, has introduced Kaggle Game Arena, a platform designed to evaluate artificial intelligence models by testing their performance in strategy-based games.
-
From Black Box to Blueprint: Thoughtworks Uses Generative AI to Extract Legacy System Functionality
Thoughtworks consultants successfully harnessed generative AI to decode legacy systems lacking source code. Using Gemini 2.5 Pro, they accelerated reverse engineering, creating validated "blueprints" of functionality in just two weeks. The pilot showcased AI's potential to drastically reduce time and risk in modernizing opaque systems while balancing speed with validation.
-
Introducing the MCP Registry
The Model Context Protocol (MCP) ecosystem is enhancing AI development with a public registry for server discovery and a secure gateway for agent interactions. This initiative, featuring the recently launched MCP Registry and the Linux Foundation's Agentgateway project, streamlines the management of AI tools, fostering collaboration and security for engineering teams.
-
Hugging Face Releases FinePDFs: a 3-Trillion-Token Dataset Built from PDFs
Hugging Face has unveiled FinePDFs, the largest publicly available corpus built entirely from PDFs. The dataset spans 475 million documents in 1,733 languages, totaling roughly 3 trillion tokens. At 3.65 terabytes in size, FinePDFs introduces a new dimension to open training datasets by tapping into a resource long considered too complex and expensive to process.