InfoQ Homepage Artificial Intelligence Content on InfoQ
-
IBM Cloud Code Engine Serverless Fleets with GPUs for High-Performance AI and Parallel Computing
IBM Cloud Code Engine’s new Serverless Fleets revolutionizes how enterprises tackle compute-intensive tasks. Harnessing integrated GPU support, it simplifies the execution of large-scale workloads with a fully managed, pay-as-you-go model. This efficient platform eliminates operational complexities, enabling developers to focus on innovation while ensuring cost-effectiveness and scalability.
-
Hugging Face Introduces RTEB, a New Benchmark for Evaluating Retrieval Models
Hugging Face unveils the Retrieval Embedding Benchmark (RTEB), a pioneering framework to assess embedding models' real-world retrieval accuracy. By merging public and private datasets, RTEB narrows the "generalization gap," ensuring models perform reliably across critical sectors. Now live and inviting collaboration, RTEB aims to set a community standard in AI retrieval evaluation.
-
Testing Organizations' Widespread Adoption of Agentic AI, but Leadership Lags in Understanding
Nearly all software testing teams are either using or plan to use agentic AI, but many leaders admit they lack a clear grasp of testing realities, according to a recent survey of 400 testing executives and engineering leaders.
-
10 AI-Related Standout Sessions at QCon San Francisco 2025
Join us at QCon San Francisco 2025 (Nov 17–21) for a three-day deep dive into the future of software development, exploring AI’s transformative impact. As a program committee member, I’m excited to showcase tracks that tackle real-world challenges, featuring industry leaders and sessions on AI, LLMs, and engineering mindsets. Don’t miss out!
-
Paper2Agent Converts Scientific Papers into Interactive AI Agents
Stanford's Paper2Agent framework revolutionizes research by transforming static papers into interactive AI agents that execute analyses and respond to queries. Leveraging the Model Context Protocol, it simplifies reproducibility and enhances accessibility, empowering users with dynamic, autonomous tools for deeper scientific exploration and understanding.
-
Genkit Extension for Gemini CLI Brings Framework-Aware AI Assistance to the Terminal
Introducing Google's Genkit Extension for Gemini CLI: a groundbreaking tool that delivers framework-aware AI assistance directly to the terminal. Streamline your Genkit application development with context-aware code generation, debugging, and best practices—all without leaving the command line. Unleash productivity and innovation in building generative AI applications.
-
Claude Sonnet 4.5 Tops SWE-Bench Verified, Extends Coding Focus beyond 30 Hours
Anthropic's Claude Sonnet 4.5, its most advanced coding model, excels in task performance and safety, achieving a 98.7% safety score and improving real-world coding capabilities. Enhanced reasoning skills allow for sustained multi-step tasks, with notable user gains reported. This drop-in replacement demonstrates a powerful balance of capability and security for users.
-
Google DeepMind Introduces CodeMender, an AI Agent for Automated Code Repair
Google DeepMind has introduced CodeMender, a new AI-driven agent designed to detect, fix, and secure software vulnerabilities automatically. The project builds on recent advances in reasoning models and program analysis, aiming to reduce the time developers spend identifying and patching security issues.
-
OpenAI DevDay 2025 Introduces GPT-5 Pro API, Agent Kit, and More
At OpenAI's DevDay 2025, AgentKit and models GPT-5 Pro and Sora 2 were unveiled, enabling interactive software experiences directly within ChatGPT. This shift towards "apps inside ChatGPT" fosters collaboration and commercialization in conversations. Enhanced self-hosting options and robust SDKs empower developers and streamline workflows, positioning OpenAI at the forefront of AI innovation.
-
QCon AI New York 2025 Schedule Published, Highlights Practical Enterprise AI
The QCon AI New York 2025 schedule is now live for its Dec 16-17 event. Focused on moving AI from PoC to production, the program offers a practical roadmap for senior engineers & tech leaders. It addresses the real-world challenges of building, scaling, and deploying reliable, enterprise-grade AI systems, helping organizations overcome the hurdles of productionizing their AI initiatives.
-
GitHub Introduces New Embedding Model to Improve Code Search and Context
GitHub has introduced a new embedding model for Copilot, now integrated into Visual Studio Code. The model is designed to improve how Copilot understands programming context, retrieves relevant code, and suggests completions.
-
Google DeepMind Launches Gemini 2.5 Computer Use Model to Power UI-Controlling AI Agents
Google DeepMind has recently released the Gemini 2.5 Computer Use model, a specialized variant of its Gemini 2.5 Pro system designed to enable AI agents to interact directly with graphical user interfaces. The new model allows developers to build agents that can click, type, scroll, and manipulate interactive elements on web pages.
-
IBM Releases Granite-Docling-258M, a Compact Vision-Language Model for Precise Document Conversion
IBM Research has recently introduced Granite-Docling-258M, a new open-source vision-language model (VLM) designed for high-fidelity document-to-text conversion while preserving complex layouts, tables, equations, and lists.
-
11 Sessions Not to Miss at QCon San Francisco 2025
As QCon San Francisco (Nov 17-21, 2025) approaches, the conference's program committee and track hosts are sharing their top picks from this year's lineup. Their selections span a wide range of topics, from AI-accelerated development and platform engineering to resilience patterns and career growth, all with QCon's signature focus on real-world case studies and lessons learned.
-
Thinking Machines Releases Tinker API for Flexible Model Fine-Tuning
Thinking Machines has released Tinker, an API for fine-tuning open-weight language models. The service is designed to reduce infrastructure overhead for developers, providing managed scheduling, GPU allocation, and checkpoint handling. By abstracting away cluster management, Tinker allows fine-tuning through simple Python calls.