InfoQ Homepage OpenAI Content on InfoQ
-
OpenAI Launches BrowseComp to Benchmark AI Agents' Web Search and Deep Research Skills
OpenAI has released BrowseComp, a new benchmark designed to test AI agents' ability to locate difficult-to-find information on the web. The benchmark contains 1,266 challenging problems that require agents to persistently navigate through multiple websites to retrieve entangled information.
-
Google DeepMind Shares Approach to AGI Safety and Security
Google DeepMind has released a new paper outlining its approach to safety and security in the development of artificial general intelligence (AGI). AGI refers to AI systems that are as capable as humans at most cognitive tasks.
-
OpenAI Releases Improved Image Generation in GPT-4o
OpenAI released a new version of GPT-4o with native image generation capability. The model can modify uploaded images or create new ones from prompts and exhibits multi-turn consistency when refining images and improved generation of text in images.
-
OpenAI Introduces New Speech Models for Transcription and Voice Generation
OpenAI has introduced new speech-to-text and text-to-speech models in its API, focusing on improving transcription accuracy and offering more control over AI-generated voices. These updates aim to enhance automated speech applications, making them more adaptable to different environments and use cases.
-
OpenAI Launches New API, SDK, and Tools to Develop Custom Agents
OpenAI has announced the new Responses API, the Agents SDK, and observability tools to address the challenges that creating production-ready agents pose, such as building custom orchestration, and handling prompt iteration across complex, multi-step tasks.
-
OpenAI Introduces Software Engineering Benchmark
OpenAI has introduced the SWE-Lancer benchmark, to evaluate the capabilities of advanced AI language models in real-world freelance software engineering tasks.
-
Perplexity Unveils Deep Research: AI-Powered Tool for Advanced Analysis
Perplexity has introduced Deep Research, an AI-powered tool designed for conducting in-depth analysis across various fields, including finance, marketing, and technology. The system automates the research process by performing multiple searches, analyzing extensive sources, and synthesizing findings into structured reports within minutes.
-
OpenAI Cancels o3 Release and Announces Roadmap for GPT 4.5, 5
OpenAI is restructuring its AI strategy to focus solely on GPT-5, consolidating capabilities like reasoning, voice synthesis, and deep research into one unified model. This shift aims to simplify product offerings and enhance user experience, with tiered subscription levels for varying intelligence. As competition heats up, the success of GPT-5 will be pivotal for OpenAI’s future.
-
OpenAI Releases Operator, an AI Agent for Web-Based Tasks
OpenAI released a research preview of Operator, an AI agent that can use a web browser to perform tasks on a user's behalf. Operator achieves new state-of-the-art performance on the WebArena and WebVoyager benchmarks.
-
OpenAI Releases Reasoning Model o3-mini, Faster and More Accurate Than o1
OpenAI released OpenAI o3-mini, their latest reasoning LLM. o3-mini is optimized for STEM applications and outperforms the full o1 model on science, math, and coding benchmarks, with lower response latency than o1-mini.
-
OpenAI Features New o3-mini Model on Microsoft Azure OpenAI Service
OpenAI has launched the advanced o3-mini model via Microsoft Azure, enhancing AI applications with improved cost efficiency, faster performance, and adjustable reasoning capabilities. Designed for complex tasks, it supports structured outputs and backward compatibility. With widespread access, the o3-mini empowers developers to drive innovation across various industries.
-
OpenAI Launches Deep Research: Advancing AI-Assisted Investigation
OpenAI has launched Deep Research, a new agent within ChatGPT designed to conduct in-depth, multi-step investigations across the web. Initially available to Pro users, with plans to expand access to Plus and Team users, Deep Research automates time-consuming research by retrieving, analyzing, and synthesizing online information.
-
Hugging Face Expands Serverless Inference Options with New Provider Integrations
Hugging Face has launched the integration of four serverless inference providers Fal, Replicate, SambaNova, and Together AI, directly into its model pages. These providers are also integrated into Hugging Face's client SDKs for JavaScript and Python, allowing users to run inference on various models with minimal setup.
-
OpenAI Introduces ChatGPT Gov for U.S. Government Agencies
OpenAI has launched ChatGPT Gov, a version of its AI-powered chatbot designed specifically for U.S. government agencies. This tailored deployment provides federal, state, and local agencies with access to OpenAI’s latest AI models while allowing them to maintain control over security, privacy, and compliance.
-
AMD and Johns Hopkins Researchers Develop AI Agent Framework to Automate Scientific Research Process
Researchers from AMD and Johns Hopkins University have developed Agent Laboratory, an artificial intelligence framework that automates core aspects of the scientific research process. The system uses large language models to handle literature reviews, experimentation, and report writing, producing both code repositories and research documentation.