InfoQ Homepage AI Architecture Content on InfoQ
-
OpenAI’s Stargate Project Aims to Build AI Infrastructure in Partner Countries Worldwide
OpenAI has announced a new initiative called "OpenAI for Countries" as part of its Stargate project, aiming to help nations develop AI infrastructure based on democratic principles. This expansion follows the company's initial $500 billion investment plan for AI infrastructure in the United States.
-
DeepSeek Launches Prover-V2 Open-Source LLM for Formal Math Proofs
DeepSeek has released DeepSeek-Prover-V2, a new open-source large language model specifically designed for formal theorem proving in Lean 4. The model builds on a recursive theorem proving pipeline powered by the company's DeepSeek-V3 foundation model.
-
Scaling Financial Operations: Uber’s GenAI-Powered Approach to Invoice Automation
Uber recently described a GenAI-powered invoice processing system that reduced manual effort by 2x, cut handling time by 70%, and delivered 25–30% cost savings. By leveraging GPT-4 and a modular platform called TextSense, Uber improved data accuracy by 90%, enabling globally scalable, efficient, and highly automated financial operations.
-
AWS Promotes Responsible AI in the Well-Architected Generative AI Lens
AWS announced the availability of the new Well-Architected Generative AI Lens, focused on providing best practices for designing and operating generative AI workloads. The lens is aimed at organizations delivering robust and cost-effective generative AI solutions on AWS. The document offers cloud-agnostic best practices, implementation guidance and links to additional resources.
-
QCon London 2025 Day 2: the Form of AI, Securing AI Assistants, WASM Components in FaaS
The 19th annual QCon London conference took place at the The Queen Elizabeth II Conference Centre in London, England. This three-day event, organized by C4Media, consists of presentations by expert practitioners. Day Two, scheduled on April 8th, 2025, included a keynote address by Savannah Kunovsky and presentations from five conference tracks.
-
Google Releases Open-Source Agent Development Kit for Multi-Agent AI Applications
At Google Cloud Next 2025, Google announced the Agent Development Kit (ADK), an open-source framework aimed at simplifying the development of intelligent, multi-agent applications. The toolkit is designed to support developers across the entire lifecycle of agentic systems — from logic design and orchestration to debugging, evaluation, and deployment.
-
AMD’s Gaia Framework Brings Local LLM Inference to Consumer Hardware
AMD has released Gaia, an open-source project allowing developers to run large language models (LLMs) locally on Windows machines with AMD hardware acceleration. The framework supports retrieval-augmented generation (RAG) and includes tools for indexing local data sources. Gaia is designed to offer an alternative to LLMs hosted on a cloud service provider (CSP).
-
UC Berkeley's Sky Computing Lab Introduces Model to Reduce AI Language Model Inference Costs
UC Berkeley's Sky Computing Lab has released Sky-T1-32B-Flash, an updated reasoning language model that addresses the common issue of AI overthinking. The model, developed through the NovaSky (Next-generation Open Vision and AI) initiative, "slashes inference costs on challenging questions by up to 57%" while maintaining accuracy across mathematics, coding, science, and general knowledge domains.
-
OpenAI Features New o3-mini Model on Microsoft Azure OpenAI Service
OpenAI has launched the advanced o3-mini model via Microsoft Azure, enhancing AI applications with improved cost efficiency, faster performance, and adjustable reasoning capabilities. Designed for complex tasks, it supports structured outputs and backward compatibility. With widespread access, the o3-mini empowers developers to drive innovation across various industries.
-
AMD and Johns Hopkins Researchers Develop AI Agent Framework to Automate Scientific Research Process
Researchers from AMD and Johns Hopkins University have developed Agent Laboratory, an artificial intelligence framework that automates core aspects of the scientific research process. The system uses large language models to handle literature reviews, experimentation, and report writing, producing both code repositories and research documentation.
-
Amazon Bedrock Introduces Multi-Agent Systems (MAS) with Open Source Framework Integration
Amazon Web Services has released a multi-agent collaboration capability for Amazon Bedrock, introducing a framework for deploying and managing multiple AI agents that collaborate on complex tasks. The system enables specialized agents to work together under a supervisor agent's coordination, addressing challenges developers face with agent orchestration in distributed AI systems.
-
Azure AI Agent Service in Public Preview: Automation of Routine Tasks
Unveiling at Ignite, Microsoft's Azure AI Agent Service empowers developers to build and scale AI agents seamlessly. With secure integration, flexible use cases, and support for multiple frameworks, it automates workflows across platforms like Teams and Excel. Experience the future of business automation—innovate efficiently with Azure AI today!
-
Microsoft Introduces Magentic-One, a Generalist Multi-Agent System
Microsoft has announced the release of Magentic-One, a new generalist multi-agent system designed to handle open-ended tasks involving web and file-based environments. This system aims to assist with complex, multi-step tasks across various domains, improving efficiency in activities such as software development, data analysis, and web navigation.
-
OSI Releases New Definition for Open Source AI, Setting Standards for Transparency and Accessibility
The Open Source Initiative (OSI) released Version 1.0 of its Open Source AI Definition (OSAID) after two years of development with contributions from global experts. The OSAID sets criteria defining open-source AI, aiming to bring clarity to the concept and establish benchmarks for transparency and accessibility in AI.
-
RAG-Powered Copilot Saves Uber 13,000 Engineering Hours
Uber recently detailed how it built Genie, an AI-powered on-call copilot designed to improve the efficiency of on-call support engineers. Genie leverages Retrieval-Augmented Generation (RAG) to provide accurate real-time responses and significantly enhance the speed and effectiveness of incident response. Since its launch, Genie has answered over 70,000 questions, saving 13,000 engineering hours.