InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
Hugging Face Publishes Guide on Efficient LLM Training across GPUs
Hugging Face has published the Ultra-Scale Playbook: Training LLMs on GPU Clusters, an open-source guide that provides a detailed exploration of the methodologies and technologies involved in training LLMs across GPU clusters.
-
Google Cloud Launches Gen AI Toolbox for Databases
Google Cloud has announced the public beta launch of Gen AI Toolbox for Databases, an open-source server developed in collaboration with LangChain. This new tool is designed to help developers seamlessly integrate production-grade, agent-based generative AI applications with databases while ensuring secure access, scalability, and observability.
-
IBM Granite 3.2 Brings New Vision Language Model, Chain of Thought Reasoning, Improved TimeSeries
IBM has introduced its new Granite 3.2 multi-modal and reasoning model. Granite 3.2 features experimental chain-of-thought reasoning capabilities that significantly improve its predecessor's performance, a new vision language model (VLM) outperforming larger models on several benchmarks, and smaller models for more efficient deployments.
-
Microsoft Releases BioEmu-1: a Deep Learning Model for Protein Structure Prediction
Microsoft Research has introduced BioEmu-1, a deep-learning model designed to predict the range of structural conformations that proteins can adopt. Unlike traditional methods that provide a single static structure, BioEmu-1 generates structural ensembles, offering a broader view of protein dynamics.
-
GitHub Copilot Extensions Integrate IDEs with External Services
Now generally available, GitHub Copilot Extensions allow developers to use natural language to query documentation, generate code, retrieve data, and execute actions on external services without leaving their IDEs. Besides using public extensions from companies like Docker, MongoDB, Sentry, and many more, developers can create their own extensions to work with internal libraries or APIs.
-
Google DeepMind’s AlphaGeometry2 AI Achieves Gold-Medal Math Olympiad Performance
Google DeepMind's AlphaGeometry2 (AG2) AI model solved 84% of the geometry problems from the last 25 years of International Math Olympiads (IMO), outperforming the average human gold-medalist performance.
-
Perplexity Unveils Deep Research: AI-Powered Tool for Advanced Analysis
Perplexity has introduced Deep Research, an AI-powered tool designed for conducting in-depth analysis across various fields, including finance, marketing, and technology. The system automates the research process by performing multiple searches, analyzing extensive sources, and synthesizing findings into structured reports within minutes.
-
AWS Reduces Latency and Costs for Key/Value Datastores with AZ Affinity Routing and GLIDE Valkey
AWS recently introduced Availability Zone (AZ) awareness in version 1.2 of the open source Valkey General Language Independent Driver for Enterprise (GLIDE) client library. By implementing AZ affinity routing in the open source key/value datastore, developers can reduce latency and costs by directing requests to replicas within the same AZ as the client.
-
Google Gemini's Long-term Memory Vulnerable to a Kind of Phishing Attack
AI security hacker Johann Rehberger described a prompt injection attack against Google Gemini able to modify its long-term memories using a technique he calls delayed tool invocation. The researcher described the attack as a sort of social engineering/phishing attack triggered by the user interacting with a malicious document.
-
OmniHuman-1: Advancing AI-Generated Human Animation
OmniHuman-1, an advanced AI-driven human video generation model, has been introduced, marking a significant leap in multimodal animation technology. OmniHuman-1 enables the creation of highly lifelike human videos using minimal input, such as a single image and motion cues like audio or video.
-
Latin America Launches Latam-GPT to Improve AI Cultural Relevance
Latin America is advancing in the development of artificial intelligence with the creation of Latam-GPT, a language model designed to better represent the history, culture, and linguistic diversity of the region.
-
Meta Introduces LLM-Powered Tool for Software Testing
Meta has unveiled the Automated Compliance Hardening (ACH) tool, a mutation-guided, LLM-based test generation system. Designed to enhance software reliability and security, ACH generates faults in source code and subsequently creates tests to detect and address these issues.
-
UC Berkeley's Sky Computing Lab Introduces Model to Reduce AI Language Model Inference Costs
UC Berkeley's Sky Computing Lab has released Sky-T1-32B-Flash, an updated reasoning language model that addresses the common issue of AI overthinking. The model, developed through the NovaSky (Next-generation Open Vision and AI) initiative, "slashes inference costs on challenging questions by up to 57%" while maintaining accuracy across mathematics, coding, science, and general knowledge domains.
-
OpenAI Cancels o3 Release and Announces Roadmap for GPT 4.5, 5
OpenAI is restructuring its AI strategy to focus solely on GPT-5, consolidating capabilities like reasoning, voice synthesis, and deep research into one unified model. This shift aims to simplify product offerings and enhance user experience, with tiered subscription levels for varying intelligence. As competition heats up, the success of GPT-5 will be pivotal for OpenAI’s future.
-
OpenAI Releases Operator, an AI Agent for Web-Based Tasks
OpenAI released a research preview of Operator, an AI agent that can use a web browser to perform tasks on a user's behalf. Operator achieves new state-of-the-art performance on the WebArena and WebVoyager benchmarks.