InfoQ Homepage Benchmark Content on InfoQ

News

RSS Feed

Newer Older

AI, ML & Data Engineering

AlphaWrite: Improving AI Narratives through Evolution

AlphaWrite is a new framework designed to enhance creative writing with structure and measurable improvements. Developed by Toby Simonds, it employs an evolutionary process to iteratively boost storytelling quality during inference.

Robert Krzaczyński
on Jun 21, 2025
AI, ML & Data Engineering

Mistral AI Releases Magistral, Its First Reasoning-Focused Language Model

Mistral AI has released Magistral, a new model family built for transparent, multi-step reasoning. Available in open and enterprise versions, it supports structured logic, multilingual output, and traceable decision-making.

Robert Krzaczyński
on Jun 16, 2025
AI, ML & Data Engineering

Meta Introduces V-JEPA 2, a Video-Based World Model for Physical Reasoning

Meta has introduced V-JEPA 2, a new video-based world model designed to improve machine understanding, prediction, and planning in physical environments. The model extends the Joint Embedding Predictive Architecture (JEPA) framework and is trained to predict outcomes in embedding space using video data.

Robert Krzaczyński
on Jun 13, 2025
AI, ML & Data Engineering

Google Releases LMEval, an Open-Source Cross-Provider LLM Evaluation Tool

LMEval aims to help AI researchers and developers compare the performance of different large language models. Designed to be accurate, multimodal, and easy to use, LMEval has already been used to evaluate major models in terms of safety and security.

Sergio De Simone
on May 31, 2025
AI, ML & Data Engineering

Mistral Unveils Medium 3: Enterprise-Ready Language Model

Mistral AI has unveiled Mistral Medium 3, a mid-sized language model aimed at enterprises seeking a balance between cost-efficiency, strong performance, and flexible deployment options. The model is now available through Mistral’s platform and Amazon SageMaker, with further releases planned for IBM WatsonX, Azure AI Foundry, Google Cloud Vertex AI, and NVIDIA NIM.

Robert Krzaczyński
on May 16, 2025
AI, ML & Data Engineering

OpenAI Introduces GPT‑4.1 Family with Enhanced Performance and Long-Context Support

OpenAI has released a new family of language models—GPT‑4.1, GPT‑4.1 mini, and GPT‑4.1 nano—available via its API. The models improve on GPT‑4o and GPT‑4.5 across several technical benchmarks and introduce support for up to 1 million tokens of context.

Robert Krzaczyński
on May 12, 2025
AI, ML & Data Engineering

OpenAI Launches BrowseComp to Benchmark AI Agents' Web Search and Deep Research Skills

OpenAI has released BrowseComp, a new benchmark designed to test AI agents' ability to locate difficult-to-find information on the web. The benchmark contains 1,266 challenging problems that require agents to persistently navigate through multiple websites to retrieve entangled information.

Vinod Goje
on May 04, 2025
AI, ML & Data Engineering

Google DeepMind Introduces QuestBench to Evaluate LLMs in Solving Logic and Math Problems

Google DeepMind’s QuestBench benchmark helps in evaluating if LLMs can pinpoint the single, crucial question needed to solve logic, planning, or math problems. DeepMind team recently published an article on QuestBench which is a set of underspecified reasoning tasks solvable by asking at most one question.

Srini Penchikala
on Apr 22, 2025
AI, ML & Data Engineering

Radical AI Releases TorchSim: a PyTorch-Native Engine for Next-Generation Atomistic Simulations

Radical AI has announced the release of TorchSim, a next-generation atomistic simulation engine built natively in PyTorch and designed for the MLIP (machine-learned interatomic potentials) era.

Robert Krzaczyński
on Apr 11, 2025
AI, ML & Data Engineering

Meta AI Releases Llama 4: Early Impressions and Community Feedback

Meta has officially released the first models in its new Llama 4 family—Scout and Maverick—marking a step forward in its open-weight large language model ecosystem. Designed with a native multimodal architecture and a mixture-of-experts (MoE) framework, these models aim to support a broader range of applications, from image understanding to long-context reasoning.

Robert Krzaczyński
on Apr 07, 2025
AI, ML & Data Engineering

Google Introduces Gemini 2.5 Pro with Improved Reasoning and Coding Capabilities

Google has released Gemini 2.5 Pro, an updated AI model focused on enhanced reasoning, code generation, and multimodal processing. The model is ranked first on LMArena, a benchmark for human preference in AI responses, and achieves strong results in math, science, and logic-based tasks. It also features a 1 million token context window, with plans to expand to 2 million.

Robert Krzaczyński
on Mar 28, 2025
AI, ML & Data Engineering

Google DeepMind Enhances AMIE for Long-Term Disease Management

Google DeepMind has extended the capabilities of its Articulate Medical Intelligence Explorer (AMIE) beyond diagnosis to support longitudinal disease management. The system is now designed to assist clinicians in monitoring disease progression, adjusting treatments, and adhering to clinical guidelines across multiple patient visits.

Robert Krzaczyński
on Mar 07, 2025
AI, ML & Data Engineering

Mistral AI Introduces Saba: Regional Language Model for Arabic and South Indian Language

Mistral AI has introduced Mistral Saba, a 24-billion-parameter language model designed to improve AI performance in Arabic and several Indian-origin languages, particularly South Indian languages like Tamil.

Robert Krzaczyński
on Mar 06, 2025
AI, ML & Data Engineering

Perplexity Unveils Deep Research: AI-Powered Tool for Advanced Analysis

Perplexity has introduced Deep Research, an AI-powered tool designed for conducting in-depth analysis across various fields, including finance, marketing, and technology. The system automates the research process by performing multiple searches, analyzing extensive sources, and synthesizing findings into structured reports within minutes.

Robert Krzaczyński
on Feb 24, 2025
AI, ML & Data Engineering

OmniHuman-1: Advancing AI-Generated Human Animation

OmniHuman-1, an advanced AI-driven human video generation model, has been introduced, marking a significant leap in multimodal animation technology. OmniHuman-1 enables the creation of highly lifelike human videos using minimal input, such as a single image and motion cues like audio or video.

Robert Krzaczyński
on Feb 20, 2025

Newer News

Older News

InfoQ Software Architects' Newsletter

News