InfoQ Homepage Large language models Content on InfoQ

News

RSS Feed

Newer Older

AI, ML & Data Engineering

Neptune Combines AI‑Assisted Infrastructure as Code and Cloud Deployments

Now available in beta, Neptune is a conversational AI agent designed to act like an AI platform engineer, handling the provisioning, wiring, and configuration of the cloud services needed to run a containerized app. Neptune is both language and cloud-agnostic, with support for AWS, GCP, and Azure.

Sergio De Simone
on Dec 22, 2025
AI, ML & Data Engineering

Meta Details GEM Ads Model Using LLM-Scale Training, Hybrid Parallelism, and Knowledge Transfer

Meta released details about its Generative Ads Model (GEM), a foundation model designed to improve ads recommendation across its platforms. The model addresses core challenges in recommendation systems (RecSys) by processing billions of daily user-ad interactions where meaningful signals such as clicks and conversions are very sparse.

Vinod Goje
on Dec 22, 2025
Java

TornadoVM 2.0 Brings Automatic GPU Acceleration and LLM Support to Java

The TornadoVM project recently reached version 2.0, a major milestone for the open-source project that aims to provide a heterogeneous hardware runtime for Java. The project automatically accelerates Java programs on multi-core CPUs, GPUs, and FPGAs. This release is likely to be of particular interest to teams developing LLM solutions on the JVM.

Ben Evans
on Dec 17, 2025
AI, ML & Data Engineering

Meta's Optimization Platform Ax 1.0 Streamlines LLM and System Optimization

Now stable, Ax is an open-source platform from Meta designed to help researchers and engineers apply machine learning to complex, resource-intensive experimentation. Over the past several years, Meta has used Ax to improve AI models, accelerate machine learning research, tune production infrastructure, and more.

Sergio De Simone
on Dec 16, 2025
AI, ML & Data Engineering

AlphaEvolve Enters Google Cloud as an Agentic System for Algorithm Optimization

Google Cloud announced the private preview of AlphaEvolve, a Gemini-powered coding agent designed to discover and optimize algorithms for complex engineering and scientific problems. The system is now available through an early access program on Google Cloud, targeting use cases where traditional brute-force or manual optimization methods struggle due to vast search spaces.

Robert Krzaczyński
on Dec 15, 2025
Development

Magika 1.0: Smarter, Faster File Detection with Rust and AI

Google has just released version 1.0 of Magika, a substantial rewrite of its open-source file type detection system. The new version leverages AI to support a broader range of file types and is built in Rust for maximum speed and security.

Sergio De Simone
on Dec 12, 2025
AI, ML & Data Engineering

Replit Introduces New AI Integrations for Multi-Model Development

Replit has introduced Replit AI Integrations, a feature that lets users select third-party models directly inside the IDE and automatically generate the code needed to run inference.

Daniel Dominguez
on Dec 09, 2025
DevOps

NVIDIA Dynamo Addresses Multi-Node LLM Inference Challenges

Serving Large Language Models (LLMs) at scale is complex. Modern LLMs now exceed the memory and compute capacity of a single GPU or even a single multi-GPU node. As a result, inference workloads for 70B+, 120B+ parameter models, or pipelines with large context windows, require multi-node, distributed GPU deployments.

Claudio Masolo
on Dec 04, 2025
Cloud

Arm Launches AI-Powered Copilot Assistant to Migrate Workflows to Arm Cloud Compute

At the recent GitHub Universe 2025 developer conference, Arm unveiled the Cloud migration assistant custom agent, a tool designed to help developers automate, optimize, and accelerate the migration of their x86 cloud workflows to Arm infrastructure.

Sergio De Simone
on Dec 03, 2025
AI, ML & Data Engineering

Memori Expands into a Full-Scale Memory Layer for AI Agents across SQL and MongoDB

Memori is an innovative, open-source memory system that empowers AI agents with structured, long-term memory using standard databases like SQL and MongoDB. It seamlessly integrates into existing frameworks, enabling efficient data extraction and retrieval without vendor lock-in. Ideal for developers, Memori's modular design ensures reliability and scalability for next-gen intelligent systems.

Robert Krzaczyński
on Dec 03, 2025
Mobile

Google's New LiteRT Accelerator Supercharges AI Workloads on Snapdragon-powered Android Devices

Google has introduced a new accelerator for LiteRT, called Qualcomm AI Engine Direct (QNN), to enhance on-device AI performance on Qualcomm-powered Android devices equipped with Snapdragon 8 SoCs. The accelerator delivers significant gains, offering up to a 100x speedup over CPU execution and 10x over GPU.

Sergio De Simone
on Nov 30, 2025
AI, ML & Data Engineering

Google Launches Agent Development Kit for Go

Google has added support for the Go language to its Agent Development Kit (ADK), enabling Go developers to build and manage agents in an idiomatic way that leverages the language's strong concurrency and typing features.

Sergio De Simone
on Nov 25, 2025
AI, ML & Data Engineering

Google Brings Colab Integration to Visual Studio Code

Google has announced the availability of a new Visual Studio Code extension that connects local notebooks to a Colab runtime. This allows developers to unify their previously separate local development setup and web-based Colab environment.

Sergio De Simone
on Nov 24, 2025
AI, ML & Data Engineering

AnyLanguageModel: Unified API for Local and Cloud LLMs on Apple Platforms

Developers on Apple platforms often face a fragmented ecosystem when using language models. Local models via Core ML or MLX offer privacy and offline capabilities, while cloud services like OpenAI, Anthropic, or Google Gemini provide advanced features. AnyLanguageModel, a new Swift package, simplifies integration by offering a unified API for both local and remote models.

Robert Krzaczyński
on Nov 24, 2025
AI, ML & Data Engineering

Olmo 3 Release Provides Full Transparency into Model Development and Training

The Allen Institute for AI has unveiled Olmo 3, an open-source language model family that empowers developers with full access to the model lifecycle, from training datasets to checkpoints. Featuring reasoning-focused variants and robust tools for post-training modifications, Olmo 3 promotes transparency, experimentation, and community collaboration, driving innovations in AI.

Robert Krzaczyński
on Nov 22, 2025

Newer News

Older News

InfoQ Software Architects' Newsletter

News