InfoQ Homepage AI, ML & Data Engineering Content on InfoQ

News

RSS Feed

Newer Older

AI, ML & Data Engineering

Nexa AI Unveils Omnivision: a Compact Vision-Language Model for Edge AI

Nexa AI unveiled Omnivision, a compact vision-language model tailored for edge devices. By significantly reducing image tokens from 729 to 81, Omnivision lowers latency and computational requirements while maintaining strong performance in tasks like visual question answering and image captioning.

Robert Krzaczyński
on Dec 03, 2024
AI, ML & Data Engineering

Physical Intelligence Unveils Robotics Foundation Model Pi-Zero

Physical Intelligence recently announced π0 (pi-zero), a general-purpose AI foundation model for robots. Pi-zero is based on a pre-trained vision-language model (VLM) and outperforms other baseline models in evaluations on five robot tasks.

Anthony Alford
on Dec 03, 2024
AI, ML & Data Engineering

AWS Reveals Multi-Agent Orchestrator Framework for Managing AI Agents

AWS has introduced Multi-Agent Orchestrator, a framework designed to manage multiple AI agents and handle complex conversational scenarios. The system routes queries to the most suitable agent, maintains context across interactions, and integrates seamlessly with a variety of deployment environments, including AWS Lambda, local setups, and other cloud platforms.

Daniel Dominguez
on Dec 02, 2024
AI, ML & Data Engineering

Microsoft Introduces Magentic-One, a Generalist Multi-Agent System

Microsoft has announced the release of Magentic-One, a new generalist multi-agent system designed to handle open-ended tasks involving web and file-based environments. This system aims to assist with complex, multi-step tasks across various domains, improving efficiency in activities such as software development, data analysis, and web navigation.

Daniel Dominguez
on Nov 30, 2024
AI, ML & Data Engineering

QCon SF 2024 - Ten Reasons Your Multi-Agent Workflows Fail

At QCon SF 2024, Victor Dibia from Microsoft Research explored the complexities of multi-agent systems powered by generative AI. Highlighting common pitfalls like inadequate prompts and poor orchestration, he shared strategies for enhancing reliability and scalability. Dibia emphasized the need for meticulous design and oversight to unlock the full potential of these innovative systems.

Andrew Hoblitzell
on Nov 29, 2024
AI, ML & Data Engineering

Epoch AI Unveils FrontierMath: A New Frontier in Testing AI's Mathematical Reasoning Capabilities

Epoch AI in collaboration with over 60 mathematicians from leading institutions worldwide has introduced FrontierMath, a new benchmark designed to evaluate AI systems' capabilities in advanced mathematical reasoning.

Vinod Goje
on Nov 28, 2024
AI, ML & Data Engineering

Mistral AI Releases Two Small Language Model Les Ministraux

Mistral AI recently released Ministral 3B and Ministral 8B, two small language models that are collectively called les Ministraux. The models are designed for local inference applications and outperform other comparably sized models on a range of LLM benchmarks.

Anthony Alford
on Nov 28, 2024
AI, ML & Data Engineering

QCon SF 2024 - Scaling Large Language Model Serving Infrastructure at Meta

At QCon SF 2024, Ye (Charlotte) Qi of Meta tackled the complexities of scaling large language model (LLM) infrastructure, highlighting the "AI Gold Rush" challenge. She emphasized efficient hardware integration, latency optimization, and production readiness, alongside Meta's innovative approaches like hierarchical caching and automation to enhance AI performance and reliability.

Andrew Hoblitzell
on Nov 26, 2024
AI, ML & Data Engineering

QCon SF 2024 - Incremental Data Processing at Netflix

Jun He gave a talk at QCon SF 2024 titled Efficient Incremental Processing with Netflix Maestro and Apache Iceberg. He showed how Netflix used the system to reduce processing time and cost while improving data freshness.

Anthony Alford
on Nov 25, 2024
AI, ML & Data Engineering

LLaVA-CoT Shows How to Achieve Structured, Autonomous Reasoning in Vision Language Models

Chinese researchers fine-tuned Llama-3.2-11B to improve its ability to solve multimodal reasoning problems by going beyond the direct-response or chain-of-thought (coT) approaches to reason step by step in a structured way. Named LLava-CoT, the new model outperforms its base model and proves better than larger models, including Gemini-1.5-pro, GPT-4o-mini, and Llama-3.2-90B-Vision-Instruct.

Sergio De Simone
on Nov 24, 2024
AI, ML & Data Engineering

Microsoft Announces General Availability of Fabric API for GraphQL

Microsoft has launched Fabric API for GraphQL, moving the data access layer from public preview to general availability (GA). This release introduces several enhancements, including support for Azure SQL and Fabric SQL databases, saved credential authentication, detailed monitoring tools, and integration with CI/CD workflows.

Robert Krzaczyński
on Nov 24, 2024
AI, ML & Data Engineering

Vercel Expands AI Toolkit with AI SDK 4.0 Update

Vercel has announced version 4.0 of its open-source AI SDK toolkit designed for building AI applications in JavaScript and TypeScript. The update introduces key features like PDF support, computer use integration, and a new xAI Grok API.

Daniel Dominguez
on Nov 24, 2024
AI, ML & Data Engineering

QCon SF 2024 - Why ML Projects Fail to Reach Production

Wenjie Zi of Grammarly addressed the high failure rates in machine learning at QCon SF 2024, revealing challenges from misaligned business goals to poor data quality. She advocated for a "fail fast" approach and robust MLOps infrastructure, emphasizing that learning from failures can drive success. Clear objectives and rigorous practices are essential for effective implementation.

Andrew Hoblitzell
on Nov 22, 2024
AI, ML & Data Engineering

QCon SF 2024: Scale Batch GPU Inference with Ray

At QConSF 2024, Cody Yu presented how Anyscale’s Ray can more effectively handle scaling out batch inference. Some of the problems Ray can assist with include scaling large datasets (hundreds of GBs or more), ensuring reliability with spot and on-demand instances, managing multi-stage heterogeneous compute, and managing tradeoffs with cost and latency.

Andrew Hoblitzell
on Nov 22, 2024
AI, ML & Data Engineering

Techniques and Trends in AI-Powered Search by Faye Zhang at QCon SF

At QCon SF 2024, Faye Zhang gave a talk titled Search: from Linear to Multiverse, covering three trends and techniques in AI-powered search: multi-modal interaction, personalization, and simulation with AI agents.

Anthony Alford
on Nov 22, 2024

Newer News

Older News

InfoQ Software Architects' Newsletter

News