InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
LLaVA-CoT Shows How to Achieve Structured, Autonomous Reasoning in Vision Language Models
Chinese researchers fine-tuned Llama-3.2-11B to improve its ability to solve multimodal reasoning problems by going beyond the direct-response or chain-of-thought (coT) approaches to reason step by step in a structured way. Named LLava-CoT, the new model outperforms its base model and proves better than larger models, including Gemini-1.5-pro, GPT-4o-mini, and Llama-3.2-90B-Vision-Instruct.
-
Microsoft Announces General Availability of Fabric API for GraphQL
Microsoft has launched Fabric API for GraphQL, moving the data access layer from public preview to general availability (GA). This release introduces several enhancements, including support for Azure SQL and Fabric SQL databases, saved credential authentication, detailed monitoring tools, and integration with CI/CD workflows.
-
Vercel Expands AI Toolkit with AI SDK 4.0 Update
Vercel has announced version 4.0 of its open-source AI SDK toolkit designed for building AI applications in JavaScript and TypeScript. The update introduces key features like PDF support, computer use integration, and a new xAI Grok API.
-
QCon SF 2024 - Why ML Projects Fail to Reach Production
Wenjie Zi of Grammarly addressed the high failure rates in machine learning at QCon SF 2024, revealing challenges from misaligned business goals to poor data quality. She advocated for a "fail fast" approach and robust MLOps infrastructure, emphasizing that learning from failures can drive success. Clear objectives and rigorous practices are essential for effective implementation.
-
QCon SF 2024: Scale Batch GPU Inference with Ray
At QConSF 2024, Cody Yu presented how Anyscale’s Ray can more effectively handle scaling out batch inference. Some of the problems Ray can assist with include scaling large datasets (hundreds of GBs or more), ensuring reliability with spot and on-demand instances, managing multi-stage heterogeneous compute, and managing tradeoffs with cost and latency.
-
Techniques and Trends in AI-Powered Search by Faye Zhang at QCon SF
At QCon SF 2024, Faye Zhang gave a talk titled Search: from Linear to Multiverse, covering three trends and techniques in AI-powered search: multi-modal interaction, personalization, and simulation with AI agents.
-
Aurora Limitless: AWS Introduces New PostgreSQL Database with Automated Horizontal Scaling
AWS has announced the general availability of Amazon Aurora PostgreSQL Limitless Database, a relational database designed to provide automated horizontal scaling. This new option can handle millions of write transactions per second and manage petabytes of data, all within a single database environment.
-
QCon SF: Mandy Gu on Using Generative AI for Productivity at Wealthsimple
Mandy Gu spoke at QCon SF 2024 about how Wealthsimple, a Canadian fintech company, uses Generative AI to improve productivity. Her talk focused on the development and evolution of their GenAI tool suite and how Wealthsimple crossed the "Trough of Disillusionment" to achieve productivity.
-
Timescale Bolsters AI-Ready PostgreSQL with pgai Vectorizer
Timescale recently expanded its PostgreSQL AI offerings with pgai Vectorizer. This update enables developers to create, store, and manage vector embeddings alongside relational data without the need for external tools or additional infrastructure.
-
QCon SF: Large Scale Search and Ranking Systems at Netflix
Moumita Bhattacharya spoke at QCon SF 2024 about state-of-the-art search and ranking systems. She gave an overview of the typical structure of these systems and followed with a deep dive into how Netflix created a single combined model to handle both tasks.
-
QCon SF: Using Metaflow to Support Diverse ML Systems at Netflix
At QCon SF 2024, David Berg and Romain Cledat gave a talk about how Netflix uses Metaflow, an open-source framework, to support a variety of ML systems. The pair gave an overview of Metaflow's design principles and illustrated several of Netflix's use cases, including media processing, content demand modeling, and meta-models for explaining models.
-
How Uber Sped up SQL-based Data Analytics with Presto and Express Queries
Uber uses Presto, an open-source distributed SQL query engine, to provide analytics across several data sources, including Apache Hive, Apache Pinot, MySQL, and Apache Kafka. To improve its performance, Uber engineers explored the advantages of dealing with quick queries, a.k.a. express queries, in a specific way and found they could improve both Presto utilization and response times.
-
Google Introduces Gemini AI Features to Android Studio
Google has released a set of updates to Gemini in Android Studio, aiming to enhance the developer productivity through AI-powered features. This release is designed to bring AI to every stage of the development lifecycle, such as AI-assisted coding, refactoring, generating documentation, analyzing and test code, and suggesting fixes.
-
Meta Releases NotebookLlama: Open-Source PDF to Podcast Toolkit
Meta has released NotebookLlama, an open-source toolkit designed to convert PDF documents into podcasts, providing developers with a structured, accessible PDF-to-audio workflow. As an open-source alternative to Google’s NotebookLM, NotebookLlama guides users through a four-step process that converts PDF text into audio content.
-
GitHub Universe 2024 Unveils AI Innovations and Developer-Centric Tools
GitHub Universe 2024 unveiled groundbreaking updates emphasizing developer autonomy and AI capabilities. With multi-model support for Copilot, the introduction of AI-driven GitHub Spark, enhanced security features, and improved workflows in popular IDEs, GitHub aims to democratize coding and empower developers, regardless of skill level, to harness the full potential of artificial intelligence.