InfoQ Homepage News
-
LLaVA-CoT Shows How to Achieve Structured, Autonomous Reasoning in Vision Language Models
Chinese researchers fine-tuned Llama-3.2-11B to improve its ability to solve multimodal reasoning problems by going beyond the direct-response or chain-of-thought (coT) approaches to reason step by step in a structured way. Named LLava-CoT, the new model outperforms its base model and proves better than larger models, including Gemini-1.5-pro, GPT-4o-mini, and Llama-3.2-90B-Vision-Instruct.
-
Microsoft Announces General Availability of Fabric API for GraphQL
Microsoft has launched Fabric API for GraphQL, moving the data access layer from public preview to general availability (GA). This release introduces several enhancements, including support for Azure SQL and Fabric SQL databases, saved credential authentication, detailed monitoring tools, and integration with CI/CD workflows.
-
Vercel Expands AI Toolkit with AI SDK 4.0 Update
Vercel has announced version 4.0 of its open-source AI SDK toolkit designed for building AI applications in JavaScript and TypeScript. The update introduces key features like PDF support, computer use integration, and a new xAI Grok API.
-
First Google Axion Processor Now Available: Claims Best Performance in Cloud Market
Google has announced the general availability of its C4A virtual machines, marking the debut of Axion-based instances. The cloud provider claims these instances deliver up to 10% better price-performance compared to the latest Arm-based alternatives from competitors, including Amazon Graviton4.
-
Netflix Rolls Out Service-Level Prioritized Load Shedding to Improve Resiliency
Netflix extended its prioritized load-shedding implementation to the individual service level to further improve system resilience. The approach uses cloud capacity more efficiently by shedding low-priority requests only when necessary instead of maintaining separate clusters for failure isolation.
-
QCon SF 2024 - Why ML Projects Fail to Reach Production
Wenjie Zi of Grammarly addressed the high failure rates in machine learning at QCon SF 2024, revealing challenges from misaligned business goals to poor data quality. She advocated for a "fail fast" approach and robust MLOps infrastructure, emphasizing that learning from failures can drive success. Clear objectives and rigorous practices are essential for effective implementation.
-
QCon SF 2024: Scale Batch GPU Inference with Ray
At QConSF 2024, Cody Yu presented how Anyscale’s Ray can more effectively handle scaling out batch inference. Some of the problems Ray can assist with include scaling large datasets (hundreds of GBs or more), ensuring reliability with spot and on-demand instances, managing multi-stage heterogeneous compute, and managing tradeoffs with cost and latency.
-
Techniques and Trends in AI-Powered Search by Faye Zhang at QCon SF
At QCon SF 2024, Faye Zhang gave a talk titled Search: from Linear to Multiverse, covering three trends and techniques in AI-powered search: multi-modal interaction, personalization, and simulation with AI agents.
-
WildFly 34 Adds Preview of Jakarta EE 11 and Support for Jakarta Data
The WildFly community announced the latest release of WildFly 34, emphasizing the significant changes made to the WildFly Preview, including support for Jakarta Data 1.0, MicroProfile REST Client 4.0, and MicroProfile Telemetry 2.0. Other minor updates include ORM 6.6.x, Hibernate search 7.2, and FasterXML Jackson 2.17.
-
Aurora Limitless: AWS Introduces New PostgreSQL Database with Automated Horizontal Scaling
AWS has announced the general availability of Amazon Aurora PostgreSQL Limitless Database, a relational database designed to provide automated horizontal scaling. This new option can handle millions of write transactions per second and manage petabytes of data, all within a single database environment.
-
DevProxy 0.22 Improves API Permission Checks
Microsoft has released version 0.22 of DevProxy, an API simulation command-line tool. The new version improves logging and detects minimal permissions without the need for Azure API centre.
-
Spring Framework 6.2 and Spring Boot 3.4 Improve Containers, Actuators Ahead of New 2025 Generations
Broadcom released Spring Framework 6.2 and Spring Boot 3.4, keeping the Java 17 and Jakarta EE 9 baselines. Spring Boot 3.4 has structured logging, adds container images to Docker Compose and Testcontainers, and improves container building and actuators. Broadcom announced Spring Framework 7 & Spring Boot 4 for 2025 with Java 17 and Jakarta 11. InfoQ spoke to Juergen Hoeller and Sébastien Deleuze.
-
How to Delight Your Developers with User-Centric Platforms and Practices
By focusing on the users, platform development teams can ensure that they build a platform that tackles the true needs of developers, Ana Petkovska said at QCon London. In her talk, Delight Your Developers with User-Centric Platforms & Practices, she shared what their Developer Experience (DevEx) group looks like and what products and services they provide.
-
ASP.NET Core 9: Enhancements in Static Asset Handling, Blazor, SignalR, and OpenAPI Support
Microsoft has released .NET 9, which contains features regarding ASP.NET Core 9. This latest release focuses on optimizing static asset handling, refining Blazor's component interaction, enhancing SignalR's observability and performance, and streamlining API documentation through built-in OpenAPI support.
-
QCon SF: Mandy Gu on Using Generative AI for Productivity at Wealthsimple
Mandy Gu spoke at QCon SF 2024 about how Wealthsimple, a Canadian fintech company, uses Generative AI to improve productivity. Her talk focused on the development and evolution of their GenAI tool suite and how Wealthsimple crossed the "Trough of Disillusionment" to achieve productivity.