InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
Google Trains User Interface and Infographics Understanding AI Model ScreenAI
Google Research recently developed ScreenAI, a multimodal AI model for understanding infographics and user interfaces. ScreenAI is based on the PaLI architecture and achieves state-of-the-art performance on several tasks.
-
Java News Roundup: JobRunr 7.0, Introducing the Commonhaus Foundation, Payara Platform, Devnexus
This week's Java roundup for April 8th, 2024, features news highlighting: JobRunr 7.0; introducing the Commonhaus Foundation; the April 2024 edition of Payara Platform; JEP 473, Stream Gatherers (Second Preview), and JEP 469, Vector API (Eighth Incubator), Proposed to Target for JDK 23; and Devnexus 2024.
-
QCon London: Lessons Learned from Building LinkedIn’s AI/ML Data Platform
At the QCon London 2024 conference, Félix GV from LinkedIn discussed the AI/ML platform powering the company’s products. He specifically delved into Venice DB, the NoSQL data store used for feature persistence. The presenter shared the lessons learned from evolving and operating the platform, including cluster management and library versioning.
-
Large Language Models for Code by Loubna Ben Allal at QCon London
At QCon London, Loubna Ben Allal discussed Large Language Models (LLMs) for code. She discussed the lifecycle of code completion models, which consists of pre-training on vast codebases and finetuning and continuous adaptation. She specifically discussed open-source models, which are powered by platforms like Hugging Face.
-
NVIDIA Announces Next-Generation AI Superchip Blackwell
NVIDIA recently announced their next generation GPU architecture, Blackwell. Blackwell is the largest GPU ever built, with over 200 billion transistors, and can train large language models (LLMs) up to 4x faster than previous generation hardware.
-
Navigating LLM Deployment: Tips, Tricks and Techniques by Meryem Arik at QCon London
At QCon London, Meryem Arik discussed deploying Large Language Models (LLMs). While initial proofs of concept benefit from hosted solutions, scaling demands self-hosting to cut costs, enhance performance with tailored models, and meet privacy and security requirements. She emphasized understanding deployment limits, quantization for efficiency, and optimizing inference to fully use GPU resources.
-
Java News Roundup: New JEP Candidates, Project Bisbane, Ktor Plugin Repository, JDKUpdater
This week's Java roundup for April 1st, 2024, features news highlighting: new JEP Candidates: JEP 469, Vector API (Eighth Incubator); JEP 473, Stream Gatherers (Second Preview); and JEP 474, ZGC: Generational Mode by Default, Project Bisbane, and introducing the Ktor Plugin Repository and JDKUpdater.
-
Microsoft Announces Garnet: a New Open-Source Cache-Store and Redis Alternative
Microsoft Research has recently announced Garnet, an open-source cache-store designed to accelerate applications and services. Using the RESP wire protocol, Garnet is a faster alternative to cache-stores and is compatible with existing Redis clients.
-
Nvidia Announces Robotics-Oriented AI Foundational Model
At its recent GTC 2024 event, Nvidia announced a new foundational model to build intelligent humanoid robots. Dubbed GR00T, short for Generalist Robot 00 Technology, the model will understand natural language and be able to observe human actions and emulate human movements.
-
KubeCon EU Keynotes: a Call to Action to Innovate Responsibly with Generative AI
The KubeCon EU morning keynotes were a veritable call to action encouraging the cloud-native community's involvement in building the scalable infrastructure needed by generative AI. This call was balanced with encouragement to make a cloud-native platform’s “golden path” green and sustainable, ensuring that any innovation is also responsible.
-
Meta Unveils 24k GPU AI Infrastructure Design
Meta recently announced the design of two new AI computing clusters, each containing 24,576 GPUs. The clusters are based on Meta's Grand Teton hardware platform, and one cluster is currently used by Meta for training their next-generation Llama 3 model.
-
Java News Roundup: Jakarta Data and Jakarta NoSQL Milestones, Class-File API Targeted for JDK 23
This week's Java roundup for March 25th, 2024, features news highlighting: JEP 466, Class-File API (Second Preview), targeted for JDK 23; milestone releases of Jakarta Data and Jakarta NoSQL specifications; the second release candidate for JobRunr 7.0.0; and point releases for Spring projects, Quarkus, Helidon and LangChain4j.
-
Reddit Migrates Media Metadata from S3 and Other Systems into AWS Aurora Postgres
Reddit consolidated its media metadata storage into a new architecture using AWS Aurora Postgres. Previously, the company sourced media metadata from various systems, including directly from AWS S3. The new solution simplifies media metadata retrieval and handles 100k+ requests per second with latency below 5ms (p90).
-
Databrix Announces DBRX, an Open Source General Purpose LLM
Databricks launched DBRX, a new open-source large language model (LLM) that aims to redefine the standards of open models and outperform well-known competitors on industry benchmarks.
-
Transactional Serverless Computing: PostgreSQL Creator Announces DBOS Cloud
The creators of DBOS have recently introduced DBOS Cloud, a transactional serverless application platform tailored for TypeScript developers. With all state information stored in a highly available DBMS, this new platform assures transactional serverless computing, offering reliable execution alongside so-called "time travel" capabilities.