InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
Unpacking How Ads Ranking Works @ Pinterest: Aayush Mudgal at QCon San Francisco
At QCon San Francisco, Aayush Mudgal gave a talk on Pinterest's ad ranking strategy. Pinterest does both candidate retrieval and ranking, supported by user interaction data and what they are currently watching. They use neural networks to create embeddings for ads and users, where ads which are close to the user should be relevant. They train and deploy models on a daily basis.
-
OpenAI Announces ChatGPT Voice and Image Features
OpenAI recently announced new voice and image features for ChatGPT. A new backend model, GPT-4V, will handle image inputs, and an updated DALL-E model will be integrated to generate images. In addition, users of the mobile ChatGPT app will be able to hold voice conversations with the chatbot.
-
Generative AI Service AWS Bedrock Now Generally Available
After announcing Bedrock last April in preview, Amazon is now making its fully-managed service for generative AI apps generally available.
-
Confluent Announces Apache Flink on Confluent Cloud in Open Preview
Confluent recently announced the open preview of Apache Flink on Confluent Cloud as a fully-managed service for stream processing. The company claims that the managed service will make it easier for companies to filter, join, and enrich data streams with Flink.
-
Allegro Uses Control Theory for Workload Balancing in its Apache Kafka PubSub Platform
Allegro, the largest eCommerce platform in Poland, implemented dynamic workload balancing in Hermes, its open-source publish-subscribe message broker, built on top of Apache Kafka. The new workload balancing algorithm achieves more uniform resource utilization and lower infrastructure costs.
-
Multi-Modal LLM NExT-GPT Handles Text, Images, Videos, and Audio
The NExT Research Center at the National University of Singapore (NUS) recently open-sourced NExT-GPT, an "any-to-any" multi-modal large language model (LLM) that can handle text, images, videos, and audio as input or output. NExT-GPT is based on existing pre-trained models and only required updating 1% of its total parameters during training.
-
Java News Roundup: JDK 21, GraalVM for JDK 21, Apache Pinot 1.0, Eclipse Epicyro 3.0
This week's Java roundup for September 18th, 2023, features news from OpenJDK, JDK 22, JDK 21, GraalVM, Corretto, Liberica, Epicyro 3.0, Pinot 1.0, and releases for: Spring Boot; Spring Integration; Spring Batch; Spring Cloud Dataflow; Spring Security; Spring GraphQL; Spring Authorization Server; Spring Apache Pulsar; Spring Modulith; Quarkus; Open Liberty; Micronaut; Hibernate; OpenXava; Gradle.
-
Hugging Face's Guide to Optimizing LLMs in Production
When it comes to deploying Large Language Models (LLMs) in production, the two major challenges originate from the huge amount of parameters they require and the necessity of handling very long input sequences to represent contextual information. Hugging Face has documented a list of techniques to tackle those hurdles based on their experience serving such models.
-
High Performance Functions in Rust on RDS PostgreSQL
AWS announced the general availability of the Rust procedural language handler, PL/Rust, for Amazon Relational Database Service (RDS) instances running versions 13 and 14 of PostgreSQL. This builds on the previous release in May 2023 that enabled the functionality only for instances running PostgreSQL version 15.
-
Meta Open-Sources Multilingual Translation Foundation Model SeamlessM4T
Meta recently open-sourced Massively Multilingual & Multimodal Machine Translation (SeamlessM4T), a multilingual translation AI that can translate both speech audio and text data across nearly 100 languages. SeamlessM4T is trained on 1 million hours of audio data and outperforms the current state-of-the-art speech-to-text translation model.
-
Cloudflare One Data Protection Suite for Data Security across Web, Private, and SaaS Applications
Cloudflare recently announced its One Data Protection Suite, a unified set of advanced security solutions designed to protect data across every environment – web, private, and SaaS applications. The company states the suite is powered by Cloudflare’s Security Service Edge (SSE), allowing customers to streamline compliance in the cloud, mitigate data exposure and loss of source code.
-
JCP EC Industry Experts Reveal Their Favorite JDK 21 Feature at Special Oracle Event in NYC
At a special event hosted by the New York Java Special Interest Group and Garden State Java User Group at BNY Mellon in New York City, industry experts from the Java Community Process (JCP) Executive Committee participated in a panel discussion to reveal their favorite features from the upcoming release of JDK 21. Included in the festivities was a celebration of the 25th anniversary of the JCP.
-
Abu Dhabi Releases Largest Openly-Available Language Model Falcon 180B
The Abu Dhabi government's Technology Innovation Institute (TII) released Falcon 180B, currently the largest openly-available large language model (LLM). Falcon 180B contains 180 billion parameters and outperforms GPT-3.5 on the MMLU benchmark.
-
AWS Unveils Multi-Model Endpoints for PyTorch on SageMaker
AWS has introduced Multi-Model Endpoints for PyTorch on Amazon SageMaker. This latest development promises to revolutionize the AI landscape, offering users more flexibility and efficiency when deploying machine learning models.
-
AI, ML, Data Engineering News Roundup: Stable Chat, Vertex AI, ChatGPT and Code Llama
The most recent update, which covers developments through September 4, 2023, highlights significant pronouncements and accomplishments in the fields of artificial intelligence, machine learning, and data science. Developments from Stability AI, Google, OpenAI, and Meta were among this week's significant stories.