InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
Google Releases Hive-BigQuery Open-Source Connector
Google recently announced the general availability of the Hive-BigQuery Connector, simplifying integration and migrations between Apache Hive and Google BigQuery. The open-source connector is a Hive storage handler that enables Hive to interact with BigQuery's storage layer.
-
Microsoft Introduces the Public Preview of Vector Search Feature in Azure Cognitive Search
At its annual Inspire conference, Microsoft recently announced the public preview of Vector search in Azure Cognitive Search, a capability for building applications powered by large language models. It is a new capability for indexing, storing, and retrieving vector embeddings from a search index.
-
Meta AI Reveals CM3leon, an Advanced Text-to-Image Generative Model
Meta AI has introduced CM3leon, a novel multimodal model combining text and image production. This model is the first of its type, using a modified formula from text-only language models to deliver remarkable outcomes with unequaled computational efficiency.
-
Microsoft Azure Managed Lustre for HPC and AI Workloads Now Generally Available
Microsoft recently announced the general availability (GA) of Azure Managed Lustre, a managed file system for high-performance computing (HPC) and AI workloads.
-
Introduction to Mojo Programming Language
Mojo is a newly presented programming language that combines the simplicity of Python with the speed and memory security of Rust. It is at an early stage of development and offers users an online playground to explore its features. Mojo aims for excellence in data science and machine learning, providing a fast alternative to Python. There are gradual plans to make it available to open-source.
-
Berkeley Open-Sources AI Image-Editing Model InstructPix2Pix
Researchers from the Berkeley Artificial Intelligence Research (BAIR) Lab have open-sourced InstructPix2Pix, a deep-learning model that follows human instructions to edit images. InstructPix2Pix was trained on synthetic data and outperforms a baseline AI image-editing model.
-
EU AI Act: the Regulatory Framework on the Usage of Machine Learning in the European Union
After the first publication of the proposal on the operation of machine learning applications in 2021, on June 14th negotiations have started for the realization of the legislation in the EU Council. The EU countries are expected to reach an agreement by the end of 2023. The EU Act takes a risk-based approach and plans to avoid disproportionate prescriptions when executing the regulations.
-
Databricks Unveils Lakehouse AI and MosaicML Acquisition at Data + AI Summit
The Data and AI company Databricks recently unveiled Lakehouse AI, a suite of tools for building and governing generative AI models, including large language models (LLMs), within the Databricks platform. Among the tools were LakehouseIQ, a "knowledge engine" that uses AI to understand a company's unique data, culture, and language in order to improve natural language interfaces like chatbots.
-
Yelp Rebuilds Corrupted Cassandra Cluster Using Its Data Streaming Architecture
Yelp created a solution to sanitize data from the corrupted Apache Cassandra cluster utilizing its data streaming architecture. The team explored many potential options to address the data corruption issue, however, ultimately had to move the data into a new cluster to remove corrupted records in the process.
-
Google Releases Cloud SQL Enterprise Plus for MySQL and PostgreSQL
Google Cloud recently announced the Cloud SQL Enterprise Plus edition for MySQL and PostgreSQL of the managed database service. The new edition provides performance optimizations for read and write operations, improved machine types and configurations, and an integrated SSD–backed data cache option.
-
AWS Introduces New Clickstream Analytics on AWS Solution for Mobile and Web Applications
AWS recently announced a new service called Clickstream Analytics on AWS, an end-to-end solution to collect, ingest, analyze, and visualize clickstream data inside organizations’ web and mobile applications.
-
AI Assistant Comes to ReSharper
JetBrains released an AI-powered version of ReSharper, its developer productivity extension for Microsoft Visual Studio. The new version, ReSharper 2023.2, is the first that will come with AI-powered development assistance.
-
Instacart Creates a Self-Serve Apache Flink Platform on Kubernetes
Instacart moved their Apache Flink workloads from AWS EMR to Kubernetes to meet the high demand for data processing use cases using Flink within the organization, as using EMR became problematic for many teams with different requirements. As a result, they made the platform easier to use and reduced their operational and infrastructure costs.
-
Google's Speech AI AudioPaLM Performs Translation with Voice Transfer
Researchers at Google announced AudioPaLM, a large language model (LLM) that performs text-to-speech (TTS), automated speech recognition (ASR), and speech-to-speech translation (S2ST) with voice transfer. AudioPaLM is based on the PaLM-2 LLM and outperforms OpenAI's Whisper on translation benchmarks.
-
Descaling for Delivery and Using AI to Enhance Software Development: Learnings from QCon New York
The track Optimizing Teams for Fast Flow - Surviving in the Post-agile Aftermath at QCon New York 2023 comprised two talks in the morning that went into replacing an agile process with engineering and conversational software delivery using AI.