InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
A New Service from the Microsoft and Oracle Partnership: Oracle Database Service for Microsoft Azure
Recently, Microsoft and Oracle announced the general availability (GA) of Oracle Database Service for Microsoft Azure, a new service that allows Microsoft Azure customers to provision, access, and monitor enterprise-grade Oracle Database services in Oracle Cloud Infrastructure (OCI).
-
BigScience Releases 176B Parameter AI Language Model BLOOM
The BigScience research workshop released BigScience Large Open-science Open-access Multilingual Language Model (BLOOM), an autoregressive language model based on the GPT-3 architecture. BLOOM is trained on data from 46 natural languages and 13 programming languages and is the largest publicly available open multilingual model.
-
Meta Hopes to Increase Accuracy of Wikipedia with New AI Model
Meta AI's research and advancements team developed a neural-network-based system, called SIDE, that is capable of scanning hundreds of thousands of Wikipedia citations at once and checking whether they truly support the corresponding contents. Wikipedia is a multilingual free online encyclopedia written and maintained by volunteers through open collaboration and a wiki-based editing system.
-
AWS Announced Synthetic Data Generation for SageMaker Ground Truth
AWS announced that users can now create labeled synthetic data with Amazon SageMaker Ground Truth. SageMaker Ground Truth is a data labeling service that makes it simple to label data and allows you the choice to use human annotators through third-party suppliers, Amazon Mechanical Turk, or your own private workforce.
-
Java News Roundup: JDK 19 in RDP2, Oracle Critical Patch Update, TornadoVM on M1, Grails CVE
This week's Java roundup for July 18th, 2022, features news from Oracle, JDK 18, JDK 19, JDK 20, Spring Boot and Spring Security milestone and point releases, Spring for GraphQL 1.0.1, Liberica JDK updates, Quarkus 2.10.3, CVE in Grails, JobRunr 5.1.6, JReleaser maintenance, Apache Tomcat 9.0.65 and 10.1.0-M17, Tornado VM on Apple M1 and the JBNC conference.
-
Amazon Redshift Serverless Generally Available to Automatically Scale Data Warehouse
Amazon recently announced the general availability of Redshift Serverless, an elastic option to scale data warehouse capacity. The new service allows data analysts, developers and data scientists to run and scale analytics without provisioning and managing data warehouse clusters.
-
Shopify’s Practical Guidelines from Running Airflow for ML and Data Workflows at Scale
Shopify engineering shared its experience in the company's blog post on how to scale and optimize Apache Airflow for running ML and data workflows. They shared practical solutions for the challenges they faced like slow file access, insufficient control over DAG, irregular level of traffic, resource contention among workloads, and more.
-
Obituary: Alex Blewitt
It is with great sadness that we announce that InfoQ editor Dr. Alex Blewitt has unexpectedly passed away.
-
Google's Image-Text AI LIMoE Outperforms CLIP on ImageNet Benchmark
Researchers at Google Brain recently trained Language-Image Mixture of Experts (LIMoE), a 5.6B parameter image-text AI model. In zero-shot learning experiments on ImageNet, LIMoE outperforms CLIP and performs comparably to state-of-the-art models while using fewer compute resources.
-
PyTorch 1.12 Release Includes Accelerated Training on Macs and New Library TorchArrow
The PyTorch open-source deep-learning framework announced the release of version 1.12 which includes support for GPU-accelerated training on Apple silicon Macs and a new data preprocessing library, TorchArrow, as well as updates to other libraries and APIs.
-
Google AI Developed a Language Model to Solve Quantitative Reasoning Problems
Google AI developed a deep learning language model called Minerva which could solve mathematical quantitative problems. Google AI researchers achieved a state-of-the-art deep learning model by training on a large dataset that contains quantitative reasoning with symbolic expressions. The final model, Minerva, could solve quantitative mathematical problems on STEM reasoning tasks.
-
MLGO Framework Brings Machine Learning in Compiler Optimizations
Google’s new Machine Learning Guided Optimization (MLGO) is an industrial-grade general framework for integrating machine-learning (ML) techniques systematically in a compiler and in particular in LLVM. Compiling faster and smaller code can significantly reduce the operational cost of large data-center applications.
-
OpenAI Releases Minecraft-Playing AI VPT
Researchers from OpenAI have open-sourced Video PreTraining (VPT), a semi-supervised learning technique for training game-playing agents. In a zero-shot setting, VPT performs tasks that agents cannot learn via reinforcement learning (RL) alone, and with fine-tuning is the first AI to craft a diamond pickaxe in Minecraft.
-
Google's BigQuery Introduces Column-Level Encryption Functions and Dynamic Masking of Information
Google recently released new features for its SaaS data warehouse BigQuery which include column level encryption functions and dynamic masking of information. Specifically, dynamic masking of information can be used for real-time transactions whereas column level encryption provides additional security for data at rest or in motion where real-time usability is not required.
-
LinkedIn Open-Sourced Its Feature Store to Evangelize Productive Machine Learning
LinkedIn Engineering recently open-sourced its feature store Feathr, which helps engineers to develop machine Learning products by simplifying feature management and usage in production. It defines features, computes them for training and inference purposes, and makes them discoverable by other machine learning developers.