InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
PayPal Standardizes on Apache Airflow and Apache Gobblin for Its Next-Gen Data Movement Platform
PayPal recently described how it standardized on Apache Airflow and Apache Gobblin for implementing its next-gen data movement platform. In a recent blog post, PayPal engineers detail how the existing data movement platform evolved into many tools & platforms in a complex and unmanageable ecosystem and their shift towards a new implementation.
-
Pinterest Describes an Architecture for Efficient Retrieval of Hierarchical Documents
In a recent blog post, Pinterest engineers describe how they implemented an efficient two-stage retrieval architecture to retrieve hierarchical documents in a home-grown search engine. They accomplished it by combining index flattening, index normalization, and index denormalization.
-
AWS Announces Amazon Aurora Supports PostgreSQL 12
AWS has recently announced that Amazon Aurora, a MySQL and PostgreSQL-compatible relational Database built for the Cloud, now supports major version 12 of PostgreSQL.
-
Kaggle Publishes 2020 State of Machine Learning and Data Science Report
Kaggle has published a report on the State of Machine Learning and Data Science for 2020. The report is based on survey responses from over two thousand users currently employed as data scientists. The report notes that the "vast majority" of data scientists are under 35 years of age, two-thirds have a graduate degree, and most have less than 10 years coding experience.
-
OpenAI Announces GPT-3 Model for Image Generation
OpenAI has trained a 12B-parameter AI model based on GPT-3 that can generate images from textual description. The description can specify many independent attributes, including the position of objects as well as image perspective, and can also synthesize combinations of objects that do not exist in the real world.
-
AWS Announces Enhanced Console Experience and New v2 APIs for Amazon Lex
AWS recently announced updates to Amazon Lex, a service for building conversational interfaces into any application using voice and text. The service now has an enhanced management console and new V2 APIs, including continuous streaming capability.
-
Using Language and Developer Friendly Data Structures with Couchbase
Couchbase APIs have evolved to provide programming language friendly data structures making it easier for programmers to incorporate into the respective programs. Some examples highlight how to use data structures with the Couchbase Python SDK.
-
Confluent Announces Strategic Alliance with Microsoft
Confluent, the company of the founders of Apache Kafka, recently announced a new strategic alliance between them and Microsoft to enable a more integrated experience between Confluent Cloud and the Azure platform.
-
Facebook Open-Sources Multilingual Speech Recognition Deep-Learning Model
Facebook AI Research (FAIR) open-sourced Cross-Lingual Speech Recognition (XSLR), a multilingual speech recognition AI model. XSLR is trained on 53 languages and outperforms existing systems when evaluated on common benchmarks.
-
Microsoft Research Develops a New Vision-Language System: VinVL
Microsoft Research recently developed a new object-attribute detection model for image encoding, which they named VinVL - Visual features in Vision-Language.
-
AWS Introduces HealthLake and Redshift ML in Preview
AWS introduced preview releases of Amazon HealthLake service and a feature for Amazon Redshift called Redshift ML during re:Invent 2020 in December. Amazon HealthLake is a data lake service that helps healthcare, health insurance, and pharmaceutical companies to derive value out of their data with the help of NLP. Redshift ML is a service that provides a gateway into SageMaker to Redshift users.
-
Microsoft Introduces Azure Health Bot
Microsoft recently introduced Azure Health Bot, an evolution of Microsoft Healthcare Bot that is becoming an Azure service with added functionalities. Built for developing virtual health care assistants, Azure Health Bot combines medical databases with natural language capabilities.
-
TensorFlow 2.4 Release Includes CUDA 11 Support and API Updates
The TensorFlow project announced the release of version 2.4.0 of the deep-learning framework, featuring support for CUDA 11, cuDNN 8, and NVIDIA's Ampere GPU architecture, as well as new strategies and profiling tools for distributed training. Other API updates include mixed-precision in Keras and a NumPy frontend.
-
AI Models from Google and Microsoft Exceed Human Performance on Language Understanding Benchmark
Research teams from Google and Microsoft have recently developed natural language processing (NLP) AI models which have scored higher than the human baseline score on the SuperGLUE benchmark. SuperGLUE measures a model's score on several natural language understanding (NLU) tasks, including question answering and reading comprehension.
-
Medium Describes "Rex" - a Go-Based Recommendation Service
In a recent blog post, Medium describes how it built a recommendation service named "Rex." The original recommendation service was part of the Node.js monolith, and it could only rank 150 stories. However, Medium wanted this service to rank hundreds of thousands of stories per user in under a second. So, they decided to build an entirely new, separate service using Go.