InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
Amazon Updates Transcribe with Automatic Redaction of Personally Identifiable Information Feature
Amazon Transcribe is an automatic speech recognition (ASR) service, allowing customers to add speech-to-text capabilities to their applications. Recently, the public cloud provider made a significant update to the service with an automatic redaction of Personally Identifiable Information (PII) feature.
-
TensorFlow Quantum Joins Quantum Computing and Machine Learning
TensorFlow Quantum (TFQ) brings Google quantum computing framework Cirq and TensorFlow together to enable the creation of quantum machine learning (ML) models.
-
Facebook Research Develops AI System for Music Source Separation
Facebook Research recently released Demucs, a new deep-learning-powered system for music source separation. Demucs outperforms previously reported results based on human evaluations of overall quality of sound after separation.
-
Deep Learning Accelerates Scientific Simulations up to Two Billion Times
Researchers from several physics and geology laboratories have developed Deep Emulator Network SEarch (DENSE), a technique for using deep-learning to perform scientific simulations from various fields from high-energy physics to climate science. Compared to previous simulators, the results from DENSE achieved speedups ranging from 10 million to 2 billion times.
-
Google Expands Open Images Dataset and Adds New Localized Narratives Annotation
Google AI has just released a new version (V6) of their photo dataset Open Images, which now includes an entirely new type of annotation called localized narratives. These multimodal descriptions of images incorporate synchronized voice, text, and mouse trace annotations that provide more in-depth training data for what is already one of the largest open "source annotated" image datasets.
-
Microsoft Ships Preview of Cluster-Friendly Cloud Disks
Storage is one of the more mature services in the public cloud, but rarely supports traditional clustered systems. To attract those on-premises workloads, the Microsoft Azure team released a preview of Azure Shared Disks, a block storage option for attaching managed disks to multiple virtual machines.
-
Splice Machine Data Platform 3.0 Supports Kubernetes Managed Service and New ML Manager
The latest version of distributed SQL data platform Splice Machine supports a new Kubernetes managed service, new version of Machine Learning Manager (v2.0), and automatic in-database model deployment.
-
Oracle Cloud Now Offers Data Science and Machine Learning Services
Oracle recently announced the availability of its Cloud Data Science Platform, a native service on Oracle Cloud Infrastructure (OCI), which the software designed to let teams of data scientists collaborate on the development, deployment and maintenance of machine learning models.
-
Spotify Open-Sources Terraform Module for Kubeflow ML Pipelines
Spotify has open-sourced their Terraform module for running machine-learning pipeline software Kubeflow on Google Kubernetes Engine (GKE). By switching their in-house ML platform to Kubeflow, Spotify engineers have achieved faster time to production and are producing 7x more experiments than on the previous platform.
-
MIT CSAIL TextFooler Framework Tricks Leading NLP Systems
A team of researchers at the MIT Computer Science & Artificial Intelligence Lab (CSAIL) recently released a framework called TextFooler which successfully tricked state-of-the-art NLP models (such as BERT) into making incorrect predictions.
-
PyTorch 1.4 Release Introduces Java Bindings, Distributed Training
PyTorch, Facebook's open-source deep-learning framework, announced the release of version 1.4. This release, which will be the last version to support Python 2, includes improvements to distributed training and mobile inference and introduces support for Java.
-
Boosting Apache Spark with GPUs and the RAPIDS Library
At the 2019 Spark AI Summit Europe conference, NVIDIA software engineers Thomas Graves and Miguel Martinez hosted a session on Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS Library. InfoQ recently talked with Jim Scott, head of developer relations at NVIDIA, to learn more about accelerating Apache Spark with GPUs and the RAPIDS library.
-
GitHub Releases ML-Based "Good First Issues" Recommendations
GitHub shipped an updated version of good first issues feature which uses a combination of both a machine learning (ML) model that identifies easy issues, and a hand curated list of issues that have been labeled "easy" by project maintainers. New and seasoned open source contributors can use this feature to find and tackle easy issues in a project.
-
Porting a Go-Based Face Detection Library to Wasm: Q&A with Endre Simo
Endre Simo, senior software developer and open-source contributor to a few popular image-processing projects, ported the Pigo face-detection library from Go to browsers with WebAssembly. The port illustrates the performance potential of WebAssembly today to run heavy-weight desktop applications in a browser context.
-
Algorithmia Adds GitHub Integration to Machine Learning Platform
Algorithmia, an AI model management automation platform for data scientists and machine learning (ML) engineers, now integrates with GitHub.