InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
Facebook Research Develops AI System for Music Source Separation
Facebook Research recently released Demucs, a new deep-learning-powered system for music source separation. Demucs outperforms previously reported results based on human evaluations of overall quality of sound after separation.
-
Deep Learning Accelerates Scientific Simulations up to Two Billion Times
Researchers from several physics and geology laboratories have developed Deep Emulator Network SEarch (DENSE), a technique for using deep-learning to perform scientific simulations from various fields from high-energy physics to climate science. Compared to previous simulators, the results from DENSE achieved speedups ranging from 10 million to 2 billion times.
-
Google Expands Open Images Dataset and Adds New Localized Narratives Annotation
Google AI has just released a new version (V6) of their photo dataset Open Images, which now includes an entirely new type of annotation called localized narratives. These multimodal descriptions of images incorporate synchronized voice, text, and mouse trace annotations that provide more in-depth training data for what is already one of the largest open "source annotated" image datasets.
-
Microsoft Ships Preview of Cluster-Friendly Cloud Disks
Storage is one of the more mature services in the public cloud, but rarely supports traditional clustered systems. To attract those on-premises workloads, the Microsoft Azure team released a preview of Azure Shared Disks, a block storage option for attaching managed disks to multiple virtual machines.
-
Splice Machine Data Platform 3.0 Supports Kubernetes Managed Service and New ML Manager
The latest version of distributed SQL data platform Splice Machine supports a new Kubernetes managed service, new version of Machine Learning Manager (v2.0), and automatic in-database model deployment.
-
Oracle Cloud Now Offers Data Science and Machine Learning Services
Oracle recently announced the availability of its Cloud Data Science Platform, a native service on Oracle Cloud Infrastructure (OCI), which the software designed to let teams of data scientists collaborate on the development, deployment and maintenance of machine learning models.
-
Spotify Open-Sources Terraform Module for Kubeflow ML Pipelines
Spotify has open-sourced their Terraform module for running machine-learning pipeline software Kubeflow on Google Kubernetes Engine (GKE). By switching their in-house ML platform to Kubeflow, Spotify engineers have achieved faster time to production and are producing 7x more experiments than on the previous platform.
-
MIT CSAIL TextFooler Framework Tricks Leading NLP Systems
A team of researchers at the MIT Computer Science & Artificial Intelligence Lab (CSAIL) recently released a framework called TextFooler which successfully tricked state-of-the-art NLP models (such as BERT) into making incorrect predictions.
-
PyTorch 1.4 Release Introduces Java Bindings, Distributed Training
PyTorch, Facebook's open-source deep-learning framework, announced the release of version 1.4. This release, which will be the last version to support Python 2, includes improvements to distributed training and mobile inference and introduces support for Java.
-
Boosting Apache Spark with GPUs and the RAPIDS Library
At the 2019 Spark AI Summit Europe conference, NVIDIA software engineers Thomas Graves and Miguel Martinez hosted a session on Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS Library. InfoQ recently talked with Jim Scott, head of developer relations at NVIDIA, to learn more about accelerating Apache Spark with GPUs and the RAPIDS library.
-
GitHub Releases ML-Based "Good First Issues" Recommendations
GitHub shipped an updated version of good first issues feature which uses a combination of both a machine learning (ML) model that identifies easy issues, and a hand curated list of issues that have been labeled "easy" by project maintainers. New and seasoned open source contributors can use this feature to find and tackle easy issues in a project.
-
Porting a Go-Based Face Detection Library to Wasm: Q&A with Endre Simo
Endre Simo, senior software developer and open-source contributor to a few popular image-processing projects, ported the Pigo face-detection library from Go to browsers with WebAssembly. The port illustrates the performance potential of WebAssembly today to run heavy-weight desktop applications in a browser context.
-
Algorithmia Adds GitHub Integration to Machine Learning Platform
Algorithmia, an AI model management automation platform for data scientists and machine learning (ML) engineers, now integrates with GitHub.
-
Microsoft Open-Sources Project Petridish for Deep-Learning Optimization
A team from Microsoft Research and Carnegie Mellon University has open-sourced Project Petridish, a neural architecture search algorithm that automatically builds deep-learning models that are optimized to satisfy a variety of constraints. Using Petridish, the team achieved state-of-the-art results on the CIFAR-10 benchmark with only 2.2M parameters and five GPU-days of search time.
-
Jenkins Creator Launches ML Startup in Continuous Risk-Based Testing
Jenkins creator, Kohsuke Kawaguchi, starts Launchable, a startup using machine learning to identify risk-based tests. Testing thought leader Wayne Ariola also writes about the need for a continuous testing approach, where targeted risk-based tests help provide confidence for continuous delivery.