InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
Spotify Open-Sources Terraform Module for Kubeflow ML Pipelines
Spotify has open-sourced their Terraform module for running machine-learning pipeline software Kubeflow on Google Kubernetes Engine (GKE). By switching their in-house ML platform to Kubeflow, Spotify engineers have achieved faster time to production and are producing 7x more experiments than on the previous platform.
-
MIT CSAIL TextFooler Framework Tricks Leading NLP Systems
A team of researchers at the MIT Computer Science & Artificial Intelligence Lab (CSAIL) recently released a framework called TextFooler which successfully tricked state-of-the-art NLP models (such as BERT) into making incorrect predictions.
-
PyTorch 1.4 Release Introduces Java Bindings, Distributed Training
PyTorch, Facebook's open-source deep-learning framework, announced the release of version 1.4. This release, which will be the last version to support Python 2, includes improvements to distributed training and mobile inference and introduces support for Java.
-
Boosting Apache Spark with GPUs and the RAPIDS Library
At the 2019 Spark AI Summit Europe conference, NVIDIA software engineers Thomas Graves and Miguel Martinez hosted a session on Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS Library. InfoQ recently talked with Jim Scott, head of developer relations at NVIDIA, to learn more about accelerating Apache Spark with GPUs and the RAPIDS library.
-
GitHub Releases ML-Based "Good First Issues" Recommendations
GitHub shipped an updated version of good first issues feature which uses a combination of both a machine learning (ML) model that identifies easy issues, and a hand curated list of issues that have been labeled "easy" by project maintainers. New and seasoned open source contributors can use this feature to find and tackle easy issues in a project.
-
Porting a Go-Based Face Detection Library to Wasm: Q&A with Endre Simo
Endre Simo, senior software developer and open-source contributor to a few popular image-processing projects, ported the Pigo face-detection library from Go to browsers with WebAssembly. The port illustrates the performance potential of WebAssembly today to run heavy-weight desktop applications in a browser context.
-
Algorithmia Adds GitHub Integration to Machine Learning Platform
Algorithmia, an AI model management automation platform for data scientists and machine learning (ML) engineers, now integrates with GitHub.
-
Microsoft Open-Sources Project Petridish for Deep-Learning Optimization
A team from Microsoft Research and Carnegie Mellon University has open-sourced Project Petridish, a neural architecture search algorithm that automatically builds deep-learning models that are optimized to satisfy a variety of constraints. Using Petridish, the team achieved state-of-the-art results on the CIFAR-10 benchmark with only 2.2M parameters and five GPU-days of search time.
-
Jenkins Creator Launches ML Startup in Continuous Risk-Based Testing
Jenkins creator, Kohsuke Kawaguchi, starts Launchable, a startup using machine learning to identify risk-based tests. Testing thought leader Wayne Ariola also writes about the need for a continuous testing approach, where targeted risk-based tests help provide confidence for continuous delivery.
-
Compliance and the California Privacy Act - the Empire Strikes Back
On January 1, 2020, the California Privacy Act came into effect. Many companies have not complied with the law, and the long term effects of the legislation are unclear.
-
Google Open-Sources Reformer Efficient Deep-Learning Model
Researchers from Google AI recently open-sourced the Reformer, a more efficient version of the Transformer deep-learning model. Using a hashing trick for attention calculation and reversible residual layers, the Reformer can handle text sequences up to 1 million words while consuming only 16GB of memory on a single GPU accelerator.
-
The Distributed Data Mesh as a Solution to Centralized Data Monoliths
Instead of building large, centralized data platforms, corporations and data architects should create distributed data meshes.
-
Microsoft Open-Sources ONNX Acceleration for BERT AI Model
Microsoft's Azure Machine Learning team recently open-sourced their contribution to the ONNX Runtime library for improving the performance of the natural language processing (NLP) model BERT. With the optimizations, the model's inference latency on the SQUAD benchmark sped up 17x.
-
Apple Acquires Edge-Focused AI Startup Xnor.ai
Apple has acquired Xnor.ai, a Seattle-based startup that builds AI models that run on edge devices, for approximately $200 million.
-
QCon London - Keynotes & Workshops on Kubernetes, Apache Kafka, Microservices, Docker
QCon London is fast approaching. Join over 1,600 global software leaders this March 2-4. At the event, you will experience: talks that describe how industry leaders drive innovation and change within their organizations; a focus on real-world experiences, patterns, and practices (not product pitches), and implementable ideas for your projects and your teams.