InfoQ Homepage Machine Learning Content on InfoQ
-
Amazon Comprehend Announces the Reduction of the Minimum Requirements for Entity Recognition
Amazon is announcing that they lowered the minimal requirements for training a recognizer with plain text CSV annotation files as a result of recent advances in the models powering Amazon Comprehend. Now, you just need three documents and 25 annotations for each entity type to create a unique entity recognition model.
-
Ant Group Open Sources Privacy-Preserving Computation Framework
Alibaba financial arm Ant Group has open sourced SecretFlow, its privacy-preserving framework, with a specific focus on data analysis and machine learning.
-
Meta Hopes to Increase Accuracy of Wikipedia with New AI Model
Meta AI's research and advancements team developed a neural-network-based system, called SIDE, that is capable of scanning hundreds of thousands of Wikipedia citations at once and checking whether they truly support the corresponding contents. Wikipedia is a multilingual free online encyclopedia written and maintained by volunteers through open collaboration and a wiki-based editing system.
-
AWS Announced Synthetic Data Generation for SageMaker Ground Truth
AWS announced that users can now create labeled synthetic data with Amazon SageMaker Ground Truth. SageMaker Ground Truth is a data labeling service that makes it simple to label data and allows you the choice to use human annotators through third-party suppliers, Amazon Mechanical Turk, or your own private workforce.
-
Shopify’s Practical Guidelines from Running Airflow for ML and Data Workflows at Scale
Shopify engineering shared its experience in the company's blog post on how to scale and optimize Apache Airflow for running ML and data workflows. They shared practical solutions for the challenges they faced like slow file access, insufficient control over DAG, irregular level of traffic, resource contention among workloads, and more.
-
MLGO Framework Brings Machine Learning in Compiler Optimizations
Google’s new Machine Learning Guided Optimization (MLGO) is an industrial-grade general framework for integrating machine-learning (ML) techniques systematically in a compiler and in particular in LLVM. Compiling faster and smaller code can significantly reduce the operational cost of large data-center applications.
-
LinkedIn Open-Sourced Its Feature Store to Evangelize Productive Machine Learning
LinkedIn Engineering recently open-sourced its feature store Feathr, which helps engineers to develop machine Learning products by simplifying feature management and usage in production. It defines features, computes them for training and inference purposes, and makes them discoverable by other machine learning developers.
-
Amazon Unveils ML-Powered Coding Assistant CodeWhisperer
Amazon launched CodeWhisperer, an ML-Powered Coding Companion which provides code recommendations based on developers' comments in natural language and their code in the integrated development environment. The machine learning-powered service increases developer productivity.
-
AWS and Microsoft Working Together on PyWhy, the New Home of Causal ML Library DoWhy
AWS in a joint effort with Microsoft have established PyWhy as a fresh GitHub organization to integrate AWS algorithms into DoWhy, a casual ML library from Microsoft, which has moved to PyWhy.
-
Amazon Released Incremental Training Feature in SageMaker JumpStart
AWS recently released a new feature in SageMaker (AWS Machine Learning Service) JumpStart to incrementally retrain machine-learning (ML) models trained with expanded datasets. By using this feature, developers could fine-tune their models for better performance in production with a couple of clicks. This recent feature is among the series of efforts to add more automation to SageMaker JumpStart.
-
GitHub Copilot Adopts Paid Model, Still Free for Some Open-Source Maintainers and Students
After almost one year in technical preview, GitHub Copilot is now prime time-ready for students and individual developers, says GitHub, while companies and larger organizations could get access to it before the end of the year.
-
Microsoft's New Simulation Framework FLUTE Accelerates Federated Learning Algorithm Development
Microsoft Research has recently released Federated Learning Utilities and Tools for Experimentation (FLUTE), a new simulation framework to accelerate federated learning ML algorithm development. The main goal of federated learning is to train complex machine-learning models over massive amounts of data without the need to share that data in a centralized location.
-
Amazon Rekognition Introduces Streaming Video Events
AWS recently announced the general availability of Streaming Video Events, a new feature of Amazon Rekognition to provide real-time alerts on live video streams.
-
New GraphWorld Tool Accelerates Graph Neural-Network Benchmarking
Google AI has recently released GraphWorld, a tool to accelerate performance benchmarking in the area of graph neural networks (GNNs). GraphWorld is a configurable framework to generate graphs with a variety of structural properties like different node degree distributions and Gini index.
-
TensorFlow DTensor: Unified API for Distributed Deep Network Training
Recently released TensorFlow v2.9 introduces a new API for the model, data, and space-parallel (aka spatially tiled) deep network training. DTensor aims to decouple sharding directives from the model code by providing higher-level utilities to partition the model and batch parameters between devices.