InfoQ Homepage Data Content on InfoQ
-
Unsupervised Object Detection and Semantic Segmentation Using Deep Learning
Meta AI released CutLER, a state-of-the-art zero-shot unsupervised object detector which improves detection performance by over 2.7 times on 11 benchmark datasets for different domains like video frames, painting, sketches, etc. This model’s simplicity allows compatibility with different object-detection architectures across different domains.
-
AWS Announces DataZone, a New Data Management Service to Govern Data
At AWS re:Invent, Amazon Web Services announced Amazon DataZone, a new data management service that makes it faster and easier for customers to catalog, discover, share, and govern data stored across AWS, on-premises, and third-party sources.
-
Amazon SageMaker Clarify Now Supports Online Explainability for ML Predictions
Amazon is announcing that Amazon SageMaker Clarify now supports online explainability by providing explanations for machine learning model’s individual predictions in near real-time on live endpoints.
-
AWS DataSync Discovery Preview Edition Supports Automated Data Collection and Storage Recommendation
Amazon is announcing the public preview of AWS DataSync Discovery. This new feature of AWS DataSync enables users to better understand on-premises storage usage through automated data collection and analysis, quickly identify data to migrate, and evaluate recommended AWS Storage services for data.
-
Amazon Announces the Improvement of ML Models to Better Identify Sensitive Data on Amazon Macie
Amazon is announcing a new capability to create allow lists in Amazon Macie. Now text or text patterns not desire for Macie to report as sensitive data can be specified in allow lists. Amazon Macie is a fully managed data security and data privacy service that uses machine learning and pattern matching to discover and protect sensitive data in AWS.
-
Amazon Launches What-If Analyses for Machine Learning Forecasting Service Amazon Forecast
Amazon is announcing that now its time-series machine learning based forecasting service Amazon Forecast can run what-if assessments to determine how different business scenarios can affect demand estimates. What-if analysis is an effective business technique for simulating hypothetical scenarios and stress testing on planning assumptions by recording potential outcomes.
-
Amazon Comprehend Announces the Reduction of the Minimum Requirements for Entity Recognition
Amazon is announcing that they lowered the minimal requirements for training a recognizer with plain text CSV annotation files as a result of recent advances in the models powering Amazon Comprehend. Now, you just need three documents and 25 annotations for each entity type to create a unique entity recognition model.
-
AWS Announced Synthetic Data Generation for SageMaker Ground Truth
AWS announced that users can now create labeled synthetic data with Amazon SageMaker Ground Truth. SageMaker Ground Truth is a data labeling service that makes it simple to label data and allows you the choice to use human annotators through third-party suppliers, Amazon Mechanical Turk, or your own private workforce.
-
Google's BigQuery Introduces Column-Level Encryption Functions and Dynamic Masking of Information
Google recently released new features for its SaaS data warehouse BigQuery which include column level encryption functions and dynamic masking of information. Specifically, dynamic masking of information can be used for real-time transactions whereas column level encryption provides additional security for data at rest or in motion where real-time usability is not required.
-
Microsoft's New Simulation Framework FLUTE Accelerates Federated Learning Algorithm Development
Microsoft Research has recently released Federated Learning Utilities and Tools for Experimentation (FLUTE), a new simulation framework to accelerate federated learning ML algorithm development. The main goal of federated learning is to train complex machine-learning models over massive amounts of data without the need to share that data in a centralized location.
-
Meta AI’s New Data Set to Accelerate Renewable Energy Catalyst Discovery for Hydrogen Fuel
Meta AI recently announced that it will soon release an entirely new data set for green hydrogen fuel ML modeling and simulation, focused on oxide catalysts for the oxygen evolution reaction (OER), a critical chemical reaction used in green hydrogen fuel production via wind and solar energy.
-
Microsoft Introduces Open Data for Social Impact Framework
Microsoft recently introduced the Open Data for Social Impact Framework, a guide to help organizations put data to work to get new insights, make better decisions, and improve efficiency while tackling pressing social issues. The framework includes a five-step roadmap that organizations can use to get started.
-
Orchestrate Operations, Validations, and Approvals on Data Entities with Azure Purview Workflows
Recently, Microsoft announced the preview of Azure Purview Workflows, allowing customers to orchestrate then create, update and delete operations, validation, and approval of data entities using repeatable business processes. These workflows are currently in preview.
-
Get Consistent Access to Third-Party APIs with AWS Data Exchange for APIs
During the recent AWS re:Invent in Las Vegas, the company announced the AWS Data Exchange for APIs. This new capability enables customers to find, subscribe to, and use third-party API products from providers on AWS Data Exchange.
-
Apache Spark Brings Pandas API with Version 3.2
The Apache Spark team has integrated the Pandas API in the product's latest 3.2 release. With this change, dataframe processing can be scaled to multiple clusters or multiple processors in a single machine using the PySpark execution engine.