InfoQ Homepage Data Content on InfoQ
-
Meta AI’s New Data Set to Accelerate Renewable Energy Catalyst Discovery for Hydrogen Fuel
Meta AI recently announced that it will soon release an entirely new data set for green hydrogen fuel ML modeling and simulation, focused on oxide catalysts for the oxygen evolution reaction (OER), a critical chemical reaction used in green hydrogen fuel production via wind and solar energy.
-
Microsoft Introduces Open Data for Social Impact Framework
Microsoft recently introduced the Open Data for Social Impact Framework, a guide to help organizations put data to work to get new insights, make better decisions, and improve efficiency while tackling pressing social issues. The framework includes a five-step roadmap that organizations can use to get started.
-
Orchestrate Operations, Validations, and Approvals on Data Entities with Azure Purview Workflows
Recently, Microsoft announced the preview of Azure Purview Workflows, allowing customers to orchestrate then create, update and delete operations, validation, and approval of data entities using repeatable business processes. These workflows are currently in preview.
-
Get Consistent Access to Third-Party APIs with AWS Data Exchange for APIs
During the recent AWS re:Invent in Las Vegas, the company announced the AWS Data Exchange for APIs. This new capability enables customers to find, subscribe to, and use third-party API products from providers on AWS Data Exchange.
-
Apache Spark Brings Pandas API with Version 3.2
The Apache Spark team has integrated the Pandas API in the product's latest 3.2 release. With this change, dataframe processing can be scaled to multiple clusters or multiple processors in a single machine using the PySpark execution engine.
-
Cloudera Announces the General Availability of Cloudera DataFlow for the Public Cloud
The enterprise data cloud company Cloudera recently announced the general availability (GA) of Cloudera DataFlow for the Public Cloud, a cloud-native service for data flows to process hybrid streaming workloads on the Cloudera Data Platform (CDP).
-
Microsoft Renames Its Azure for FHIR API to Azure Healthcare APIs
Recently Microsoft announced the renaming of its Cloud for Healthcare's Azure API for Fast Healthcare Interoperability Resource (FHIR) to "Azure Healthcare APIs." In addition to the renaming of the APIs, the company also expands support for healthcare data to include patient health data via FHIR, medical imaging data via DICOM - and medical device data via the Azure IoT Connector for FHIR .
-
Perceiver: One Neural-Network Model for Multiple Input Data Types
Google’s DeepMind company has recently released a state-of-the-art deep-learning model called Perceiver that receives and processes multiple input data ranging from audio to images, similarly to how the human brain perceives multimodal data. Perceiver is able to receive and classify input multiple data types, namely point cloud, audio and images.
-
The Journey from Monolith to Microservices at GitHub: QCon Plus Q&A
GitHub needed to fundamentally rethink how they did software development due to all of the different cultures, norms, and technology stacks that their teams brought to the table. They are migrating toward a microservices architecture that enables different teams and systems and technologies to work harmoniously together.
-
Google Announces a New, More Services-Based Architecture Called Runner V2 to Dataflow
Google Cloud Dataflow is a fully-managed service for executing Apache Beam pipelines within the Google Cloud Platform(GCP). In a recent blog post, Google announced a new, more services-based architecture called Runner v2 to Dataflow – which will include multi-language support for all of its language SDKs.
-
The Distributed Data Mesh as a Solution to Centralized Data Monoliths
Instead of building large, centralized data platforms, corporations and data architects should create distributed data meshes.
-
Data Science at the Intersection of Emerging Technologies
Kirk Borne, principal data scientist at Booz Allen Hamilton, gave a keynote presentation at this year’s Oracle Code One Conference on how the connection between emerging technologies, data, and machine learning are transforming data into value. Emerging technological innovations like AI, robotics, computer vision and more, are enabled by data and create value from data.
-
The Future of Data Engineering: Chris Riccomini at QCon San Francisco
At QCon San Francisco 2019, Chris Riccomini presented “The Future of Data Engineering”. The key takeaway of his talk is about reaching an end goal with data engineering, which is having a fully automated decentralized data warehouse.
-
Lessons Learned from Innovating at Google: Frame the Problem, Use Data, and Define the MVP
The truly great, innovative, useful ideas come mostly from two sources: your target users, and people working in the organization - not necessarily those with a "product manager" hat. Experimentation can help us to materialize ideas into actual products and technology. Framing the problem, using data, and defining the MVP can help us to increase the chance of success in innovation.
-
Microsoft Announces Public Preview of Azure Data Share
Microsoft has announced the public preview of Azure Data Share, which provides capabilities to share data with users in the own organization, as well as with other organizations. Essentially, Microsoft positions the recently announced service as a big data tool, though it’s also possible to share individual files.