InfoQ Homepage Data Analytics Content on InfoQ
-
Overcoming Data Scarcity and Privacy Challenges with Synthetic Data
In this article, the author discusses the importance of using synthetic data in data analytics projects, especially in financial institutions, to solve the problems of data scarcity and more importantly data privacy.
-
COVID-19 and Mining Social Media - Enabling Machine Learning Workloads with Big Data
In this article, author Adi Pollock discusses how to enable machine learning workloads with big data to query and analyze COVID-19 tweets to understand social sentiment towards COVID-19.
-
Scalable Cloud Environment for Distributed Data Pipelines with Apache Airflow
In this article, author Lena Hall discusses how to use Apache Airflow to define and execute distributed data pipelines with an example of the workflow framework running on Kubernetes on Azure cloud platform.
-
Easy Interpretation of a Logistic Regression Model with Delta-p Statistics
Delta-p statistics is an easier means of communicating results to a non-technical audience than the plain coefficients of a logistic regression model. In this article, authors Maarit Widmann and Alfredo Roccato discuss how to predict credit eligibility using the Delta-p statistics based solution.
-
Data Leadership Book Review and Interview
Data Leadership book, authored by Anthony Algmin, covers the data leadership topic and how data leaders should manage and govern the data management programs in their organizations. Data Leadership is how organizations choose to apply their energy and resources toward creating data capabilities to influence their business.
-
Innovation Startups Modeling Agile Culture
Innovation is not only about the most advanced technology; management and processes are the new era of startups' innovation. To mix the power of the data and the importance of people to offer business intelligence is a key point nowadays. The result is not only the most important thing; the way you do it is more important. To be agile is to adapt to today's market.
-
Is Edge Computing a Thing?
Edge Computing is definitely a thing, but the computing need not occur at the edge. Instead what is needed is an ability to compute (anywhere) on streaming data from large numbers of dynamically changing devices, in the edge environment. This in turn demands an architectural pattern for stateful, distributed computing.
-
How to Use Redis TimeSeries with Grafana for Real-Time Analytics
In this article, author Roshan Kumar discusses how a purpose-built database like RedisTimeSeries can be used to manage time-series data. He also shows how to visualize this data in a Grafana dashboard.
-
Azure Data Lake Analytics and U-SQL
In this article, the author shows how to use big data query and processing language U-SQL on Azure Data Lake Analytics platform. U-SQL combines the concepts and constructs both of SQL and C#. It combines the simplicity and declarative nature of SQL with the programmatic power of C# including rich types and expressions.
-
Data Analytics in the World of Agility
Is it all about customer-centric business, or is there any data left? Can we integrate data analytics and customer empathy? This article explores how we can move towards a more customer-centric business and what information we require in order to understand the most valuable thing we have: our customer.
-
Real-Time Data Processing Using Redis Streams and Apache Spark Structured Streaming
Structured Streaming, introduced with Apache Spark 2.0, delivers a SQL-like interface for streaming data. Redis Streams enables Redis to consume, hold and distribute streaming data between multiple producers and consumers. In this article, author Roshan Kumar walks us through how to process streaming data in real time using Redis and Apache Spark Streaming technologies.
-
The Data Science Mindset: Six Principles to Build Healthy Data-Driven Organizations
In this article, business and technical leaders will learn methods to assess whether their organization is data-driven and benchmark its data science maturity. They will learn how to use the Healthy Data Science Organization Framework to nurture a data science mindset within the organization.
CONTENT IN THIS BOX PROVIDED BY OUR SPONSOR