InfoQ Homepage AI, ML & Data Engineering Content on InfoQ

Articles

RSS Feed

Newer Older

AI, ML & Data Engineering

Understanding and Applying Correspondence Analysis

Customer segments, personality profiles, social classes, and age generations are examples of effective references to larger groups of people sharing similar characteristics. Correspondence analysis (CA) is a multivariate analysis technique that projects categorical data into a numeric feature space which captures most of the variability in the data by fewer dimensions.

Maarit Widmann Alfredo Roccato
on Feb 23, 2023
Culture & Methods

How I Contributed as a Tester to a Machine Learning System: Opportunities, Challenges and Learnings

Have you ever wondered about systems based on machine learning? In those cases, testing takes a backseat. And even if testing is done, it’s done mostly by developers themselves. A tester’s role is not clearly portrayed. Testers usually struggle to understand ML-based systems and explore what contributions they can make. This is a journey of assuring quality of ML-based systems as a tester.

Shivani Gaba
on Feb 16, 2023
AI, ML & Data Engineering

Understanding and Debugging Deep Learning Models: Exploring AI Interpretability Methods

ML interpretability refers to a user's ability to explain decisions made by an ML system. Interpretability increases confidence in the model, reduces bias, and ensures that model is compliant and ethical. In this article, author Andrew Hoblitzell discusses several methods of ML interpretability and dives deep into Local Interpretable Model-Agnostic Explanations (LIME) and Shapley Values.

Andrew Hoblitzell
on Feb 10, 2023
AI, ML & Data Engineering

Design Pattern Proposal for Autoscaling Stateful Systems

In this article, Rogerio Robetti discusses the challenges in auto-scaling stateful storage systems and proposes an opinionated design solution to automatically scale up (vertical) and scale out (horizontal) from a single node up to several nodes in a cluster with minimum configuration and interference of the operator.

Rogerio Robetti
on Jan 25, 2023
Development

InfoQ Software Trends Report: Major Trends in 2022 and What to Watch for in 2023

2022 was another year of significant technological innovations and trends in the software industry and communities. The InfoQ podcast co-hosts met last month to discuss the major trends from 2022, and what to watch for in 2023. This article is a summary of the 2022 software trends podcast.

Daniel Bryant Wesley Reisz Thomas Betts Shane Hastie Srini Penchikala
on Jan 13, 2023
AI, ML & Data Engineering

DynamoDB Data Transformation Safety: from Manual Toil to Automated and Open Source

Data transformation remains a continuous challenge in engineering and built upon manual toil. The open source utility Dynamo Data Transform was built to simplify and build safety and guardrails into data transformation for DynamoDB based systems––built upon a robust manual framework that was then automated and open sourced. This article discusses the challenges with Data Transformation.

Guy Braunstain
on Nov 23, 2022
AI, ML & Data Engineering

Create Your Distributed Database on Kubernetes with Existing Monolithic Databases

The next challenge for databases is to run them on Kubernetes to become cloud neutral. However, they are more difficult to manage than the application layer, since Kubernetes is designed for stateless applications. Apache ShardingSphere is the ecosystem to transform any database into a distributed database system and enhance it with sharding, elastic scaling, encryption features, and more.

Trista Pan
on Nov 16, 2022
AI, ML & Data Engineering

Apache DolphinScheduler in MLOps: Create Machine Learning Workflows Quickly

In this article, author discusses data pipeline and workflow scheduler Apache DolphinScheduler and how ML tasks are performed by Apache DolphinScheduler using Jupyter and MLflow components.

Zhou Jieguang
on Oct 14, 2022
AI, ML & Data Engineering

Migrating Netflix's Viewing History from Synchronous Request-Response to Async Events

In a web-based service, a slowdown in request processing can eventually make your service unavailable. Chances are, not all requests need to be processed right away. Some of them just need an acknowledgement of receipt. Have you ever asked yourself: “Would I benefit from asynchronous processing of requests? If so, how would I make such a change in a live, large-scale mission critical system?”

Sharma Podila
on Sep 12, 2022
AI, ML & Data Engineering

How to Migrate an Oracle Database to MySQL Using AWS Database Migration Service

Data migration efforts are typically taken up for database consolidation, cost considerations, or migrating on-prem databases to a cloud platform. In this article, author Deepak Vohra discusses the details of migrating a local database to MySQL database on the cloud, using AWS Database Migration Service.

Deepak Vohra
on Sep 07, 2022
AI, ML & Data Engineering

AutoML: the Promise vs. Reality According to Practitioners

Automation to improve machine learning projects comes from a noble goal, but true end-to-end automation is not available yet. As a collection of tools, AutoML capabilities have proven value but need to be vetted more thoroughly. Findings from a qualitative study of AutoML users suggest the future of automation for ML and AI rests in the ability for us to realize the potential of AutoMLOps.

Doris Xin
on Aug 29, 2022
Architecture & Design

Business Systems Integration is about to Get a Whole Lot Easier

A new breed of integration software is arising that syncs business data into a simplified data hub and then syncs that data to the destination system. The benefit of this integration pattern is that it reduces the number of manual transformations required (often to zero) and makes it easier to write manual transformations when you have to.

Doug Hudgeon
on Aug 24, 2022

Newer Articles

Older Articles

InfoQ Software Architects' Newsletter

Articles