InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
Accelerating Deep Learning on the JVM with Apache Spark and NVIDIA GPUs
In this article, authors discuss how to use the combination of Deep Java Learning (DJL), Apache Spark v3, and NVIDIA GPU computing to simplify deep learning pipelines while improving performance and reducing costs. They also show the performance comparison of this solution with GPU vs CPU hardware, using Amazon EMR and NVIDIA RAPIDS Accelerator.
-
Deep Diving into EF Core: Q&A with Jeremy Likness
Entity Framework (EF) Core is a cross-platform, extensible, open-source object-database mapper for .NET. Since its first release in 2016, EF Core evolved until reaching its current form: a powerful and lightweight .NET ORM. InfoQ interviewed Jeremy Likness, program manager for .NET Data at Microsoft, to understand more about EF Core and what we should expect for its next release later this year.
-
Why a Serverless Data API Might Be Your Next Database
In this article, author Pieter Humphrey discussed database as a service (DBaaS) and serverless data API for cloud based data management.
-
Indestructible Storage in the Cloud with Apache Bookkeeper
At Salesforce, we required a storage system that could work with two kinds of streams, one stream for write-ahead logs and one for data. But we have competing requirements from both of the streams. Being the pioneers in cloud computing, we also required our storage system to be cloud-aware as the requirements of availability and durability are ever more increasing.
-
The Perfect Pair: Digital Twins and Predictive Maintenance
Businesses are moving towards developing a predictive maintenance model using digital twins that mirror their real-life counterparts. In this article, the author looks at digital twins, and provides an example of how to build one.
-
How Optimizing MLOps Can Revolutionize Enterprise AI
In this article, author Monte Zweben discusses data science architecture, containerization, and how new solutions like Feature Store can help with the full lifecycle of machine learning processes.
-
Agile Development Applied to Machine Learning Projects
Machine learning is a powerful new tool, but how does it fit in your agile development? Developing ML with agile has a few challenges that new teams coming up in the space need to be prepared for - from new roles like data scientists to concerns in reproducibility and dependency management.
-
Saga Orchestration for Microservices Using the Outbox Pattern
The outbox pattern, implemented via change data capture, is a proven approach for addressing the concern of data exchange between microservices. The saga pattern, as demonstrated in this article, is useful for data updates that span multiple microservices.
-
How to Build Interactive Data Visualizations for Python with Bokeh
In this article, the author shows how to use one of the powerful Python tools Bokeh in creating data visualizations with custom charts.
-
The Future of Data Engineering
Chris Riccomini examines the current and future states of the art in data pipelines, data streaming, and data warehousing. He presents a six-stage evolution that data ecosystems follow, from a simple monolith to a complex data-microwarehouse architecture as the data engineers who manage them solve problems and clarify their roles as infrastructure engineers, rather than data stewards.
-
The Evolution of Precomputation Technology and its Role in Data Analytics
In this article, author Yang Li discusses the importance of precomputation techniques in databases, OLAP and data cubes, and some of the trends in using precomputation in big data analytics.
-
Performance Tuning Techniques of Hive Big Data Table
In this article, author Sudhish Koloth discusses how to tackle performance problems when using Hive Big Data tables.