InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
Scaling a Distributed Stream Processor in a Containerized Environment
The article presents our experience of scaling a distributed stream processor in Kubernetes. The stream processor should provide support for maintaining the optimal level of parallelism. However, adding more resources incurs additional cost and also it does not guarantee performance improvements. Instead, the stream processor should identify the level of resource requirement and scale accordingly.
-
Conquering the Challenges of Data Preparation for Predictive Maintenance
Predictive maintenance (PdM) applications aim to apply machine learning (ML) on IIoT datasets in order to reduce occupational hazards, machine downtime, and other costs. In this article, the author addresses some of the data preparation challenges faced by the industrial practitioners of ML and the solutions for data ingest and feature engineering related to PdM.
-
Book Review: Optimizing Java
InfoQ reviewed the book Optimizing Java, a comprehensive in-depth look at performance tuning in the Java programming language written by Java industry experts, Ben Evans, James Gough and Chris Newland. InfoQ spoke to the authors for more insights on their experiences, learnings and obstacles in authoring this book.
-
Key Takeaway Points and Lessons Learned from QCon San Francisco 2018
This year around 1,600 attendees descended on the Hyatt Regency in San Francisco for the twelfth annual QCon. Software engineers, architects, and project managers from a wide range of industries including some prominent Bay-area companies - attended 99 technical sessions across 6 concurrent tracks, 13 ask me anything sessions with speakers, 18 in-depth workshops, and 8 facilitated open spaces.
-
InfoQ’s 2018, and What We Expect to See in 2019
We take a look back at what we say on infoQ in 2018, and think about what the next year might bring.
-
The 2018 InfoQ Editors’ Recommended Reading List: Part Two
As part of our core values of sharing knowledge, the InfoQ editors were keen to capture and share our book and article recommendations for 2018, so that others can benefit from this too. In this second part we are sharing the final batch of recommendations
-
What Machine Learning Can Learn from DevOps
The fact that machine learning development focuses on hyperparameter tuning and data pipelines does not mean that we need to reinvent the wheel or look for a completely new way. According to Thiago de Faria, DevOps lays a strong foundation: culture change to support experimentation, continuous evaluation, sharing, abstraction layers, observability, and working in products and services.
-
Analytics Zoo: Unified Analytics + AI Platform for Distributed Tensorflow, and BigDL on Apache Spark
In this article we described how Analytics Zoo can help real-world users to build end-to-end deep learning pipelines for big data, including unified pipelines for distributed TensorFlow and Keras on Apache Spark, easy-to-use abstractions such as transfer learning and Spark ML pipeline support, built-in deep learning models and reference use cases, etc.
-
Back to the Future with Relational NoSQL
This article outlines some of the consistency issues NoSQL databases have with distributed transactions, showing how FaunaDB has solved the problems using the Calvin protocol and a virtual clock.
-
Sentiment Analysis: What's with the Tone?
Sentiment analysis is widely applied in voice of the customer (VOC) applications. In this article, the authors discuss NLP-based Sentiment Analysis based on machine learning (ML) and lexicon-based approaches using KNIME data analysis tools.
-
Spark Application Performance Monitoring Using Uber JVM Profiler, InfluxDB and Grafana
In this article, author Amit Baghel discusses how to monitor the performance of Apache Spark based applications using technologies like Uber JVM Profiler, InfluxDB database and Grafana data visualization tool.
-
Seth James Nielson on Blockchain Technology for Data Governance
Seth James Nielson recently hosted a tutorial workshop at Data Architecture Summit 2018 Conference about Blockchain technology and its impact on data architecture and data governance.