InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
Key Takeaway Points and Lessons Learned from QCon New York 2018
This year, at the seventh annual QCon New York, we had in total 143 speakers across the 117 sessions, workshops, AMAs, Open Spaces and mini-workshops. Topics included containers and orchestration, machine learning, ethics, modern user interfaces, microservices, blockchain, empowered teams, modern Java, DevEX, Serverless, chaos and resilience, Go, Rust, Elixir, and security.
-
Understanding Software System Behaviour with ML and Time Series Data
David Andrzejewski presented "Understanding Software System Behaviour with ML and Time Series Data". This article is a summary of his presentation and an overview on what to look out for. Know about the traditional approaches to time series, how to handle missing values, and know about possibly occurring seasonality in your data. Be careful about what threshold you set for anomaly detection.
-
Data Citizens: Why We All Care about Data Ethics
Data citizens are impacted by the models, methods, and algorithms created by data scientists, but they have limited agency to affect the tools which are acting on them.
-
Can People Trust the Automated Decisions Made by Algorithms?
The use of automated decision making is increasing. These algorithms can produce results that are incomprehensible, or socially undesirable. How can we determine the safety of algorithms in devices if we cannot understand them? Public fears about the inability to foresee adverse consequences has impeded technologies such as nuclear energy and genetically modified crops.
-
Democratizing Stream Processing with Apache Kafka and KSQL - Part 1
In this article, author Michael Noll discusses the stream processing with KSQL, the streaming SQL engine for Apache Kafka. Topics covered include challenges of stateful stream processing and how KSQL addresses them, and how KSQL helps to bridge the world of streams and databases through streams and tables.
-
Columnar Databases and Vectorization
In this article, author Siddharth Teotia discusses the Dremio database which is based on Apache Arrow with vectorization capabilities.
-
Back to the Future: Demystifying Hindsight Bias
Enterprise AI has more prevalent nuances in the input data than in consumer AI or academia. The Achilles’ heel in this domain is Hindsight Bias. In layman terms, it is like Marty McFly (from Back to the Future) traveling to the future, getting his hands on the Sports Almanac, and using it to bet on the games of the present. Mayukh Bhaowal from Salesforce Einstein explains how to counteract it.
-
Q&A on the Book Software Wasteland
Almost all Enterprise Information Systems now cost vastly more to implement than they should. When you have hundreds or thousands of complex applications, you are stuck in the Application Centric Quagmire. In the book Software Wasteland Dave McComb explores what is causing application development waste and how visualizing the cost of change and becoming data-centric can help to reduce the waste.
-
Virtual Panel: Microservices Communication and Governance Using Service Mesh
Service mesh is a dedicated infrastructure layer for handling service-to-service communication and offers a platform to connect, manage, and secure microservices. InfoQ spoke with subject matter experts in the service mesh area to learn more about why service mesh frameworks have become critical components of cloud native architectures.
-
Polyglot Persistence Powering Microservices
At Netflix, the cloud database engineering team is responsible for providing several flavors of data persistence as a service to microservice development teams. Roopa Tangirala explained how her team has created self-service tools that help developers easily implement the appropriate data store for each project's needs.
-
Migrating Batch ETL to Stream Processing: A Netflix Case Study with Kafka and Flink
At QCon New York, Shriya Arora presented “Personalising Netflix with Streaming Datasets” and discussed the trials and tribulations of a recent migration of a Netflix data processing job from the traditional approach of batch-style ETL to stream processing using Apache Flink.
-
Exploring the Fundamentals of Stream Processing with the Dataflow Model and Apache Beam
At QCon San Francisco 2016, Frances Perry and Tyler Akidau presented “Fundamentals of Stream Processing with Apache Beam”, and discussed Google's Dataflow model and associated implementation of Apache Beam.