InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
Building Data Pipelines in Python
Marco Bonzanini discusses the process of building data pipelines and all the steps necessary to prepare data, focusing on data plumbing and going from prototype to production.
-
Policing the Stock Market with Machine Learning
Cliff Click talks about SCORE, a solution for doing Trade Surveillance using H2O, Machine Learning, and a whole lot of domain expertise and data munging.
-
Stream Processing & Analytics with Flink @Uber
Danny Yuan discusses how Uber builds its next generation of stream processing system to support real-time analytics as well as complex event processing.
-
Building a Data Science Capability from Scratch
Victor Hu covers the challenges, both technical and cultural, of building a data science team and capability in a large, global company.
-
Data Cleansing and Understanding Best Practices
Casey Stella talks about discovering missing values, values with skewed distributions and likely errors within data, as well as a novel approach to finding data interconnectedness.
-
Predictability in ML Applications
Claudia Perlich presents scenarios in which the combination of different and highly informative features can have significantly negative overall impact on the usefulness of predictive modeling.
-
SQL Server on Linux: Will it Perform or Not?
Slava Oks talks about SQL Server’s history, high-level architecture and dives into core of I/O Manager, Memory Manager, and Scheduler. Topics include lessons learned and experiences behind the scenes.
-
The State of AI
Jim McHugh keynotes on the current state of artificial intelligence.
-
Data Driven Products Now!
Dan McKinley discusses how Etsy is using data to validate their ideas and prototypes, turning some into real products.
-
ScyllaDB: Achieving No-Compromise Performance
Avi Kivity discusses ScyllaDB, the many necessary design decisions, from the programming language and programming model through low-level details and up to the advanced cache design, and more.
-
Fundamentals of Stream Processing with Apache Beam
Frances Perry and Tyler Akidau discuss Apache Beam, out-of-order stream processing, and how Beam’s tools for reasoning simplify complex tasks.
-
Data Science in the Cloud @StitchFix
Stefan Krawczyk discusses how StitchFix used the cloud to enable over 80 data scientists to be productive and have easy access, covering prototyping, algorithms used, keeping schema in sync, etc.