InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
Wall St. Derivative Risk Solutions Using Geode
Andre Langevin and Mike Stolz discuss how Geode forms the core of many Wall Street derivative risk solutions which provide cross-product risk management at speeds suitable for automated hedging.
-
The Joy of Analysis Development
Hilary Parker discusses the history of the analysis development tools, the current state of the art, and the importance for data scientists and analysts to understand programming principles.
-
Machine Learning Fast and Slow
Suman Deb Roy talks about some of Betaworks’ internal data tools and platform, product-specific solutions and best practices they learned when machine learning has to drive the startup road.
-
Exploring Wikipedia with Apache Spark: A Live Coding Demo
Sameer Farooqui demos connecting to the live stream of Wikipedia edits, building a dashboard showing what’s happening with Wikipedia datasets and how people are using them in real time.
-
Adaptive Availability for Quality of Service
Theo Schlossnagle talks about lessons learned in building an always-on distributed time-series database with aggressive quality of service guarantees, and techniques for dealing with bad machines.
-
Ingest & Stream Processing - What Will You Choose?
Pat Patterson and Ted Malaska talk about current and emerging data processing technologies, and the various ways of achieving "at least once" and "exactly once" timely data processing.
-
Structuring Data for Self-Serve Customer Insights
Jim Porzak discusses creating an analyst ready data mart that is complete at different levels of abstraction and models customer decision points in order to be able to understand customers.
-
Journey from Data Integration to Data Science
Michael Wise discusses the journey from having data integrated across an organization, to employing data science to make good use of it.
-
The Joy of Not Coding
Jeroen Janssens discusses several tricks for polyglot programmers helping to mix and match different languages and tools in a project.
-
Applying Big Data
Graeme Seaton discusses the drivers behind Big Data initiatives and how to approach them using the vast amounts of data available.
-
Apache Beam: The Case for Unifying Streaming APIs
Andrew Psaltis talks about Apache Beam, which aims to provide a unified stream processing model for defining and executing complex data processing, data ingestion and integration workflows.
-
Monitoring and Troubleshooting Real-Time Data Pipelines
Alan Ngai and Premal Shah discuss best practices on monitoring distributed real-time data processing frameworks and how DevOps can gain control and visibility over these data pipelines.