InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
Data Science for Hire Ed
Gloria Lau describes some of the products built for the higher education sector, the data standardization process, determining school similarity and identifying notable alumni.
-
Machine Learning & Recommender Systems at Netflix Scale
Xavier Amatriain discusses the machine learning algorithms and architecture behind Netflix' recommender systems, offline experiments and online A/B testing.
-
R for Big Data
Indrajit Roy presents HP Labs’ attempts at scaling R to efficiently perform distributed machine learning and graph processing on industrial-scale data sets.
-
Deploying Machine Learning and Data Science at Scale
Nick Kolegraff discusses common problems and architecture to support all the phases of data science and how to start a data science initiative, sharing lessons from Accenture, Best Buy, and Rackspace.
-
Working with Databases and Groovy
Paul King presents working with databases in Groovy, covering datasets, GMongo, Neo4J, raw JDBC, Groovy-SQL, CRUD, Hibernate, caching, Spring Data technologies, etc.
-
Functional Programming for Optimization Problems with City of Palo Alto Open Data
Paco Nathan reviews an example data analysis application written in Cascalog used for a recommender system based on City of Palo Alto Open Data.
-
Add ALL the Things: Abstract Algebra Meets Analytics
Avi Bryant discusses how the laws of group theory provide a useful codification of the practical lessons of building efficient distributed and real-time aggregation systems.
-
Big Data Platform as a Service at Netflix
Jeff Magnusson details some of Netflix' key services: Franklin, Sting and Lipstick.
-
Stream Processing: Philosophy, Concepts, and Technologies
Dan Frank discusses stream data processing and introduces NSQ – Bitly’s open source queuing system – and other new technologies used for communication between streaming programs.
-
"Big Data" Agile Analytics
Ken Collier discusses Agile Analytics, a combination of sophisticated analytics techniques, lean learning principles, agile delivery methods, and "big data" technologies.
-
High Speed Smart Data Ingest into Hadoop
Oleg Zhurakousky discusses architectural tradeoffs and alternative implementations of real-time high speed data ingest into Hadoop.
-
Making the Internet a Better Place: Scaling AppNexus
Mike Nolet shares lessons learned scaling AppNexus and architectural details of their system processing 30TB/day: Hadoop, DNS built in GSLB and Keepalived, and real-time data streaming built in C.