InfoQ Homepage Big Data Content on InfoQ
-
Machine Learning Netflix Style with Xavier Amatriain
Xavier Amatriain discusses how Netflix uses specialized roles, including that of the Data Scientist and Machine Learning Engineer, to deliver valuable data at the right time to Netflix' customer base through a mixture of offline, online, and nearline data processes. Xavier also discusses what it takes to become a Machine Learning Engineer and how to gain real experience in the field.
-
Eva Andreasson on Hadoop, the Hadoop Ecosystem, Impala
Eva Andreasson explains the various Hadoop technologies and how they interact, real-time queries with Impala, the Hadoop ecosystem including Hue, Oozie, YARN, and much more.
-
Big Data's Role in Etsy's Product Development
Etsy's approach to big data has been to give the entire organization visibility to different sources of data generated by their product as well as access to the experts who know how to use it. Nell Thomas explains her role at Etsy and how Etsy's view of big data has shaped its product's evolution.
-
The Larger Purpose of Big Data with Pavlo Baron
Big Data means more than just the size of a dataset. Pavlo Baron explains different ways of applying Big Data concepts in various situations: from analytics, to delivering content, to medical applications. His larger vision for Big Data ranges from specialized Data Scientists, to learning Decision Support Systems, to helping mankind itself.
-
Erik Meijer on Big Data, Types of Data Stores and Reactive Programming
Erik Meijer explains the various aspects needed to categorise data stores, how reactive programming fits in with databases, the return to data transformation, denotational semantics, and much more.
-
Eli Collins on Hadoop
Eli Collins discusses Cloudera's CDH4 release, which tasks are well suited for Hadoop, Hadoop and MapReduce vs SQL, the state of Hadoop, and much more.
-
Stuart Halloway on Datomic, Clojure, Reducers
Stuart Halloway explains Datomic, programming transactional behavior with Datomic, Datalog and logic programming, programming with values, Clojure Reducers and much more.
-
Max Sklar on Machine Learning at Foursquare
Max Sklar talks about machine learning at Foursquare, the use of Bayesian Statistics and other methods to build Foursquare's recommendation system and much more.
-
Big Data Architecture at LinkedIn
In this interview at QCon London, LinkedIn’s Sid Anand discusses the problems they face when serving high-traffic, high-volume data. Sid explains how they’re moving some use cases from Oracle to gain headroom, and lifts the hood on their open source search and data replication projects, including Kafka, Voldemort, Espresso and Databus.
-
Hadoop and NoSQL in a Big Data Environment
Ron Bodkin of Big Data Analytics discusses early adoption of Hadoop, NoSQL and big data technologies. He discusses common patterns and explains how developers can write low-level primitives to optimize MapReduce function. Other topics include Hive, Pig, multi tenancy, and security.
-
All things Hadoop
In this interview Ted Dunning talk about Hadoop, its current usage and its future. He explains the reasons for Hadoop's success and make recommendations on how to start using it.
-
Costin Leau on Spring Data, Spring Hadoop and Data Grid Patterns
In this interview recorded at JavaOne 2011 Conference, Spring Hadoop project lead Costin Leau talks about the current state and upcoming features of Spring Data and Spring Hadoop projects. He also talks about the Caching and Data Grid architecture patterns.