Ron Bodkin on Big Data and Analytics
Ron Bodkin discusses big data architecture, real-time analytics, batch processing, map-reduce, and data science.
Ron Bodkin discusses big data architecture, real-time analytics, batch processing, map-reduce, and data science.

Hilary Mason, interviewed by Ryan Slobojan, discuss the engineering behind bit.ly and their use of machine learning in their system architecture. Hilary also talks about their use of MySQL and MongoDB to manage terabytes of information about users and clicks and their implications on performing real-time analysis of anthropology on the human condition.
Hilary Mason presents the history of machine learning covering some of the most significant developments taking place over the last two decades, especially the fundamental math and algorithmic tools employed. She also exemplifies how machine learning is used by bit.ly to discover various statistical information about users.
Corporations are increasingly using social media to learn more about what their customers are saying about their products. This presents unique challenges as unstructured content needs analytic techniques to interpret the sentiment embodied in the blog posts. InfoQ caught up with Subramanian Kartik to learn more about the blog sentiment analysis project his team worked on.
In a recent news article the Massachusetts Institute of Technology has introduced a technology for automatically remembering connections between objects. The provided system determines how objects in a large software project interact, so it can inform latecomers which objects they will need to design certain types of functions.
Ravi Kannan from Microsoft Research has been appointed winner of the ACM SIGACT's (Special Interest Group on Algorithms and Computation Theory) Knuth Price 2011. According to the press announcement he receives the price for his work on influential algorithic techniques aimed at solving long-standing computational problems.
InfoQ interviewed David Smith, VP of Community for Revolution Analytics at the Strata big data conference. Revolution provides commercial extensions for the open source R statistics package and announced the R Enterprise v4.2 Suite along with offering tools to help SAS users to migrate to R.
The need for machine-learning techniques like clustering, collaborative filtering, and categorization has steadily increased the last decade along with the number of solutions needing quick and efficient algorithms to transform vast amounts of raw data into relevant information. Apache Mount 0.3 has been announced on March, adding more functionality, stability and performance.