Twitter’s engineering group, known for various contributions to open source from streaming MapReduce to front-end framework Bootstrap recently announced open sourcing an algorithm that can efficiently recommend content. LinkedIn also open sourced a Machine Learning library of its own, ml-ease. In this article we present the algorithms and what they mean for the open source community.
Splunk’s user conference has drawn to a close. After three days with over 160 sessions ranging from security and operations to business intelligence to even the Internet of Things, the same central theme kept appearing over and over again: the key to Big Data is machine learning.
Nvidia earlier this month released cuDNN, a set of optimized low-level primitives to boost the processing speed of deep neural networks (DNN) on CUDA compatible GPUs. The company intends to help developers harness the power of graphics processing units for deep learning applications.
Microsoft recently announced Azure ML, a machine learning cloud based platform that helps predict future events based on past performance. Microsoft has been using machine learning for years for Bing, Xbox and other products but this is the first time that internal technologies are consumerized and deployed as cloud services. Ersatz Labs is also trying to build a PaaS for Machine Learning.
Domino, a Platform-as-a-Service for data science, enables people to do analytical work using languages such as Python or R in the cloud (EC2).
Recently, Spark graduated from the Apache incubator. Spark claims up to 100x speed improvements over Apache Hadoop over in-memory datasets and gracefully falling back to 10x speed improvement for on-disk performance. Based on Scala, it can run SQL queries and be used directly in R. It provides Machine Learning, Graph database capabilities and other further discussed in the article.
2013 has been rich in announcements for new programs, degrees and grants for aspiring data scientists and Big Data practitioners.
Neural networks have long been an interesting field of research for exploring concepts in machine learning (otherwise known as artificial intelligence). Dr James McCaffrey of Microsoft Research recently gave an introduction to neural networks for those looking to learn more about them in an engaging talk that includes working demo code.
Concurrent, Inc., the enterprise Big Data application platform company, today announced Pattern, a machine learning based on an industry standard called PMML which allows analytics frameworks such as SAS, R, Microstrategy, Oracle, etc., to export predictive models and run them on Hadoop clusters
ThoughtWorks's latest "Technology Radar" focuses on mobile, accessible analytics, simple architectures, reproducible environments, and data persistence done right.
Corporations are increasingly using social media to learn more about what their customers are saying about their products. This presents unique challenges as unstructured content needs analytic techniques to interpret the sentiment embodied in the blog posts. InfoQ caught up with Subramanian Kartik to learn more about the blog sentiment analysis project his team worked on.
In a recent news article the Massachusetts Institute of Technology has introduced a technology for automatically remembering connections between objects. The provided system determines how objects in a large software project interact, so it can inform latecomers which objects they will need to design certain types of functions.
Ravi Kannan from Microsoft Research has been appointed winner of the ACM SIGACT's (Special Interest Group on Algorithms and Computation Theory) Knuth Price 2011. According to the press announcement he receives the price for his work on influential algorithic techniques aimed at solving long-standing computational problems.
InfoQ interviewed David Smith, VP of Community for Revolution Analytics at the Strata big data conference. Revolution provides commercial extensions for the open source R statistics package and announced the R Enterprise v4.2 Suite along with offering tools to help SAS users to migrate to R.
The need for machine-learning techniques like clustering, collaborative filtering, and categorization has steadily increased the last decade along with the number of solutions needing quick and efficient algorithms to transform vast amounts of raw data into relevant information. Apache Mount 0.3 has been announced on March, adding more functionality, stability and performance.