The InfoQ eMag: Introduction to Machine Learning
Machine learning has long powered many products we interact with daily—from "intelligent" assistants like Apple's Siri and Google Now, to recommendation engines like Amazon's that suggest new products to buy, to the ad ranking systems used by Google and Facebook.
More recently, machine learning has entered the public consciousness because of advances in "deep learning"—these include AlphaGo's defeat of Go grandmaster Lee Sedol and impressive new products around image recognition and machine translation.
While much of the press around machine learning has focused on achievements that were not previously possible, the full range of machine learning methods—from traditional techniques that have been around for decades to more recent approaches with neural networks—can be deployed to solve many important (but perhaps more prosaic) problems that businesses face. Examples of these applications include, but are by no means limited to, fraud prevention, time-series forecasting, and spam detection.
InfoQ has curated a series of articles for this introduction to machine learning eMagazine covering everything from the very basics of machine learning (what are typical classifiers and how do you measure their performance?), to production considerations (how do you deal with changing patterns in data after you’ve deployed your model?), to newer techniques in deep learning. After reading through this series, you should be ready to start on a few machine learning experiments of your own.
The Introduction to Machine Learning eMag include:
- Introduction to Machine Learning with Python - We begin with the basics, using a concrete problem to frame the discussion: how can we detect credit card fraud using machine learning? We’ll discuss feature encoding, various types of models (logistic regression, decision trees, and random forests), and measures of model performance (precision, recall, and ROC curves). We’ll build models using popular open source libraries available for Python and include essentially all the code you’ll need to develop similar models yourself.
- Practicing Machine Learning with Optimism - Using machine learning to solve real-world problems often presents challenges that weren't initially considered during the development of the machine learning method. In the next article, Alyssa Frazee addresses a few examples of such issues: how do you obtain confidence intervals around uncertain estimates, how do you update and retrain your models when the models themselves are changing the world (and the data you have available), and how do you explain the seemingly black-box decisions that models make?
- Anomaly Detection for Time Series Data with Deep Learning - We take a detour from traditional machine learning techniques and problems to introduce deep learning—machine learning models which derive their name from the similarity the models have to the connections between neurons in the brain. Tom Hanlon discusses the various types of neural networks (feed-forward, convolutional, and recurrent) and describes how to build a recurrent neural network that detects anomalies in time series data. To make the discussion concrete, Tom uses Deeplearning4j, a popular open-source deep-learning library for the JVM, in his examples.
- Real-World, Man-Machine Algorithms - In this article, Edwin Chen and Justin Palmer talk about the end-to-end flow of developing machine learning models. Kaggle competitions may lead you to believe that the hard part of machine learning is just in the algorithm tuning, but in reality there are a host of problems to address before and after the algorithmic part: where do you get the labels for your data? And how do you address changes in those labels over time? Approaches to these and similar issues in model “lifecycle” management are discussed.
- Book Review: Andrew McAfee and Erik Brynjolfsson's The Second Machine Age - As machine learning becomes increasingly prevalent, society will have to address the impact the technology has on workers who might be displaced. In their book The Second Machine Age, Andrew McAfee and Erik Brynjolfsson discuss some of the potential effects that artificial intelligence and related technologies will have, particularly on economic inequality, and propose policy interventions to mitigate the negative impact.
InfoQ eMags are professionally designed, downloadable collections of popular InfoQ content - articles, interviews, presentations, and research - covering the latest software development technologies, trends, and topics.