The InfoQ eMag: Getting a Handle on Data Science
Making best use of data is fast becoming a critical skillset, not only for professional data scientists but also for software developers in general, whether moving into that specialist field or simply wanting to have data science as one tool in the arsenal. As a developer, how can you start to get a handle on data, choose the best tools and libraries, navigate through the maths and stats, and get useful results?
This eMag looks at data science from the ground up, across technology selection, assembling raw and unstructured data, statistical thinking, machine learning basics, and the ethics of applying these new weapons.
Getting a Handle on Data Science includes:
- Solving Business Problems with Data Science - Enterprises are increasingly realising that many of their most pressing business problems could be tackled with the application of a little data science.This article, the first in a series, looks at the foundations of a successful business-orientated data science project.
- Selecting Big Data and Data-Science Technologies at a Large Financial Organisation - Adopting Big Data and Data Science technologies into an organisation is a transformative project similar to an agile transformation and with many similar challenges. In this article, the author describes such a project for a FTSE100 financial services company.
- Getting Started with Machine Learning - A quick introduction to the machine learning field, exploring both supervised and unsupervised approaches.
- From Raw Data to Data Science: Adding Structure to Unstructured Data to Support Product Development - With unstructured database technologies like Cassandra, MongoDB and even JSON storage in Postgres, unstructured data has become remarkably easy to store and to process. Software and data engineers alike can succeed in a world (mostly) free from data modelling, which is no longer a prerequisite to collecting data or extracting value from it.
- Data Science up and down the Ladder of Abstraction - Although Clojure lacks the extensive toolbox and analytic community of the most popular data science languages, R and Python, it provides a powerful environment for developing statistical thinking and for practicing effective data science.
- Book Review: Cathy O’Neil’s Weapons of Math Destruction - “Big Data has plenty of evangelists, but I’m not one of them,” writes Cathy O’Neil, a blogger (mathsbabe.org) and former quantitative analyst at the hedge fund DE Shaw who became suficiently disillusioned with her hedge fund modelling that she joined the Occupy movement.
InfoQ eMags are professionally designed, downloadable collections of popular InfoQ content - articles, interviews, presentations, and research - covering the latest software development technologies, trends, and topics.