InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
The InfoQ Podcast: Cathy O'Neil on Pernicious Machine Learning Algorithms and How to Audit Them
In this week's podcast InfoQ’s editor-in-chief Charles Humble talks to Data Scientist Cathy O’Neil. Topics discussed include her book “Weapons of Math Destruction,” predictive policing models, the teacher value added model, approaches to auditing algorithms and whether government regulation of the field is needed.
-
Spark GraphX in Action Book Review and Interview
“Spark GraphX in Action” book from Manning Publications, authored by Michael Malak and Robin East, provides a tutorial based coverage of Spark GraphX, the graph data processing library from Apache Spark framework. InfoQ spoke with authors about the book and Spark GraphX library as well as overall Spark framework and what's coming up in the area of graph data processing and analytics.
-
Book Review: Cathy O’Neil’s Weapons of Math Destruction
“Big Data has plenty of evangelists, but I’m not one of them,” writes Cathy O’Neil, a blogger (mathsbabe.org) and former quantitative analyst at the hedge fund DE Shaw who became sufficiently disillusioned with her hedge fund modelling that she joined the Occupy movement.
-
Chris Fregly on the PANCAKE STACK Workshop and Data Pipelines
InfoQ Interviews Chris Fregly, organizer for the 4000+ member Advanced Spark and TensorFlow Meetup about the PANCAKE STACK workshop, Spark and building data pipelines for a machine learning pipeline
-
Christine Doig on Data Science as a Team Discipline
Christine Doig spoke at this year's OSCON Conference about data science as a team discipline and how to navigate the data science Python ecosystem. InfoQ spoke with Christine about challenges data science teams need to address to be more effective.
-
Virtual Panel: Current State of NoSQL Databases
NoSQL databases have been around for several years now and have become a choice of data storage for managing semi-structured and unstructured data. These databases offer lot of advantages in terms of linear scalability and better performance for both data writes and reads. InfoQ spoke with four panelists to get different perspectives on the current state of NoSQL databases.
-
Key Takeaway Points and Lessons Learned from QCon New York 2016
The fifth annual QCon New York was the biggest yet, bringing together over 800 team leads, architects, project managers, and engineering directors. In total, over 140 practitioner-speakers presented 79 full-length technical sessions and 16 in-depth tutorials, providing deep insights into real-world architectures and state of the art software development practices from a practitioner’s perspective.
-
What the JIT!? Anatomy of the OpenJDK HotSpot VM
If you've ever wondered what happens when your bytecode executes, join former Oracle G1GC performance-lead Monica Beckwith in her guided tour of just-in-time (JIT) compilation and runtime optimizations in OpenJDK HotSpot VM.
-
Big Data Analytics with Spark Book Review and Interview
Big Data Analytics with Spark book, authored by Mohammed Guller, provides a practical guide for learning Apache Spark framework for different types of big-data analytics projects, including batch, interactive, graph, and stream data analysis as well as machine learning. InfoQ spoke with author about the book & development tools for big data applications.
-
Configure Once, Run Everywhere: Decoupling Configuration and Runtime
Configuration is one of the most widely used cross-cutting concerns in application development. Apache Tamaya is a new incubator project that brings standardized property management to Java.
-
Martin Van Ryswyk on DataStax Enterprise Graph Database
DataStax recently announced a new product called DataStax Graph to store graph data models. It's based on open source Titan graph database and uses Apache Tinkerpop framework's Gremlin query language. InfoQ spoke with Martin Van Ryswyk about the new product.
-
Big Data Processing with Apache Spark - Part 4: Spark Machine Learning
In this fourth installment of Apache Spark article series, author Srini Penchikala discusses machine learning concepts and Spark MLlib library for running predictive analytics using a sample application.