  • A Quick Primer on Isolation Levels and Dirty Reads

    by Jonathan Allen on  Oct 07, 2016

    Recently MongoDB found itself at the top of Reddit again when developer David Glasser learned the hard way that MongoDB performs dirty reads by default. In this article we will explain what isolation levels and dirty reads are and how they are implemented in popular databases.

  • Traffic Data Monitoring Using IoT, Kafka and Spark Streaming

    by Amit Baghel on  Sep 28, 2016 9

    Internet of Things (IoT) is an emerging disruptive technology and becoming an increasing topic of interest. One of the areas of IoT application is the connected vehicles. In this article we'll use Apache Spark and Kafka technologies to analyse and process IoT connected vehicle's data and send the processed data to real time traffic monitoring dashboard.

  • Big Data Processing with Apache Spark - Part 5: Spark ML Data Pipelines

    by Srini Penchikala on  Sep 24, 2016 2

    With support for Machine Learning data pipelines, Apache Spark framework is a great choice for building a unified use case that combines ETL, batch analytics, streaming data analysis, and machine learning. In this fifth installment of Apache Spark article series, author Srini Penchikala discusses Spark ML package and how to use it to create and manage machine learning data pipelines.

Spark GraphX in Action Book Review and Interview

Posted by Srini Penchikala on  Sep 12, 2016

InfoQ spoke with authors of Spark GraphX in Action book, Apache Spark framework and what's coming up in the area of graph data processing and analytics.

Introduction to SQL Server Containers

Posted by Paul Stanton on  Sep 08, 2016

Containers are just around the corner for the Windows community, and this article takes a closer look at using SQL Server containers.

Chris Fregly on the PANCAKE STACK Workshop and Data Pipelines

Posted by Dylan Raithel on  Aug 29, 2016

InfoQ interviews Chris Fregly, organizer for the 4000+ member Advanced Spark and TensorFlow Meetup about the PANCAKE STACK workshop, Spark and building data pipelines for a machine learning pipeline

Christine Doig on Data Science as a Team Discipline

Posted by Srini Penchikala on  Aug 26, 2016

Christine Doig spoke at OSCON Conference about data science as a team discipline and how to navigate data science Python ecosystem. InfoQ spoke with Christine about challenges of data science teams.

Starcounter vs. ORM and DDD

Posted by Kostiantyn Cherniavskyi on  Aug 10, 2016

Kostiantyn Cherniavskyi looks at some of the issues surrounding the object-relation impedance mismatch and how many of them can be solved with hybrid databases such as Starcounter. 5

Virtual Panel: Current State of NoSQL Databases

Posted by Srini Penchikala on  Aug 02, 2016

NoSQL databases have been around for several years and have become a preferred choice for managing unstructured data. InfoQ spoke with four panelists about the current state of NoSQL databases. 2

Big Data Analytics with Spark Book Review and Interview

Posted by Srini Penchikala on  Jun 23, 2016

Big Data Analytics with Spark, authored by Mohammed Guller, provides a practical guide for learning Apache Spark. InfoQ and the author discuss the book & development tools for big data applications.

Everything Is “Lock-In”: Focus on Switching Costs

Posted by Richard Seroter on  Jun 08, 2016

It makes no difference how hard you try- some form of lock-in is unavoidable. What matters most is understanding the layers of lock-in, and how to assess and reduce your switching costs.

Martin Van Ryswyk on DataStax Enterprise Graph Database

Posted by Srini Penchikala on  May 17, 2016

Datastax recently announced DataStax Graph to support graph data models. InfoQ spoke with Martin Van Ryswyk from DataStax team about the new product. 1