With unstructured database technologies like Cassandra, MongoDB and even JSON storage in Postgres, unstructured data has become remarkably easy to store and to process. Software and data engineers alike can succeed in a world (mostly) free from data modelling, which is no longer a prerequisite to collecting data or extracting value from it.
Adopting Big Data and Data Science technologies into an organisation is a transformative project similar to an agile transformation and with many similar challenges. In this article, the author describes such a project for a FTSE100 financial services company.
In many cases the repository pattern is an apparently unnecessary layer around the underlying data access technology. But once you have a repository in place, many new opportunities become available.
This article will focus on the basic functionality of the repository pattern and how that functionality would be implemented using three different styles of ORM. 2
Yahoo uses Hadoop for different use cases in big data & machine learning areas. InfoQ spoke with Peter Cnudde on how Yahoo leverages big data technologies.
In this article we will explain what isolation levels and dirty reads are and how they are implemented in popular databases.
Internet of Things (IoT) is an emerging technology. One of the areas of IoT is the connected vehicles. In this article, we'll use Spark and Kafka to analyse and process IoT connected vehicle's data. 4
In this fifth installment of Apache Spark article series, author Srini Penchikala discusses Spark ML package and how to use it to create and manage machine learning data pipelines. 1
InfoQ spoke with authors of Spark GraphX in Action book, Apache Spark framework and what's coming up in the area of graph data processing and analytics.
Containers are just around the corner for the Windows community, and this article takes a closer look at using SQL Server containers.
InfoQ interviews Chris Fregly, organizer for the 4000+ member Advanced Spark and TensorFlow Meetup about the PANCAKE STACK workshop, Spark and building data pipelines for a machine learning pipeline
Christine Doig spoke at OSCON Conference about data science as a team discipline and how to navigate data science Python ecosystem. InfoQ spoke with Christine about challenges of data science teams.