InfoQ Homepage Database Content on InfoQ
-
Cybertron: Pushing the Limit on I/O Reduction in Data-Parallel Programs
The authors introduce Cybertron, a new tool for reducing I/O operations in data-parallel programs through a constraint-based encoding.
-
Full-text Search: Basics and Challenges
Itamar Syn-Hershko discusses full text search, what it is, how it works, improving relevance ranking, tackling multi-lingual search and challenges doing it with Lucene and Elasticsearch.
-
How to Train Your Docker Cloud
Andrew Kennedy talks about the reasons for creating a Docker cloud and how Clocker was born.
-
Understanding Cloud, Big Data, Mobile and Security – Do They Play Nicely Together?
Colin Mower discusses the challenges met using together Cloud, Big Data, Mobile and Security and how these can work together to achieve business value.
-
Beating the Traffic Jam Using Embedded Devices, OPC-UA, Akka and NoSQL
Kristoffer Dyrkorn presents the experiences gained by the Norwegian Public Roads Administration in building a new infrastructure for road traffic measurements.
-
Groovy Vampires: Combining Groovy, REST, NoSQL, and More
Ken Kousen discusses combining various technologies: Groovy, Ratpack, MongoDB, Grails, REST.
-
A Taste of Random Decision Forests on Apache Spark
Sean Owen introduces Spark, Scala and random decision forests, and demonstrates the process of analyzing a real-world data set with them.
-
Analyzing Social Networks with F#
Evelina Gabasova explains how to run a social network analysis on Twitter and how to use data science tools to find out more about followers.
-
Don’t Let Data Gravity Crush Your Infrastructure
Dave McCrory talks about what is Data Gravity, how it affects performance and portability and why these effects are amplified when there are larger volumes of data.
-
How SoundCloud Uses Cassandra
Emily Green is taking a look at how SoundCloud uses Cassandra. She describes a couple of Cassandra instances, from the point of view of the products and functionality they support.
-
Customer Insight, from Data to Information
Thore Thomassen shares from experience how to combine structured data in a DWH with unstructured data in NoSQL, and using parallel data warehouse appliances to boost the analytical capabilities.
-
Efficient Data Storage for Analytics with Parquet 2.0
Julien Le Dem discusses the advantages of a columnar data layout, specifically the features and design choices Apache Parquet uses to achieve goals of interoperability, space and query efficiency.