Our physical world is about to become digitally enabled and according to various predictions for example by Gartner or Cisco, there will be many billions of IoT devices going online and constantly gathering data in the coming years. We got in touch with Wayne Carter and Ali LeClerc of Couchbase to discuss how Couchbase Mobile is also ready for the upcoming era of Internet of Things.
Machine learning is about making data-driven decisions or predictions based on existing data. Apache Spark and its machine learning library MLlib offer several algorithms useful for developing scalable machine learning applications. InfoQ spoke with Nick Pentreath, author of the book Machine Learning with Spark, about data science and machine learning topics.
In this article, third installment of Apache Spark series, author Srini Penchikala discusses Apache Spark Streaming framework for processing real-time streaming data using a log analytics sample application.
In this article, Dr. Josiah Carlson, author of the book “Redis in Action”, explains how to use Redis and sorted sets with hashes for time series analysis. 2
In this article, author discusses the survival prediction of colorectal cancer as a multi-class classification problem and how to solve that problem using the Apache Spark's MLlib Java API.
Data Lake-as-a-Service provides big data processing in the cloud for business outcomes in a cost effective way. InfoQ spoke with Lovan Chetty & Hannah Smalltree from Cazena about these solutions work.
Neo Technology, the company behind the graph NoSQL database Neo4j, recently released version 2.3 of the database and also announced openCypher initiative. InfoQ spoke with Philip Rathle about it.
In this article, author Dan Macklin discusses the transition to Riak NoSQL and Erlang based architecture coupled with Convergent Replicated Data Types (CRDTs) and lessons learned with the transition. 3
In this article, author discusses a bio-informatic software as a service (SaaS) product which was built as a public data warehousing and analytical platform for mass spectrometry data. 3
A new Eclipse Oozie plugin allows to significantly simplify implementation of Oozie processes by allowing to define them graphically. An article introduces plugin and provides an example of its usage. 1
ColumnarStore can offer performance improvements over traditional tables, but aren’t always faster. Aleksandr Shavlyuga explores the power, and limitations of SQL Server’s ColumnStore Indexes.
In this article, author Carlos Bueno discusses the strategies for estimating the server capacity for big data projects and initiatives, with the help of two case studies.