InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
Cassandra CLI Internals Using JArchitect
Cassandra CLI is a useful tool for Cassandra administrators. It's a good example of how to implement a Cassandra client and CLI internals help us to develop custom Cassandra clients or even extend the CLI tool. In this article, author explores Cassandra CLI architecture model using JArchitect tool and CQLinq language to analyze its code base.
-
Don’t jump the SQL ship just yet
The SQL language has been evolving steadily over the last two decades. At the same time, the verbosity caused by the JDBC API in Java client code and the lack of first class SQL support within the Java language have led to the introduction of ORMs such as Hibernate, which was later standardised into JPA and the Criteria API.If SQL and JPA are diverging, where will our data interaction patterns go?
-
Building a RESTful Web Service with Spring Boot to Access Data in an Aerospike Cluster
Spring Boot allows you to build Spring based applications with little effort on your part. Aerospike is a distributed and replicated in-memory database that is ACID compliant. This article will take you through creating a simple RESTful web service with Spring Boot and Aerospike.
-
Agility, Big Data, and Analytics
How do you bringing agility into big data analytics? Learn what makes analytics uniquely different than application development, and how to adapt agile principles and practices to the nuances of analytics. Examine how the disciplines of data science and software development complement one another, and how these intersect in an agile project environment.
-
Costin Leau on Elasticsearch, BigData and Hadoop
Elasticsearch is an open source, distributed real-time search and analytics engine for the cloud. The first milestone of elasticsearch-hadoop 1.3.M1 was released last month. InfoQ spoke with Costin Leau about Elasticsearch and how it integrates with Hadoop and other Big Data technologies.
-
Preparing for Your First MongoDB Deployment: Capacity Planning and Monitoring
In this article, author Mat Keep discusses the deployment best practices of MongoDB databases with focus on capacity planning and monitoring aspects. He also explains the topics like hardware selection, key metrics for monitoring and when it’s time to add shards.
-
How to Make Your In-memory NoSQL Datastores Enterprise-Ready
In this article, author Yiftach Schoolman outlines how to overcome the top seven challenges associated with managing the in-memory NoSQL datastores in the cloud. He discusses the challenges like availability, consistency during and after network splits, data durability, scalability, and ops overhead.
-
How to Provide SQL Access to NoSQL Type Data using Multi-Record Type
In this article, author Randal Hoff shows how to use Multi-Record Type pattern to provide both NoSQL and SQL access to c-treeACE data that combines multiple schemas in a single table.
-
Building Scalable Applications in .NET: Introducing the FatDB Distributed Computing Platform
Justin Weiler introduces FatDB, a NoSQL DB and a distributed platform built on Mission Oriented Architecture meant to abstract and generalize the essential characteristics of enterprise applications.
-
Exploring the Architecture of the NuoDB Database, Part 2
In Part 2 of this article the author takes a look at how the transaction system is implemented, the role of the administrative layer, how all components work together and what to expect in the future.
-
Exploring the Architecture of the NuoDB Database, Part 1
In Part 1 of this article the author introduces NuoDB and covers some of its main features: 3-tiered architecture, nodes are equal peers, Atoms - the fundamental data unit, and the versioning and concurrency system used to handle data update conflicts and implement consistency.
-
Spoilt for Choice – How to choose the right Big Data / Hadoop Platform?
In his new article Kai Wähner compares several alternatives for installing a version of Hadoop and realizing big data processes. He compares distributions and tooling from Apache and many other vendors including Cloudera, HortonWorks, MapR, Amazon, IBM, Oracle, Microsoft. He additionally describes pros and cons of every distribution and provides a decision tree for choosing a most appropriate one.