BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News Martin Fowler on Software Design in the 21st Century

Martin Fowler on Software Design in the 21st Century

This item in japanese

Bookmarks

Schemaless data structures are not well understood and it's important to consider the advantages and disadvantages when using these data structures in NoSQL databases. At a recent company event Martin Fowler talked about Schemaless Data Structures, and NoSQL & Consistency.

Schemaless Data Structures:

Being schemaless is often seen as a big advantage with NoSQL databases. Martin believes that the area is not well understood and describes different aspects of schemalessness as well as what advantages and disadvantages of using schemaless data structures.

The main point is that even in a schemaless structure you still have a schema. In order to query the data and find information you have to understand the data, and that's an Implicit Schema, a definition of data e.g. in code. In contrast the schema in a relational database, where only correct data is accepted, is an Explicit Schema.

Martin ends the discussion with claiming that most of the time "Implicit Schema == Bad Thing" prefering an explicit schema to get a clear statement what data looks like, although there are a few cases where schemalessnes is useful. But he also states that a schema does not need to be a fixed storage schema; it can be more in the form of a contract, e.g. a data access layer or XML schema.

NoSQL and Consistency:

In this talk Martin looks at two aspects of consistency in NoSQL databases.

Logical Consistency deals with keeping data consistent when working in one database. For most NoSQL databases (graphs being one exception), the use of aggregates (a concept from Domain Driven Design where you store a cluster of objects at the same time) is an obvious way of avoiding inconsistency.

While describing Replication Consistency, with copies of the same data in several places, Martin introduces the CAP theorem, and with data already replicated over the network he simplifies it into a choice between consistency and availability, He emphasizes that this not a technical issue, it's a business choice whether being consistent or available is the top priority.

Martin ended with a talk discussing the value of software design and technical debt.

 

Rate this Article

Adoption
Style

BT