BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News Martin Fowler on Software Design in the 21st Century

Martin Fowler on Software Design in the 21st Century

This item in japanese

Bookmarks

Schemaless data structures are not well understood and it's important to consider the advantages and disadvantages when using these data structures in NoSQL databases. At a recent company event Martin Fowler talked about Schemaless Data Structures, and NoSQL & Consistency.

Schemaless Data Structures:

Being schemaless is often seen as a big advantage with NoSQL databases. Martin believes that the area is not well understood and describes different aspects of schemalessness as well as what advantages and disadvantages of using schemaless data structures.

The main point is that even in a schemaless structure you still have a schema. In order to query the data and find information you have to understand the data, and that's an Implicit Schema, a definition of data e.g. in code. In contrast the schema in a relational database, where only correct data is accepted, is an Explicit Schema.

Martin ends the discussion with claiming that most of the time "Implicit Schema == Bad Thing" prefering an explicit schema to get a clear statement what data looks like, although there are a few cases where schemalessnes is useful. But he also states that a schema does not need to be a fixed storage schema; it can be more in the form of a contract, e.g. a data access layer or XML schema.

NoSQL and Consistency:

In this talk Martin looks at two aspects of consistency in NoSQL databases.

Logical Consistency deals with keeping data consistent when working in one database. For most NoSQL databases (graphs being one exception), the use of aggregates (a concept from Domain Driven Design where you store a cluster of objects at the same time) is an obvious way of avoiding inconsistency.

While describing Replication Consistency, with copies of the same data in several places, Martin introduces the CAP theorem, and with data already replicated over the network he simplifies it into a choice between consistency and availability, He emphasizes that this not a technical issue, it's a business choice whether being consistent or available is the top priority.

Martin ended with a talk discussing the value of software design and technical debt.

 

Rate this Article

Adoption
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Community comments

  • Schema is a means to an end

    by Ralf Westphal,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    It seems the debate is whether schemas are "good" or "bad" in an absolute way.
    But schemas are just means to an end.

    That means you need to know what the (non-functional) requirements are... and then decide if more or less schema is good for a certain purpose.

    Schemas are contracts to enable collaboration of different parties in space and time.
    Schemas are performance optimizers.
    Schemas are... what else?

    Fixing a schema too early leads to all sorts of costs due to schema changes. Or even prevents change.
    Fixing a schema too late leads to loss of performance/scalability/security etc.

    So the most important question to me seems: Can you easily change the rigidity of the schema of certain data?
    Is the decision for more or less schema reversible?

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

BT