BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Articles Obscuring Complexity

Obscuring Complexity

Leia em Português

Bookmarks

Key Takeaways

  • If done well, Model Driven Software Development can partially obscure some complexity, but you are going to have to treat the source code output as build artifacts and take ownership of the templates. Maintaining code generating templates is a kind of meta-programming that most developers are not used to doing.
  • Twelve Factor applications can truly achieve lower complexity, but only when integrated with mature or stable (i.e. boring) data stores.
  • You can lower complexity in microservice orchestration by building in some limited, cross cutting intelligence into the connectors, but you must be careful because too much intelligence creates the opposite effect.
  • You can write smaller applications when using a heavyweight framework, but beware that efficiency may suffer, that these kinds of services are harder to tune for performance, and that it may take longer to debug certain kinds of issues.
  • With reactive programming, you end up trading backend complexity for frontend complexity.
     

One of the most important things that software architects do is manage the complexity of their systems in order to mitigate release disruption while maintaining sufficient feature velocity. Systems can be simplified, but only by so much. Modern software must meet a lot of sophisticated demands in order to be competitive. When we cannot reduce complexity, we try to hide or shift it.

Software architects tend to manage that complexity with the following time-honored strategies:

  • They can decrease the size of their applications by either reusing generic frameworks or by using programmatic code generators.
  • They make it easier to scale out their systems by keeping close tabs on application statefulness.
  • They design systems that degrade gracefully under increasing load or partial outage.
  • Finally, they normalize workloads by moving to eventually consistent systems.

Let’s go into more detail on these different strategies, and the historical context under which each strategy was formed, in order to better understand the advantages and disadvantages to each approach.

For each strategy, I mention complementary example technologies for Java, JavaScript, Python, and .NET developers. These lists are by no means complete, so please accept my apologies if I don’t mention your favorites.

It’s Only a Model

MDSD, or model-driven software development, increases feature velocity because the developers save time by writing less boilerplate code. Instead, a model is specified, and a generator combines the model with templates of the boilerplate code to generate code that the developers used to have to write by hand.

My first exposure to MDSD was back in the days of CASE (computer-aided software engineering). It resurfaced when UML (unified modeling language) was at its peak. The issue with MDSD back then was that it was being pitched to generate all the code, meaning the type of model needed to capture all possible requirements was so complex that it was easier just to write the code.

MDSD is making a resurgence due to a technology called Swagger (new versions of the modeling specification are now curated by the OpenAPI Initiative), where you specify a model just for what your APIs look like. The model and a set of templates are input to a generator, which outputs boilerplate code that surfaces the APIs. There are separate templates for code that produce and consume the API being modeled, and starter templates are available online for just about any relevant technology stack.

For example, examine the Swagger templates for Spring Boot, which generate the REST controllers, Jackson annotated request and response POJOs, and various application boilerplate code. It is up to the developers to add the logic and implementation details (e.g. database access) needed to satisfy the requirements for each API. To avoid engineers having to modify the Swagger generated files, use either inheritance or dependency injection.

How can MDSD obscure the complexity of your application code? It is tricky but it can be done. The generator outputs the code that implements the API resources, so the developers don't have to worry about coding that. However, if you use the generator as a one-time code wizard and commit the output to your version-controlled source code repository (e.g. git), then all you did was save some initial coding time. You didn't really hide anything, since the developers will have to study and maintain the generated code.

To truly obscure the complexity of this code, you have to commit the model into your version-controlled source code repository, but not the generated source code. You need to generate that output source from the model every time you build the code. You will need to add that generator step to all your build pipelines. Maven users will want to configure the swagger-codegen-maven-plugin in their pom file. That plugin is a module in the swagger-codegen project.

What if you do have to make changes to the generated source code? That is why you will have to assume ownership of the templates and also commit them to your version-controlled source code repository. These are mustache templates, which look like the code to be generated with curly brace-delimited substitution parameters and decision branch logic sprinkled in them. Template programming is a form of meta-programming that is quite advanced.

In the end, the best you can hope for with MDSD is that it can obscure complexity for your junior developers, but at the expense of having your senior developers support those templates.

Model-Driven Software Development
Pros Cons
  • Smaller application size
  • Increased feature velocity
  • More complicated bulid pipelines
  • Templates must be maintained

On the Origin of Computer Systems

In 2011, the folks at Heroku published a curated collection of best practices for writing modern, cloud native, service-oriented software. They called it the Twelve-Factor App. For a better understanding of why these twelve factors truly reduce complexity, we briefly review the history of how computer systems have evolved from simple, single machine setups, to complex clusters of connected virtual machines in a software defined network.

For a long time, applications were designed to run on a single computer. If you wanted the app to handle more requests, then you had to scale the app up by installing it on a bigger computer. Systems evolved into two-tier applications where hundreds of users would run a specialized client program on their desktops that connected to a database program running on a single server.

The next step in the evolution of computing was three-tier systems where the client programs connected to an application server that would access the database server. Web applications replaced client-server applications because it was easier to deploy the client portion (assuming that everyone's computer had a modern web browser installed on it) and you could accommodate more users connected to the system. Scaling up (replacing one computer with a bigger computer) became less attractive than scaling out (expanding from one computer to many computers). In order to handle the additional user load, that single app server was scaled out to a cluster of computers running behind a load balancer. Database servers could be scaled out by techniques known as sharding (for writes) and replication (for reads). Back then, all of these servers were deployed either on the premises of the company that used them, or in a rented data center.

For about thirty years, the most viable option for database software was relational databases, also known as SQL databases because application code communicates with them via commands written in the structured query language. There are many great relational databases available to choose from. MySQL, MariaDB, and PostgreSQL are popular open source databases. Oracle and MS SQL Server are popular proprietary databases.

A proliferation of other options have become available in the past decade or so. There is now a category known as NoSQL databases which includes wide column databases such as Cassandra, key value databases like Aerospike, document databases similar to MongoDB, graph databases such as Neo4j, and Elasticsearch-style inverted indexes. Even more recently, multi-model and distributed databases have gained some popularity. With multi-model databases, you can call both SQL and NoSQL APIs on a single database installation. Distributed databases handle sharding and replication without any additional complexity in the application code. YugaByte and Cosmos DB are both multi-modal and distributed.

With the advent of the cloud, companies no longer had to employ engineers who knew how to assemble and cable in racks of computers or sign five-year lease agreements with computer manufacturers and managed hosting service providers. In order to truly realize these economies of scale, the computers were virtualized and became more ephemeral. Software had to be redesigned to more easily accommodate all these changes in how it was deployed.

Typical deployment where two microservices and a database are scaled out.
Typical deployment where two microservices and a database are scaled out.

Applications that properly follow these twelve factors easily handle this proliferation of hardware with minimal complexity. Let's focus on factors 6 (processes), 8 (concurrency) and 9 (disposability).

You will be able to scale out your application more easily if the app is designed to execute on many stateless processes. Otherwise, all can you do easily is scale up. That is what factor 6 is all about. It is okay to cache data in an external cluster in order to speed up average latency or to protect the underlying database(s) from getting overwhelmed, but the cache should never contain any data that isn’t already in the database(s). You can lose the cache at any time without losing any actual data.

Factor 8 is about concurrency. These processes should be grouped in clusters such that each cluster of processes handle the same type of requests. Your software will be much simpler if these processes do not share any state other than using database(s). If these processes share internal state, then they have to know about each other, which will make it harder and more complicated to scale out by adding more processes to the cluster.

Your application will be more responsive to changes in load and robust to destabilization if each process can quickly initialize and gracefully terminate. This is factor 9, disposability. Dynamic scaling gives you the ability to quickly and automatically add more processes to handle increased load, but that works only if each process doesn’t need to take a long time to start up before it is ready to accept requests. Sometimes a system will destabilize, and the quickest way to resolve the outage is to restart all processes, but that works only if each process can terminate quickly without losing any data.

You will avoid many bugs and brittleness if you architect your systems in such a way that the stream of inbound requests is handled by many processes running concurrently. These processes can, but do not have to be, multi-threaded. These processes should be able to start up quickly and shutdown gracefully. Most importantly, these processes should be stateless and share nothing.

Obviously, there is not a lot of demand for applications that cannot remember anything, so where does the state go? The answer is in the database, but database applications are software too. Why is it okay for databases to be stateful, when it is not okay for applications to be stateful? We have already covered that applications need to be able to deliver on a fairly fast feature velocity. The same is not true for database software. It takes a lot of engineering time, thought, and effort to get stateful done right at high load. Once you get there, you don't want to make a lot of big changes, because stateful software is very complex and easy to break.

As mentioned earlier, there is a large proliferation of database technologies, many of which are new and relatively untested. You can get some degree of usefulness out of a stateless application after only a couple of engineering months of effort. Some or most of those engineers can be recent graduates with little professional experience. A stateful application is completely different. I wouldn't bet on any database technology that didn't have at least two decades of engineering effort in it. (That’s engineering decades, not calendar decades.) Those engineers have to be seasoned professionals who are very smart and have lots of distributed computing experience and monster computer science chops. If you use an untested or immature database engine, then you will end up introducing additional complexity into your application in order to work around the bugs and limitations of the immature database. Once the bugs in the database get fixed, you will have to re-architect your application to remove the now unnecessary complexity.

Innovating Your Technology Stack
Pros Cons
  • Adopting new technology that is stateless can be fun and affords a competitive advantage with little risk.
  • Adopting new technology that is stateful is very risky.
  • It will most likely increase complexity for your apps, instead of decreasing it.
     

It is a Series of Tubes after All

As systems evolved from a single application to clusters of interconnected applications and databases, a body of knowledge was cataloged to advise on the most effective ways that these applications can interact with each other. In the early 2000s, a book on enterprise integration patterns (or EIP) was published that more formally captured this body of knowledge.

Back then, a style of service interaction known as service-oriented architecture became popular. In SOA, applications communicated with each other through an enterprise service bus (ESB) that was also programmed to manipulate the messages and route them based on configuration rules that closely followed EIP.

Workflow engines are a similar technology, based on Petri Nets, that was more business-focused. It was sold on the premise that non-engineers could write the rules, but never truly delivered on that promise.

These approaches introduced a lot of unnecessary and unappreciated complexity which caused them to fall out of favor. Configuration grew to a complex tangle of interconnected rules that became very resistant to change over time. Why is this? It’s the same issue as getting MDSD to model all requirements. Programming languages may require more engineering knowledge than modeling languages, but they are also more expressive. It’s a lot easier to write or understand a previously written small snippet of code to handle an EIP requirement, than to author a large and complicated BPMN model specification. Both Camel (an Apache project) and Mulesoft (acquired by Salesforce in 2018) are ESBs that attempt to simplify their respective technologies. I hope that they succeed.

The reaction to ESB / workflow-flavored SOA became known as MSA, or microservice architecture. In 2014, James Lewis and Martin Fowler summed up the differences between MSA and SOA. With SOA, you had dumb endpoints and smart pipes. With MSA, you had smart endpoints and dumb pipes. Complexity was reduced, but perhaps by too much. Such systems were brittle and non-resilient (i.e. easily destabilized) during times of partial failure or degraded performance. There was also a lot of duplicity in the separate microservices that each had to implement the same cross-cutting concerns, such as security. This is true (although to a lesser degree) even if each implementation simply embeds the same shared library.

What followed was the introduction of API gateways and service meshes, both of which are enhanced versions of layer 7 load balancers. The term “layer 7” is a reference to the OSI or Open Systems Interconnection model that was introduced back in the 80s.

When calls from the Internet or intranet are intended for microservices on the backend, they pass through an API gateway which handles features like authentication, rate limiting, and request logging, removing those requirements from each individual microservice. 

Calls from any microservice to any other microservice pass through a service mesh which handles such concerns as bulkheading and circuit breaking. When requests to a service timeout too frequently, the service mesh immediately fails future calls (for a while) instead of attempting to make the actual calls. This prevents the unresponsive service causing the dependent services to also become unresponsive due to all of their threads waiting on the original unresponsive service. This behavior is similar to a bulkhead on a ship preventing a flood from spreading beyond one compartment. With circuit breaking, the service mesh immediately fails calls (for a while) to a service that has been failing most of its previous calls in the recent past. The rationale for this strategy is that the failing service has gotten overwhelmed, and preventing calls to that service will give it a chance to recover.

Deploying an API gateway and a service mesh.

API gateways and service meshes make microservices more resilient without introducing any additional complexity in the microservice code itself. However, they increase operational costs due to the additional responsibility for maintaining the health of the API gateway and/or service mesh.

MSA vs SOA
Pros Cons
  • For EIP, code is simpler than configuration.
  • API gateways reduce duplicity of implementing cross-cutting concerns.
  • Service meshes increase resiliency.
     
  • ESBs make it harder to understand systems and predict behavior.
  • Systems that use workflow engines are more likely to become resistant to change over time.
  • API gateways and service meshes introduce additional operational costs.
     

March of the Frameworks

Another way to reduce the amount of code that developers have to write is to use an application framework. A framework is just a library of general-purpose routines that implement functionality common to all applications. Parts of the framework load first and end up calling your code later.

Like I mentioned earlier, relational databases were originally developed in the mid-70s, and were so useful that they remained popular throughout technology trends described earlier. They are still popular today, but using them in the web application world introduces a lot of complexity. Connections to relational databases are stateful and long-lived, yet typical web requests are stateless and short-lived. The net result is that multi-threaded services have to deal with this complexity using a technique known as connection pooling. Single-threaded applications are less efficient in this manner; therefore they have to depend more on sharding and replication.

Object-Oriented Programming became quite popular during the client-server era, and has maintained its popularity since. Relational data does not fit into the object-oriented structure very easily, so object-relational mapping frameworks were developed in an attempt to obscure this kind of complexity. Popular ORM frameworks include Hibernate, SQLAlchemy, LoopBack, and Entity Framework.

In the early days of web application development, everything was built in what later became known as the monolith. The graphical user interface or GUI (basically browser rendered HTML, CSS, and JavaScript) was generated server-side. Patterns such as MVC (model view controller) were used to coordinate GUI rendering with data access, business rules, etc. There are actually many variations on MVC, but, for the purpose of this article, I am lumping them all into the same category as MVC. MVC is still around, and popular modern MVC frameworks include Play, Meteor, Django and ASP.NET.

Over time, these kinds of applications became large and cumbersome; so large that their behavior was hard to understand or predict. Making changes to the application was risky, and releasing new versions was disruptive because it was hard to test and verify correctness of these overly complex systems. A lot of engineering time was spent rapidly fixing the buggy code that got deployed without proper vetting. When you are forced to fix something quickly, you don't have the time to come up with the best solution, causing poor quality code to slip in. The intention is to replace this poor quality code with good quality code later on.

The answer was to split up the monolith into multiple components or microservices that could be released separately. The GUI code was all moved over to what is now called SPA (single page applications) as well as native mobile apps. Data access and business rules were kept server-side and split up into multiple services. Popular microservice frameworks include Flask and Express. Spring Boot and Dropwizard are the most popular Jersey-based servlet containers for Java developers.

Microservice frameworks were originally simple, easy to learn, and exhibited behavior that was easy to understand and predict. Applications built on these lightweight frameworks got big over time due to the above mentioned complexity factors. The bigger an application becomes, the more it resembles a monolith. When they weren’t splitting up big microservices into smaller ones, architects started looking for ways to reduce application size by hiding the related complexity in the framework. Using opinionated software, annotation-based design patterns, and replacing code with configuration reduced the number of lines of code in the applications, but made the frameworks more heavyweight.

Applications that use a heavyweight framework tend to have fewer lines of code and enjoy a faster feature velocity, but there are downsides to this form of obscured complexity. By their very nature, frameworks are more general-purpose than applications, which means that it takes significantly more code to do the same work. Though you have less custom application code, the actual executable, which includes the relevant framework code, is much larger. This means that it will take longer for the application to start up as all this extra code gets loaded into memory. All that extra unseen code also means that the stack traces (that get written to the application log whenever an unexpected exception gets thrown) will be a lot longer. A bigger stack trace takes an engineer more time to read and understand when debugging. 

At its best, performance tuning can be a bit of a black art. It can take a lot of trial and error to reach the right combination of connection pool sizes, cache expiration durations, and connection timeout values. This becomes even more daunting when you don't see the code that you are trying to tune. These frameworks are open source, so you could study the code but most developers don't.

Lightweight vs Heavyweight Frameworks
Pros Cons
  • Lightweight frameworks are easier to debug and tune.
  • Heavyweight frameworks increase feature velocity and lower release disruption.
  • Applications have less duplicitous code.
     
  • Lightweight frameworks require the devs to write more code.
  • Heavyweight frameworks take longer to start up and shut down.
  • Usually means accepting the black box code within a framework
     

Eventual Consistency

Instead of synchronously processing each API request immediately, reactive systems asynchronously pass messages around to its internal subsystems in order to eventually process each API request.

It's hard to say when reactive programming was first introduced. The Reactive Manifesto was published in July 2013, but there were plenty or precursors to that. The pubsub, or publish-subscribe pattern, was first introduced in the mid-80s. Complex event processing, or CEP, briefly experienced some popularity in the 90s. The first article that I saw on SEDA, or staged event driven architecture, was published near the end of 2001. Event sourcing is a recent variation on the theme of reactive programming. Reactive systems can be coded in the pubsub style or as message flows in domain scripting languages that resemble functional programming.

When a reactive programming system is distributed across multiple computers, there is usually (but not always) a message broker involved. Some of the more popular brokers for this are Kafka, RabbitMQ, and ActiveMQ. Recently, the Kafka folks have released a client side library called Kafka Streams.

Typical deployment for a distributed, fully reactive system.

ReactiveX is a very popular reactive framework with libraries for many different programming languages. For Java programmers, there is Spring Integration or Spring Cloud Data Flow, Vert.x, and Akka.

Here is how architects use reactive programming to obscure complexity. Calls to microservices become asynchronous, which means that whatever was asked of the API doesn't have to be done when the calls return. This is also known as eventual consistency. This makes those microservices more resilient to partial outages or degraded database performance without introducing much additional complexity. You don't have to worry about the caller timing out and resubmitting while the original transaction is still running. If some resource is not available, then just wait until it becomes available again. I will admit that it can be a challenge for junior developers to debug reactive programs (especially if coded in the pubsub style), but this is mostly because they are unfamiliar with this paradigm.

So, where did the complexity go? There is a lot of complexity in modern message brokers, but you are most likely going to just be able to use one of those and not have to write your own. Like any technology, they do have their own caveats, but they have very reasonable limitations.

For application development, the complexity was moved to the frontend. Eventual consistency might be wonderful for backend systems, but it is terrible for humans. You might not care when your vacation pictures reached all your friends in your social network, but if you are an enterprise customer negotiating an interconnected multi-stage order, then you will want to know precisely when each part of your order gets submitted, validated, approved, scheduled, and eventually fulfilled.

In order for the GUI to accommodate that very human psychological need, it will need to notify the user when what was asked of the backend does complete. Since the API call isn't synchronous, the frontend will have to find out some other way. Polling the API for status updates does not scale well. That means that the web browser or mobile device will need to use a stateful and long-lived connection by which it can receive updates from the backend without any prompting. In the old days, you could extend XMPP servers to do this. For modern web browsers, there is good support for websockets and server sent events. Spring WebFlux, socket.io, and SignalR are three popular libraries that permit server-side services to communicate with client-side javascript in this manner.

Web browsers enforce limits on such connections, so the client application will need to share the same connection for receiving all notifications. Because most load balancers close idle connections, the application must account for that by occasionally sending keep alive messages. Mobile devices are notorious for intermittent connection failures, requiring reconnection logic in the client software. Also, there must be some mechanism by which the client application can associate each notification (there may be more than one) with the original API call. There still has to be some mechanism to determine the status of previous API calls for when the user returns to the application after being away.

Reactive Systems and Eventual Consistency
Pros Cons
  • Reactive systems are more responsive and resilient.
  • Reactive systems may decrease complexity on the backend, especially for data-intensive applications.
     
  • Reactive systems may increase complexity on the frontend in order for the application to be emotionally satisfying to its users.
  • Highly scalable, distributed reactive systems increase operational costs with the adoption of a message broker.
     

Conclusion

From the early days of the first mainframes to the present day with the cloud, systems have grown in complexity, and software architects have found new ways to manage that complexity. When possible, reducing complexity without sacrificing capability is the best course of action. Twelve-Factor Apps have great advice on how to do that. With EIP, reactive systems and eventual consistency, you might think that you are reducing complexity when you are actually just pushing it around to another part of the system. Sometimes you just have to hide the complexity, and there are plenty of model-based generators, frameworks and connectors to help you do that, but there are both advantages and disadvantages to that approach. As we learned with Twelve-Factor Apps and reactive systems, nothing increases complexity like statefulness, so be very wary and conservative when adding or increasing statefulness in your applications.

However they reduce, hide, or redistribute it, software architects will continue to manage complexity in order to keep delivering quality software more quickly in a world with ever-increasing demands for more functionality, capability, capacity, and efficiency.

About the Author

Glenn Engstrand is a software architect at Adobe, Inc.. His focus is working with engineers in order to deliver scalable, server side, Twelve Factor compliant application architectures. Engstrand was a breakout speaker at Adobe's internal Advertising Cloud developer's conference in 2018 and 2017, and at the 2012 Lucene Revolution conference in Boston. He specializes in breaking monolithic applications up into microservices and in deep integration with Real-Time Communications infrastructure.
 

Rate this Article

Adoption
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Community comments

  • Examples of statefulness

    by kimsia sim,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Thanks for the great write up and pro and con the various approaches.

    Simplified a lot of concepts that have confounded me for a while.

    Can I ask that you give some examples of statefulness? Perhaps even an example where adding statefulness is unavoidable and one where adding is avoidable.

  • Re: Examples of statefulness

    by Glenn Engstrand,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    That is a good question. Thanks for asking. For the set up to this example of statefulness, review paragraph 8 from the section on Twelve Factor Apps (titled “On the Origin of Computer Systems”) in this blog. That paragraph introduces factor 6 then goes on to explain a common scenario where access to a relational database is fronted by a remote in-memory key value store acting as a cache.

    Let’s say that you have a service that surfaces a synchronous API that follows this pattern. So far, the service itself is stateless so all is well and good in terms of managing complexity. You start getting complaints about API calls, that return a lot of data, being slow. You profile those calls to learn that most of the time is spent deserializing those large object hierarchies from the external cache cluster. You might be tempted to solve this problem by caching these large object hierarchies locally in the same JVM as the service itself. That means no deserialization blocking the thread that is servicing each API call. But wait, what if the next request goes to a different node? You use something like Hazelcast to asynchronously replicate all of these locally cached objects to the other nodes in the cluster. That seams like a simple enough solution but now your service is stateful.

    After that, you start fielding complaints about API calls returning old, out-of-date data whenever the network partitions or the number of nodes in the cluster for the service gets large. You will also notice a higher cache miss ratio putting more pressure on your database. These are all solvable problems but the solutions end up adding a lot of complexity to your application code.

    When would something like this be unavoidable? If your APIs had real-time constraints that mandate that they return the data in so short a period of time that any calls to external caches or databases would take too long. Everything has to already be local to each node in the cluster by the time that the request gets received.

  • Re: Examples of statefulness

    by kimsia sim,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Thanks for the detailed answer, Glenn

    There’s a lot of “what” and “how to” Information these days regarding microservices which is why your pro and con analysis is so useful by adding nuances.

    I have a “when” question that requires similar nuanced answer.

    Suppose a small team of 1-4 start off with a simple CRUD app with a traditional database like MySQL or Postgres. This is considered to be monolithic app.

    *When* would it be right to start moving towards microservices? And what would be the first move?

    Would it be when the traffic is “high”? How would you define “high”?

    Would it be when certain tasks start to become Long running?

    Eg of Long running tasks such as a task that requires updating hundreds of data rows and updating aggregation stats like total, average, and count. Or a task that requires exporting out hundreds of thousands of data rows with multiple joins to an excel file that typically lasts minutes per request.

    Would the first move be adding queue?

    From what i see most times, people don’t immediately jump into microservices architecture they kind of evolve their way there as complexity increases over time.

    I deliberately chose the above scenario because they are probably more small teams out there than large ones.

    Thanks

  • Re: Examples of statefulness

    by Glenn Engstrand,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Another excellent question. You are correct in that consensus wisdom says to start with a monolith first then move to microservices once the pain of the monolith (see paragraph 5 under the section titled “March of the Frameworks” above) becomes too great.

    Why start with a monolith? In a greenfield project, there is a lot of discovery in the problem domain. Otherwise, there would be no business opportunity for starting the project. Nobody needs yet another content management system. That means you will need to do a lot of refactoring over time, including how the APIs are defined. That refactoring is much easier to accomplish when all the code is in the same project repository. That is actually one of the most compelling arguments for adopting the particular software development strategy known as monorepo.

    When should you transition to microservices? It is not about high load as a well designed monolith can be scaled out. It is not about long running requests as a monolith can be reactive. It is more about maintaining feature velocity when you have highly disruptive releases. If your application is too big to release with confidence, then split it up so that each separately releasable microservice can be released with confidence. See the section in this blog From Monolith to Microservices titled "If It Isn’t Broken, Then Don’t Fix It"

  • Thanks

    by nano_sprite IQ,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    A refreshing read. Was great to have an impartial exploration of the path that led us to where we are and the pros and cons of the different architectural approaches. Look forward to reading more of your thoughts.

  • Re: Examples of statefulness

    by kimsia sim,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    I like you answer my “when” question based on feature velocity.

    It’s unexpected and an education for me.

    I guess my follow up question to that is do you recommend a quantitative way to say this is roughly when microservices needs to be considered?

    Qualitative is fine as well. As clear a signal as possible without being too far wrong is ideal

  • Re: Examples of statefulness

    by kimsia sim,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Apologies I meant to ask other than your original criteria of 2 yes to the questions listed in the “if it isn’t broken don’t fix it section”

    would you change that criteria? Would you add more quantitative measures?

  • Re: Thanks

    by Glenn Engstrand,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    I'm glad that you liked it.

  • Re: Examples of statefulness

    by Glenn Engstrand,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    I believe that there is some merit to view your monolith as tech debt once it gets big and complex enough to slow feature velocity. My question for you is this. What is the quantitative trigger for paying down tech debt?

  • Re: Examples of statefulness

    by kimsia sim,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Just realized taht my reply didn't go thru yesterday. In case you get duplicated replies, just pick one.

    For me I guess the quantitative trigger to pay down tech debt is

    1. when something (e.g. add a CRUD of model M1 meant for users of role R1) that used to be done in X amount of time now takes 2X time or even more
    2. when situations like 1 is not isolated to 1 case, but many cases (e.g. another CRUD of another model M2 for users of role R1 or another CRUD of same model M1 but for users of role R2)

    When both situations 1 & 2 occur, I guess that's time to look at architecture or paying down technical debt or both.

    Am I making sense?

  • Re: Examples of statefulness

    by kimsia sim,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Another possible quantitative trigger I just realize is slow build time.

    Does that count?

  • There's more wrong with reactive systems

    by Bas Groot,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    I fully support the idea that complexity can only move, grow or explode, and hardly be reduced. Reactive UI frameworks may not be of the "move complexity" but rather the "explosive" category, due to 2 phenomena not discussed here.

    1) When asynchronous code gets more complex, the number of possible pathways to inconsistent and wrong views that do not update anymore or even stop working, grows exponentially. Factors like user error, client bugs and unforeseen sequences makes the area of potential failures a lot larger. You need a lot of error-catching to cover the worst. Full reloads are a catch-all, but reactive frameworks trade off short code and snappy execution for a long initialization phase, making a reload way too slow.

    2) Reactive frameworks litter a UI data structure with "handlers". For a developer writing reactive code, it's hard to predict how many handlers you get and where these end up. It can multiply into very wasteful, highly recursive execution while code looks simple and clean.

    In bigger UIs displaying large amounts of data (like web applications for professional users), I've seen the combination of these two leading to UIs that neither want to perform nor want to become reliable, no matter what you do.

  • Re: Examples of statefulness

    by Glenn Engstrand,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Indeed. Another way of saying the same thing is this. Consider splitting up the monolith into microservices (or other forms of paying down tech debt) when enough people are asking the following question. Why does it take so long to code and debug even the simplest of changes?

  • Re: Examples of statefulness

    by Glenn Engstrand,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Slow build time counts towards determining when to break up the monolith into microservices. Here is another. When it takes a significant portion of the sprint to vet the next release.

  • Notes on transition to microservices

    by kimsia sim,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    > When should you transition to microservices?

    Since we on the topic of transitioning to microservices, i thought of sharing this article which talks about the how. martinfowler.com/articles/break-monolith-into-m...

    Specifically what to decouple and how to decouple.

  • Great article overall!

    by Anit Shrestha Manandhar,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    At did not expect the article to move towards architectures from the perspective of a model but it was interesting to see the transition and the relation Glenn had created. Though still not crystal clear in their relation but overall it was an excellent read! May be another go through might help for me.

    Thank you.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

BT