BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Articles A Skeptic’s Guide to Software Architecture Decisions

A Skeptic’s Guide to Software Architecture Decisions

Key Takeaways

  • Clearly articulating and testing assumptions help teams make better decisions by gathering better information.
  • Every requirement, including Quality Attribute Requirements (QARs) that drive the architectural design, represents a hypothesis about value. One of our goals, when taking a skeptical approach, is to make these hypotheses explicit and to consciously design experiments that specifically test the value of the requirements.
  • Entire requirements need not be implemented to test assumptions; selective implementation of critical parts of an application can help teams test assumptions at a minimal cost.
  • Greater awareness of the implicit architectural decisions they are making, and forcing these decisions to be made explicitly, can help development teams make better decisions using the empirical data they gain from their sprints/iterations
  • Being skeptical is not a sign of disrespect; it’s a sign of sincere respect for the complexity of delivering excellent outcomes to customers in a chaotic world.

It is popular to believe that attitude is the key to success, and there is a lot of truth in that. As Henry Ford observed,

"Whether you think you can, or you think you can’t - you’re right."

If you don’t believe you can do something, and most certainly if you don’t try, you will likely never achieve it; that much seems obvious.

And yet, mere belief in oneself only goes so far; preparation and planning play roles, too.

So does doubt, but of a specific kind, and by this, we mean skepticism. More specifically, philosophical skepticism is the belief that knowing the truth about something is impossible, at least, in the case of software, without evidence.

Skepticism is, in our experience, an architectural superpower that helps you to see through false assumptions before you have followed them too far -  before they have cost you too much time and created so much work that you’ll never completely recover.

But more about that in a moment.

Figure 1: Phrases that tell you that you may have assumptions you need to test

"Excessive positivism" can lead to blindness

The power of positive thinking is real, and yet, when taken too far, it can result in an inability, or even unwillingness, to see outcomes that don’t conform to rosy expectations. More than once, we’ve encountered managers who, when confronted with data that showed a favorite feature of an important stakeholder had no value to customers, ordered the development team to suppress that information so the stakeholder would not "look bad."

Even when executives are not telling teams to do things the team suspects are wrong, pressure to show positive results can discourage teams from asking questions that might indicate they are not on the right track. Or the teams may experience confirmation bias and ignore data that does not reinforce the decision they "know" to be correct, dismissing data as "noise."

Such might be the case when a team is trying to understand why a much-requested feature has not been embraced by customers, even weeks after its release. The team may believe the feature is hard to find, perhaps requiring a UI redesign. However, the problem may be that the new feature does not solve the user’s problem, and they shouldn’t discard that explanation until they investigate further. If they simply assume that the feature is valuable, they may embark on an expensive, fruitless UI redesign when that’s not the problem.

The Titanic had received warnings from other ships that there was drifting ice in the area where she was sailing, but Captain Edward Smith decided to keep on steaming at full speed, which was standard practice at the time. Had he questioned the beliefs that ice posed little danger to large vessels and that the Titanic was unsinkable, the deadliest peacetime sinking of a cruise ship may have been averted.

Skepticism is not the same as negativism

People often use the word "skepticism" as a synonym for "negativism," with negativism defined as "an attitude of mind marked by skepticism, especially about nearly everything affirmed by others." A skeptical attitude can drift into negativism, especially when amplified by cynicism. Still, we think skepticism has its place when dealing with complex challenges, i.e., in which there are more unknowns than there are knowns.

What we mean when we use the term skepticism is the philosophical kind of skepticism we mentioned earlier. When dealing with complex challenges, especially early on, we know very little - not only about the solution but even, and sometimes especially, about the problem. In these situations, philosophical skepticism can help us see through mistaken assumptions and overcome our cognitive biases to discover better solutions to real problems. This is why we think of skepticism as an architectural superpower: it’s like x-ray vision that helps us see the world as it is, not as we wish it to be.

The concept of skepticism has similar roots to empiricism, which forms the foundation of agile approaches; the word skepticism comes from the greek word σκέψη (sképsi̱), which means thought or investigation, and it’s the investigation part that is important to software architecture.

What does skeptical software architecture look like?

As explained in a previous article, architecting and designing modern software applications is fundamentally exploratory. Teams building today’s applications encounter new challenges every day: unprecedented technical challenges and providing customers with new ways of solving new and different problems. This continuous exploration means that the architectural design can’t be determined up-front.

Software architectural designs are driven by Quality Attribute Requirements (QARs), not by functional requirements. Not considering QARs in the initial iterations often creates issues when the software system is deployed beyond an initial pilot phase with a small number of users. Every requirement, including QARs that drive the architectural design, represents a hypothesis about value. One of our goals when taking a skeptical approach is to make these hypotheses explicit and to consciously design experiments that specifically test the value of the requirements.

Unfortunately, QARs are often not well-defined. Vague requirements such as "the system must be fast" or "the system must be scalable" are not very helpful in designing architecture. Neither are over-inflated scalability QARs such as "the new system must be able to handle at least 10 times the current transaction volumes," as those requirements are often based on unrealistic expectations from the system stakeholders. Being skeptical about those vague requirements is important, as they may lead to over-engineering the system and building unnecessary capabilities. It is also important to validate the value of requirements; research shows evidence that most requirements are not actually useful, and sorting those from the rest requires experimentation.

Teams need not fully meet the entire requirement to determine its value; it may be sufficient for a team to simply build enough of the system to run a test that validates (or refutes) their hypothesis and proves or disproves the value of the requirement. Similarly, a team does not have to build the entire solution to assess the critical assumptions on which the solution rests. Merely identifying assumptions isn’t enough; teams need to test their assumptions. However, the team must include instrumentation in the test system when building just enough of the system to allow testing of their assumptions. Finally, it is important that the team run tests that could fail or be used to determine when failure would occur.

Greater awareness of the implicit architectural decisions they are making, and forcing these decisions to be made explicitly, can help development teams make better, more informed decisions using the empirical data they gain from their sprints/iterations. Based on past experiences, teams must find new ways of satisfying quality requirements.

Skepticism aids decision-making

Teams sometimes experience analysis paralysis, which makes them afraid to make a decision. Skepticism can often be a cause, but it can also lead to a solution. Suppose the team accepts that there is no way to know whether a decision is right or wrong without experimentation. In that case, they can short-circuit paralysis by identifying alternatives and then empirically testing them. Analysis paralysis is especially problematic when teams try to assess a decision without any information: with no new information to guide them, they keep passing over the same alternatives.

When it comes to decisions about the solution, the only useful data comes from executing code; everything else is conjecture. Team members familiar with a specific technology may feel that a particular technical decision is right. Still, the only way to be sure is to write some code and test the assumptions behind the code. Writing code is the only way to settle an argument between architects; no amount of belief can confirm or reject an assumption.

Skepticism in practice

Applying skepticism should be as simple as identifying assumptions and asking questions like "what evidence will we need to see to know that is true?" and then building measurements to enable testing the assumptions. Applying skepticism appears to be hard, based on what we have observed. Consider the following example:

Some senior technical team members at an insurance company had been experimenting with rules engine technology. They seemed to find many uses for it, replacing "if-then-else" code in applications. With a rules engine, they found, they could make changes in application behavior without having to modify, compile, and deploy code. Technical staff might not need to be involved at all; subject matter experts could maintain the rules.

After early successes with underwriting rules that controlled policy issuance and pricing, largely behind-the-scenes work, they stumbled into an area that seemed more challenging. Some rule changes altered not only program logic but also the look and behavior of the user interface (UI). After some consideration, the team observed that UI behavior was just a different kind of application logic, and they could develop rules that would define the UI behavior.

As a result, they embarked on an ambitious redesign of the application’s UI, writing rules that expressed look and feel and behavior. Technically, it was achievable, but the impacts were considerable: the application was slow, and the knowledge required to manage the UI rules required developer experience. Instead of achieving flexibility, the development organization had created a more complex problem: coding UI behavior in an environment that lacked good development and testing tools (a rules engine). The behavior in the system became untraceable when there was a problem.

This example follows the old maxim, "if all you have is a hammer, everything looks like a nail." Rules engines can be effective for some parts of the problem, but the metaphor needs to match the problem. This would have become quickly evident with a more skeptical approach that identified Quality Attribute Requirements related to maintainability and then tested the proposed solution against them. Rules engines have their place, but the team’s use for UI design went beyond the intended use for such tools.

A non-decision example: The assumption of scalability

Scalability has traditionally not been considered a top QAR for a software system, but this perception has changed during the last few years, perhaps because of the focus of large e-commerce and social media companies on scalability. Scalability can be thought of as the property of a system to handle an increased (or decreased) workload by increasing (or decreasing) the cost of the system. Calling a system scalable is a common oversimplification. Scalability is a multidimensional concept that needs to be qualified, as it may refer to application scalability, data scalability, or infrastructure scalability.

Surprisingly, software systems are often assumed to be scalable, especially if hosted on a commercial cloud. As stated in Continuous Architecture in Practice [1].

"Cloud computing offers the promise of allowing an application to handle unexpected workloads at an affordable cost without any noticeable disruption in service to the application’s customers, and as such, cloud computing is very important to scaling."

However, only well-designed systems scale well in a cloud environment. In other words, porting a poorly designed system to a cloud environment is not likely to make it scale well, especially if it isn’t designed to be horizontally scalable.

For example, a team developing a Minimum Viable Product (MVP) for a new software system often focuses on delivering the system as quickly as possible. It doesn’t concern itself about what may happen after the MVP becomes successful. If the system’s user base grows rapidly and unexpectedly after the initial release, the system might need to scale quickly beyond the team’s original workload assumptions. They might assume that hosting the system on a commercial cloud platform will make scalability the cloud provider’s problem! Commercial cloud platforms can provide effective scaling, but only if the application is designed to take advantage of certain cloud platform features. Even then, cloud platforms do not solve all scaling problems, such as when resource bottlenecks are built into the design.

Scalability is sometimes confused with performance, which, unlike scalability, is about the software system’s ability to meet its timing requirements and easier to test than scalability. If the system’s performance is adequate in the initial release, the team may assume that the system would be able to cope with increased workloads. Unfortunately, that’s rarely the case if scalability wasn’t included as one of the top QARs during the architectural design.

Of course, scalability requirements are often poorly specified, especially for a system implementing an MVP when no one usually can guess how many customers would be interested in the new MVP, but that should not be a reason to make a non-decision and ignore scalability. On the other hand, overbuilding the system for scalability "just in case" and creating an architectural design based on vastly inflated workload estimates isn’t a good approach either.

For example, Google, Amazon, Facebook, and Netflix systems have outstanding scaling capabilities. However, the design tactics those companies use do not necessarily apply to companies dealing with different workload characteristics. We should be careful about rushing to use scalability tactics such as microservice-based architectures, asynchronous communications, database sharding, and complex distributed systems before understanding the implications of these tactics and documenting QARs and assumptions explicitly as a series of architectural decisions.

Being skeptical about these estimates prevents the team from over-engineering the system and building unnecessary capabilities.

Using skepticism to test proposed solutions

Another organization was able to use skepticism to reach a more positive result. Some team members had limited experience with an open-source framework and felt it could be used on a new worldwide manufacturer’s warranty management system. Each country had slightly different legal constraints and business requirements, but the core process was the same across the globe. The team felt that the open source framework might help them develop the system’s basic functionality so that customizing for local country needs would be easier, cheaper, and faster than developing the base functionality from scratch. Limited prototypes using the framework showed promise but were not extensive enough to evaluate its overall suitability.

Rather than jumping ahead and developing the entire system, they decided to apply a skeptical approach: they carved off a slice of the functionality representative of the system’s overall capabilities. Over a month, they used the framework to build and test this slice, even deploying it to a smaller business unit for internal testing and feedback. The cost of developing this "slice" was nearly USD 500k. Still, the team felt that since this slice was their "Minimum Viable Product" that tested their proposed "Minimum Viable Architecture," the expenditure was reasonable.

In doing so, they discovered a couple of things:

  • They were able to build the desired functionality, and it did work in a production-like environment, but ...
  • The open-source framework was not as flexible and productivity-enhancing as they hoped it would be. In fact, the framework made the code harder to maintain, not easier, because the framework did not quite match the model required for the underlying workflows.

As a result of the experiment, the organization decided not to use the open source framework in the development effort and instead to build its own framework that better fits the problem at hand. Had they simply developed the entire system using the external framework, the projected development cost would have been USD 20M. Discovering the poor fit late in the development cycle would have resulted in substantial rework removing the external framework, and rebuilding the system. In projects we have observed, this sometimes means starting over and losing the entire original investment.

(How to) Practice tactful skepticism

Skepticism can be challenging, and not just in a technical sense. People in an organization may not be accustomed to having their statements questioned, no matter the positive intent behind the challenge. To unleash the power of skepticism, a team has to be open to validating assertions, no matter who makes them.

There are ways to do this that soften what can seem to undercut. The team might first establish a principle that all assumptions need to be tested, no matter their source. When a team agrees that this is part of the "way of working," no one’s specific idea is being challenged. Later, when specific assertions or assumptions need testing, doing so is just part of how the team has decided to work.

Another way is for the team to ask itself, "which of our assumptions if they turn out not to be true, might prevent us from achieving what we need to achieve?" This asks a general question of all assumptions and does not point to a particular person’s assumption or assertion.

Another alternative is to consider the maxim, "facts are friendly," meaning that more or better information will help us achieve our goals, not hurt us. And establishing a principle in the way of working that asserts "there are no bad ideas, just incomplete information" can also help to depersonalize challenges to assumptions and assertions.

Interestingly, the Catholic Church made skepticism an integral part of their canonization process in the 16th century by establishing an official "Defender of the Faith" position (better known as the Devil’s Advocate). The "Defender of the Faith" was to argue against the candidate’s canonization by taking a skeptical view of their characters and actions. This role no longer exists today, but tactful skepticism remains an integral part of the canonization process. For example, critics of the candidates may be interviewed by the Catholic Church as part of the process.

Conclusion

Skepticism is a valuable counter to an unreasonably rosy picture of the world that only looks at the best cases for outcomes. While a positive attitude is essential, it’s often best to "hope for the best and plan for the worst."

In practical terms, applying skepticism often means making space for team members to question assumptions and assertions by looking for ways to prove (or disprove) the team’s assumptions. Skepticism means more than simply cataloging assumptions; it means actively investigating whether those assumptions are valid.

Teams need to recognize that every requirement, including Quality Attribute Requirements (QARs) that drive the architectural design, represents a hypothesis about value. One of our goals when taking a skeptical approach is to make these hypotheses explicit and to consciously design experiments that specifically test the value of the requirements.

Being skeptical is not a sign of disrespect; it’s actually a sign of sincere respect for the complexity of delivering excellent outcomes to customers in a chaotic world. It means taking seriously the team’s commitment to seek desirable outcomes expressed in product goals and QARs. Skepticism helps teams to question assumptions and hidden biases in a positive way.

Thoughtfully employing skepticism can be essential to every software development team’s tool kit. It can help them make better decisions earlier in the development cycle and at a much lower cost.

Endnotes

[1]. Murat Erder, Pierre Pureur, and Eoin Woods, "Continuous Architecture in Practice" (Addison-Wesley, 2021)

About the Authors

Rate this Article

Adoption
Style

BT