BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Articles If Twitter Doesn’t Have a Staging Environment, Should Anyone?

If Twitter Doesn’t Have a Staging Environment, Should Anyone?

Key Takeaways

  • Context is what informs developer teams, so they need visibility into usage, end user, and regulatory requirements. The approach needs to align with the context or they will over- or under-engineer their solutions.
  • If a team is operating out of sync with the broader context of the project, quality and developer experience will suffer
  • Developers want to bring value to end users, and if they don’t have context, there will be friction and business outcomes will suffer.
  • Delivering at the right speed and quality for the context doesn’t always mean delivering at maximum speed and quality.

Twitter has been in the news a lot recently. However, if we cast our minds back to September, the company made the news for a different reason. At the time, a Twitter employee testified before Congress that the social media platform does not maintain a software staging environment but pushes updates straight to production. This raised more than a few eyebrows in the software development community. People questioned how the company could be so slack not testing rigorously before releasing. Others had an opposing perspective. If you roll out new capabilities methodically, use feature flags, have automated rollback, and employ strong monitoring and tracing to identify issues quickly, you should push right to production. It should not just be allowable, but better. As is so often the case, neither viewpoint was entirely right.

There are times when rigorous testing is non-negotiable, and other times when pushing lightly tested code is OK. Earlier this year, as a consequence of not rigorously testing, a DeFi developer working in a crypto business confessed that he accidentally caused the loss of $661,000, essentially at the stroke of a key. Eschewing a rigorous process proved disastrous in this case. 

The Software Development Lifecycle (SDLC) wars have raged for years. Is Waterfall really over? Is it Agile or bust? Or perhaps Agile is over if you believe what some in high tech say. The debate will continue, but the truth is, there is no one true development methodology that fits all teams, companies, and circumstances. So, how do you determine which is best for your project and your team? 

To answer that, we must consider the increasingly broad role developers are being asked to take on in some organizations. The formerly well-defined boundaries between product management, development, QA and testing, security, and cloud operations are no more as countless tasks are shifting to the developer. Shift-left and shift-right motions mean that developers have responsibilities that persist throughout the SDLC, all the way from feature definition to production. The optimal SDLC is unquestionably influenced by the developer’s role, and that varies by company, team, and project. All this considered, how should development teams determine the best methodology for their team and their company? 

Context Is the Missing Link

Often, the key to deciding the best development methodology for your team to use is the business context of your project. Context includes considering who the end user is, application usage trends, and legal and regulatory requirements, among other factors. Good developers want to bring value to end users; they want their work to matter. Giving them visibility into the context of the project will help them to better deliver this value – for the end user, the project, and the company. Conversely, without context, there will be uncertainty and friction, and business outcomes are likely to suffer. 

Context is not even necessarily defined at the company level; it must be defined at a project level. Teams today need to calibrate their approaches to the different contexts of disparate projects. If a team is operating out of sync with the broader context of the project, there will be failings. Teams will unknowingly over-engineer or under-engineer the solution. They might hold back code that’s ready enough, or release it when further testing is demanded by the context. Therefore, we need to educate teams about context and trust them to adopt the appropriate processes and toolkits for the task. 

The complexities associated with determining the best development methodology to use can be simplified into three variables – often working against each other – directly related to the business context of your project: 1) level of investment in quality in both pre- and post-release, 2) velocity, and 3) developer experience. 

Teams need to deliver at the optimum quality and speed for the context – which does not always mean maximum quality and speed. If the balance between velocity and quality is appropriate for the project, the developer experience will improve. Where those two are out of sync with the project context, it will get worse with burnout, frustration, and cutting corners.

  1. Level of Investment in Quality


Heavy investment in quality can be made prior to release, or you might move faster through QA and invest more heavily in testing in production. Consider the level of investment in quality within the context of the project. Any financial projects – an investment banking app, trading platform, or crypto project – should have the highest levels of rigor before any updates are released. The potential cost of error is simply too high. Those projects will likely have a high investment in production testing as well, but unlike some other projects, we cannot just push them into production and see how it goes.

On the other hand, two people in a college dorm room working on an early-stage Y Combinator app generally do not have a lot to invest in pre-production testing. They might do some exploratory QA, push to production, and wait to hear from users that something isn’t working. And many times, users are extremely forgiving of early-stage start-up processes. 

In the case of Twitter, because the consequences of a bug escaping to a small portion of users are not as serious as perhaps the code a healthcare provider releases to support HIPAA compliance, the company may choose to push directly to production.  

  1. Velocity

How quickly you should move to production again depends on business context. If you push too hard on velocity, quality will suffer. For financial projects, maybe it should take as long as it takes because quality cannot be compromised. Where the stakes are most likely not high, or for some early start-up projects, major features can release quickly. Where the importance of quality lies somewhere between those extremes, for example, a CRM feature, quick iterations with multivariate testing, and progressive delivery are solid choices. There are times when it’s ok to push good code and keep moving. It’s all about the context. 

  1. Developer Experience

Developer experience is an under-attended element in the modern software development process. When quality and velocity are in sync with context, that’s where developers are most happy. If a team is operating out of sync with the broader context of the project, developer experience will always suffer, and in turn, the project. Developer visibility into the business context is where and how quality and velocity are defined, putting developers on the right track for the project. Providing visibility also takes development team empathy into consideration, significantly contributing to job satisfaction and team performance. Teams will and should carefully manage velocity to ensure a good developer experience and avoid burnout. 

How Do You Know If Your Developer Approach Is Aligned with Your Business Context

One way to test if your team is using an approach that correlates to the context of the project is to ask a series of questions about your process, all relating to quality, velocity, and developer experience.  

Let’s take a simple hypothetical project where quality is of utmost importance. In the pre-production stage, ask the degree to which you are carrying out the steps below, and how appropriate those steps are for your business context:

Are the feature and WIP well understood? 

It’s difficult to imagine a scenario where it shouldn’t be, but depending on where a product is in its lifecycle, the team may be operating in a highly experimental, iterative process where immediate user feedback is more important than deep precision of specifications.

Are we taking a design-first API approach? 

I’m a proponent of design-first full stop, but if the API is going to be used by a broad audience of colleagues and/or external developers, taking a design-first approach is even more important. And, if the API is to be used by external developers, it needs to be well documented on a portal, which in turn should be connected to the core design so they stay in sync.

Does the application behave functionally correctly? How broadly should we test it? 

There are numerous angles that influence how comprehensive functional testing should be. These include the level of unit testing, the urgency to release, and the cost of a bug escaping. If those factors aren’t consciously considered, then velocity, quality or both will miss the target.

Does the app function correctly on all the operating systems, browsers, and devices we care about? 

Browsers behave differently and display UI components differently. Mobile operating systems are updated frequently and contain diverse changes. Different devices deploy disparate low-level libraries that will result in different end-user experiences. If the cost of a bug is high or if the user base is global, meaning your user base will run the gamut of mobile device manufacturers, the importance of cross-browser, cross-OS, and real-device testing increases.

Does the app function correctly at a system-wide level? 

As the uses of microservices and serverless functions expand, end-to-end testing increases in complexity. Striking the right balance between end-to-end testing, contract testing, and progressive delivery with rapid roll-back capabilities will depend on business context. Embedded medical device software cannot be easily rolled-back, of course, while cloud-native deployments with low cost of escaped issues support progressive delivery more naturally.

Does the app function correctly with many simultaneous users? 

Business context will tell us what “many” means. Is it hundreds of millions of social media users, or tens of thousands of visits to a credit union website? What is the peak, and will your code and infrastructure hold up?

If you answer no to the above questions, or if you feel you are not doing them rigorously, it’s worth digging deeper. Does your process have risk baked into it, or is the risk aligned with the business context? For the production phase, consider:

  • Is the application always available?
  • Is it performant for end users?
  • When it crashes, do we know what happened?
  • Can we replicate it?
  • Can we identify where to fix it?
  • Can we clearly define the fix?

Though this set of questions is somewhat simplified, by going through this checklist, you will learn if your tools and processes yield a rational balance of quality, velocity, and developer experience for the business context. If not, you can easily see where to make changes. This simple checklist also highlights the value of making conscious decisions about the approaches, processes, and tools you are using to ensure all are in alignment with the broader business context. 

The Endless Debate: The Best Development Methodology

In the end, the debate will never be resolved because there is no single best approach to software development. Business context is often what’s missing for developers to make the best choice. The business context might require a waterfall approach however labored that might feel according to modern development patterns (e.g. if a deadline is absolutely critical and features are to deliver regulatory support.) 

Business context may require a huge investment in release readiness because the cost of a bug escaping is very high, or because rollback is not easy (e.g. embedded software, medical software, highly-regulated industries). Business context may encourage extreme velocity and multiple pushes to production per day (e.g. a brand new capability in a high potential market – imagine the race to get ChatGPT capabilities embedded into consumer-oriented applications, for example) – providing this context allows them to determine the relative importance of quality and velocity to the project, and to directly impact the developer experience as they work. The more we can create visibility into a closed-loop development process – from concept to development, from testing to production, and then back again – the more we can help developers accelerate how they build, launch, monitor, and improve the software over time – within the appropriate business context of their project.

About the Author

Rate this Article

Adoption
Style

BT