Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage Articles Scaling Distributed Teams by Drawing Parallels from Distributed Systems

Scaling Distributed Teams by Drawing Parallels from Distributed Systems


Key Takeaways

  • Effective distributed teams are accountable, over-communicate, have clear goals, understand the decision-making process, and have autonomy with explicit norms.
  • Affinity distance is a barrier to forming deep, meaningful relationships. Any effort to improve this has huge returns on distributed team performance.
  • Better tools, both software and hardware, can improve operational distance.
  • At 10,000 ft everything looks fine. It is important to zoom in, find patterns, and continuously improve.
  • Local first, global next. A model that works in one culture may not work in another culture.

An effective distributed team’s characteristics are accountability, good communication, clear goals and expectations, a defined decision-making process, and autonomy with explicit norms. Ranganathan Balashanmugam, CTO at EverestEngineering, spoke about scaling distributed teams around the world at QCon London 2020. In his talk he showed how we can apply distributed systems patterns for scaling distributed teams.

Distributed teams is where people from different offices across the world work together. Distributed systems work better by scaling horizontally, as Balashanmugam explained. 

A simple analogy is, do you want thousands of minions or a big expensive hulk to finish the task? When you have thousands of minions, you need an orchestrator to manage the minions. So, a set of minions (orchestrator) pick an intensive task, coordinate with other minions, and get the task done. All the minions have the required information to make decisions while executing the job. They send health reports at a specific frequency to the orchestrators. 

One of the core ideas is that all of these minions are redundant. If any minion fails, the orchestrator picks and assigns the task to a free minion, Balashanmugam said. The system is designed for better resilience by handling the failures gracefully.

There should be defined norms and a lot of examples of good and bad behaviors, Balashanmugam said. Diverse leaders can make strategic decisions, and teams and individuals can make tactical decisions. He gave an example of how this can look:

At EverestEngineering, we do not have managers at all. We have a few individual coaches and team coaches (orchestrators). When a team picks a task, they have clarity on the deadlines and deliverables. The coaches give clarity and tips for performing better. Anyone in the team can choose any task, and they have complete freedom in their working style. No one asks permission for their leave or to work from home or take a break and watch a movie.

InfoQ interviewed Ranganathan Balashanmugam about reducing virtual distance, applying distributed systems patterns for scaling distributed teams, paying attention to human parts of distributed teams, and how agile coaches can work with distributed teams.

InfoQ: How do you define virtual distance?

Ranganathan Balashanmugam: The concept is from the book The Power of Virtual Distance by Dr. Karen Sobel Lojeski. "It is a sense of psychological distance from others that affects collaboration performance." It is a combination of physical distance, operational distance, and affinity distance. 

  • Geography, time zones, and organizational affiliation create physical distance. 

  • The noise in the system is the operational distance. Some examples are bad quality of calls and notifications. 

  • Affinity distance is the barrier to form deep, meaningful relationships. Affinity distance has the most influence on virtual distance. It comes from the lack of commonality of values and styles. When people cannot attach themselves to one another, the work suffers. 

InfoQ: How can we reduce the virtual distance between people?

Balashanmugam: Physical distance is expensive to solve. We can promote some travel to support face-to-face interactions and coworking. 

Better tools, both software and hardware, can improve operational distance. Hardware includes better mics, headsets, light, internet, and power backups. It improves the quality of interaction. Software is a better choice of tools for synchronous communication like video conferencing and asynchronous communication like chatting/emails. Defining a schedule for collaboration and concentration time is also essential to focus and get the work done. 

Affinity distance can be improved a lot by the leadership team. There should be a focus on respecting the local culture and applying some of the global norms. Finding a local leader helps to create a balance of local versus global policies. Frequency communication gives clarity to the goals. Promoting transparency helps with building trust between the teams. It is crucial to avoid "we versus they" culture. Decision-making should be inclusive of the distributed teams. There should be autonomous tactical decisions. Promoting online chit chats creates better bonding by providing a better view of commonalities and styles.

Above all of this, it is important to collect feedback and continuously improve.

InfoQ: What distributed systems patterns can we apply for scaling distributed teams? How have you applied them?

Balashanmugam: The biggest bottleneck for any distributed team is decision-making. Similar to distributed systems, if we apply “deliver accountability and receive autonomy,” the bottleneck is removed eventually. For this to happen, there should be a lot of transparency and information sharing. So the teams and individuals are enabled to make decisions independently. 

Clarity is harder with a distributed team. Distributed systems send heartbeats very frequently and detailed reports at a lesser frequency. Communication is the key. Distributed standups are a better way of determining progress. Apart from that, move one-to-one conversations and decision-making to a common channel. 

We tried a concept called the end of the day update. Everyone posts their progress at the end of their day (considering different time zones). We believe it gives a better view of what each person is working on and the overall progress, even before they come to standups. At EverestEngineering, the coaches are responsible for improving the health of the channel. A healthy distributed team has a lot of discussions on slack channels and quick calls. You can see a lot of decisions made in the channel. There are enough reactions and threads for a question. 

Identify and remove SPOF - single points of failures - in teams. Distributed teams should have a redundancy-first approach. A way to do it for teams is to have redundancy of knowledge. All the members of the distributed team should have equal access to information. Otherwise, there will be bottlenecks. Since people work in different timezones; it is everyone's responsibility to share decisions and discussion summaries on a common channel. A small problem with this approach is to expect everyone to know everything, which is not practical. We solved this with smaller teams with a pragmatic microservice approach. We also follow the domain-driven design. So while people have a high-level view of the whole, each individual in a team has complete knowledge of their service.

A node (a minion) in a distributed cluster is either storing a part of data, or processing a part of a big task. The interesting part is that the expectation from the node is clear. It knows the boundaries of the job. Having clear goals and expectations from a distributed team is very important. When any system (a minion) is added to a distributed cluster, it gets onboarded to understand the cluster rules. It has logic on what is expected out of it and how it should coordinate. A new team member should be onboarded with the knowledge of values and the right information.

Equality is important. There are no specially configured nodes (minions) in distributed systems; all minions are equal. All the systems are treated equally. People from different cultures should be treated alike. Everyone must have equal access to interesting work.

InfoQ: How can we better pay attention to the human parts of distributed teams?

Balashanmugam: We need to look both at the macro level and micro level. When we see things at the macro level, everything appears okay, but when we zoom in, it will be surprising. 

Long ago, I worked in a distributed team where two teams were working on the same feature. Everything looked fine, till we found out the two services did not integrate. It felt like we built a bridge from two sides, which did not meet at the center. One of the main reasons was that the two teams were from different cultures. Initially, we thought it was a trust issue. Later from retrospectives, we found that it was an issue with varying styles of communication. In hindsight, we would have been more efficient if the team was hybrid.

“We versus they” is a common problem that we face in distributed teams. The goal of a goalkeeper in football, like anyone else, is to make the team win. Vital to building trust between team members is promoting short face-to-face visits. Tasks should be shared equally between groups. We created hybrid teams with people from different cultures. It helped improve shared responsibility.

When we start meetings, instead of jumping into the agenda, we promote a few minutes of chit chat. It eases the team and makes them comfortable. Once a while, we also have chit chat sessions scheduled for 30 minutes. It helps to understand the cultural differences.

InfoQ: What’s your advice to agile coaches who are working with distributed teams?

Balashanmugam: Agile coaches can be internal and external. The advantage of an external coach is that they come with a fresh pair of eyes and give a new perspective. They challenge some of the norms, which are otherwise generally accepted behaviors. Internal agile coaches have the benefit of knowing the system's nuances in detail, as they are part of the system. They know which knob to turn.

My first advice for any organization is to have a combination of internal and external agile coaches. External ones challenge the norms once in a while, and the internal ones know how to improve the norms.

Many organizations are struggling to work remotely and in a distributed environment during this pandemic. Agile coaches with experience working with distributed teams improve the performance of the organization. They need to act as role models by going on the journey of adaptation with distributed teams, acting as a mentor, role model, and coach. Some of the best agile coaches I worked with have a good sense of cross-culture collaboration, business, and team debugging skills, identifying performance hindering norms, and are influential in navigating change through the organization.

About the Author

Ranganathan Balashanmugam has worked with globally distributed teams for the last fourteen years, and was recently named as one of the top 10 CTOs in India by ceo insights magazine. He was a developer for nearly eleven years when he was working on using distributed technologies to scale software. Later he picked up operations and engineering management at Aconex, where there were teams distributed in four different time zones. He is currently CTO of EverestEngineering, which he scaled to 70+ people in the last one year, in three different regions. He is passionate about scaling and leading distributed teams.  

Rate this Article


Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Community comments

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p