Developing a High Capacity Network Gateway with LeSS
At the end of 2007 we started having a discussion how to build a high capacity network gateway from scratch. We faced two fundamental risks. First, the technology was completely new and has never been used before in Nokia Siemens Networks (now Nokia Networks). Second, the use cases for first commercial deployments were not completely defined at the start - It became clear that we needed to adapt feature content heavily throughout, based on learning. Using the LeSS framework appeared to be the appropriate response to these major risks.
The initial idea was to build a broadband network gateway but after few months of development we realized that there would be more market demand for a gateway for 2G/3G and long-term evolution (LTE) enabled mobile networks. Luckily we had chosen an agile development framework to develop the product and this provided us the flexibility to change the direction smoothly.
The hardware (HW) and software (SW) platforms we selected were totally new and at the beginning of the development they were not available. The only option was to use HW that had the target CPU but was totally different from the ATCA blade architecture that we would use in the commercial product. We had the same problem with SW platform, and we had to start with the SDK delivered by HW manufacturer instead of using the SW provided by our platform organization.
Organization and growth
Selecting LeSS as the development framework was easy; adopting it in practice was hard work. The first challenge was to convince all parties that feature teams were preferred over component teams, if we wanted to reduce cycle time and increase flexibility. The feature teams that we decided to use, after long debate, are long-lived, cross-functional and cross-component teams that complete many end-to-end customer features 1. We started the product development with people coming from two totally different backgrounds. The first team had used Scrum for over one year and successfully created a product. The other team came from a traditional waterfall organization that had “failed” in their attempt to adopt Scrum and had resistance in trying it again. Why had they failed? In short, it was a classic “fake Scrum” adoption that just consisted in relabeling their existing waterfall development practices and teams with “agile” and Scrum terms, but nothing was really changed in their organizational design.
The initial team-formation approach that we used was to mix the teams so both new feature teams would have members from the traditional waterfall development background and from the (real) agile development background. We put all development people in a room and asked them to organize themselves into two feature teams.
At first, the teams did not want to use feature teams because they claimed that feature teams lead to bad software quality. An Agile coach present in the meeting asked gently what was the quality of the code that the component teams had created in their previous product? The answer was that a mess! So after some discussion we agreed to see how the code would end up when using feature teams. After the teams had agreed to try feature teams, the forming of teams went smoothly, and it took under one hour.
The newly formed teams selected their ScrumMasters from available “out of line” managers who did not have any power relationship over the teams. We decided to start with out-of-line managers as ScrumMaster even though we knew this could be a problem, since we did not have any other work for them to do. After few Sprints, we realized that the managers were not capable of serving as good ScrumMasters, and the managers could see this too. So they decided to offer the ScrumMaster role to team members (who did take on the ScrumMaster role) and then the managers started to work as free agents helping in miscellaneous ways. But they were not part of any team. Later we found a better solution for how managers could work, which is explained later.
So, we had our first teams, and development could start!
We decided to start with two teams, and this led to huge arguments between teams and very slow start in development because there were so many different opinions how the architecture and infrastructure should be done. There were several reason behind this: the main reason was that the people not having real Scrum experience were the stars and were not used to collaborating with others. The culture of winning arguments was strong and visible inside teams but worse between teams. We had common design session with both teams present were present which helped little to achieve agreement but there were so many moving parts beginning that teams managed to create conflicting solutions that needed to be solved. The positive impact was that we probably covered all possible options when designing the initial architecture with two feature teams each trying to convince other team that their solution is the best.
In the beginning, there were also difficulties in planning since jumping from a sequential lifecycle model that consists of long planning and specification period to an agile style where only a small amount of work is planned in detail was very uncomfortable for people who did not have an agile development background. The common phrase heard in the first Sprint planning meetings was, “We can not plan this; we have to specify and analyze this so much more before we can start”. The phrase was reflecting thinking that planning can solve the uncertainties. But no amount of planning would have solved the uncertainties we had! The only way to tackle the issue was to start doing and let the work emerge during the Sprint and do more design and planning during the Sprint, when we know more.
Initial LeSS with two teams
With two teams, we used a common Sprint planning session (in the same room) for Sprint planning 1 and 2. This enabled us easily to collaborate when defining architecture and solving dependencies between teams. As usual in LeSS, a separate Sprint Backlog was created per team, using visual management on a wall. This simple, enjoyable tool that we selected in the beginning endured over time and all later teams also just used simple “cards on a wall”.
Below are picture of Initial PB creation, Sprint Planning 1 and a Team space.
In the beginning, the Product Backlog Refinement (PBR) was not arranged as a separate meeting; instead it was an ongoing informal activity during Sprint. The top of the Product Backlog was so unclear on the technology side that PBR took a significant amount of time in first Sprints. Since it was not a distinct meeting, each team did it alone, and when they came to some solutions tried to sell the solution to the other team. The requirement and priorities in the beginning of the development were clear so teams could work pretty independently on PBR. However, the side effect was that teams did not collaborate in refinement, which was also contributing to conflicts between teams. Eventually, we changed to a system where each team had a PBR meeting with the Product Owner. In hindsight, it looks obvious that we should have had an overall PBR session during Sprints that would have created common understanding on possible solutions, but we did not realize at the time that this idea was already a part of LeSS.
We had one overall Sprint Review session with both teams. And individual team Retrospectives after that for each team. Shortly after, we added a common Overall Retrospective to agree on improvements that affected both teams.
Growing the First Wave
We had our first growth point when we added two more teams (for a total of four). It was a challenge since they were transferred from a traditional organization. One of these teams refused to learn new testing tools and the new way of working. They did not produce anything that could be considered done for several Sprints. The team argued that the testing tools in their previous environment were much better and resisted the learning of new tools and did not want to write unit tests. In retrospect one crucial point that caused the resistance was that we did not provide the sufficient training and the reasoning how work is done when using iterative development. The new teams should be able to influence the ways of working so they can feel the rules as their own.
We added the new teams without breaking up the existing two teams since we wanted to stick with long living feature teams.
And at this point, we managed to insert customer documentation people to work directly with teams during Sprints, so documentation was produced with the software. For a short time, we tried to use offsite customer documentation people but the overhead of communicating with the remote site made the process so slow and time-consuming that we discontinued it.
Sprint Planning 1 included all team members (rather than just a few team representatives) since we felt it was very desirable to have everyone together since there were many things teams has to agree during the planning together. And the individual team-level Sprint Planning 2 meetings were also held together in the same big room, to increase simple collaboration between the teams, given that there was a lot of talking and coordination needed.
The one Sprint Review with six teams was organized in a big open area close to the team spaces, and each team presented what they had done. The Review did not take too long, and all teams were interested in others work. We managed to have marketing and support people participating in the Review to keep in touch with the development progress that enabled us to incorporate support functionality to the product from early stages.
The Overall Retrospective was a real challenge to get productive, even with six teams, since getting the improvement actions done was slow. One of the main reasons behind the weakness was that there was no real focus from managers to do improvements, and many of the improvements at that time were out their immediate control. Sometimes improvements were not done by anyone, and this reduced the team's willingness to participate in an Overall Retrospective.
The key to solving this problem would have been to create an improvement service where managers would collaborate to get improvement done to help the teams, as recommended in LeSS, but at that time the managers were not so engaged in helping.
Growing the Second Wave
Adding teams to the same site was easy compared to next step where we decided to add teams at a second site, to (hopefully!) speed up development because the market demand for the product that we were creating was suddenly emerging. Here we found out that adopting LeSS and using automated acceptance testing paid off. We trained the subcontractor at the second site in our ways of working by having them spend several weeks with our local teams doing work as team members until we were confident that they could work by themselves.
The same coding and testing rules were applied to the subcontractor that were for our teams. They had to write unit tests, create automated acceptance test for all code and use the central continuous integration system. The biggest challenge in working with the second site was the communication problem. It was hard to communicate the requirements and in the first half a year we had one person at the first site working as a Product Owner proxy for the second site to reduce the misunderstandings in requirements. After that period, one local person at the second site took the proxy role.
During the growing phase we came to the point that one Product Owner was getting overloaded. So we started to classify the Product Backlog items into major requirement areas, and some teams would focus in a particular area. However, in retrospect we realized that we made the areas too small, with only two teams for example. That led to more complexity and less transparency, versus if we’d made bigger areas with more teams.
We spend significant time convincing specification people that were working in product management to start working with teams to help the Product Owner since he was not able to have any more detailed view what each team is doing. We tried to find a few Area Product Owners as described in LeSS 3 but we did not fully managed to implement the concept mainly because the Product Owner, who came from product management, was used in calling the shots and did not want to give decision power to anybody else. The teams ended up working with several feature experts depending on what feature the team worked on.
Now by this point, Sprint Planning 1 was done using a few team representatives from all teams, and it worked well since the teams had done proper Product Backlog refinement with feature experts.
After having more that eight Teams on one site, the one Sprint Review was starting to feel too heavy for Teams and the one Product Owner, so we changed the Sprint Review from one big event to a series of sequential meetings where the Product Owner and feature experts were visiting all teams at the main site, one after each other. Each team presented what they had done and got feedback. Since PO and feature experts wanted to visit all teams the time review was short at maximum 30 minutes. This way each team was able to follow the reviews of other teams they were interested and skip the ones they were not interested. The remote teams were reviewed in a separate event using teleconferencing and relying on the remote site Product Owner proxy doing a Review locally.
Current Team Structure
After adding even more teams we have now over 20 teams, and the majority are developing and documenting features. We have a couple of teams in supporting roles like performance testing, system testing, coaching and continuous integration system (CIS) team. The CIS team is taking care of building and automation system. The performance and system testing teams were focusing on executing tests that can not be done by feature teams because of the need for real network elements in a special test lab (located at different site). We have only a limited amount of these special network elements, and coordinating the usage of them between several teams at different sites is not feasible. The coaching team's main responsibility is to support modern engineering practices, help teams solve difficult technical challenges and in general help the organization to learn faster.
And we finally found a solution for the first-level managers when we had several of them on one site! Instead of working as free agents and loosely collaborating with each other, these managers formed a regular feature team and started to work together taking work from the Product Backlog. It is unclear who came up with the idea, but it worked! This solution was suitable for most managers since they had strong software development background. A few managers decided to focus on the ScrumMaster job, and they managed transfer from managers to ScrumMaster and worked closely with a local agile coach to remove impediments.
We still had few traditional project managers in the organization because of historical reasons, but there was no project-management work for them anymore! We managed to keep them away from the teams, which took some effort from ScrumMaster in the beginning, but they learned not to disturb the teams. So what did they do? They worked in helping removing impediments identified by teams and ScrumMasters and helping the Product Owner. But mostly they just had free time since there was no actual work for them, and they couldn’t remove very many impediments.
LeSS Huge Adoption Analysis
Starting product development and growing gradually from a LeSS to a LeSS Huge structure was challenging, and we did not succeed in that as well as we would have liked.
Since this is a telecom product based on technical standards for public inbound and outbound interfaces, we defined the Requirement Areas around families of these public interfaces. However, there were different areas for inbound, and for outbound, interfaces. For some of the requirements, a feature team within an area could complete an end-to-end requirement. So in those cases our choice of Requirement Areas was OK. But too often we discovered that a complete end-to-end requirement involved a choreography of inbound and outbound messaging. And then this this caused a dependency or constraint between teams in different areas.
Handling dependencies that were created by ourselves is just a self-inflicted wound! In hindsight, on seeing the problem, we should have re-organized the areas so that most of the interactions were within single and larger Requirement Areas.
Another main challenge was to get the Area Product Owner concept working. We have feature experts that do the requirement refinement with teams, but the Product Owner was reluctant to give decision-making power to the feature expects and to let them act as real Area Product Owners. This is clearly visible in the Sprint Planning and Review where the one Product Owner wants to be in charge of “everything”. Currently, teams struggle to get proper feedback about their items during the Sprint or during the Sprint Review since the overall Product Owner doesn’t want to delegate. The Review is done sequentially with each of the many teams separately by the non-delegating Product Owner; each team gets only 30 minutes. This leads to poor or skipped inspection of items, and no in-depth discussion.
In beginning of 2010 we had an interview study made by an external person. It focused on our LeSS adoption and how people see that their work has changed. Sixteen persons from all key roles participated in the interview study at the main site.
The biggest impact of the LeSS adoption, revealed by the study, can be summarized as: none of the people interviewed would like to go back to the old way of working.
The main negative finding was that even after two years of doing LeSS people had no clear picture what the LeSS-framework is, how it works, and why we are doing things like we are doing.
This just highlights the need to have an intensive education program where the whole organization participates to LeSS trainings that focus on “why”, so they can understand the rationale behind the framework. And it highlights that our ScrumMasters were not doing a good job of explaining “why?”.
Due to several conflicting forces in the organization, we did not manage to adopt LeSS beyond the organization building the product. The only visible impact on the broader organizational graph is that the lowest level of the organization is formed in feature teams. The first level managers still remain managers of teams, though most of them are focusing mainly on the development as they are working together as a Team, implementing PBIs. Each manager is still formally manager of a Team and is taking care of company bureaucracy related to their subordinates.
The selection of the LeSS framework and agile development practices significantly accelerated the time to market and gave us the flexibility that our traditional development methods never offered. The previous gateway product where we used a sequential life cycle model took twice as long to develop, and the sequential life cycle and single-function teams and component teams would not have allowed us to make the major change in direction that we did. LeSS gave us that organizational agility.
Automated acceptance testing helped us significantly when we added new teams to development to keep the code base in high quality 4.
We should have focused more in LeSS adoption when we added more teams since it is clear that the new teams were confused by the new way of working even though they had some experience in Scrum. But not with LeSS. It seems people learn the new way of working with coaching but cannot see the bigger picture in the development without formal education.
Moving from LeSS to LeSS Huge should have been planned instead of just being done ad hoc, and with a clear agreement from the bottleneck Product Owner to delegate to empowered Area Product Owners.
Overall LeSS adoption on the feature-team level worked. But to create sustainable organizational change there needs to be more focus on the formal organization otherwise over time there is a big risk of sliding back to a more traditional organizational design.
- Larman C., Vodde B.: Scaling Lean & Agile Development: Thinking and Organizational Tools for Large-Scale Scrum. Addison-Wesley, Boston (2009)
- Schwaber K.,Agile Project Management with Scrum. Microsoft Press, Redmond (2004)
- Nyman R., Aro I., Wagner R.: Automated Acceptance Testing of High Capacity Network Gateway, Agile Processes in Software Engineering and Extreme Programming, 11th International Conference, XP 2010, Trondheim, Norway, June 1-4, 2010. Proceedings
About the Author
Ran Nyman is an experienced software professional who has worked since 1995 in professional software development field. First programs he wrote in CP/M operating system using BASIC language in the middle of eighties. Since then he moved to more modern languages like C, C++, and Java. Ran has extensive experience in design patterns, UML, distributed systems, Test Driven Development and Specification by Example, Executable Requirements (also know as Acceptance Test Driven Development). Currently, Ran is working in Gosei Ltd. as a consultant and trainer in the process improvement field, helping large multinational organizations to move from sequential product development to more agile ways of working. The primary focus has been on how to move big products (over 100 people) to use Large-Scale Scrum (LeSS) and Lean. This work includes giving wide range of trainings, workshops, team coaching and management consulting. Ran is Certified LeSS Trainer and Certified Scrum Trainer.