Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News Engineering Your Organization through Services, Platforms, and Communities

Engineering Your Organization through Services, Platforms, and Communities

This item in japanese

Organizations need to be able to sustainably deliver value to their customers and business; that is why they exist, said Randy Shoup at QCon Plus May 2021. To do so, they need to be able to effectively and efficiently leverage the "resources" they have at their disposal- their people, teams, and technology.

Shoup discussed three mechanisms organizations use to divide and share work: services, platforms, and communities:

Services reflect the different parts of the business, or what Domain Driven Design would call "domains".

Platforms are typically about factoring out common capabilities that almost every team needs to use.

There are always things you want to share and collaborate around that don’t strictly follow any organizational hierarchy. This is what communities are for, what the so-called Spotify model calls "guilds".

These mechanisms are not mutually exclusive; most large organizations use all of them, Shoup said.

To foster psychological safety and inclusiveness, one small but important thing that leaders can do is to make sure that all perspectives are heard in discussions and meetings, as Shoup explained:

If only the white men are doing the talking (not an uncommon situation), make sure to explicitly encourage input from other team members: "What do you think, X?" Modeling this behavior as a leader is a necessary -- but far from sufficient! -- step.

InfoQ interviewed Randy Shoup, VP of engineering and chief architect at eBay, about engineering organizations.

InfoQ: How does eBay organize its services around its domains of work?

Randy Shoup: At the very highest level, we have a selling domain, a buying domain, a search domain, a payments domain, etc. ("domain" is the actual term we use for these divisions). Each of these very large domains is made up of smaller domains, going down usually several levels. Ideally, at the "leaf" level of this organizational tree each individual development team is responsible for one and only one sub-domain, and builds and maintains a small set of (micro)services.

We organize like this for many reasons:

  • We can maintain long-lived teams in each particular area, and those teams can develop deep expertise in their domain area
  • We can build a particular capability in one and only one place (usually!)
  • We can leverage each other’s work through well-defined APIs

Anyone familiar with Conway’s Law will see that the organization and service architecture are duals of one another -- each reflects and reinforces the other. An architecture of microservices is supported and mirrored by an organization of autonomous teams.

As you can imagine, over time such an ecosystem of teams and services evolves. We add services, we deprecate them, etc. It can and does feel like everything is constantly in a state of flux. There is a well-known quip inside Google, for example, that "every service is either deprecated or not ready yet." It certainly can seem that way!

When a particular team needs to refactor their service(s) -- usually to factor out a new one, but sometimes to combine two of them -- they only have to interact with teams that are directly affected, and often those teams are "close by" in the organization. This reduces the overhead for change and makes it easier to evolve.

InfoQ: What possibilities can platforms offer to development teams?

Shoup: I don’t think I’ve ever worked at a company that did not have some sort of platform team. What is typically part of a platform? Think infrastructure (e.g., a public or private cloud), common capabilities (authentication, secrets management, observability, etc.), and developer experience (CD pipelines, source control, etc.).

You don’t have to build or even operate these things for them to be part of your platform. Most companies probably leverage a third-party provider for one or all of these things (and they should!). And many companies larger than a certain size might have several platforms, whether through acquisition or developer choice.

The opportunity here is to provide an easy-to-use "Paved Road" that reduces the cognitive load on development teams and makes using a common platform the path of least resistance. If I use the common platform, lots of things come for free and I don’t have to worry about them. So in order for me to choose to use anything else, it would have to be a *lot* better for my particular use-case.

InfoQ: Why should we charge internal customers for using platforms?

Shoup: Platforms cost money and time to develop and operate, and someone has to bear that cost. When you make something free to use, the user does not have a robust economic incentive to use it wisely and efficiently. In smaller organizations, informal methods can keep usage contained, but past a certain organization size, when those informal methods no longer scale, you have to put into place some form of chargeback, or at least showback, mechanism to incent efficient behavior. Fortunately, in the public cloud world, this is easier than it has ever been before, as all usage is metered.

I learned this lesson at Google App Engine, when one of our client teams internally was consuming something like 25% globally of a very expensive, very scarce resource. Even though we charged external customers, and we were ourselves charged internally, at the time internal use of App Engine was free. Asking the team didn’t work, nor did begging or threatening; it was simply not a top priority for them to make my team’s life easier. But once I sent them a bill for what we would have charged externally -- and that bill had a lot of zeros in it!-- they were able almost immediately to make it a priority to optimize their usage. They reduced their usage by 10x, and as a direct benefit to them, actually got better latency from App Engine. Economic incentives work.

InfoQ: How can we organize communities of practice?

Shoup: Whether it is through Slack channels, internal groups, regular meetings, or a combination of all of the above, there is a huge value in encouraging developers who, for example, use the same language framework, have the same role, or practice the same techniques, to collaborate.

My suggestion to leadership is to make this easy and lightweight. There should not need to be any approvals or oversight to create a new Slack channel, for example. This encourages teams to scratch their own itches instead of waiting for some bureaucratic process to execute.

The power of suggestion also works. At eBay, several teams have been independently using GraphQL for their internal services for several years, but those teams worked independently. With a small nudge from leadership, the teams involved formed a working group (with a Slack channel ;-)) to help share ideas and move GraphQL forward more broadly inside eBay.

InfoQ: How can leaders foster psychological safety and inclusiveness?

Shoup: As the Project Aristotle research at Google shows, psychological safety (recently rephrased by its original proponent Amy Edmondson as "felt permission for candor") is the single most important factor in determining team success. This team dynamic is more important than any individual characteristic of team members. The insight here is that if we are not hearing everyone’s perspective, we are making poorer decisions than we otherwise would, and are effectively "leaving money on the table."

As the Xerox PARC leader Bob Taylor used to say, "None of us is as smart as all of us."

Rate this Article