Scaling Organizations I

Part 1 - Universal Scalability Law

Nov 21, 2022

When I joined PayPal in the early 2000’s there were about 30 engineers. In less than three years we grew 10x. More recently at Etsy, I saw similar growth of the engineering team from 250 to almost 1,000 over the course of a couple of years. As a consultant, I saw similar growth of organizations at many companies. This type of growth puts a lot of pressure on people and processes. Understanding how to scale an organization is an important skill for engineering leaders.

The ideal scaling state is linear where adding one additional person achieves a full person’s worth of value. However, this is almost never the case. We almost always scale sub-linearly. Think of the phrase “too many cooks in the kitchen” as a witticism that depicts the problems with adding more people to a task. In computer science, we have Brooke’s law, from Fred Brooks 1975 book The Mythical Man-Month, that opines that adding an incremental person to a project slows it down instead of speeding it up. Of course this is an oversimplification but it does point out the issue of getting new people up to speed and the exponential increase in communication channels caused by more people (see Metcalfe’s law). But, as a company we need to grow to satisfy our growing customer base. Therefore, we should understand the factors that cause sub-linear scaling so that we can minimize their impact.

The formula below might look intimidating but once broken down it is easy to understand. Neil J. Gunther, a computer information systems researcher, introduced the Universal Scalability Law in 1993 as a capacity model for scaling massively parallel transaction systems. His formula extended Amdahl’s law, which if you’re into this type of thing, is used in parallel computing to predict the speedup of using multiple processors.

Where N = the number of processors, 𝛼 = queueing for shared resources, 𝛽 = latency for data to become consistent, and 𝛾 = effective amount of parallelism. These terms are respectively referred to as contention, coherency, and concurrency. In order to use this for an organization, X(N) becomes the amount of work being done and N is the number of employees. In that ideal world of linear scale, we’d have X(N) = N.

Let’s start with the factors in the denominator of the formula that reduce our scale. The 𝛼 (contention) parameter is how much we wait for someone else. When there are two people on a team, one person can review all the pull requests, when there are ten people on the team, that one person becomes the bottleneck. Minimizing this single point or bottleneck as we grow is important. As a manager scaling up, you must learn to delegate. As an engineer scaling up, you must learn to teach.

The second parameter that inhibits our scale is the 𝛽 (coherency) parameter which represents how many people are needed to make a decision. You may have noticed that this parameter is quadratic in N, meaning that it impacts scale way more than contention. The ability to make decisions with fewer stakeholders is crucial to scale. I wrote a few weeks ago about the importance of trust, this is one of the real world examples of why that is so important. Besides trusting colleagues, tools like RACI are often employed to help deal with this coherence problem.

The last parameter, 𝛾, is the amount of parallelism or concurrency. It is in the numerator so it increases our scale. In engineering we often think about this in terms of skill sets, e.g. frontend vs backend, or architecture, e.g. service splits. Specialization of skills helps us focus on smaller tasks and increase the parallelization of work. Human Resource business partners help scale engineering managers by having the expertise to guide decisions. Without them, such as in smaller companies, new managers have to rely on more senior managers to help them navigate issues that arise. Having an expert in payroll, keeps all of us from having to know something about that task.

In summary, the USL provides a model of organizational scale that helps explain why we need to limit contention and coherence, while increasing concurrency. Learn how to train, delegate, supervise, trust and verify, and generally not be the bottleneck. Use tools like RACI, trust colleagues, and limit meeting participants in order to reduce the number of decision makers. Divide tasks, break systems into smaller parts, and rely on other’s expertise to increase parallelism. This is how engineering organizations grow and continue to scale as close to linear as possible.

Share Fish Food for Thought

Fish Food for Thought

Scaling Organizations I

Part 1 - Universal Scalability Law

Discussion about this post