Lots of companies get to a particular stage, say a couple hundred engineers, and they want to introduce the title of architect in their engineering organization. The motivation for this is that they typically feel some larger guidance or standards are missing and as the size of the organization grows, teams begin building different services or parts of the application differently. When I hear this, my first question is, what type of architect? I ask this because I think there are many different types of architects and each serves a different purpose. There is an interesting discussion about whether the title should be a job family, such as full-stack engineer or security engineer, or more of an honorific, such as tech lead; but, we’ll leave that for a different article.
You can probably find dozens of different types of architects in use at different tech companies including enterprise architect, solutions architect, data architect, network architect, and many more. At a company with tens of thousands of engineers in different divisions and subsidiaries, you might need some of those roles but for most tech teams with below a thousand engineers, I think there are only two primary types of architects that you should be concerned with - systems and software architects. Let's dive into how I differentiate these and why that is important.
Systems architecture deals with the architecture of the system as a whole. This includes topics such as how data flows between services, how services are dependent on each other, and how services or parts of the application can fail independently. An example of this would be the design of a messaging service that relies on a streaming service for real time notification of messages and a persistent storage for permanent recording of the messages and all meta information. Design considerations might include the volume of messages per second as well as over a longer period of time such as annually. Other things that need to be determined is how the messaging service should behave if it can't write to the persistent storage. Should it fail immediately, should it keep messages in memory until it runs out of capacity and then fail, or should it continue to work no matter if the messages are lost? These types of questions and considerations are helping to determine how the system as a whole should operate and fail. We often call these non-functional requirements. Note a couple of things about this discussion. There was no talk of a programming language, a software pattern, nor a technology. In fact, we didn't even specify if we're considering RDBMS or no-SQL solutions for storage. We do this because the system design doesn't care about the implementation and should be able to exist in multiple implementations of different technologies. This is useful because technologies advance very rapidly. Today we might choose MySQL because it meets our performance requirements and it's a technology that we as an organization know how to operate in a production environment (productionize). In the future we might make a shift to a different technology and our system’s architecture should be able to accommodate that.
Once we have the system design at all levels, since many systems have sub-components that need to be considered, we can then move on to the software architecture. Software architecture deals with languages, technologies, software patterns, etc. Continuing our messaging service example above, during this design phase we might now decide that we want to implement this in Node.js, Kafka, and ksqlDB. We might decide we should use the CQRS (Command Query Responsibility Segregation) software pattern to separate reads and writes since they will happen at very different rates. We might decide that this will be deployed in Kubernetes containers. These details are obviously just as important as the systems architecture but without a proper design at this level, we are likely to make worse decisions at the software architecture phase. Let's dig into how this could cause problems.
Let's say we didn't do our systems architecture design process and just jumped into selecting the technologies that we are going to implement. Our team knows Amazon Elastic Block Storage (EBS) really well so that's what we're going with. However, we didn't consider the rate of writes that our database will need at peak. As we implement this solution we find that at peak times the writes are falling behind. This might result in our customers not being able to re-read a message after it is sent for quite some time until the writes catch up. It might even eventually back up writes to the app and cause performance issues or even failure of that service. We of course can scramble and find a work around or even, worse case, reimplement the persistent storage with something more scalable. But all of this could have been avoided had we taken the time to do a proper systems architecture design.
Sometimes it is difficult to tell from a design document or architecture diagram, whether it is a system architecture or a software architecture. One simple rule that I use is that if a technology is mentioned in the doc or in the diagram, it's not a proper systems architecture. I prefer that generic terms like persistent storage and cache should be used instead of technologies like MySQL and Memcached.
Navigating the maze of a growing company, especially when the technical landscape starts resembling a bustling city rather than a quiet little town, calls for some architectural prowess. That’s where our knight in shining armor, the architect, comes galloping in. But wait, it's not just one knight; we have a duo - the systems architect and the software architect. As we've journeyed through, the systems architect is the visionary, the one sketching out the broad strokes, making sure all parts of our tech empire talk to each other nicely, without getting bogged down with which brand of communication devices they use. They’re all about keeping the big picture in focus, ensuring that whatever changes in the tech world, our empire remains standing tall.
Now, enter stage right, our software architect. They are the artisans, choosing the best tools, materials, and methods to bring our systems architect's grand vision to life. They dive into the nitty-gritty, deciding whether to go with the elegance of Node.js, the robustness of Kafka, or the agility of Kubernetes, making sure that the theoretical city the systems architect designed, functions smoothly on the ground.
The saga of the Amazon Elastic Block Storage (EBS) in our narrative is akin to a plot twist. It's like throwing a big, fancy party but realizing you forgot to check if the venue can handle the crowd. The hiccup demonstrated how skipping the homework on systems architecture could lead us into murky waters, especially when the tide of user interactions rises high.
The subtle takeaway, wrapped in a simple rule about avoiding tech specifics in system designs, is akin to keeping our systems architectural plans adaptable. Today it might be MySQL, tomorrow, who knows? By avoiding name-dropping technologies in system designs, we keep our blueprints ready for the future, whatever tech trends it may hurl our way.
This exploration through the realms of architecture in the tech kingdom not only differentiates the roles but also lays down a narrative on why both matter. It's like saying, before you decide on the decor, make sure the house is well-built. So, as our tech teams burgeon, the narrative guides us on why taking a pause, sketching the broad strokes (system architecture), and then diving into the details (software architecture) is not just wise, but essential. And who knows, with such a thoughtful approach, navigating the next phase of growth might just be a tad less daunting!
As you know, I'm going to challenge this a little bit. I'd argue that even for engineering organizations below 1000 engineers, if they have a large cloud infrastructure, then a third role is necessary: the business architect, who approaches design from a cloud financial operations perspective.