Overhead in Computing Systems

There is a concept of "overhead cost" in business. Overhead costs are ongoing, indirect expenses needed to run a business. These costs are incidental to the nature of certain business operations and are not of the operation itself.

The cost of logistics is a good example. Consider the transportation costs for a batch of goods. If it costs the same to move 5 units of said goods as it does 10 units, then moving 5 units twice introduces overhead in fulfillment time, and doubles our costs. Say there is a fixed cost per item. Then making two trips will not double our cost, but still has its cost in fulfillment time. And, if we impose the constraint of a single vehicle, introduces a hidden cost; the cost of the return trip to the start to pick up the leftover goods.

This is a trivial observation. The number of operations you carry out is directly proportional to the number of incidental costs you incur.

Let us consider again the logistics example. If we get rid of our "single vehicle" constraint, we have parallelism. Then we can get rid of the return trip costs. Sure, we have to ensure we dispatch all our vehicles at the same time to minimize the cost in fulfillment time, but we have inadvertently introduced another cost; the cost of synchronization, assuming we have the bizarre requirement to ensure all 10 units get delivered all at once, as one package.

Generally, when you split an operation into parts, you introduce overhead at the margins. This is a very important design consideration. Why does modifying some algorithms to use more memory make them faster? Why should you not make two HTTP calls, when you can make one? Why are n+1 queries bad?

For HTTP calls, some of the steps are not directly related to the server's job of handling your request. There is Socket Initialization, DNS Lookup, and TCP Handshake before the request starts being processed. Along with a TLS Handshake, if you're working with HTTPS. This is done on every request. Depending on the nature of your request, there could be several similar connections that the server has to make to other HTTP services or a database.

Making n+1 SQL queries is bad for similar reasons. The incidental operations of opening up a connection to the database and receiving and decoding the results, n times, lead to significantly reduced performance. The solution to n+1 queries is of course to use joins. This gives you a single more performant query, that is run just once.

One optimization for both of these scenarios is connection pooling. As a rule of thumb, you should always use connection pooling when you can.

Overhead breeds this class of problems, and when you learn to look for it, you see it everywhere.

Not all overhead can be eliminated though, a case in point is parallelism. Parallelism is an optimization and a very useful one, but it introduces the overhead of synchronization. If your primary requirement is operational guarantee, that is operations that must happen, then you necessarily have to break your operation into two parts, the initation part and the retryable part. This overhead cannot be removed. It is useful to break down code into well-defined sub-procedures or functions. Each function invocation introduces a small run-time overhead. But the usefulness of the readability of code is usually worth it.

Overhead is not inherently good or bad, it is fundamental. Overhead is a universal concept, no different from the concepts of abstraction or parallelism. Identifying where it can be removed is a valuable skill.