A great article from Amazon’s Builder Library.
These patterns have three key features. One, they don’t scale up or slow down with load or stress. Two, they don’t have modes, which means they do the same operations in all conditions. Three, if they have any variation, it’s to do less work in times of stress so they can perform better when you need them most. There’s that anti-fragility again.
Even when there are only a few health checks active, the health checkers send a set of results to the aggregators that is sized to the maximum. For example, if only 10 health checks are configured on a particular health checker, it’s still constantly sending out a set of (for example) 10,000 results, if that’s how many health checks it could ultimately support. The other 9,990 entries are dummies. However, this ensures that the network load, as well as the work the aggregators are doing, won’t increase as customers configure more health checks. That’s a significant source of variance … gone.
Something important to remember is that O(1) doesn’t mean that a process or algorithm only uses one operation. It means that it uses a constant number of operations regardless of the size of the input. The notation should really be O(C).
A unicycle has fewer moving parts than a bicycle, but it’s much harder to ride. That’s not simpler. A good design has to handle many stresses and faults, and over enough time “survival of the fittest” tends to eliminate designs that have too many or too few moving parts or are not practical.
More on https://aws.amazon.com/builders-library/reliability-and-constant-work/
What do you think?