Scaling Foundations: Latency, Throughput, SLOs

In this lesson we learn to estimate, in numbers, how "big" a system needs to be. We'll practice capacity estimates (how much load the system must handle), QPS (queries per second — how many requests arrive each second), how much storage is needed, how much bandwidth (the amount of data flowing at an

System Design is like planning a whole city: roads that carry the traffic, warehouses that store things, traffic lights that keep the flow orderly, and maintenance crews that fix problems — all so the city keeps running smoothly even at rush hour, when everyone is out at once.

Capacity and SLOs: The core skill of this lesson: estimating the system's size in numbers — capacity (how much load it handles), QPS (requests per second), how much storage is needed, how much bandwidth (the amount of data flowing), and SLOs (service level objectives, our promise about availability or speed). A rough up-front calculation that helps you plan correctly.
Trade-off: A conscious choice between two options, where each has an upside but also a price. There's no perfect answer — you pick what fits and explain to the interviewer what you gain and what you give up. Like choosing between a fast, pricey route and a cheap, slow one.
Operational metric: A number you can measure to tell whether a decision actually works once the system is live and serving real users (production). For example: latency (how long a response takes to come back), error rate (the share of requests that fail), queue lag (how many tasks are waiting in line), cache hit ratio (how often we found the answer ready in a fast temporary store), and more.