There is a comfortable fiction at the center of most cloud architectures, one that gets written into runbooks and repeated in postmortems with the same exhausted confidence: we autoscale. As if the declaration itself is a reliability posture. As if telling your HPA to watch CPU utilization is the same thing as building a system that breathes.
It isn’t. And the gap between those two things has eaten more than a few production environments.