Every overloaded system sheds load somehow. The choice is whether it does so deliberately or accidentally. Accidental shedding looks like timeouts, huge queues, retry storms, and exhausted connection pools. Deliberate shedding makes a policy visible: reject exports before checkout, protect live requests from batch jobs, return a clear 503 with retry guidance, or serve stale data with a warning.
A concrete failure mode is an API that shares one worker pool between user requests and expensive background reports. During a traffic spike, reports fill the pool. User requests wait, clients retry, and the retries add more work. The system is “up” but behaves randomly. Adding load shedding might cap report concurrency, make report creation return later, and keep the interactive endpoint within its latency budget.
Good load shedding is a product and operations decision, not just code. Teams need to decide what matters first, what users should see, what alerts should fire, and how recovery happens after pressure drops.