is a technique that limits the number of requests or data that a server can handle in a given time period.
- Helps to prevent overloading, improve performance, and allow quota limits.
- Protects against denial-of-service (DoS), by limiting the number of requests per IP.
- Helps to balance between different servers or regions.
- Can be used to establish the priority of requests (pay for quality of service).
- Can be used to govern and commercialize APIs (ex: Google Maps).
- Cloud services use throttling for limits and quotas for storage, computing, and so on.