Rate limiting is one of those controls that everyone agrees with in principle and that almost everyone gets wrong in practice. The simple version, capping requests per IP per minute, made sense fifteen years ago when most attackers operated from a small number of stable addresses. Today the attackers control sprawling residential proxy networks that source traffic from millions of household IP addresses, and a per-IP limit barely slows them down.
Per-Account Rate Limits Outperform Per-IP
Limits applied per authenticated identity work much better than limits applied per source address. An attacker can rotate through ten thousand IP addresses with ease. Rotating through ten thousand valid accounts takes considerably more effort and produces signals that abuse detection can pick up. Tie rate limits to the identity calling, not the network they happen to be using, and you cut the bulk of the brute force category in half. A capable best pen testing company will test both vectors during an engagement.
Endpoint Sensitivity Should Drive The Threshold
Not every endpoint deserves the same limit. A search endpoint that hits an expensive database query should tolerate fewer calls per minute than a status endpoint that returns a cached health check. Tiering your limits based on the cost and sensitivity of the underlying operation is what separates effective rate limiting from a blanket policy that frustrates real users and barely slows real attackers. The login endpoint, the password reset endpoint and the payment endpoint should all sit at the strict end of the scale.
Expert Commentary
William Fieldhouse, Director of Aardwolf Security Ltd

Most rate limiting failures we report come from inconsistent application. The endpoint at /api/login is protected. The endpoint at /api/v2/login was added six months later and the same protection never made it across. Maintain a single point of enforcement, ideally at the API gateway, and every endpoint inherits the right behaviour by default.
Monitoring Closes The Loop
Detecting abuse requires logging that captures enough context to identify patterns over time. Request rates per endpoint per identity, errors patterns by source, timing distributions and unusual parameter combinations all contribute to a useful telemetry stream. Pair the logging with regular review by people who understand normal traffic patterns. Automated detection catches the obvious abuse. Human review catches the abuse that was carefully crafted to look like normal traffic. Worth treating telemetry as a first class deliverable from each new API endpoint rather than something added later. Endpoints that ship with proper logging produce useful operational data from day one. Endpoints retrofitted with logging years later rarely produce the same quality of signal.
Response Behaviour Matters Too
When the limit triggers, the response should give the legitimate caller enough information to back off, while giving the attacker as little useful information as possible. A 429 status code with a Retry-After header is the standard. Avoid leaking specifics like the remaining budget for the next minute, because that helps the attacker tune their request rate to stay just below your detection threshold. Pair this with focused vulnerability scan services to catch the endpoints that slipped through the standard policy.
Rate limiting is a quiet control. When it works you barely notice. When it does not work the consequences land on the news pages. Rate limiting done well is invisible to legitimate users and brutal on automated abuse. The investment is modest and the protection compounds over time. API security is harder than web application security in some respects and easier in others. The teams that understand the differences and design their controls accordingly tend to produce better outcomes than the ones that simply apply web thinking to API problems.
















