Rate Limiting
Overview
Q01 Core APIs implement rate limiting to prevent abuse, ensure fair usage, and maintain system stability. Rate limits restrict the number of API requests a user or tenant can make within a specific time window.
Benefits:
- ✅ Prevents API abuse and DoS attacks
- ✅ Ensures fair resource allocation
- ✅ Maintains system performance
- ✅ Protects backend services from overload
- ✅ Encourages efficient client implementations
Rate Limit Dimensions:
- Per User - Individual user request limits
- Per Tenant - Tenant-wide request limits
- Per Endpoint - Operation-specific limits
- Global - System-wide limits
Rate Limit Tiers
Standard Limits
Per User Limits:
| Window | Limit | Burst |
|---|---|---|
| Second | 10 requests | 15 requests |
| Minute | 100 requests | 150 requests |
| Hour | 1,000 requests | 1,500 requests |
Per Tenant Limits:
| Window | Limit | Burst |
|---|---|---|
| Second | 100 requests | 150 requests |
| Minute | 1,000 requests | 1,500 requests |
| Hour | 10,000 requests | 15,000 requests |
Operation-Specific Limits
Read Operations (GET):
| Endpoint | Limit | Window |
|---|---|---|
GET /api/v4/core/{dim} | 200/min | 1 minute |
GET /api/v4/core/{dim}/{id} | 300/min | 1 minute |
Write Operations (POST/PUT/PATCH/DELETE):
| Endpoint | Limit | Window |
|---|---|---|
POST /api/v4/core/{dim} | 50/min | 1 minute |
PUT /api/v4/core/{dim}/{id} | 100/min | 1 minute |
PATCH /api/v4/core/{dim}/{id} | 100/min | 1 minute |
DELETE /api/v4/core/{dim}/{id} | 20/min | 1 minute |
Rationale:
- Read operations have higher limits (less resource-intensive)
- Write operations have lower limits (database writes, cascades, outbox)
- Delete operations have lowest limits (destructive operations)
Rate Limit Headers
Response Headers
Every API response includes rate limit headers:
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 847
X-RateLimit-Reset: 1735574400
X-RateLimit-Window: 3600
Header Descriptions:
| Header | Description | Example |
|---|---|---|
X-RateLimit-Limit | Total requests allowed in window | 1000 |
X-RateLimit-Remaining | Remaining requests in current window | 847 |
X-RateLimit-Reset | Unix timestamp when limit resets | 1735574400 |
X-RateLimit-Window | Window duration in seconds | 3600 (1 hour) |
Example Response
Normal Response (within limits):
GET /api/v4/core/PRD
Authorization: Bearer {token}
HTTP/1.1 200 OK
Content-Type: application/json
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 847
X-RateLimit-Reset: 1735574400
{
"status": "success",
"data": [...]
}
Rate Limit Exceeded:
GET /api/v4/core/PRD
Authorization: Bearer {token}
HTTP/1.1 429 Too Many Requests
Content-Type: application/json
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1735574400
Retry-After: 300
{
"error": "RateLimitExceeded",
"message": "Rate limit exceeded. Retry after 300 seconds.",
"code": "RATE_LIMIT_EXCEEDED",
"status": 429,
"retry_after": 300,
"limit": 1000,
"window": 3600
}
Rate Limit Implementation
Server-Side Tracking
CoreService tracks requests using Redis:
// routingController.go
func RateLimitMiddleware(next buffalo.Handler) buffalo.Handler {
return func(c buffalo.Context) error {
// Extract user from JWT
userId := c.Value("user_id").(string)
// Check rate limit
allowed, remaining, resetAt := checkRateLimit(userId)
// Set headers
c.Response().Header().Set("X-RateLimit-Limit", "1000")
c.Response().Header().Set("X-RateLimit-Remaining", fmt.Sprintf("%d", remaining))
c.Response().Header().Set("X-RateLimit-Reset", fmt.Sprintf("%d", resetAt))
if !allowed {
retryAfter := resetAt - time.Now().Unix()
c.Response().Header().Set("Retry-After", fmt.Sprintf("%d", retryAfter))
return c.Render(429, r.JSON(map[string]interface{}{
"error": "RateLimitExceeded",
"message": fmt.Sprintf("Rate limit exceeded. Retry after %d seconds.", retryAfter),
"code": "RATE_LIMIT_EXCEEDED",
"status": 429,
"retry_after": retryAfter,
}))
}
return next(c)
}
}
func checkRateLimit(userId string) (bool, int, int64) {
key := fmt.Sprintf("ratelimit:%s:hour", userId)
// Increment counter
count, _ := redis.Incr(key)
// Set expiration on first request
if count == 1 {
redis.Expire(key, 3600) // 1 hour
}
// Get TTL
ttl, _ := redis.TTL(key)
resetAt := time.Now().Unix() + ttl
limit := 1000
remaining := limit - count
allowed := count <= limit
return allowed, remaining, resetAt
}
Token Bucket Algorithm
Rate limiting uses token bucket algorithm:
Bucket Capacity: 1000 tokens
Refill Rate: 1000 tokens/hour (or ~0.28 tokens/second)
Request arrives:
↓
Bucket has tokens? → Yes → Consume 1 token, allow request
↓
No → Return 429 Too Many Requests
Benefits:
- ✅ Allows bursts (temporary spikes)
- ✅ Smooth rate limiting
- ✅ Fair over time
Handling Rate Limits
Client-Side Detection
JavaScript Rate Limit Handler:
class RateLimitHandler {
constructor() {
this.rateLimitInfo = {
limit: 1000,
remaining: 1000,
resetAt: Date.now() + 3600000
};
}
updateFromHeaders(headers) {
this.rateLimitInfo = {
limit: parseInt(headers.get('X-RateLimit-Limit')) || 1000,
remaining: parseInt(headers.get('X-RateLimit-Remaining')) || 1000,
resetAt: parseInt(headers.get('X-RateLimit-Reset')) * 1000
};
}
isNearLimit(threshold = 0.1) {
const percentRemaining = this.rateLimitInfo.remaining / this.rateLimitInfo.limit;
return percentRemaining <= threshold;
}
getResetTime() {
return new Date(this.rateLimitInfo.resetAt);
}
getTimeUntilReset() {
return Math.max(0, this.rateLimitInfo.resetAt - Date.now());
}
}
// Usage
const rateLimitHandler = new RateLimitHandler();
async function apiRequest(url, options = {}) {
// Check if near limit
if (rateLimitHandler.isNearLimit(0.1)) {
console.warn('Approaching rate limit:', rateLimitHandler.rateLimitInfo);
}
const response = await fetch(url, options);
// Update rate limit info from response headers
rateLimitHandler.updateFromHeaders(response.headers);
// Handle rate limit exceeded
if (response.status === 429) {
const error = await response.json();
const retryAfter = error.retry_after * 1000; // Convert to milliseconds
console.warn(`Rate limit exceeded. Retry after ${error.retry_after}s`);
// Wait and retry
await sleep(retryAfter);
return apiRequest(url, options);
}
return response;
}
function sleep(ms) {
return new Promise(resolve => setTimeout(resolve, ms));
}
Exponential Backoff
Retry with exponential backoff:
async function apiRequestWithBackoff(url, options = {}, maxRetries = 3) {
let retryCount = 0;
while (retryCount < maxRetries) {
try {
const response = await fetch(url, options);
if (response.status === 429) {
retryCount++;
if (retryCount >= maxRetries) {
throw new Error('Max retries exceeded');
}
// Get Retry-After header or use exponential backoff
const retryAfter = response.headers.get('Retry-After');
const backoffMs = retryAfter
? parseInt(retryAfter) * 1000
: Math.pow(2, retryCount) * 1000; // 2s, 4s, 8s
console.log(`Rate limited. Retry ${retryCount}/${maxRetries} after ${backoffMs}ms`);
await sleep(backoffMs);
continue;
}
return response;
} catch (error) {
console.error('API request failed:', error);
throw error;
}
}
throw new Error('Request failed after retries');
}
Throttling Requests
Limit concurrent requests:
class RequestThrottler {
constructor(maxConcurrent = 10) {
this.maxConcurrent = maxConcurrent;
this.activeRequests = 0;
this.queue = [];
}
async throttle(requestFn) {
// Wait if at max concurrency
if (this.activeRequests >= this.maxConcurrent) {
await new Promise(resolve => this.queue.push(resolve));
}
this.activeRequests++;
try {
return await requestFn();
} finally {
this.activeRequests--;
// Process next queued request
if (this.queue.length > 0) {
const resolve = this.queue.shift();
resolve();
}
}
}
}
// Usage
const throttler = new RequestThrottler(10);
async function batchCreateProducts(products) {
const results = await Promise.all(
products.map(product =>
throttler.throttle(() => createProduct(product))
)
);
return results;
}
Rate Limit Monitoring
Client-Side Monitoring
Track rate limit usage:
class RateLimitMonitor {
constructor() {
this.history = [];
}
record(remaining, limit, timestamp = Date.now()) {
this.history.push({ remaining, limit, timestamp });
// Keep last 100 records
if (this.history.length > 100) {
this.history.shift();
}
}
getAverageUsage() {
if (this.history.length === 0) return 0;
const totalUsage = this.history.reduce((sum, record) => {
return sum + (record.limit - record.remaining);
}, 0);
return totalUsage / this.history.length;
}
isHighUsage(threshold = 0.8) {
if (this.history.length === 0) return false;
const latest = this.history[this.history.length - 1];
const percentUsed = (latest.limit - latest.remaining) / latest.limit;
return percentUsed >= threshold;
}
logStatus() {
if (this.history.length === 0) {
console.log('No rate limit data');
return;
}
const latest = this.history[this.history.length - 1];
const percentUsed = ((latest.limit - latest.remaining) / latest.limit * 100).toFixed(1);
console.log(`Rate Limit: ${latest.remaining}/${latest.limit} (${percentUsed}% used)`);
}
}
// Usage
const monitor = new RateLimitMonitor();
async function apiRequest(url, options = {}) {
const response = await fetch(url, options);
// Record rate limit info
const remaining = parseInt(response.headers.get('X-RateLimit-Remaining'));
const limit = parseInt(response.headers.get('X-RateLimit-Limit'));
monitor.record(remaining, limit);
// Check if high usage
if (monitor.isHighUsage(0.8)) {
console.warn('High API usage detected');
}
return response;
}
Best Practices
✅ DO:
Check rate limit headers:
// ✅ Good - monitor remaining requests
const remaining = response.headers.get('X-RateLimit-Remaining');
if (remaining < 50) {
console.warn('Low rate limit remaining:', remaining);
}
Implement exponential backoff:
// ✅ Good - retry with backoff
if (response.status === 429) {
const retryAfter = response.headers.get('Retry-After');
await sleep(retryAfter * 1000);
return retry();
}
Throttle batch operations:
// ✅ Good - limit concurrent requests
const throttler = new RequestThrottler(10);
await Promise.all(products.map(p =>
throttler.throttle(() => createProduct(p))
));
Cache responses:
// ✅ Good - reduce API calls
const cached = cache.get(key);
if (cached) return cached;
const data = await fetchFromAPI();
cache.set(key, data, 60); // Cache 1 minute
return data;
❌ DON'T:
Don't ignore rate limit headers:
// ❌ Bad - no rate limit awareness
await fetch(url); // No header checking
Don't hammer API on 429:
// ❌ Bad - immediate retry
if (response.status === 429) {
return fetch(url); // No waiting!
}
Don't send unbounded parallel requests:
// ❌ Bad - 1000 concurrent requests
await Promise.all(
largeArray.map(item => createItem(item))
);
Don't skip caching:
// ❌ Bad - fetching same data repeatedly
for (let i = 0; i < 10; i++) {
await getProduct(123); // Same product 10 times!
}
Summary
- ✅ Rate limits prevent abuse and ensure fair usage
- ✅ Limits per user, tenant, endpoint, and global
- ✅ Response headers indicate limit status
- ✅ 429 status code when limit exceeded
- ✅ Retry-After header indicates wait time
- ✅ Token bucket algorithm allows bursts
- ✅ Client should implement exponential backoff
- ✅ Throttle concurrent requests for batch operations
Key Takeaways:
- Monitor rate limit headers on every response
- Implement exponential backoff for 429 errors
- Throttle batch operations to stay within limits
- Cache responses to reduce API calls
- Check Retry-After header for wait time
- Write operations have lower limits than reads
- Be a good API citizen - don't abuse limits
Rate Limit Flow:
API Request
↓
Check Rate Limit (Redis)
↓
Within Limit? → Yes → Process Request
↓ ↓
No Set Headers
↓ ↓
Return 429 Return Response
↓
Client Waits (Retry-After)
↓
Retry Request
Response Headers:
X-RateLimit-Limit: 1000 (total allowed)
X-RateLimit-Remaining: 847 (remaining in window)
X-RateLimit-Reset: 1735574400 (reset timestamp)
Retry-After: 300 (wait 300 seconds)
Related Concepts
- Security Overview - Multi-level security
- Authentication - JWT tokens
- Error Handling - 429 errors
- Batch Operations - Throttling batches