Skip to main content

Rate Limiting

Overview

Q01 Core APIs implement rate limiting to prevent abuse, ensure fair usage, and maintain system stability. Rate limits restrict the number of API requests a user or tenant can make within a specific time window.

Benefits:

  • ✅ Prevents API abuse and DoS attacks
  • ✅ Ensures fair resource allocation
  • ✅ Maintains system performance
  • ✅ Protects backend services from overload
  • ✅ Encourages efficient client implementations

Rate Limit Dimensions:

  • Per User - Individual user request limits
  • Per Tenant - Tenant-wide request limits
  • Per Endpoint - Operation-specific limits
  • Global - System-wide limits

Rate Limit Tiers

Standard Limits

Per User Limits:

WindowLimitBurst
Second10 requests15 requests
Minute100 requests150 requests
Hour1,000 requests1,500 requests

Per Tenant Limits:

WindowLimitBurst
Second100 requests150 requests
Minute1,000 requests1,500 requests
Hour10,000 requests15,000 requests

Operation-Specific Limits

Read Operations (GET):

EndpointLimitWindow
GET /api/v4/core/{dim}200/min1 minute
GET /api/v4/core/{dim}/{id}300/min1 minute

Write Operations (POST/PUT/PATCH/DELETE):

EndpointLimitWindow
POST /api/v4/core/{dim}50/min1 minute
PUT /api/v4/core/{dim}/{id}100/min1 minute
PATCH /api/v4/core/{dim}/{id}100/min1 minute
DELETE /api/v4/core/{dim}/{id}20/min1 minute

Rationale:

  • Read operations have higher limits (less resource-intensive)
  • Write operations have lower limits (database writes, cascades, outbox)
  • Delete operations have lowest limits (destructive operations)

Rate Limit Headers

Response Headers

Every API response includes rate limit headers:

X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 847
X-RateLimit-Reset: 1735574400
X-RateLimit-Window: 3600

Header Descriptions:

HeaderDescriptionExample
X-RateLimit-LimitTotal requests allowed in window1000
X-RateLimit-RemainingRemaining requests in current window847
X-RateLimit-ResetUnix timestamp when limit resets1735574400
X-RateLimit-WindowWindow duration in seconds3600 (1 hour)

Example Response

Normal Response (within limits):

GET /api/v4/core/PRD
Authorization: Bearer {token}

HTTP/1.1 200 OK
Content-Type: application/json
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 847
X-RateLimit-Reset: 1735574400

{
"status": "success",
"data": [...]
}

Rate Limit Exceeded:

GET /api/v4/core/PRD
Authorization: Bearer {token}

HTTP/1.1 429 Too Many Requests
Content-Type: application/json
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1735574400
Retry-After: 300

{
"error": "RateLimitExceeded",
"message": "Rate limit exceeded. Retry after 300 seconds.",
"code": "RATE_LIMIT_EXCEEDED",
"status": 429,
"retry_after": 300,
"limit": 1000,
"window": 3600
}

Rate Limit Implementation

Server-Side Tracking

CoreService tracks requests using Redis:

// routingController.go
func RateLimitMiddleware(next buffalo.Handler) buffalo.Handler {
return func(c buffalo.Context) error {
// Extract user from JWT
userId := c.Value("user_id").(string)

// Check rate limit
allowed, remaining, resetAt := checkRateLimit(userId)

// Set headers
c.Response().Header().Set("X-RateLimit-Limit", "1000")
c.Response().Header().Set("X-RateLimit-Remaining", fmt.Sprintf("%d", remaining))
c.Response().Header().Set("X-RateLimit-Reset", fmt.Sprintf("%d", resetAt))

if !allowed {
retryAfter := resetAt - time.Now().Unix()
c.Response().Header().Set("Retry-After", fmt.Sprintf("%d", retryAfter))

return c.Render(429, r.JSON(map[string]interface{}{
"error": "RateLimitExceeded",
"message": fmt.Sprintf("Rate limit exceeded. Retry after %d seconds.", retryAfter),
"code": "RATE_LIMIT_EXCEEDED",
"status": 429,
"retry_after": retryAfter,
}))
}

return next(c)
}
}

func checkRateLimit(userId string) (bool, int, int64) {
key := fmt.Sprintf("ratelimit:%s:hour", userId)

// Increment counter
count, _ := redis.Incr(key)

// Set expiration on first request
if count == 1 {
redis.Expire(key, 3600) // 1 hour
}

// Get TTL
ttl, _ := redis.TTL(key)
resetAt := time.Now().Unix() + ttl

limit := 1000
remaining := limit - count
allowed := count <= limit

return allowed, remaining, resetAt
}

Token Bucket Algorithm

Rate limiting uses token bucket algorithm:

Bucket Capacity: 1000 tokens
Refill Rate: 1000 tokens/hour (or ~0.28 tokens/second)

Request arrives:

Bucket has tokens? → Yes → Consume 1 token, allow request

No → Return 429 Too Many Requests

Benefits:

  • ✅ Allows bursts (temporary spikes)
  • ✅ Smooth rate limiting
  • ✅ Fair over time

Handling Rate Limits

Client-Side Detection

JavaScript Rate Limit Handler:

class RateLimitHandler {
constructor() {
this.rateLimitInfo = {
limit: 1000,
remaining: 1000,
resetAt: Date.now() + 3600000
};
}

updateFromHeaders(headers) {
this.rateLimitInfo = {
limit: parseInt(headers.get('X-RateLimit-Limit')) || 1000,
remaining: parseInt(headers.get('X-RateLimit-Remaining')) || 1000,
resetAt: parseInt(headers.get('X-RateLimit-Reset')) * 1000
};
}

isNearLimit(threshold = 0.1) {
const percentRemaining = this.rateLimitInfo.remaining / this.rateLimitInfo.limit;
return percentRemaining <= threshold;
}

getResetTime() {
return new Date(this.rateLimitInfo.resetAt);
}

getTimeUntilReset() {
return Math.max(0, this.rateLimitInfo.resetAt - Date.now());
}
}

// Usage
const rateLimitHandler = new RateLimitHandler();

async function apiRequest(url, options = {}) {
// Check if near limit
if (rateLimitHandler.isNearLimit(0.1)) {
console.warn('Approaching rate limit:', rateLimitHandler.rateLimitInfo);
}

const response = await fetch(url, options);

// Update rate limit info from response headers
rateLimitHandler.updateFromHeaders(response.headers);

// Handle rate limit exceeded
if (response.status === 429) {
const error = await response.json();
const retryAfter = error.retry_after * 1000; // Convert to milliseconds

console.warn(`Rate limit exceeded. Retry after ${error.retry_after}s`);

// Wait and retry
await sleep(retryAfter);
return apiRequest(url, options);
}

return response;
}

function sleep(ms) {
return new Promise(resolve => setTimeout(resolve, ms));
}

Exponential Backoff

Retry with exponential backoff:

async function apiRequestWithBackoff(url, options = {}, maxRetries = 3) {
let retryCount = 0;

while (retryCount < maxRetries) {
try {
const response = await fetch(url, options);

if (response.status === 429) {
retryCount++;

if (retryCount >= maxRetries) {
throw new Error('Max retries exceeded');
}

// Get Retry-After header or use exponential backoff
const retryAfter = response.headers.get('Retry-After');
const backoffMs = retryAfter
? parseInt(retryAfter) * 1000
: Math.pow(2, retryCount) * 1000; // 2s, 4s, 8s

console.log(`Rate limited. Retry ${retryCount}/${maxRetries} after ${backoffMs}ms`);

await sleep(backoffMs);
continue;
}

return response;
} catch (error) {
console.error('API request failed:', error);
throw error;
}
}

throw new Error('Request failed after retries');
}

Throttling Requests

Limit concurrent requests:

class RequestThrottler {
constructor(maxConcurrent = 10) {
this.maxConcurrent = maxConcurrent;
this.activeRequests = 0;
this.queue = [];
}

async throttle(requestFn) {
// Wait if at max concurrency
if (this.activeRequests >= this.maxConcurrent) {
await new Promise(resolve => this.queue.push(resolve));
}

this.activeRequests++;

try {
return await requestFn();
} finally {
this.activeRequests--;

// Process next queued request
if (this.queue.length > 0) {
const resolve = this.queue.shift();
resolve();
}
}
}
}

// Usage
const throttler = new RequestThrottler(10);

async function batchCreateProducts(products) {
const results = await Promise.all(
products.map(product =>
throttler.throttle(() => createProduct(product))
)
);

return results;
}

Rate Limit Monitoring

Client-Side Monitoring

Track rate limit usage:

class RateLimitMonitor {
constructor() {
this.history = [];
}

record(remaining, limit, timestamp = Date.now()) {
this.history.push({ remaining, limit, timestamp });

// Keep last 100 records
if (this.history.length > 100) {
this.history.shift();
}
}

getAverageUsage() {
if (this.history.length === 0) return 0;

const totalUsage = this.history.reduce((sum, record) => {
return sum + (record.limit - record.remaining);
}, 0);

return totalUsage / this.history.length;
}

isHighUsage(threshold = 0.8) {
if (this.history.length === 0) return false;

const latest = this.history[this.history.length - 1];
const percentUsed = (latest.limit - latest.remaining) / latest.limit;

return percentUsed >= threshold;
}

logStatus() {
if (this.history.length === 0) {
console.log('No rate limit data');
return;
}

const latest = this.history[this.history.length - 1];
const percentUsed = ((latest.limit - latest.remaining) / latest.limit * 100).toFixed(1);

console.log(`Rate Limit: ${latest.remaining}/${latest.limit} (${percentUsed}% used)`);
}
}

// Usage
const monitor = new RateLimitMonitor();

async function apiRequest(url, options = {}) {
const response = await fetch(url, options);

// Record rate limit info
const remaining = parseInt(response.headers.get('X-RateLimit-Remaining'));
const limit = parseInt(response.headers.get('X-RateLimit-Limit'));
monitor.record(remaining, limit);

// Check if high usage
if (monitor.isHighUsage(0.8)) {
console.warn('High API usage detected');
}

return response;
}

Best Practices

✅ DO:

Check rate limit headers:

// ✅ Good - monitor remaining requests
const remaining = response.headers.get('X-RateLimit-Remaining');
if (remaining < 50) {
console.warn('Low rate limit remaining:', remaining);
}

Implement exponential backoff:

// ✅ Good - retry with backoff
if (response.status === 429) {
const retryAfter = response.headers.get('Retry-After');
await sleep(retryAfter * 1000);
return retry();
}

Throttle batch operations:

// ✅ Good - limit concurrent requests
const throttler = new RequestThrottler(10);
await Promise.all(products.map(p =>
throttler.throttle(() => createProduct(p))
));

Cache responses:

// ✅ Good - reduce API calls
const cached = cache.get(key);
if (cached) return cached;

const data = await fetchFromAPI();
cache.set(key, data, 60); // Cache 1 minute
return data;

❌ DON'T:

Don't ignore rate limit headers:

// ❌ Bad - no rate limit awareness
await fetch(url); // No header checking

Don't hammer API on 429:

// ❌ Bad - immediate retry
if (response.status === 429) {
return fetch(url); // No waiting!
}

Don't send unbounded parallel requests:

// ❌ Bad - 1000 concurrent requests
await Promise.all(
largeArray.map(item => createItem(item))
);

Don't skip caching:

// ❌ Bad - fetching same data repeatedly
for (let i = 0; i < 10; i++) {
await getProduct(123); // Same product 10 times!
}

Summary

  • ✅ Rate limits prevent abuse and ensure fair usage
  • ✅ Limits per user, tenant, endpoint, and global
  • ✅ Response headers indicate limit status
  • ✅ 429 status code when limit exceeded
  • ✅ Retry-After header indicates wait time
  • ✅ Token bucket algorithm allows bursts
  • ✅ Client should implement exponential backoff
  • ✅ Throttle concurrent requests for batch operations

Key Takeaways:

  1. Monitor rate limit headers on every response
  2. Implement exponential backoff for 429 errors
  3. Throttle batch operations to stay within limits
  4. Cache responses to reduce API calls
  5. Check Retry-After header for wait time
  6. Write operations have lower limits than reads
  7. Be a good API citizen - don't abuse limits

Rate Limit Flow:

API Request

Check Rate Limit (Redis)

Within Limit? → Yes → Process Request
↓ ↓
No Set Headers
↓ ↓
Return 429 Return Response

Client Waits (Retry-After)

Retry Request

Response Headers:

X-RateLimit-Limit: 1000        (total allowed)
X-RateLimit-Remaining: 847 (remaining in window)
X-RateLimit-Reset: 1735574400 (reset timestamp)
Retry-After: 300 (wait 300 seconds)