Rate Limiting

Overview

Q01 Core APIs implement rate limiting to prevent abuse, ensure fair usage, and maintain system stability. Rate limits restrict the number of API requests a user or tenant can make within a specific time window.

Benefits:

✅ Prevents API abuse and DoS attacks
✅ Ensures fair resource allocation
✅ Maintains system performance
✅ Protects backend services from overload
✅ Encourages efficient client implementations

Rate Limit Dimensions:

Per User - Individual user request limits
Per Tenant - Tenant-wide request limits
Per Endpoint - Operation-specific limits
Global - System-wide limits

Rate Limit Tiers

Standard Limits

Per User Limits:

Window	Limit	Burst
Second	10 requests	15 requests
Minute	100 requests	150 requests
Hour	1,000 requests	1,500 requests

Per Tenant Limits:

Window	Limit	Burst
Second	100 requests	150 requests
Minute	1,000 requests	1,500 requests
Hour	10,000 requests	15,000 requests

Operation-Specific Limits

Read Operations (GET):

Endpoint	Limit	Window
`GET /api/v4/core/{dim}`	200/min	1 minute
`GET /api/v4/core/{dim}/{id}`	300/min	1 minute

Write Operations (POST/PUT/PATCH/DELETE):

Endpoint	Limit	Window
`POST /api/v4/core/{dim}`	50/min	1 minute
`PUT /api/v4/core/{dim}/{id}`	100/min	1 minute
`PATCH /api/v4/core/{dim}/{id}`	100/min	1 minute
`DELETE /api/v4/core/{dim}/{id}`	20/min	1 minute

Rationale:

Read operations have higher limits (less resource-intensive)
Write operations have lower limits (database writes, cascades, outbox)
Delete operations have lowest limits (destructive operations)

Rate Limit Headers

Response Headers

Every API response includes rate limit headers:

X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 847
X-RateLimit-Reset: 1735574400
X-RateLimit-Window: 3600

Header Descriptions:

Header	Description	Example
`X-RateLimit-Limit`	Total requests allowed in window	`1000`
`X-RateLimit-Remaining`	Remaining requests in current window	`847`
`X-RateLimit-Reset`	Unix timestamp when limit resets	`1735574400`
`X-RateLimit-Window`	Window duration in seconds	`3600` (1 hour)

Example Response

Normal Response (within limits):

GET /api/v4/core/PRD
Authorization: Bearer {token}

HTTP/1.1 200 OK
Content-Type: application/json
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 847
X-RateLimit-Reset: 1735574400

{
  "status": "success",
  "data": [...]
}

Rate Limit Exceeded:

GET /api/v4/core/PRD
Authorization: Bearer {token}

HTTP/1.1 429 Too Many Requests
Content-Type: application/json
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1735574400
Retry-After: 300

{
  "error": "RateLimitExceeded",
  "message": "Rate limit exceeded. Retry after 300 seconds.",
  "code": "RATE_LIMIT_EXCEEDED",
  "status": 429,
  "retry_after": 300,
  "limit": 1000,
  "window": 3600
}

Rate Limit Implementation

Server-Side Tracking

CoreService tracks requests using Redis:

// routingController.go
func RateLimitMiddleware(next buffalo.Handler) buffalo.Handler {
    return func(c buffalo.Context) error {
        // Extract user from JWT
        userId := c.Value("user_id").(string)

        // Check rate limit
        allowed, remaining, resetAt := checkRateLimit(userId)

        // Set headers
        c.Response().Header().Set("X-RateLimit-Limit", "1000")
        c.Response().Header().Set("X-RateLimit-Remaining", fmt.Sprintf("%d", remaining))
        c.Response().Header().Set("X-RateLimit-Reset", fmt.Sprintf("%d", resetAt))

        if !allowed {
            retryAfter := resetAt - time.Now().Unix()
            c.Response().Header().Set("Retry-After", fmt.Sprintf("%d", retryAfter))

            return c.Render(429, r.JSON(map[string]interface{}{
                "error": "RateLimitExceeded",
                "message": fmt.Sprintf("Rate limit exceeded. Retry after %d seconds.", retryAfter),
                "code": "RATE_LIMIT_EXCEEDED",
                "status": 429,
                "retry_after": retryAfter,
            }))
        }

        return next(c)
    }
}

func checkRateLimit(userId string) (bool, int, int64) {
    key := fmt.Sprintf("ratelimit:%s:hour", userId)

    // Increment counter
    count, _ := redis.Incr(key)

    // Set expiration on first request
    if count == 1 {
        redis.Expire(key, 3600) // 1 hour
    }

    // Get TTL
    ttl, _ := redis.TTL(key)
    resetAt := time.Now().Unix() + ttl

    limit := 1000
    remaining := limit - count
    allowed := count <= limit

    return allowed, remaining, resetAt
}

Token Bucket Algorithm

Rate limiting uses token bucket algorithm:

Bucket Capacity: 1000 tokens
Refill Rate: 1000 tokens/hour (or ~0.28 tokens/second)

Request arrives:
    ↓
Bucket has tokens? → Yes → Consume 1 token, allow request
    ↓
    No → Return 429 Too Many Requests

Benefits:

✅ Allows bursts (temporary spikes)
✅ Smooth rate limiting
✅ Fair over time

Handling Rate Limits

Client-Side Detection

JavaScript Rate Limit Handler:

class RateLimitHandler {
  constructor() {
    this.rateLimitInfo = {
      limit: 1000,
      remaining: 1000,
      resetAt: Date.now() + 3600000
    };
  }

  updateFromHeaders(headers) {
    this.rateLimitInfo = {
      limit: parseInt(headers.get('X-RateLimit-Limit')) || 1000,
      remaining: parseInt(headers.get('X-RateLimit-Remaining')) || 1000,
      resetAt: parseInt(headers.get('X-RateLimit-Reset')) * 1000
    };
  }

  isNearLimit(threshold = 0.1) {
    const percentRemaining = this.rateLimitInfo.remaining / this.rateLimitInfo.limit;
    return percentRemaining <= threshold;
  }

  getResetTime() {
    return new Date(this.rateLimitInfo.resetAt);
  }

  getTimeUntilReset() {
    return Math.max(0, this.rateLimitInfo.resetAt - Date.now());
  }
}

// Usage
const rateLimitHandler = new RateLimitHandler();

async function apiRequest(url, options = {}) {
  // Check if near limit
  if (rateLimitHandler.isNearLimit(0.1)) {
    console.warn('Approaching rate limit:', rateLimitHandler.rateLimitInfo);
  }

  const response = await fetch(url, options);

  // Update rate limit info from response headers
  rateLimitHandler.updateFromHeaders(response.headers);

  // Handle rate limit exceeded
  if (response.status === 429) {
    const error = await response.json();
    const retryAfter = error.retry_after * 1000; // Convert to milliseconds

    console.warn(`Rate limit exceeded. Retry after ${error.retry_after}s`);

    // Wait and retry
    await sleep(retryAfter);
    return apiRequest(url, options);
  }

  return response;
}

function sleep(ms) {
  return new Promise(resolve => setTimeout(resolve, ms));
}

Exponential Backoff

Retry with exponential backoff:

async function apiRequestWithBackoff(url, options = {}, maxRetries = 3) {
  let retryCount = 0;

  while (retryCount < maxRetries) {
    try {
      const response = await fetch(url, options);

      if (response.status === 429) {
        retryCount++;

        if (retryCount >= maxRetries) {
          throw new Error('Max retries exceeded');
        }

        // Get Retry-After header or use exponential backoff
        const retryAfter = response.headers.get('Retry-After');
        const backoffMs = retryAfter
          ? parseInt(retryAfter) * 1000
          : Math.pow(2, retryCount) * 1000; // 2s, 4s, 8s

        console.log(`Rate limited. Retry ${retryCount}/${maxRetries} after ${backoffMs}ms`);

        await sleep(backoffMs);
        continue;
      }

      return response;
    } catch (error) {
      console.error('API request failed:', error);
      throw error;
    }
  }

  throw new Error('Request failed after retries');
}

Throttling Requests

Limit concurrent requests:

class RequestThrottler {
  constructor(maxConcurrent = 10) {
    this.maxConcurrent = maxConcurrent;
    this.activeRequests = 0;
    this.queue = [];
  }

  async throttle(requestFn) {
    // Wait if at max concurrency
    if (this.activeRequests >= this.maxConcurrent) {
      await new Promise(resolve => this.queue.push(resolve));
    }

    this.activeRequests++;

    try {
      return await requestFn();
    } finally {
      this.activeRequests--;

      // Process next queued request
      if (this.queue.length > 0) {
        const resolve = this.queue.shift();
        resolve();
      }
    }
  }
}

// Usage
const throttler = new RequestThrottler(10);

async function batchCreateProducts(products) {
  const results = await Promise.all(
    products.map(product =>
      throttler.throttle(() => createProduct(product))
    )
  );

  return results;
}

Rate Limit Monitoring

Client-Side Monitoring

Track rate limit usage:

class RateLimitMonitor {
  constructor() {
    this.history = [];
  }

  record(remaining, limit, timestamp = Date.now()) {
    this.history.push({ remaining, limit, timestamp });

    // Keep last 100 records
    if (this.history.length > 100) {
      this.history.shift();
    }
  }

  getAverageUsage() {
    if (this.history.length === 0) return 0;

    const totalUsage = this.history.reduce((sum, record) => {
      return sum + (record.limit - record.remaining);
    }, 0);

    return totalUsage / this.history.length;
  }

  isHighUsage(threshold = 0.8) {
    if (this.history.length === 0) return false;

    const latest = this.history[this.history.length - 1];
    const percentUsed = (latest.limit - latest.remaining) / latest.limit;

    return percentUsed >= threshold;
  }

  logStatus() {
    if (this.history.length === 0) {
      console.log('No rate limit data');
      return;
    }

    const latest = this.history[this.history.length - 1];
    const percentUsed = ((latest.limit - latest.remaining) / latest.limit * 100).toFixed(1);

    console.log(`Rate Limit: ${latest.remaining}/${latest.limit} (${percentUsed}% used)`);
  }
}

// Usage
const monitor = new RateLimitMonitor();

async function apiRequest(url, options = {}) {
  const response = await fetch(url, options);

  // Record rate limit info
  const remaining = parseInt(response.headers.get('X-RateLimit-Remaining'));
  const limit = parseInt(response.headers.get('X-RateLimit-Limit'));
  monitor.record(remaining, limit);

  // Check if high usage
  if (monitor.isHighUsage(0.8)) {
    console.warn('High API usage detected');
  }

  return response;
}

Best Practices

✅ DO:

Check rate limit headers:

// ✅ Good - monitor remaining requests
const remaining = response.headers.get('X-RateLimit-Remaining');
if (remaining < 50) {
  console.warn('Low rate limit remaining:', remaining);
}

Implement exponential backoff:

// ✅ Good - retry with backoff
if (response.status === 429) {
  const retryAfter = response.headers.get('Retry-After');
  await sleep(retryAfter * 1000);
  return retry();
}

Throttle batch operations:

// ✅ Good - limit concurrent requests
const throttler = new RequestThrottler(10);
await Promise.all(products.map(p =>
  throttler.throttle(() => createProduct(p))
));

Cache responses:

// ✅ Good - reduce API calls
const cached = cache.get(key);
if (cached) return cached;

const data = await fetchFromAPI();
cache.set(key, data, 60); // Cache 1 minute
return data;

❌ DON'T:

Don't ignore rate limit headers:

// ❌ Bad - no rate limit awareness
await fetch(url);  // No header checking

Don't hammer API on 429:

// ❌ Bad - immediate retry
if (response.status === 429) {
  return fetch(url);  // No waiting!
}

Don't send unbounded parallel requests:

// ❌ Bad - 1000 concurrent requests
await Promise.all(
  largeArray.map(item => createItem(item))
);

Don't skip caching:

// ❌ Bad - fetching same data repeatedly
for (let i = 0; i < 10; i++) {
  await getProduct(123);  // Same product 10 times!
}

Summary

✅ Rate limits prevent abuse and ensure fair usage
✅ Limits per user, tenant, endpoint, and global
✅ Response headers indicate limit status
✅ 429 status code when limit exceeded
✅ Retry-After header indicates wait time
✅ Token bucket algorithm allows bursts
✅ Client should implement exponential backoff
✅ Throttle concurrent requests for batch operations

Key Takeaways:

Monitor rate limit headers on every response
Implement exponential backoff for 429 errors
Throttle batch operations to stay within limits
Cache responses to reduce API calls
Check Retry-After header for wait time
Write operations have lower limits than reads
Be a good API citizen - don't abuse limits

Rate Limit Flow:

API Request
    ↓
Check Rate Limit (Redis)
    ↓
Within Limit? → Yes → Process Request
    ↓              ↓
    No         Set Headers
    ↓              ↓
Return 429     Return Response
    ↓
Client Waits (Retry-After)
    ↓
Retry Request

Response Headers:

X-RateLimit-Limit: 1000        (total allowed)
X-RateLimit-Remaining: 847     (remaining in window)
X-RateLimit-Reset: 1735574400  (reset timestamp)
Retry-After: 300               (wait 300 seconds)

Security Overview - Multi-level security
Authentication - JWT tokens
Error Handling - 429 errors
Batch Operations - Throttling batches

Rate Limiting

Overview​

Rate Limit Tiers​

Standard Limits​

Operation-Specific Limits​

Rate Limit Headers​

Response Headers​

Example Response​

Rate Limit Implementation​

Server-Side Tracking​

Token Bucket Algorithm​

Handling Rate Limits​

Client-Side Detection​

Exponential Backoff​

Throttling Requests​

Rate Limit Monitoring​

Client-Side Monitoring​

Best Practices​

✅ DO:​

❌ DON'T:​

Summary​

Related Concepts​