Skip to main content
Home/Blog/Development/API Security & Rate Limiting Implementation Workflow
Development

API Security & Rate Limiting Implementation Workflow

A comprehensive 6-stage workflow for implementing production-grade API security with OAuth 2.1, rate limiting algorithms, webhook validation, quota management, and incident response. Covers OWASP API Top 10 protections with real-world code examples.

By InventiveHQ Development Team
API Security & Rate Limiting Implementation Workflow

📚 Part of the API Security Complete Guide: OWASP Top 10, Authentication, and Best Practices series.

API Security & Rate Limiting Implementation Workflow

The modern API economy powers 83% of internet traffic, but this growth comes with unprecedented security challenges. Organizations now expose an average of 15,000+ API endpoints, creating a massive attack surface. API attacks have increased 400% year-over-year, with the average breach costing $4.5 million. This comprehensive workflow provides a systematic approach to implementing production-grade API security that protects your services, satisfies compliance requirements, and enables sustainable business growth.

Rate Limit Calculator

Model API throttling budgets, per-client limits, and queue sizing to avoid 429 errors

Open the full Rate Limit Calculator tool →
Loading interactive tool...

Introduction

The API Security Challenge in 2025

APIs have become the backbone of digital business, enabling everything from mobile app functionality to complex microservice architectures. However, this central role makes them prime targets for attackers. The consequences of inadequate API security extend far beyond technical failures:

Business Impact:

  • Regulatory penalties from GDPR, PSD2, CCPA, HIPAA violations
  • Loss of customer trust and brand reputation damage
  • Service disruptions affecting revenue and user experience
  • Competitive disadvantage from security incidents

Technical Challenges:

  • Broken authentication and authorization (OWASP API1, API2)
  • Inadequate rate limiting enabling abuse and DDoS attacks
  • Insecure data exposure through over-permissive APIs
  • Lack of visibility into API usage and security events

Why This Workflow Matters

This workflow addresses the complete lifecycle of API security implementation, from initial authentication design through production monitoring and incident response. Unlike fragmented security approaches, this systematic methodology ensures:

Secure by Default: APIs implement security controls from day one, reducing security debt by 60% compared to retroactive hardening.

Compliance Ready: Built-in support for PSD2, SOC 2, HIPAA, and other regulatory frameworks through comprehensive logging, access controls, and audit trails.

Business Enablement: Rate limiting and quota systems support API monetization strategies, turning security controls into revenue generators.

Operational Excellence: Monitoring and incident response playbooks enable rapid threat detection and remediation.

Workflow Overview

This guide presents a 6-stage approach covering all critical aspects of API security:

Stage 1: Authentication & Authorization Architecture (2-4 hours) Implement OAuth 2.1/OIDC with PKCE, design secure JWT tokens, establish authorization policies including object-level access controls.

Stage 2: Rate Limiting Strategy & Implementation (2-3 hours) Select optimal algorithms (token bucket, sliding window), configure tiered limits, implement quota management and overage handling.

Stage 3: API Gateway Security & CORS Configuration (1.5-3 hours) Deploy centralized security controls, configure security headers, establish CORS policies for cross-origin access.

Stage 4: Webhook Security Implementation (1.5-3 hours) Validate HMAC signatures, prevent replay attacks, implement delivery guarantees and retry logic.

Stage 5: API Monetization & Quota Enforcement (2-3 hours) Design tiered pricing structures, track usage in real-time, handle overages and upgrade flows.

Stage 6: Monitoring, Logging & Incident Response (2-3 hours) Establish comprehensive observability, configure threat detection, create incident response playbooks.

The total implementation time ranges from 11-19 hours, with the specific duration depending on API complexity and organizational requirements.


Stage 1: Authentication & Authorization Architecture (2-4 hours)

Authentication verifies who users are, while authorization determines what they can access. Modern API security requires both layers implemented according to current best practices.

Step 1.1: OAuth/OIDC Flow Selection & Design (30-60 minutes)

OAuth 2.1 and OpenID Connect (OIDC) provide industry-standard frameworks for API authentication. As of January 2025, the IETF published RFC 9700 (OAuth 2.0 Security Best Current Practice), which mandates significant changes from OAuth 2.0.

Critical Changes in OAuth 2.1:

The Authorization Code flow with PKCE (Proof Key for Code Exchange) is now required for ALL client types, including confidential clients that previously could skip PKCE. The Implicit grant and Resource Owner Password Credentials (ROPC) flows are deprecated and removed from OAuth 2.1 due to security vulnerabilities.

Authorization Code + PKCE Flow:

This flow provides the strongest security guarantees and works across all application types:

// Step 1: Generate PKCE challenge (client-side)
function generatePKCE() {
  // Create cryptographically random code verifier (43-128 characters)
  const codeVerifier = base64URLEncode(crypto.getRandomValues(new Uint8Array(32)));

  // Generate code challenge (SHA-256 hash of verifier)
  const encoder = new TextEncoder();
  const data = encoder.encode(codeVerifier);
  const hash = await crypto.subtle.digest('SHA-256', data);
  const codeChallenge = base64URLEncode(new Uint8Array(hash));

  return { codeVerifier, codeChallenge };
}

// Step 2: Build authorization request
function buildAuthorizationRequest(codeChallenge) {
  const params = new URLSearchParams({
    response_type: 'code',
    client_id: 'your_client_id',
    redirect_uri: 'https://app.example.com/callback',
    scope: 'openid profile email read:orders write:orders',
    state: generateRandomString(32), // CSRF protection
    nonce: generateRandomString(32), // ID token replay protection (OIDC)
    code_challenge: codeChallenge,
    code_challenge_method: 'S256'
  });

  // Redirect user to authorization endpoint
  window.location.href = `https://auth.example.com/authorize?${params}`;
}

// Step 3: Exchange authorization code for tokens (server-side)
async function exchangeCodeForTokens(code, codeVerifier) {
  const response = await fetch('https://auth.example.com/token', {
    method: 'POST',
    headers: {
      'Content-Type': 'application/x-www-form-urlencoded',
      'Authorization': `Basic ${btoa(`${clientId}:${clientSecret}`)}`
    },
    body: new URLSearchParams({
      grant_type: 'authorization_code',
      code: code,
      redirect_uri: 'https://app.example.com/callback',
      code_verifier: codeVerifier // PKCE verification
    })
  });

  const tokens = await response.json();
  // Returns: { access_token, refresh_token, id_token, expires_in, token_type }
  return tokens;
}

Client Credentials Flow (Server-to-Server):

For machine-to-machine authentication where no user is involved:

async function getClientCredentialsToken() {
  const response = await fetch('https://auth.example.com/token', {
    method: 'POST',
    headers: {
      'Content-Type': 'application/x-www-form-urlencoded',
      'Authorization': `Basic ${btoa(`${clientId}:${clientSecret}`)}`
    },
    body: new URLSearchParams({
      grant_type: 'client_credentials',
      scope: 'read:data write:data'
    })
  });

  const { access_token, expires_in } = await response.json();
  return access_token;
}

Device Authorization Grant (IoT/CLI):

For devices with limited input capabilities:

// Step 1: Request device code
async function initiateDeviceFlow() {
  const response = await fetch('https://auth.example.com/device/code', {
    method: 'POST',
    headers: { 'Content-Type': 'application/x-www-form-urlencoded' },
    body: new URLSearchParams({
      client_id: 'your_client_id',
      scope: 'openid profile device:control'
    })
  });

  const data = await response.json();
  // {
  //   device_code: "NGU5OWFiNjQ...",
  //   user_code: "WDJB-MJHT",
  //   verification_uri: "https://auth.example.com/device",
  //   expires_in: 1800,
  //   interval: 5
  // }

  console.log(`Visit ${data.verification_uri} and enter code: ${data.user_code}`);
  return data;
}

// Step 2: Poll for authorization
async function pollForToken(deviceCode, interval) {
  while (true) {
    await sleep(interval * 1000);

    const response = await fetch('https://auth.example.com/token', {
      method: 'POST',
      headers: { 'Content-Type': 'application/x-www-form-urlencoded' },
      body: new URLSearchParams({
        grant_type: 'urn:ietf:params:oauth:grant-type:device_code',
        device_code: deviceCode,
        client_id: 'your_client_id'
      })
    });

    if (response.ok) {
      return await response.json();
    } else if (response.status === 400) {
      const error = await response.json();
      if (error.error === 'authorization_pending') {
        continue; // Keep polling
      } else if (error.error === 'slow_down') {
        interval += 5; // Increase polling interval
        continue;
      } else {
        throw new Error(error.error);
      }
    }
  }
}

Security Best Practices:

  1. Always use HTTPS for all OAuth endpoints and redirects
  2. Implement exact redirect URI matching (no wildcards or partial matches)
  3. Use state parameter with 32+ random characters for CSRF prevention
  4. Include nonce parameter in OIDC requests to prevent ID token replay
  5. Never send tokens in URL query parameters (use Authorization header)
  6. Store client secrets securely (environment variables, secret managers)

You can design and test these flows using the OAuth/OIDC Debugger to generate PKCE challenges, validate authorization requests, and verify flow compliance with RFC 9700.

Step 1.2: JWT Token Design & Validation (45-90 minutes)

JSON Web Tokens (JWTs) are the standard format for API access tokens. Proper JWT design and validation prevents numerous security vulnerabilities.

JWT Structure:

A JWT consists of three Base64URL-encoded parts separated by periods:

header.payload.signature

Example Access Token:

// Header
{
  "alg": "RS256",         // Algorithm (RS256 recommended)
  "typ": "at+jwt",        // Token type (RFC 9068)
  "kid": "key-2025-01"    // Key ID for rotation
}

// Payload
{
  // Standard claims
  "iss": "https://auth.example.com",              // Issuer
  "sub": "usr_123456",                             // Subject (user ID)
  "aud": "https://api.example.com",               // Audience (API identifier)
  "exp": 1704824400,                               // Expiration (15 min from now)
  "iat": 1704823500,                               // Issued at
  "nbf": 1704823500,                               // Not before
  "jti": "jti_abc123def",                          // JWT ID (unique identifier)

  // Custom claims
  "scope": "read:orders write:orders",
  "roles": ["customer", "api_consumer"],
  "tenant_id": "tenant_xyz",
  "subscription_tier": "pro"
}

// Signature (computed from header + payload)

Signature Algorithms:

RS256 (Recommended): Asymmetric encryption using RSA keys. Public key distributed for verification, private key kept secure for signing. Ideal for distributed systems where multiple services need to validate tokens.

HS256 (Acceptable): Symmetric encryption using shared secret. Same key used for signing and verification. Suitable for single-service architectures where key distribution is controlled.

ES256 (Emerging): Asymmetric encryption using ECDSA. Smaller key size than RS256 with equivalent security. Growing adoption for performance-sensitive applications.

Production JWT Validation:

import jwt from 'jsonwebtoken';
import { JwksClient } from 'jwks-rsa';

// Middleware for JWT validation
async function validateJWT(request) {
  // 1. Extract token from Authorization header
  const authHeader = request.headers.get('Authorization');
  if (!authHeader || !authHeader.startsWith('Bearer ')) {
    throw new UnauthorizedError('Missing or invalid Authorization header');
  }

  const token = authHeader.substring(7);

  // 2. Decode header without verification (to get kid)
  const decodedHeader = jwt.decode(token, { complete: true });
  if (!decodedHeader) {
    throw new UnauthorizedError('Invalid token format');
  }

  // 3. Verify algorithm is expected (prevent algorithm confusion attacks)
  if (decodedHeader.header.alg !== 'RS256') {
    throw new UnauthorizedError('Unsupported algorithm');
  }

  // 4. Fetch public key from JWKS endpoint (with caching)
  const jwksClient = new JwksClient({
    jwksUri: 'https://auth.example.com/.well-known/jwks.json',
    cache: true,
    cacheMaxAge: 86400000, // 24 hours
    rateLimit: true
  });

  const kid = decodedHeader.header.kid;
  const key = await jwksClient.getSigningKey(kid);
  const publicKey = key.getPublicKey();

  // 5. Verify signature and claims
  try {
    const payload = jwt.verify(token, publicKey, {
      algorithms: ['RS256'],
      issuer: 'https://auth.example.com',
      audience: 'https://api.example.com',
      clockTolerance: 30 // Allow 30 seconds clock skew
    });

    // 6. Additional validation
    if (!payload.sub) {
      throw new UnauthorizedError('Missing subject claim');
    }

    // 7. Verify required scopes
    const requiredScopes = ['read:orders'];
    const grantedScopes = payload.scope ? payload.scope.split(' ') : [];
    const hasRequiredScopes = requiredScopes.every(scope =>
      grantedScopes.includes(scope)
    );

    if (!hasRequiredScopes) {
      throw new ForbiddenError('Insufficient scopes');
    }

    return payload;

  } catch (error) {
    if (error.name === 'TokenExpiredError') {
      throw new UnauthorizedError('Token expired');
    } else if (error.name === 'JsonWebTokenError') {
      throw new UnauthorizedError('Invalid token signature');
    } else if (error.name === 'NotBeforeError') {
      throw new UnauthorizedError('Token not yet valid');
    }
    throw error;
  }
}

Critical Security Validations:

Always reject tokens with alg: "none" - this allows unsigned tokens and is a common attack vector. Verify the aud claim matches your API identifier to prevent token reuse across different services. Enforce short expiration times (15 minutes for access tokens) to limit damage from token leakage. Never trust the exp claim alone - always verify the signature first.

Token Storage Security:

Web Applications: Store tokens in HttpOnly, Secure, SameSite=Strict cookies to prevent XSS attacks. Never use localStorage or sessionStorage for sensitive tokens.

Single Page Apps: Keep access tokens in memory only (JavaScript variables). Use refresh tokens in HttpOnly cookies with backend-for-frontend (BFF) pattern.

Mobile Apps: Use secure platform storage - iOS Keychain or Android KeyStore. Enable biometric authentication for sensitive operations.

Server-to-Server: Store in environment variables or secret management systems (AWS Secrets Manager, HashiCorp Vault, Azure Key Vault).

Use the JWT Decoder to analyze token structure, validate claims, detect security issues, and understand expiration times during development and debugging.

Step 1.3: Authorization Model Implementation (45-90 minutes)

Authentication identifies users, but authorization determines what they can do. Proper authorization prevents OWASP API1 (Broken Object Level Authorization) and API5 (Broken Function Level Authorization).

Scope-Based Authorization:

Scopes provide coarse-grained access control at the endpoint or resource type level:

// Scope naming convention: {action}:{resource}:{qualifier}
const SCOPE_DEFINITIONS = {
  // Read permissions
  'read:orders': 'View all orders',
  'read:orders:own': 'View own orders only',
  'read:customers': 'View customer information',

  // Write permissions
  'write:orders': 'Create and modify orders',
  'write:customers': 'Create and modify customers',

  // Admin permissions
  'admin:orders': 'Full order management including deletions',
  'admin:users': 'User management and role assignment',

  // Sensitive operations
  'refund:orders': 'Issue refunds',
  'delete:data': 'Permanent data deletion'
};

// Middleware to enforce scope requirements
function requireScopes(...requiredScopes) {
  return async (request, payload) => {
    const grantedScopes = payload.scope ? payload.scope.split(' ') : [];

    const hasAllScopes = requiredScopes.every(scope =>
      grantedScopes.includes(scope)
    );

    if (!hasAllScopes) {
      throw new ForbiddenError(
        `Missing required scopes: ${requiredScopes.join(', ')}`
      );
    }

    return true;
  };
}

// Usage in API endpoints
app.get('/api/orders',
  validateJWT,
  requireScopes('read:orders'),
  async (req, res) => {
    // User has valid token with read:orders scope
    const orders = await getOrders();
    res.json(orders);
  }
);

Role-Based Access Control (RBAC):

Roles group related permissions and simplify permission management:

const ROLE_DEFINITIONS = {
  customer: {
    scopes: ['read:orders:own', 'write:orders:own', 'read:profile']
  },
  support: {
    scopes: ['read:orders', 'read:customers', 'write:tickets']
  },
  admin: {
    scopes: ['read:*', 'write:*', 'admin:*']
  },
  api_consumer: {
    scopes: ['read:orders', 'read:customers', 'write:orders']
  }
};

function hasRole(payload, requiredRole) {
  const userRoles = payload.roles || [];
  return userRoles.includes(requiredRole);
}

// Check role membership
if (!hasRole(payload, 'admin')) {
  throw new ForbiddenError('Admin role required');
}

Object-Level Authorization (Critical for OWASP API1):

Scope and role checks alone are insufficient. You must verify the user has permission to access the SPECIFIC resource:

// ❌ VULNERABLE: No object-level authorization
app.get('/api/orders/:orderId',
  validateJWT,
  requireScopes('read:orders'),
  async (req, res) => {
    // User has read:orders scope, but can they access THIS order?
    const order = await db.orders.findById(req.params.orderId);
    res.json(order); // SECURITY ISSUE: No ownership check
  }
);

// ✅ SECURE: Object-level authorization enforced
app.get('/api/orders/:orderId',
  validateJWT,
  requireScopes('read:orders'),
  async (req, res) => {
    const order = await db.orders.findById(req.params.orderId);

    if (!order) {
      throw new NotFoundError('Order not found');
    }

    // Verify user owns this order OR has admin scope
    const userId = req.payload.sub;
    const isAdmin = req.payload.scope.includes('admin:orders');
    const ownsOrder = order.user_id === userId;

    if (!ownsOrder && !isAdmin) {
      // Return 404 to prevent information leakage
      throw new NotFoundError('Order not found');
    }

    res.json(order);
  }
);

Attribute-Based Access Control (ABAC):

For complex authorization logic based on user attributes, resource properties, and environmental factors:

// Policy evaluation function
function evaluatePolicy(user, resource, action, context) {
  // Example: Allow invoice access if user's department matches invoice department
  if (action === 'read' && resource.type === 'invoice') {
    return user.department === resource.department;
  }

  // Example: Restrict sensitive operations to office hours
  if (action === 'delete' && resource.sensitivity === 'high') {
    const hour = new Date().getHours();
    const isBusinessHours = hour >= 9 && hour < 17;
    return isBusinessHours && user.roles.includes('admin');
  }

  // Example: Require MFA for financial transactions over $10,000
  if (action === 'approve' && resource.type === 'transaction') {
    if (resource.amount > 10000) {
      return context.mfa_verified === true;
    }
  }

  return false;
}

Rich Authorization Requests (FAPI 2.0):

OAuth 2.0 Rich Authorization Requests (RAR) enable fine-grained authorization:

// Traditional scope-based request
const scopeRequest = {
  scope: 'read:orders write:orders'
};

// Rich Authorization Request (RFC 9396)
const rarRequest = {
  authorization_details: [
    {
      type: 'order_access',
      actions: ['read', 'write'],
      locations: ['https://api.example.com/orders'],
      datatypes: ['order', 'shipment'],
      identifier: 'order_123456' // Specific resource
    },
    {
      type: 'payment_initiation',
      actions: ['create'],
      instructedAmount: {
        currency: 'USD',
        amount: '150.00'
      },
      creditorAccount: {
        iban: 'DE1234567890'
      }
    }
  ]
};

Authorization Matrix Example:

Create a clear matrix defining what each role can do:

ResourceCustomerSupportAdminAPI Consumer
Own OrdersRead, CreateReadRead, Update, DeleteRead, Create
All Orders-ReadRead, Update, DeleteRead
Own ProfileRead, Update-Read, Update-
All Profiles-ReadRead, Update, Delete-
Refunds-CreateCreate, Approve-
Settings--Read, Update-

This comprehensive authorization approach prevents the most common API vulnerabilities while supporting complex business requirements.


Stage 2: Rate Limiting Strategy & Implementation (2-3 hours)

Rate limiting protects APIs from abuse, ensures fair resource allocation, and enables monetization through tiered pricing. Without rate limiting, a single client can overwhelm your infrastructure or consume disproportionate resources.

Step 2.1: Rate Limit Algorithm Selection (30-60 minutes)

Different algorithms suit different use cases. Understanding their trade-offs enables optimal selection.

Token Bucket Algorithm (Recommended for Most APIs):

The token bucket allows bursts while enforcing long-term rate limits. Tokens refill at a constant rate, and each request consumes one token. When the bucket is empty, requests are rejected.

class TokenBucketRateLimiter {
  constructor(capacity, refillRate) {
    this.capacity = capacity;        // Maximum tokens in bucket
    this.tokens = capacity;           // Current tokens available
    this.refillRate = refillRate;     // Tokens per second
    this.lastRefill = Date.now();
  }

  async checkLimit(clientId) {
    // Refill tokens based on elapsed time
    const now = Date.now();
    const elapsed = (now - this.lastRefill) / 1000; // seconds
    const tokensToAdd = elapsed * this.refillRate;

    this.tokens = Math.min(this.capacity, this.tokens + tokensToAdd);
    this.lastRefill = now;

    // Check if request allowed
    if (this.tokens >= 1) {
      this.tokens -= 1;
      return {
        allowed: true,
        remaining: Math.floor(this.tokens),
        resetAt: new Date(now + ((this.capacity - this.tokens) / this.refillRate) * 1000)
      };
    }

    // Calculate retry after time
    const retryAfter = Math.ceil((1 - this.tokens) / this.refillRate);

    return {
      allowed: false,
      remaining: 0,
      resetAt: new Date(now + (this.capacity / this.refillRate) * 1000),
      retryAfter: retryAfter
    };
  }
}

// Usage: Allow burst of 100 requests, refill at 10 requests/second
const limiter = new TokenBucketRateLimiter(100, 10);

// Redis-based implementation for distributed systems
async function checkTokenBucket(redis, clientId, capacity, refillRate) {
  const key = `ratelimit:${clientId}`;
  const now = Date.now();

  // Get current state
  const state = await redis.get(key);
  let tokens, lastRefill;

  if (state) {
    const parsed = JSON.parse(state);
    tokens = parsed.tokens;
    lastRefill = parsed.lastRefill;
  } else {
    tokens = capacity;
    lastRefill = now;
  }

  // Refill tokens
  const elapsed = (now - lastRefill) / 1000;
  tokens = Math.min(capacity, tokens + (elapsed * refillRate));

  if (tokens >= 1) {
    tokens -= 1;

    // Save state
    await redis.setex(key, 3600, JSON.stringify({
      tokens: tokens,
      lastRefill: now
    }));

    return { allowed: true, remaining: Math.floor(tokens) };
  }

  return { allowed: false, remaining: 0 };
}

Sliding Window Algorithm (Fairest Distribution):

Tracks requests in a continuous rolling window, preventing burst exploits at window boundaries:

async function checkSlidingWindow(redis, clientId, limit, windowMs) {
  const key = `ratelimit:sliding:${clientId}`;
  const now = Date.now();
  const windowStart = now - windowMs;

  // Redis sorted set: timestamp as score, request ID as value
  const pipeline = redis.pipeline();

  // Remove expired requests
  pipeline.zremrangebyscore(key, '-inf', windowStart);

  // Count requests in current window
  pipeline.zcard(key);

  // Add current request
  const requestId = `${now}-${Math.random()}`;
  pipeline.zadd(key, now, requestId);

  // Set expiration
  pipeline.expire(key, Math.ceil(windowMs / 1000) + 1);

  const results = await pipeline.exec();
  const count = results[1][1]; // Result of zcard

  if (count < limit) {
    return {
      allowed: true,
      remaining: limit - count - 1,
      resetAt: new Date(now + windowMs)
    };
  }

  // Remove the request we just added since it's rejected
  await redis.zrem(key, requestId);

  return {
    allowed: false,
    remaining: 0,
    resetAt: new Date(now + windowMs)
  };
}

// Usage: 100 requests per 60-second sliding window
const result = await checkSlidingWindow(redis, 'client_123', 100, 60000);

Leaky Bucket Algorithm (Smooth Traffic):

Requests are queued and processed at a steady rate, smoothing traffic bursts:

class LeakyBucketRateLimiter {
  constructor(capacity, leakRate) {
    this.capacity = capacity;         // Queue size
    this.leakRate = leakRate;        // Requests per second
    this.queue = [];
    this.processing = false;
  }

  async addRequest(request) {
    if (this.queue.length >= this.capacity) {
      throw new RateLimitError('Queue full - request rejected');
    }

    this.queue.push(request);

    if (!this.processing) {
      this.processQueue();
    }
  }

  async processQueue() {
    this.processing = true;

    while (this.queue.length > 0) {
      const request = this.queue.shift();
      await this.handleRequest(request);

      // Wait before processing next request (leak rate)
      await sleep(1000 / this.leakRate);
    }

    this.processing = false;
  }

  async handleRequest(request) {
    // Process the request
  }
}

Fixed Window Algorithm (Simplest):

Counts requests per fixed time interval. Simple but prone to burst attacks at window boundaries:

async function checkFixedWindow(redis, clientId, limit, windowSeconds) {
  const now = Math.floor(Date.now() / 1000);
  const windowStart = Math.floor(now / windowSeconds) * windowSeconds;
  const key = `ratelimit:fixed:${clientId}:${windowStart}`;

  const count = await redis.incr(key);

  if (count === 1) {
    await redis.expire(key, windowSeconds);
  }

  const resetAt = new Date((windowStart + windowSeconds) * 1000);

  if (count <= limit) {
    return {
      allowed: true,
      remaining: limit - count,
      resetAt: resetAt
    };
  }

  return {
    allowed: false,
    remaining: 0,
    resetAt: resetAt,
    retryAfter: windowSeconds - (now % windowSeconds)
  };
}

// Usage: 100 requests per 60-second window
const result = await checkFixedWindow(redis, 'client_123', 100, 60);

Algorithm Comparison:

AlgorithmProsConsBest For
Token BucketAllows bursts, simpleCan allow large burstsREST APIs, file uploads
Sliding WindowFairest, no burst edgesHigher memory/CPULLM APIs, financial transactions
Leaky BucketSmooth trafficAdds latencyDatabase writes, email sending
Fixed WindowSimplest, lowest overheadBurst vulnerabilityInternal APIs, low-security

Use the Rate Limit Calculator to model different algorithms, calculate optimal limits based on your traffic patterns, and simulate various scenarios.

Step 2.2: Per-Client vs Global Rate Limiting (45-90 minutes)

Effective rate limiting requires both per-client fairness and global backend protection.

Per-Client Rate Limiting Implementation:

// Middleware for per-client rate limiting
async function perClientRateLimit(request, response, next) {
  // Identify client (priority order)
  const apiKey = request.headers.get('X-API-Key');
  const userId = request.payload?.sub;
  const ipAddress = request.ip;

  const clientId = apiKey || userId || ipAddress;

  // Get client's tier and limits
  const tier = await getClientTier(clientId);
  const limits = TIER_LIMITS[tier];

  // Check rate limit
  const result = await checkTokenBucket(
    redis,
    clientId,
    limits.burstCapacity,
    limits.refillRate
  );

  // Add rate limit headers
  response.setHeader('X-RateLimit-Limit', limits.burstCapacity);
  response.setHeader('X-RateLimit-Remaining', result.remaining);
  response.setHeader('X-RateLimit-Reset', result.resetAt.toISOString());

  if (!result.allowed) {
    response.setHeader('Retry-After', result.retryAfter);
    return response.status(429).json({
      error: 'rate_limit_exceeded',
      message: `Rate limit of ${limits.burstCapacity} requests per minute exceeded`,
      retry_after: result.retryAfter,
      limit: limits.burstCapacity,
      remaining: 0,
      reset_at: result.resetAt.toISOString()
    });
  }

  next();
}

Tiered Rate Limits:

const TIER_LIMITS = {
  free: {
    burstCapacity: 20,           // 20 request burst
    refillRate: 10 / 60,         // 10 per minute
    dailyQuota: 1000,
    endpointLimits: {
      'POST /auth/login': { capacity: 5, rate: 5 / 60 }  // Stricter for auth
    }
  },
  builder: {
    burstCapacity: 100,
    refillRate: 60 / 60,         // 60 per minute
    dailyQuota: 100000,
    endpointLimits: {
      'POST /auth/login': { capacity: 10, rate: 10 / 60 }
    }
  },
  pro: {
    burstCapacity: 1000,
    refillRate: 600 / 60,        // 600 per minute
    dailyQuota: 1000000,
    endpointLimits: {}           // No special restrictions
  },
  enterprise: {
    burstCapacity: 10000,
    refillRate: 6000 / 60,       // 6000 per minute
    dailyQuota: null,            // Unlimited
    endpointLimits: {}
  }
};

Endpoint-Specific Limits:

Different endpoints have different resource costs and security requirements:

const ENDPOINT_LIMITS = {
  // Authentication (brute force protection)
  'POST /auth/login': { capacity: 5, rate: 5 / 60 },
  'POST /auth/forgot-password': { capacity: 3, rate: 3 / 300 },

  // Read operations (generous)
  'GET /users/:id': { capacity: 100, rate: 100 / 60 },
  'GET /orders': { capacity: 100, rate: 100 / 60 },

  // Write operations (moderate)
  'POST /orders': { capacity: 20, rate: 20 / 60 },
  'PUT /users/:id': { capacity: 10, rate: 10 / 60 },

  // Expensive operations (restrictive)
  'POST /search': { capacity: 10, rate: 10 / 60 },
  'POST /reports': { capacity: 5, rate: 5 / 300 },

  // Sensitive operations (very restrictive)
  'DELETE /users/:id': { capacity: 1, rate: 1 / 60 },
  'POST /refunds': { capacity: 5, rate: 5 / 300 }
};

async function getEndpointLimit(method, path, tier) {
  const endpointKey = `${method} ${path}`;
  const endpointLimit = ENDPOINT_LIMITS[endpointKey];
  const tierLimit = TIER_LIMITS[tier];

  // Use endpoint-specific limit if defined, otherwise tier default
  return endpointLimit || {
    capacity: tierLimit.burstCapacity,
    rate: tierLimit.refillRate
  };
}

Global Rate Limiting (Backend Protection):

// Global rate limiter to protect backend
async function globalRateLimit(request, response, next) {
  const globalLimit = 10000; // 10,000 requests per second across all clients
  const result = await checkTokenBucket(redis, 'global', globalLimit, globalLimit);

  if (!result.allowed) {
    // Backend overloaded - activate circuit breaker
    response.setHeader('Retry-After', 60);
    return response.status(503).json({
      error: 'service_unavailable',
      message: 'Service temporarily overloaded, please retry',
      retry_after: 60
    });
  }

  next();
}

Rate Limit Response Headers:

Standard headers provide clients with limit information:

function setRateLimitHeaders(response, limit, remaining, resetAt, retryAfter = null) {
  response.setHeader('X-RateLimit-Limit', limit.toString());
  response.setHeader('X-RateLimit-Remaining', remaining.toString());
  response.setHeader('X-RateLimit-Reset', Math.floor(resetAt.getTime() / 1000).toString());

  if (retryAfter !== null) {
    response.setHeader('Retry-After', retryAfter.toString());
  }

  // Optional: Add rate limit policy header
  response.setHeader('X-RateLimit-Policy', `${limit} requests per minute`);
}

Step 2.3: Quota Management & Overage Handling (45-90 minutes)

Quotas enforce monthly or billing-period limits, distinct from per-minute rate limits.

Monthly Quota Tracking:

async function trackMonthlyUsage(redis, apiKey) {
  const month = new Date().toISOString().slice(0, 7); // "2025-01"
  const key = `quota:${apiKey}:${month}`;

  // Increment usage
  const currentUsage = await redis.incr(key);

  // Set expiration (90 days retention)
  if (currentUsage === 1) {
    await redis.expire(key, 90 * 86400);
  }

  // Get quota limit for this client
  const quota = await getQuotaLimit(apiKey);
  const percentageUsed = (currentUsage / quota) * 100;

  // Send alerts at thresholds
  if (percentageUsed >= 100 && currentUsage === quota + 1) {
    await sendQuotaAlert(apiKey, '100', currentUsage, quota);
  } else if (percentageUsed >= 90 && Math.floor((currentUsage - 1) / quota * 100) < 90) {
    await sendQuotaAlert(apiKey, '90', currentUsage, quota);
  } else if (percentageUsed >= 80 && Math.floor((currentUsage - 1) / quota * 100) < 80) {
    await sendQuotaAlert(apiKey, '80', currentUsage, quota);
  }

  return { currentUsage, quota, percentageUsed };
}

Quota Enforcement Strategies:

async function enforceQuota(apiKey, currentUsage, quota) {
  if (currentUsage <= quota) {
    return { allowed: true };
  }

  // Get overage strategy for client's tier
  const tier = await getClientTier(apiKey);
  const strategy = OVERAGE_STRATEGIES[tier];

  switch (strategy.type) {
    case 'hard_stop':
      return {
        allowed: false,
        status: 402,
        error: 'quota_exceeded',
        message: `Monthly quota of ${quota} requests exhausted`,
        reset_at: getNextBillingCycle(),
        upgrade_url: `https://api.example.com/upgrade?api_key=${apiKey}`
      };

    case 'soft_quota':
      // Allow continued use but throttle rate limit
      const throttledLimit = strategy.throttledRate;
      return {
        allowed: true,
        throttled: true,
        throttledLimit: throttledLimit,
        warning: `Quota exceeded, rate limited to ${throttledLimit} requests/minute`
      };

    case 'overage_billing':
      // Charge per additional request
      const overageRequests = currentUsage - quota;
      const overageCharge = overageRequests * strategy.pricePerRequest;

      await recordOverageCharge(apiKey, overageRequests, overageCharge);

      return {
        allowed: true,
        overage: true,
        overageRequests: overageRequests,
        overageCharge: overageCharge,
        warning: `${overageRequests} requests beyond quota ($${overageCharge.toFixed(2)})`
      };

    default:
      return { allowed: false };
  }
}

const OVERAGE_STRATEGIES = {
  free: {
    type: 'hard_stop'
  },
  builder: {
    type: 'soft_quota',
    throttledRate: 10 // Reduce to 10 requests/minute
  },
  pro: {
    type: 'overage_billing',
    pricePerRequest: 0.001 // $0.001 per request
  },
  enterprise: {
    type: 'unlimited' // No quotas
  }
};

Quota Alert Webhooks:

async function sendQuotaAlert(apiKey, threshold, currentUsage, quota) {
  const customer = await getCustomer(apiKey);

  const webhookPayload = {
    event: threshold === '100' ? 'quota.exhausted' : 'quota.warning',
    api_key: apiKey,
    customer_id: customer.id,
    usage_percentage: parseInt(threshold),
    current_usage: currentUsage,
    quota_limit: quota,
    period_start: getMonthStart().toISOString(),
    period_end: getMonthEnd().toISOString(),
    recommended_action: threshold === '100' ? 'upgrade_required' : 'consider_upgrade'
  };

  // Send webhook to customer's configured endpoint
  if (customer.webhookUrl) {
    await sendWebhook(customer.webhookUrl, webhookPayload, customer.webhookSecret);
  }

  // Send email notification
  await sendEmail(customer.email, 'quota-alert', webhookPayload);
}

Test quota webhook delivery and validation using the Webhook Tester & Inspector.


Stage 3: API Gateway Security & CORS Configuration (1.5-3 hours)

API gateways centralize security controls, reducing duplicated effort across services and ensuring consistent policy enforcement.

Step 3.1: API Gateway Selection & Configuration (45-90 minutes)

Modern API gateways provide authentication, rate limiting, request validation, and observability as managed services.

Kong Gateway Configuration:

# kong.yml - Declarative configuration
_format_version: "3.0"

services:
  - name: orders-api
    url: http://backend:8080
    routes:
      - name: orders-route
        paths:
          - /api/orders
        methods:
          - GET
          - POST
          - PUT
          - DELETE
    plugins:
      - name: jwt
        config:
          key_claim_name: kid
          secret_is_base64: false
          claims_to_verify:
            - exp
            - nbf
          uri_param_names:
            - jwt
      - name: rate-limiting
        config:
          minute: 60
          hour: 1000
          policy: redis
          redis_host: redis
          redis_port: 6379
      - name: cors
        config:
          origins:
            - https://app.example.com
          methods:
            - GET
            - POST
            - PUT
            - DELETE
          headers:
            - Authorization
            - Content-Type
          exposed_headers:
            - X-RateLimit-Limit
            - X-RateLimit-Remaining
          credentials: true
          max_age: 3600
      - name: request-size-limiting
        config:
          allowed_payload_size: 10 # 10 MB
      - name: response-transformer
        config:
          add:
            headers:
              - "X-API-Version: 2025-01-01"
              - "Strict-Transport-Security: max-age=31536000"

AWS API Gateway with Lambda Authorizer:

// Lambda authorizer for JWT validation
export async function handler(event) {
  const token = event.authorizationToken.replace('Bearer ', '');

  try {
    const payload = await validateJWT(token);

    // Generate IAM policy
    return {
      principalId: payload.sub,
      policyDocument: {
        Version: '2012-10-17',
        Statement: [{
          Action: 'execute-api:Invoke',
          Effect: 'Allow',
          Resource: event.methodArn
        }]
      },
      context: {
        userId: payload.sub,
        scopes: payload.scope,
        tier: payload.subscription_tier
      }
    };
  } catch (error) {
    throw new Error('Unauthorized');
  }
}

Gateway Security Patterns:

// Centralized authentication at gateway
async function gatewayAuthenticationMiddleware(request) {
  // 1. Extract and validate JWT
  const payload = await validateJWT(request);

  // 2. Enrich request with user context
  request.headers.set('X-User-ID', payload.sub);
  request.headers.set('X-User-Scopes', payload.scope);
  request.headers.set('X-User-Tier', payload.subscription_tier);

  // 3. Remove sensitive headers before forwarding to backend
  request.headers.delete('Authorization'); // Backend trusts gateway

  return request;
}

// Request validation
async function validateRequest(request, schema) {
  const contentType = request.headers.get('Content-Type');

  // Validate Content-Type
  if (!contentType || !contentType.includes('application/json')) {
    throw new BadRequestError('Content-Type must be application/json');
  }

  // Validate payload size
  const contentLength = parseInt(request.headers.get('Content-Length') || '0');
  if (contentLength > 10 * 1024 * 1024) { // 10 MB
    throw new PayloadTooLargeError('Request payload exceeds 10 MB limit');
  }

  // Validate against JSON schema
  const body = await request.json();
  const valid = await validateSchema(body, schema);

  if (!valid.valid) {
    throw new BadRequestError(`Validation failed: ${valid.errors.join(', ')}`);
  }

  return body;
}

Use the HTTP Request Builder to test your gateway configuration, verify authentication, check rate limiting headers, and debug routing issues.

Step 3.2: Security Headers Implementation (30-60 minutes)

HTTP security headers protect against common web vulnerabilities.

Complete Security Headers Configuration:

function addSecurityHeaders(response) {
  // HSTS - Force HTTPS for 1 year
  response.setHeader(
    'Strict-Transport-Security',
    'max-age=31536000; includeSubDomains; preload'
  );

  // Prevent MIME type sniffing
  response.setHeader('X-Content-Type-Options', 'nosniff');

  // Prevent clickjacking
  response.setHeader('X-Frame-Options', 'DENY');

  // Referrer policy
  response.setHeader('Referrer-Policy', 'strict-origin-when-cross-origin');

  // Permissions policy
  response.setHeader(
    'Permissions-Policy',
    'geolocation=(), camera=(), microphone=(), payment=()'
  );

  // Content Security Policy (for API documentation sites)
  if (isDocumentationRoute(response.request.url)) {
    response.setHeader(
      'Content-Security-Policy',
      "default-src 'self'; script-src 'self' 'unsafe-inline'; style-src 'self' 'unsafe-inline'"
    );
  }

  // API-specific headers
  response.setHeader('X-API-Version', '2025-01-01');
  response.setHeader('X-Request-ID', response.requestId);

  return response;
}

Audit your security headers using the Security Headers Analyzer to get a security grade and implementation recommendations.

Step 3.3: CORS Policy Configuration (45-90 minutes)

Cross-Origin Resource Sharing (CORS) controls which web applications can access your API from browsers.

CORS Implementation:

async function handleCORS(request, response) {
  const origin = request.headers.get('Origin');
  const method = request.method;

  // Allowed origins (never use '*' with credentials)
  const allowedOrigins = [
    'https://app.example.com',
    'https://admin.example.com',
    'https://partner.example.com'
  ];

  // Check if origin is allowed
  if (origin && allowedOrigins.includes(origin)) {
    response.setHeader('Access-Control-Allow-Origin', origin);
    response.setHeader('Access-Control-Allow-Credentials', 'true');
  }

  // Handle preflight request (OPTIONS)
  if (method === 'OPTIONS') {
    response.setHeader('Access-Control-Allow-Methods', 'GET, POST, PUT, DELETE, PATCH');
    response.setHeader('Access-Control-Allow-Headers', 'Content-Type, Authorization, X-API-Key, X-Request-ID');
    response.setHeader('Access-Control-Max-Age', '86400'); // 24 hours
    response.setHeader('Access-Control-Expose-Headers', 'X-RateLimit-Limit, X-RateLimit-Remaining, X-Request-ID');

    return new Response(null, { status: 204, headers: response.headers });
  }

  // Expose headers for actual requests
  if (origin && allowedOrigins.includes(origin)) {
    response.setHeader('Access-Control-Expose-Headers', 'X-RateLimit-Limit, X-RateLimit-Remaining, X-Request-ID');
  }

  return response;
}

CORS Security Validations:

function validateCORSConfiguration(config) {
  const issues = [];

  // Check for wildcard with credentials
  if (config.allowOrigin === '*' && config.allowCredentials === true) {
    issues.push({
      severity: 'CRITICAL',
      message: 'Wildcard origin (*) cannot be used with credentials',
      fix: 'Specify exact allowed origins'
    });
  }

  // Check for null origin
  if (config.allowOrigin === 'null') {
    issues.push({
      severity: 'HIGH',
      message: 'Allowing null origin is dangerous',
      fix: 'Remove null from allowed origins'
    });
  }

  // Check for origin reflection without validation
  if (config.reflectOrigin && !config.originWhitelist) {
    issues.push({
      severity: 'CRITICAL',
      message: 'Reflecting Origin header without validation allows any origin',
      fix: 'Implement origin whitelist validation'
    });
  }

  // Check for overly permissive methods
  if (config.allowMethods.includes('*')) {
    issues.push({
      severity: 'MEDIUM',
      message: 'Wildcard methods (*) are overly permissive',
      fix: 'Specify exact allowed methods'
    });
  }

  return issues;
}

Test your CORS configuration using the CORS Policy Analyzer to detect misconfigurations and validate policies.


This guide was condensed for readability; deep-dive specifics live in the related guides above.

Frequently Asked Questions

Find answers to common questions

Token Bucket allows bursts up to bucket capacity, then enforces steady rate—ideal for APIs with bursty traffic patterns like batch operations. Sliding Window tracks requests in a rolling time window, providing smoother rate enforcement without allowing large bursts. Token Bucket is simpler to implement but can allow brief traffic spikes. Sliding Window is more precise but requires tracking individual request timestamps. Most production APIs use Token Bucket for simplicity with reasonable burst limits.

Use a centralized counter store like Redis with atomic increment operations (INCR with EXPIRE). For eventual consistency, use sliding window counters with Redis sorted sets. Consider these patterns:

  1. Single Redis instance for simplicity
  2. Redis Cluster for high availability
  3. Cell-based architecture for geographic distribution.

Always handle Redis failures gracefully—either fail open (allow requests) or fail closed (deny requests) based on your security requirements.

Include these headers in every response: X-RateLimit-Limit (maximum requests allowed), X-RateLimit-Remaining (requests left in current window), X-RateLimit-Reset (Unix timestamp when limit resets). When rate limited, return 429 Too Many Requests with Retry-After header (seconds until retry is safe). Some APIs add X-RateLimit-Policy to describe the limit type. IETF draft-ietf-httpapi-ratelimit-headers proposes standardizing as RateLimit-Limit, RateLimit-Remaining, RateLimit-Reset.

OAuth 2.1 (RFC in progress, based on RFC 9700 BCP) consolidates OAuth 2.0 with security best practices. Key changes:

  1. PKCE is required for ALL clients (including confidential)
  2. Implicit grant removed entirely
  3. Resource Owner Password Credentials (ROPC) removed
  4. Bearer tokens must use Authorization header (not query strings)
  5. Refresh tokens must be sender-constrained or one-time-use
  6. Stricter redirect URI matching.

Existing OAuth 2.0 + PKCE implementations are largely compliant.

Use HMAC-SHA256 signatures with a shared secret. The sender computes HMAC(secret, payload) and sends it in a header (e.g., X-Signature-256). Your server recomputes the signature from the raw request body and compares using constant-time comparison to prevent timing attacks. Never use string equality (==). Also validate:

  1. Timestamp to prevent replay attacks (reject if >5 min old)
  2. Expected sender IP ranges if available
  3. Correct Content-Type.

Store secrets securely; rotate periodically.

Implement tiered rate limiting:

  1. Anonymous/free tier with strict limits (e.g., 100 req/hour)
  2. Authenticated user tier with moderate limits (e.g., 1000 req/hour)
  3. Premium/enterprise tier with high limits or custom quotas.

Add per-endpoint limits for expensive operations. Use adaptive rate limiting that increases limits for users with good history. Implement quota systems for monthly/daily allowances separate from per-minute rate limits.

Rate limiting enforces hard caps—requests exceeding the limit are rejected with 429. Throttling slows down or queues excess requests instead of rejecting them. Rate limiting protects your API from abuse; throttling provides graceful degradation for legitimate users during traffic spikes. Many systems combine both: throttle requests up to a threshold, then rate-limit beyond that. Choose based on user experience needs—rejecting is simpler but queuing may be better for background jobs.

Never embed API keys in client-side code or mobile apps—use OAuth tokens instead. For server-side keys:

  1. Store in environment variables or secrets managers (not code)
  2. Use separate keys per environment (dev/staging/prod)
  3. Implement key rotation without downtime (support multiple active keys)
  4. Hash keys in your database (store only hash, compare on auth)
  5. Log key usage for audit trails
  6. Set expiration dates
  7. Scope keys to minimum required permissions.

Rate limiting directly addresses OWASP API4:2023 (Unrestricted Resource Consumption) by preventing denial-of-service and resource exhaustion. It also mitigates API2 (Broken Authentication) by limiting brute-force attacks, API6 (Unrestricted Access to Sensitive Business Flows) by preventing automated abuse of business logic, and indirectly helps with API1 (BOLA) by limiting enumeration attacks. Rate limiting should be one layer in a defense-in-depth strategy, not the only protection.

Design separate rate limit pools for critical vs. normal operations. Health checks, auth token refresh, and payment confirmations may need guaranteed availability even when other limits are exceeded. Implement priority queuing where critical requests bypass normal limits. Use circuit breakers to shed non-critical load during overload. Consider separate infrastructure for truly critical paths. Document which operations have guaranteed availability and test failover scenarios regularly.

Building Something Great?

Our development team builds secure, scalable applications. From APIs to full platforms, we turn your ideas into production-ready software.