Rate Limiting

Understand OrbVPN API rate limits, tiers, response headers, and best practices for building integrations that respect rate boundaries and handle throttling gracefully.

API Reference

Rate Limiting

OrbVPN enforces rate limits to ensure fair usage and platform stability. Learn how limits are applied, how to monitor your usage, and how to handle throttling in your integrations.


Rate Limit Tiers

Rate limits are applied at different scopes depending on the endpoint category and your authentication level. Each tier has an independent counter.

TierLimitScopeDescription
Global100 req/minPer IP addressApplies to all unauthenticated requests and as a baseline for all traffic from a single IP.
Auth endpoints10 req/minPer IP addressLogin, register, password reset, and other authentication endpoints. Stricter to prevent brute-force attacks.
Protected endpoints300 req/minPer authenticated userStandard API endpoints that require a valid Bearer token. Tracked by user ID, not IP.
Admin endpoints1000 req/minPer admin userAdministrative and management endpoints. Higher limits for platform operators.
OrbGuard Labs60 req/minPer API keyThreat intelligence, scam detection, and forensics endpoints. Tracked by API key.

Multiple Tiers Can Apply Simultaneously

A single request may count against multiple tiers. For example, an authenticated request to a protected endpoint counts against both the Global tier (per IP) and the Protected tier (per user). The most restrictive limit applies first.


Rate Limit Headers

Every API response includes three headers that tell you exactly where you stand with your current rate limit window.

HeaderTypeDescription
X-RateLimit-LimitIntegerThe maximum number of requests allowed in the current time window.
X-RateLimit-RemainingIntegerThe number of requests you have left in the current window.
X-RateLimit-ResetUnix timestampThe UTC epoch time (in seconds) when the current rate limit window resets.

Example Response Headers

HTTP/1.1 200 OK
Content-Type: application/json
X-RateLimit-Limit: 300
X-RateLimit-Remaining: 247
X-RateLimit-Reset: 1707350460

Monitor X-RateLimit-Remaining

Track the X-RateLimit-Remaining value in your application. When it drops below 10% of the limit, consider slowing down your request rate proactively rather than waiting to be throttled.


429 Too Many Requests

When you exceed a rate limit, the API returns a 429 status code with a RATE_LIMITED error. The response includes a Retry-After header indicating how many seconds to wait before sending another request.

429Rate limit exceeded
{
  "success": false,
  "error": {
    "code": "RATE_LIMITED",
    "message": "Too many requests. Please wait before retrying.",
    "details": {
      "limit": 100,
      "window": "60s",
      "retryAfter": 23,
      "tier": "global"
    }
  }
}

Response Headers on 429

HTTP/1.1 429 Too Many Requests
Content-Type: application/json
Retry-After: 23
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1707350483

Respect the Retry-After Header

Always honor the Retry-After value. Clients that continue to send requests after receiving a 429 may be temporarily blocked at the IP level. Persistent abuse can result in a permanent ban.


Retry Strategy: Exponential Backoff with Jitter

When you receive a 429 or a transient server error (500, 502, 503), use exponential backoff with random jitter to retry. This prevents all clients from retrying at the exact same instant (the "thundering herd" problem).

1

Receive the Error

Detect a 429, 500, 502, or 503 response. Read the Retry-After header if present.

2

Calculate the Delay

Use the formula: delay = min(base * 2^attempt, maxDelay) + random_jitter. Start with a base of 1 second.

3

Wait and Retry

Sleep for the calculated delay, then resend the request with the same parameters and headers.

4

Respect Maximum Retries

Set a maximum retry count (recommended: 5). After exhausting all retries, surface the error to the user or log it for investigation.

Backoff Schedule

Attempt 1:  1s  + jitter(0-500ms)  =  ~1.0s - 1.5s
Attempt 2:  2s  + jitter(0-500ms)  =  ~2.0s - 2.5s
Attempt 3:  4s  + jitter(0-500ms)  =  ~4.0s - 4.5s
Attempt 4:  8s  + jitter(0-500ms)  =  ~8.0s - 8.5s
Attempt 5: 16s  + jitter(0-500ms)  = ~16.0s - 16.5s

Use Retry-After When Available

If the response includes a Retry-After header, use that value as your initial delay instead of the exponential formula. The server is telling you exactly how long to wait.


Rate Limit Handling Code Examples

Full working examples showing how to detect and gracefully handle rate limits in four languages.

#!/bin/bash
# Rate-limit-aware API caller with exponential backoff

BASE_URL="https://api.orbai.world"
MAX_RETRIES=5

call_api() {
  local method="$1"
  local path="$2"
  local token="$3"

  for attempt in $(seq 0 $MAX_RETRIES); do
    response=$(curl -s -w "\n%{http_code}" \
      -X "$method" "${BASE_URL}${path}" \
      -H "Authorization: Bearer $token" \
      -H "Content-Type: application/json" \
      -D /tmp/headers.txt)

    http_code=$(echo "$response" | tail -n1)
    body=$(echo "$response" | sed '$d')

    if [ "$http_code" -eq 200 ] || [ "$http_code" -eq 201 ]; then
      echo "$body"
      return 0
    fi

    if [ "$http_code" -eq 429 ]; then
      # Read Retry-After header
      retry_after=$(grep -i "Retry-After" /tmp/headers.txt \
        | tr -d '\r' | awk '{print $2}')
      remaining=$(grep -i "X-RateLimit-Remaining" /tmp/headers.txt \
        | tr -d '\r' | awk '{print $2}')

      echo "Rate limited. Remaining: $remaining. Waiting ${retry_after}s..." >&2
      sleep "$retry_after"
      continue
    fi

    if [ "$http_code" -ge 500 ] && [ "$attempt" -lt "$MAX_RETRIES" ]; then
      delay=$(echo "2^$attempt" | bc)
      jitter=$(echo "scale=2; $RANDOM/32768*0.5" | bc)
      wait_time=$(echo "$delay + $jitter" | bc)
      echo "Server error $http_code. Retrying in ${wait_time}s..." >&2
      sleep "$wait_time"
      continue
    fi

    # Non-retryable error
    echo "$body" >&2
    return 1
  done

  echo "Max retries exceeded" >&2
  return 1
}

# Usage
call_api "GET" "/api/v1/users/me" "$ACCESS_TOKEN"

Best Practices

Follow these guidelines to stay well within rate limits and build efficient integrations.

Cache Responses

Cache GET responses locally. Data like server lists and user profiles change infrequently. Use ETags and If-None-Match headers to validate cached data without consuming your quota.

Use Batch Endpoints

Where available, use batch endpoints to fetch or modify multiple resources in a single request instead of making individual calls in a loop.

Use Webhooks Instead of Polling

Subscribe to webhook events rather than polling for changes. Webhooks push data to your server in real-time, eliminating repetitive API calls entirely.

Use WebSocket for Real-Time Data

For live updates like connection status, notifications, and threat alerts, use the WebSocket API. A single persistent connection replaces hundreds of polling requests.

Implement Request Queuing

Queue outgoing API calls and process them at a steady rate below your limit. This prevents bursts that trigger throttling and makes your integration more predictable.

Monitor Your Usage

Log the X-RateLimit-Remaining header from every response. Set up alerts when remaining drops below 10% so you can proactively adjust your request patterns.


Frequently Asked Questions

What happens if I keep sending requests after a 429?

Clients that repeatedly ignore 429 responses and the Retry-After header may be temporarily blocked at the IP level. Persistent abuse (sustained requests after multiple warnings) can result in a permanent ban of the IP address or API key.

Do rate limits apply to WebSocket connections?

WebSocket connections have their own rate limiting model. Message throughput is limited per connection, but the limits are significantly higher than REST endpoints since there is no per-request overhead. See the WebSocket Guide for details.

Can I request a higher rate limit?

Enterprise customers can request custom rate limits. Contact your account manager or reach out through the support portal to discuss your requirements.

Are rate limits shared across API keys?

No. Each API key (for OrbGuard Labs) and each authenticated user (for OrbNET) has an independent rate limit counter. The Global tier is per IP address and is the only tier shared across all requests from the same origin.

Enterprise Rate Limits

If your integration requires higher throughput than the standard tiers allow, contact the OrbVPN team to discuss enterprise rate limit packages tailored to your use case.


Build Resilient Integrations

Combine rate limit handling with proper error management to build integrations that gracefully handle any API condition. Explore the full Error Reference for detailed error codes.

View Error Reference