QNP API Documentation
QNP is an OpenAI-compatible API proxy that gives you access to multiple LLM providers through a single endpoint. With configurable fallback: you define the provider chain per API key (primary → fallback₁ → fallback₂). When one fails, the next is tried automatically.
- Configurable fallback — set provider order per key (OpenAI → Anthropic → Groq, etc.)
- OpenAI SDK compatible — drop-in replacement, same base URL pattern
- BYOK or platform credits — use your own provider keys or QNP credits
- Supported providers — OpenAI, Anthropic, Google, Groq, DeepSeek, Mistral, Together, Perplexity, Cohere, and more
- Unified usage, billing, and monitoring
Production base URL: https://qnp.ai/api/api/v1
OpenAI-compatible clients (Cline, OpenAI SDK, etc.): https://qnp.ai/api/api/v1. Set the API key to your qnp- key and choose provider type OpenAI Compatible where applicable.
For chat, model names qnp, or qnp/auto resolve to the first model on your API key fallback chain when you have configured providers; otherwise QNP uses an intent-based default (Sage). Your fallback order still applies to the resolved model.
Quickstart Guide
Create an Account
Sign up at qnp.ai/login using Google, Apple, Passkey, or email magic link. You'll get a free tier with 50 requests per day.
Get Your API Key
Go to Dashboard → API Keys and create a new key. Configure your provider chain (which LLM providers to use and in what order). Copy the key — it starts with qnp- and is only shown once.
Make Your First Request
Use any OpenAI-compatible SDK or cURL. Just change the base URL and API key:
curl -X POST "https://qnp.ai/api/api/v1/chat/completions" \
-H "Authorization: Bearer YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"qnp/auto","messages":[{"role":"user","content":"Hello!"}]}'Monitor Usage
View request logs, costs, and latency in the Dashboard → Logs page. Export data as CSV for analysis.
Authentication
All API requests require authentication via your QNP API key. You can pass it in two ways:
Authorization: Bearer YOUR_KEY # or X-API-Key: YOUR_KEY
Key Management
- Keys are created in the Dashboard and start with
qnp- - Each key has its own provider chain (fallback order), model access, and rate limits
- Keys can be scoped with allowed IPs, referrers, and permissions (chat, embeddings, images)
- Rotate keys without downtime — the old key is revoked and a new one is generated
- Revoke keys instantly if compromised — they stop working immediately
- Free tier: up to 5 active keys. Pro tier: unlimited keys
Key Scoping
Each key can be configured with:
- Provider Chain — which providers to route requests to, in priority order
- Allowed IPs — restrict key usage to specific IP addresses
- Allowed Referrers — restrict key usage to specific domains
- Permissions — which endpoints the key can access (chat, embeddings, images)
- Budget — set spending limits per key
- Rate Limits — custom RPM/TPM limits per key
Fallback Configuration
Per API key, you define a provider chain: primary provider first, then fallbacks in order. If the primary fails (rate limit, outage, etc.), QNP automatically tries the next in the chain.
Example: OpenAI (primary) → Anthropic (fallback 1) → Groq (fallback 2). Request goes to OpenAI; if it fails, Anthropic is tried; if that fails, Groq.
Configure fallback in the dashboard when creating or editing an API key: add providers, set priority, and choose models per step. No code changes needed.
With no fallback configured, the key uses platform mode: QNP credits and platform providers. Add your own provider keys (BYOK) to control costs.
Chat Completions
OpenAI-compatible chat endpoint. Supports messages, model, stream, temperature, max_tokens, tools, tool_choice, and all standard parameters.
Use any model from your fallback chain. The model is resolved from your configured providers. Model format: provider/model-slug or just model-slug for auto-routing.
curl -X POST "https://qnp.ai/api/api/v1/chat/completions" \
-H "Authorization: Bearer YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"qnp/auto","messages":[{"role":"user","content":"Hello!"}]}'from openai import OpenAI
client = OpenAI(
base_url="https://qnp.ai/api/api/v1",
api_key="YOUR_KEY"
)
response = client.chat.completions.create(
model="qnp/auto", # routes to your fallback chain
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://qnp.ai/api/api/v1",
apiKey: "YOUR_KEY",
});
const response = await client.chat.completions.create({
model: "qnp/auto", // routes to your fallback chain
messages: [{ role: "user", content: "Hello!" }],
});
console.log(response.choices[0].message.content);Embeddings
Create vector embeddings for text. OpenAI-compatible format.
curl -X POST "https://qnp.ai/api/api/v1/embeddings" \
-H "Authorization: Bearer YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"text-embedding-3-small","input":"The quick brown fox"}'response = client.embeddings.create(
model="text-embedding-3-small",
input="The quick brown fox"
)
print(response.data[0].embedding[:5])Images
Generate images via DALL-E or other supported providers. OpenAI images API format.
Models
List models available for your API key. Returns all models from your fallback chain (BYOK) or platform providers.
curl "https://qnp.ai/api/api/v1/models" \ -H "Authorization: Bearer YOUR_KEY"
Streaming
QNP supports Server-Sent Events (SSE) streaming for chat completions. Set "stream": true in your request body. The response will be a stream of data: events in the OpenAI format.
Each SSE event contains a JSON chunk with delta content. The stream ends with data: [DONE].
curl -X POST "https://qnp.ai/api/api/v1/chat/completions" \
-H "Authorization: Bearer YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"qnp/auto","stream":true,"messages":[{"role":"user","content":"Hello!"}]}'stream = client.chat.completions.create(
model="qnp/auto",
messages=[{"role": "user", "content": "Tell me a story"}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")Model Catalog
QNP supports 9 providers. Use the model format provider/model-slug for explicit routing or just model-slug to use your fallback chain.
| Provider | Example Models | Format Example |
|---|---|---|
| OpenAI | gpt-4o, gpt-4o-mini, o1, o3-mini | openai/gpt-4o |
| Anthropic | claude-opus-4-5, claude-sonnet-4-5, claude-haiku-3-5 | anthropic/claude-sonnet-4-5 |
| gemini-2.0-flash, gemini-1.5-pro, gemini-1.5-flash | google/gemini-2.0-flash | |
| DeepSeek | deepseek-chat, deepseek-r1 | deepseek/deepseek-chat |
| Groq | llama-3.3-70b-versatile, mixtral-8x7b-32768 | groq/llama-3.3-70b-versatile |
| Mistral | mistral-large-latest, mistral-small-latest | mistral/mistral-large-latest |
| Together | meta-llama/Llama-3.3-70B-Instruct-Turbo | together/meta-llama/Llama-3.3-70B-Instruct-Turbo |
| Perplexity | sonar-pro, sonar | perplexity/sonar-pro |
| Cohere | command-r-plus, command-r | cohere/command-r-plus |
Pricing
Pricing follows the upstream provider — whatever the provider charges is what you pay. No markup. QNP charges for the platform subscription only.
Free Tier
- $0/month
- 50 requests per day
- Up to 5 API keys
- All providers available
Pro Tier
- $9.99/month
- Unlimited requests (pay per usage)
- Unlimited API keys
- Advanced analytics & priority support
For live per-model pricing and usage details, visit your Dashboard.
Rate Limits
Rate limits are enforced per API key and depend on your subscription tier:
| Tier | Requests per Day | RPM (default) | API Keys |
|---|---|---|---|
| Free | 50 RPD | 10 RPM | 5 |
| Pro | Unlimited | Unlimited | Unlimited |
Custom per-key rate limits (RPM, TPM) can be configured in the dashboard for Pro tier keys.
When rate limited, the API returns 429 Too Many Requests with a Retry-After header.
Error Codes
Errors follow OpenAI format: { error: { message, type, code } }
| Code | Meaning | What to Do |
|---|---|---|
| 400 | Bad Request | Check your request body format and required fields |
| 401 | Unauthorized | Check your API key is valid and included in the header |
| 402 | Insufficient Credits | Top up credits in the dashboard or upgrade to Pro |
| 403 | Forbidden | Key lacks permission for this endpoint or IP/referrer blocked |
| 404 | Not Found | Check the endpoint URL and model name |
| 429 | Rate Limited | Wait and retry. Check Retry-After header. Upgrade tier for higher limits |
| 500 | Internal Server Error | Retry the request. If persistent, contact support |
| 502 | Bad Gateway | Upstream provider error. Your fallback chain will auto-retry |
| 503 | Service Unavailable | Provider temporarily unavailable. Fallback routing handles this |
SDK Examples
Use any OpenAI-compatible SDK. Change baseURL and apiKey. Your fallback chain is configured in the dashboard — no SDK changes needed.
Python
from openai import OpenAI
client = OpenAI(
base_url="https://qnp.ai/api/api/v1",
api_key="YOUR_KEY"
)
response = client.chat.completions.create(
model="qnp/auto", # routes to your fallback chain
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)Node.js
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://qnp.ai/api/api/v1",
apiKey: "YOUR_KEY",
});
const response = await client.chat.completions.create({
model: "qnp/auto", // routes to your fallback chain
messages: [{ role: "user", content: "Hello!" }],
});
console.log(response.choices[0].message.content);LangChain
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
base_url="https://qnp.ai/api/api/v1",
api_key="YOUR_KEY",
model="qnp/auto" # routes to your fallback chain
)
response = llm.invoke("Hello!")
print(response.content)Cline Setup
Cline is an AI-powered coding assistant. Use QNP as the provider for configurable fallback and unified API.
Steps
- Open Cline settings (click the gear icon).
- Set API Provider to OpenAI Compatible.
- Base URL:
https://qnp.ai/api/api/v1. - API Key: Your QNP key (starts with
qnp-). - Model: Use
qnp/autoto auto-resolve to your fallback chain.
Tip: Configure your API key provider chain in Dashboard → API Keys. The qnp/auto model uses the first model in that chain; if it fails, the next is tried automatically.
Troubleshooting
- 400 invalid_request_error: Ensure your key has at least one provider in its fallback chain. Add Anthropic or OpenAI in Dashboard → Keys → Routing.
- Invalid API Key: Confirm the key starts with
qnp-and is active. - Base URL: Must end with
/v1(no trailing slash).
Setup Script
The setup script auto-detects and configures OpenClaw, Claude Code, and Hermes Agent. Use QNP as the model provider for unified fallback and BYOK.
The setup script will interactively prompt you for your API key and base URL.
curl -fsSL https://www.qnp.ai/setup.sh | bash
What the script does
- Detects installed tools (OpenClaw, Claude Code, Hermes Agent) and configures QNP as provider for each
- Sets
input: ["text","image","audio","video"]for all models - Creates a backup before changes