QNP | One Key for Every Model

QNP API Documentation

QNP is an OpenAI-compatible API proxy that gives you access to multiple LLM providers through a single endpoint. With configurable fallback: you define the provider chain per API key (primary → fallback₁ → fallback₂). When one fails, the next is tried automatically.

Configurable fallback — set provider order per key (OpenAI → Anthropic → Groq, etc.)
OpenAI SDK compatible — drop-in replacement, same base URL pattern
BYOK or platform credits — use your own provider keys or QNP credits
Supported providers — OpenAI, Anthropic, Google, Groq, DeepSeek, Mistral, Together, Perplexity, Cohere, and more
Unified usage, billing, and monitoring

Production base URL: https://qnp.ai/api/api/v1

OpenAI-compatible clients (Cline, OpenAI SDK, etc.): https://qnp.ai/api/api/v1. Set the API key to your qnp- key and choose provider type OpenAI Compatible where applicable.

For chat, model names qnp, or qnp/auto resolve to the first model on your API key fallback chain when you have configured providers; otherwise QNP uses an intent-based default (Sage). Your fallback order still applies to the resolved model.

Quickstart Guide

Create an Account

Sign up at qnp.ai/login using Google, Apple, Passkey, or email magic link. You'll get a free tier with 50 requests per day.

Get Your API Key

Go to Dashboard → API Keys and create a new key. Configure your provider chain (which LLM providers to use and in what order). Copy the key — it starts with qnp- and is only shown once.

Make Your First Request

Use any OpenAI-compatible SDK or cURL. Just change the base URL and API key:

cURL

curl -X POST "https://qnp.ai/api/api/v1/chat/completions" \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"qnp/auto","messages":[{"role":"user","content":"Hello!"}]}'

Monitor Usage

View request logs, costs, and latency in the Dashboard → Logs page. Export data as CSV for analysis.

Authentication

All API requests require authentication via your QNP API key. You can pass it in two ways:

Authorization: Bearer YOUR_KEY
# or
X-API-Key: YOUR_KEY

Key Management

Keys are created in the Dashboard and start with qnp-
Each key has its own provider chain (fallback order), model access, and rate limits
Keys can be scoped with allowed IPs, referrers, and permissions (chat, embeddings, images)
Rotate keys without downtime — the old key is revoked and a new one is generated
Revoke keys instantly if compromised — they stop working immediately
Free tier: up to 5 active keys. Pro tier: unlimited keys

Key Scoping

Each key can be configured with:

Provider Chain — which providers to route requests to, in priority order
Allowed IPs — restrict key usage to specific IP addresses
Allowed Referrers — restrict key usage to specific domains
Permissions — which endpoints the key can access (chat, embeddings, images)
Budget — set spending limits per key
Rate Limits — custom RPM/TPM limits per key

Fallback Configuration

Per API key, you define a provider chain: primary provider first, then fallbacks in order. If the primary fails (rate limit, outage, etc.), QNP automatically tries the next in the chain.

Example: OpenAI (primary) → Anthropic (fallback 1) → Groq (fallback 2). Request goes to OpenAI; if it fails, Anthropic is tried; if that fails, Groq.

Configure fallback in the dashboard when creating or editing an API key: add providers, set priority, and choose models per step. No code changes needed.

With no fallback configured, the key uses platform mode: QNP credits and platform providers. Add your own provider keys (BYOK) to control costs.

Chat Completions

POST/v1/chat/completions

OpenAI-compatible chat endpoint. Supports messages, model, stream, temperature, max_tokens, tools, tool_choice, and all standard parameters.

Use any model from your fallback chain. The model is resolved from your configured providers. Model format: provider/model-slug or just model-slug for auto-routing.

cURL

curl -X POST "https://qnp.ai/api/api/v1/chat/completions" \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"qnp/auto","messages":[{"role":"user","content":"Hello!"}]}'

Python (OpenAI SDK)

from openai import OpenAI

client = OpenAI(
    base_url="https://qnp.ai/api/api/v1",
    api_key="YOUR_KEY"
)

response = client.chat.completions.create(
    model="qnp/auto",  # routes to your fallback chain
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

Node.js (OpenAI SDK)

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://qnp.ai/api/api/v1",
  apiKey: "YOUR_KEY",
});

const response = await client.chat.completions.create({
  model: "qnp/auto", // routes to your fallback chain
  messages: [{ role: "user", content: "Hello!" }],
});
console.log(response.choices[0].message.content);

Embeddings

POST/v1/embeddings

Create vector embeddings for text. OpenAI-compatible format.

cURL

curl -X POST "https://qnp.ai/api/api/v1/embeddings" \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"text-embedding-3-small","input":"The quick brown fox"}'

Python

response = client.embeddings.create(
    model="text-embedding-3-small",
    input="The quick brown fox"
)
print(response.data[0].embedding[:5])

Images

POST/v1/images/generations

Generate images via DALL-E or other supported providers. OpenAI images API format.

Models

GET/v1/models

List models available for your API key. Returns all models from your fallback chain (BYOK) or platform providers.

cURL

curl "https://qnp.ai/api/api/v1/models" \
  -H "Authorization: Bearer YOUR_KEY"

Streaming

QNP supports Server-Sent Events (SSE) streaming for chat completions. Set "stream": true in your request body. The response will be a stream of data: events in the OpenAI format.

Each SSE event contains a JSON chunk with delta content. The stream ends with data: [DONE].

cURL (Streaming)

curl -X POST "https://qnp.ai/api/api/v1/chat/completions" \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"qnp/auto","stream":true,"messages":[{"role":"user","content":"Hello!"}]}'

Python (Streaming)

stream = client.chat.completions.create(
    model="qnp/auto",
    messages=[{"role": "user", "content": "Tell me a story"}],
    stream=True
)
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Model Catalog

QNP supports 9 providers. Use the model format provider/model-slug for explicit routing or just model-slug to use your fallback chain.

Provider	Example Models	Format Example
OpenAI	gpt-4o, gpt-4o-mini, o1, o3-mini	openai/gpt-4o
Anthropic	claude-opus-4-5, claude-sonnet-4-5, claude-haiku-3-5	anthropic/claude-sonnet-4-5
Google	gemini-2.0-flash, gemini-1.5-pro, gemini-1.5-flash	google/gemini-2.0-flash
DeepSeek	deepseek-chat, deepseek-r1	deepseek/deepseek-chat
Groq	llama-3.3-70b-versatile, mixtral-8x7b-32768	groq/llama-3.3-70b-versatile
Mistral	mistral-large-latest, mistral-small-latest	mistral/mistral-large-latest
Together	meta-llama/Llama-3.3-70B-Instruct-Turbo	together/meta-llama/Llama-3.3-70B-Instruct-Turbo
Perplexity	sonar-pro, sonar	perplexity/sonar-pro
Cohere	command-r-plus, command-r	cohere/command-r-plus

Pricing

Pricing follows the upstream provider — whatever the provider charges is what you pay. No markup. QNP charges for the platform subscription only.

Free Tier

$0/month
50 requests per day
Up to 5 API keys
All providers available

Pro Tier

$9.99/month
Unlimited requests (pay per usage)
Unlimited API keys
Advanced analytics & priority support

For live per-model pricing and usage details, visit your Dashboard.

Rate Limits

Rate limits are enforced per API key and depend on your subscription tier:

Tier	Requests per Day	RPM (default)	API Keys
Free	50 RPD	10 RPM	5
Pro	Unlimited	Unlimited	Unlimited

Custom per-key rate limits (RPM, TPM) can be configured in the dashboard for Pro tier keys.

When rate limited, the API returns 429 Too Many Requests with a Retry-After header.

Error Codes

Errors follow OpenAI format: { error: { message, type, code } }

Code	Meaning	What to Do
400	Bad Request	Check your request body format and required fields
401	Unauthorized	Check your API key is valid and included in the header
402	Insufficient Credits	Top up credits in the dashboard or upgrade to Pro
403	Forbidden	Key lacks permission for this endpoint or IP/referrer blocked
404	Not Found	Check the endpoint URL and model name
429	Rate Limited	Wait and retry. Check Retry-After header. Upgrade tier for higher limits
500	Internal Server Error	Retry the request. If persistent, contact support
502	Bad Gateway	Upstream provider error. Your fallback chain will auto-retry
503	Service Unavailable	Provider temporarily unavailable. Fallback routing handles this

SDK Examples

Use any OpenAI-compatible SDK. Change baseURL and apiKey. Your fallback chain is configured in the dashboard — no SDK changes needed.

Python

pip install openai

from openai import OpenAI

client = OpenAI(
    base_url="https://qnp.ai/api/api/v1",
    api_key="YOUR_KEY"
)

response = client.chat.completions.create(
    model="qnp/auto",  # routes to your fallback chain
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

Node.js

npm install openai

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://qnp.ai/api/api/v1",
  apiKey: "YOUR_KEY",
});

const response = await client.chat.completions.create({
  model: "qnp/auto", // routes to your fallback chain
  messages: [{ role: "user", content: "Hello!" }],
});
console.log(response.choices[0].message.content);

LangChain

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    base_url="https://qnp.ai/api/api/v1",
    api_key="YOUR_KEY",
    model="qnp/auto"  # routes to your fallback chain
)
response = llm.invoke("Hello!")
print(response.content)

Cline Setup

Cline is an AI-powered coding assistant. Use QNP as the provider for configurable fallback and unified API.

Steps

Open Cline settings (click the gear icon).
Set API Provider to OpenAI Compatible.
Base URL: https://qnp.ai/api/api/v1.
API Key: Your QNP key (starts with qnp-).
Model: Use qnp/auto to auto-resolve to your fallback chain.

Tip: Configure your API key provider chain in Dashboard → API Keys. The qnp/auto model uses the first model in that chain; if it fails, the next is tried automatically.

Troubleshooting

400 invalid_request_error: Ensure your key has at least one provider in its fallback chain. Add Anthropic or OpenAI in Dashboard → Keys → Routing.
Invalid API Key: Confirm the key starts with qnp- and is active.
Base URL: Must end with /v1 (no trailing slash).

Setup Script

The setup script auto-detects and configures OpenClaw, Claude Code, and Hermes Agent. Use QNP as the model provider for unified fallback and BYOK.

The setup script will interactively prompt you for your API key and base URL.

bash

curl -fsSL https://www.qnp.ai/setup.sh | bash

What the script does

Detects installed tools (OpenClaw, Claude Code, Hermes Agent) and configures QNP as provider for each
Sets input: ["text","image","audio","video"] for all models
Creates a backup before changes