API reference
Rate limits & headers
Pitchbar applies per-endpoint rate limits to protect against abuse and to keep cost-bearing endpoints (LLM, vector search) predictable. Every response from a rate-limited surface carries a quad of headers so integrators can pace themselves.
Headers on every response
| Header | Meaning |
|---|---|
X-RateLimit-Limit | Maximum requests allowed in the current window. |
X-RateLimit-Remaining | Requests still available in the current window. |
X-RateLimit-Reset | Unix timestamp when the window resets and budget is restored. |
Retry-After | Only sent on 429 — seconds to wait before retrying. |
Pitchbar's API-surface middleware ensures
X-RateLimit-Reset ships on every response, not just
on 429. That mirrors GitHub / Stripe behaviour.
Named limiters in use
| Limiter | Applies to | Limit | Keyed by |
|---|---|---|---|
widget-init |
POST /api/v1/widget/init |
1000/min/IP+agent · 30000/hr/IP | Soft per-IP — absorbs NAT bursts. |
widget-session |
POST /api/v1/widget/messages*, events, conversation operations |
300/min | Per JWT (visitor session). |
widget-leads |
POST /api/v1/widget/leads |
30/min | Per JWT. |
wp-plugin |
All /v1/wp/* bulk-sync endpoints |
60/min | Per workspace API token id. |
Inline (throttle:600,1) |
POST /api/v1/widget/typing |
600/min/IP | Per IP (visitor typing pings). |
Inline (throttle:60,1) |
POST /api/v1/widget/satisfaction |
60/min/IP | Per IP. |
Inline (throttle:120,1) |
POST /api/v1/widget/coupon/apply |
120/min/IP | Per IP. |
What 429 looks like
HTTP/1.1 429 Too Many Requests
Content-Type: application/json
X-RateLimit-Limit: 300
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1747438932
Retry-After: 28
{"error":{"code":"rate_limited","message":"Too many widget requests for this conversation. Please slow down."}}
Best practices for consumers
- Watch
X-RateLimit-Remainingon every response. When it hits a small threshold (e.g. < 5), pause and wait forX-RateLimit-Reset. - On 429, sleep for the value in
Retry-After, then retry. Don't back off exponentially — the window is deterministic. - If you operate a high-fanout integration, segment your
callers so they don't all share an IP — the
widget-initper-IP cap can squeeze hard from a single egress.