Rate limits & concurrency

Two independent throttling mechanisms apply to every request:

Per-minute rate limit (how fast you submit).
Concurrency cap (how many runs can process simultaneously).

Rate limits (per minute)

Enforced on the submit endpoint:

	Limit
Default	500 requests/minute per client

The rate limit is configured per key and visible in the dashboard.

Over the limit returns:

429 RATE_LIMIT_EXCEEDED

Back off and retry. Implement exponential backoff — don't hot-loop.

Need higher throughput? Contact ZenHire support — enterprise plans can raise the per-minute cap.

Concurrency cap

Default 8 simultaneous processing runs per client (contact support to extend). Enforced atomically at submit time. When you hit the cap, the submit endpoint still returns 202 Accepted — but with status: "queued":

{
  "id": "req_...",
  "status": "queued",
  "queuePosition": 3,
  "activeRequests": 8,
  "pollIntervalSeconds": 20
}

Queued requests start automatically in FIFO order when a slot frees up. You don't need to retry submit. Just poll the returned requestId.

Poll-endpoint rate limit

The poll endpoint has its own minimum interval: 10 seconds per requestId for non-terminal statuses. Polling faster returns 429 POLL_RATE_LIMITED with a Retry-After header.

Strategy recommendations

Respect pollIntervalSeconds from every response.
Implement exponential backoff on 429 RATE_LIMIT_EXCEEDED.
Don't treat status: queued as an error — it's a normal submit outcome.
Parallelize freely — queued runs cost nothing until they start processing.

See in the API reference

POST /api/v1/speech/analyze — 429 RATE_LIMIT_EXCEEDED response, queuePosition / activeRequests fields on 202 queued response
GET /api/v1/speech/analyze/{id} — 429 POLL_RATE_LIMITED response and Retry-After header

Rate limits (per minute)​

Concurrency cap​

Poll-endpoint rate limit​

Strategy recommendations​

See in the API reference​

Rate limits (per minute)

Concurrency cap

Poll-endpoint rate limit

Strategy recommendations

See in the API reference