Skip to main content

Rate limits & concurrency

Two independent throttling mechanisms apply to every request:

  1. Per-minute rate limit (how fast you submit).
  2. Concurrency cap (how many runs can process simultaneously).

Rate limits (per minute)

Enforced on the submit endpoint:

Limit
Default500 requests/minute per client

The rate limit is configured per key and visible in the dashboard.

Over the limit returns:

429 RATE_LIMIT_EXCEEDED

Back off and retry. Implement exponential backoff — don't hot-loop.

Need higher throughput? Contact ZenHire support — enterprise plans can raise the per-minute cap.

Concurrency cap

Default 8 simultaneous processing runs per client (contact support to extend). Enforced atomically at submit time. When you hit the cap, the submit endpoint still returns 202 Accepted — but with status: "queued":

{
"id": "req_...",
"status": "queued",
"queuePosition": 3,
"activeRequests": 8,
"pollIntervalSeconds": 20
}

Queued requests start automatically in FIFO order when a slot frees up. You don't need to retry submit. Just poll the returned requestId.

Poll-endpoint rate limit

The poll endpoint has its own minimum interval: 10 seconds per requestId for non-terminal statuses. Polling faster returns 429 POLL_RATE_LIMITED with a Retry-After header.

Strategy recommendations

  • Respect pollIntervalSeconds from every response.
  • Implement exponential backoff on 429 RATE_LIMIT_EXCEEDED.
  • Don't treat status: queued as an error — it's a normal submit outcome.
  • Parallelize freely — queued runs cost nothing until they start processing.

See in the API reference