CV DeepSearch — Ingestion (fill the corpus)
Ingestion is the first of the two CV DeepSearch flows: you fill a corpus with parsed CVs once, then search it repeatedly. This guide covers ingesting candidates and reading their embedding status.
For the concepts (corpus, candidate, embedding status), see the CV DeepSearch overview. The full endpoint reference is the Ingestion group in the CV DeepSearch API reference sidebar.
Prerequisites
- An API key — issued in the ZenHire dashboard.
It looks like
zh_api_…. Send it on every request as theX-API-Keyheader. - The
cvdeepsearchpermission on your client. Contact support if your key returnsMISSING_PERMISSION. - Parsed CVs as JSON (you bring your own CV parser, or reuse the one from your ATS). CV DeepSearch ingests the parsed JSON — it does not parse PDFs.
Ingest a corpus
POST https://platform.zenhire.ai/api/v1/cvds/candidates
Batch your parsed CVs into a corpus. Each candidate is keyed by your own
external_id and embedded for semantic retrieval.
curl -X POST "https://platform.zenhire.ai/api/v1/cvds/candidates" \
-H "X-API-Key: zh_api_…" \
-H "Content-Type: application/json" \
-d '{
"corpus_id": "acme-eng-pool",
"candidates": [
{ "external_id": "cand-001", "parsed_cv": { "name": "Jane Doe", "skills": ["Node.js", "PostgreSQL"], "experience": [] }, "tags": ["batch-2026-q2"] },
{ "external_id": "cand-002", "parsed_cv": { "name": "John Roe", "skills": ["Python", "AWS"], "experience": [] } }
]
}'
Response (HTTP 200):
{
"results": [
{ "external_id": "cand-001", "status": "accepted", "embedding_status": "embedded" },
{ "external_id": "cand-002", "status": "accepted", "embedding_status": "embedded" }
],
"summary": { "accepted": 2, "updated": 0, "unchanged": 0, "failed": 0 }
}
Node.js
const res = await fetch("https://platform.zenhire.ai/api/v1/cvds/candidates", {
method: "POST",
headers: {
"X-API-Key": process.env.ZENHIRE_API_KEY,
"Content-Type": "application/json",
},
body: JSON.stringify({
corpus_id: "acme-eng-pool",
candidates: [
{ external_id: "cand-001", parsed_cv: { name: "Jane Doe", skills: ["Node.js"] }, tags: ["batch-2026-q2"] },
],
}),
});
const json = await res.json();
console.log(json.summary);
Python
import requests
res = requests.post(
"https://platform.zenhire.ai/api/v1/cvds/candidates",
headers={"X-API-Key": "zh_…", "Content-Type": "application/json"},
json={
"corpus_id": "acme-eng-pool",
"candidates": [
{"external_id": "cand-001", "parsed_cv": {"name": "Jane Doe", "skills": ["Node.js"]}},
],
},
timeout=60,
)
res.raise_for_status()
print(res.json()["summary"])
Request shape — single or array
candidates is always an array. Send one candidate as a one-element array,
or many at once:
// one candidate
{ "corpus_id": "acme-eng-pool", "candidates": [
{ "external_id": "cand-001", "parsed_cv": { "name": "Jane Doe" } }
] }
// many candidates
{ "corpus_id": "acme-eng-pool", "candidates": [
{ "external_id": "cand-001", "parsed_cv": { "name": "Jane Doe" } },
{ "external_id": "cand-002", "parsed_cv": { "name": "John Roe" } }
] }
Things to know
corpus_idis fail-closed. Set it top-level or per candidate. A candidate with no resolvablecorpus_idrejects the whole batch withMISSING_CORPUS_ID.- Idempotent. Re-posting an unchanged candidate is a no-op (
unchanged); a changed CV re-embeds it (updated). Re-sync a corpus by re-posting. - Limits: up to 500 candidates and ~10 MB per request. More than 500
candidates returns
BATCH_TOO_LARGE; a body over ~10 MB returnsPAYLOAD_TOO_LARGE. For larger corpora, split into multiple requests (≤ 500 candidates and ≤ ~10 MB each) or use bulk sync. parsed_cvis PII. It's embedded but never returned by the read endpoints — you keep your own copy.- Embedding can be async. Large batches return
pendingand embed in the background. Poll the candidate-status endpoint (below) until the status isembeddedbefore relying on the candidate appearing in a search.
Check embedding status
A candidate is only returned by a search once its embedding_status is
embedded. There are two read endpoints.
List a corpus
GET https://platform.zenhire.ai/api/v1/cvds/candidates?corpus_id=acme-eng-pool
curl "https://platform.zenhire.ai/api/v1/cvds/candidates?corpus_id=acme-eng-pool&limit=50" \
-H "X-API-Key: zh_api_…"
Returns a page of candidate statuses (paginated via cursor / next_cursor).
You can filter to one state with embedding_status=pending|embedded|failed.
corpus_id is mandatory — a request without it is rejected with
MISSING_CORPUS_ID (fail-closed).
Get one candidate
GET https://platform.zenhire.ai/api/v1/cvds/candidates/{external_id}?corpus_id=acme-eng-pool
curl "https://platform.zenhire.ai/api/v1/cvds/candidates/cand-001?corpus_id=acme-eng-pool" \
-H "X-API-Key: zh_api_…"
Poll this until embedding_status is embedded before searching.
Error codes
The ingestion endpoints use the shared standard error envelope:
error.code | HTTP | Meaning | Retry? |
|---|---|---|---|
INVALID_INPUT | 400 | A field failed validation. | After fix |
MISSING_CORPUS_ID | 400 | A mandatory corpus_id was missing (fail-closed). | After fix |
BATCH_TOO_LARGE | 400 | More than 500 candidates in one ingest. | After fix |
PAYLOAD_TOO_LARGE | 413 | Request body over the ~10 MB per-request limit. | Split / smaller batch |
MISSING_PERMISSION | 401 / 403 | Missing/invalid key, or missing cvdeepsearch perm. | Contact support |
NOT_FOUND | 404 | No candidate with that id for your client. | No |
INTERNAL_ERROR | 500 | Uncategorised server error. | After delay |
Next step
Once your corpus is embedded, move on to the Search guide to query it for a position.