Ingest candidates into a corpus
POST/api/v1/cvds/candidates
Batch upsert parsed CVs into a corpus. Each candidate is keyed by
(corpus_id, external_id) and idempotently upserted — re-ingesting a
candidate whose content is unchanged is a no-op (unchanged), while a
changed parsed_cv re-embeds it (updated).
Corpus resolution (fail-closed)
Provide corpus_id at the top level (applies to every candidate) or
per-candidate (overrides the top-level value). A candidate with no
resolvable corpus_id rejects the whole batch with
MISSING_CORPUS_ID — it is never merged into a default corpus.
Request shape
candidates is an array of { external_id, parsed_cv, … } items —
send a single candidate as a one-element array. Set corpus_id once at
the top level (applies to every candidate) or per-candidate.
Limits
- Up to 500 candidates per request (
BATCH_TOO_LARGEabove that). - ~10 MB request body per request (
PAYLOAD_TOO_LARGEabove that). For larger corpora, split into multiple requests (≤ 500 candidates and ≤ ~10 MB each) or use bulk sync.
Small batches embed inline and return embedded statuses directly;
larger batches accept fast with pending embedding statuses and embed
in the background — poll
GET /api/v1/cvds/candidates/{external_id} for the final status.
Required permission
Client's permissions[] must contain cvdeepsearch. Without it the
request returns 403 MISSING_PERMISSION.
Request
Responses
- 200
- 400
- 401
- 403
- 413
- 500
Per-candidate ingest result plus a roll-up summary. A 200 is
returned even when some candidates fail to embed — inspect the
per-candidate embedding_status and the summary.failed count.
Validation failure. error.code is one of:
INVALID_INPUT— a field failed validation (emptycandidates, missingexternal_id/parsed_cv, …).MISSING_CORPUS_ID— a candidate had no resolvablecorpus_id(top-level or per-candidate). Fail-closed: the whole batch is rejected.BATCH_TOO_LARGE— more than 500 candidates in one request.
Missing or invalid API key.
Authenticated but missing the cvdeepsearch permission.
Request body exceeds the ~10 MB per-request limit
(PAYLOAD_TOO_LARGE). Split the corpus into smaller requests
(≤ 500 candidates and ≤ ~10 MB each) or use bulk sync.
Internal error.