Skip to main content

Ingest candidates into a corpus

POST 

/api/v1/cvds/candidates

Batch upsert parsed CVs into a corpus. Each candidate is keyed by (corpus_id, external_id) and idempotently upserted — re-ingesting a candidate whose content is unchanged is a no-op (unchanged), while a changed parsed_cv re-embeds it (updated).

Corpus resolution (fail-closed)

Provide corpus_id at the top level (applies to every candidate) or per-candidate (overrides the top-level value). A candidate with no resolvable corpus_id rejects the whole batch with MISSING_CORPUS_ID — it is never merged into a default corpus.

Request shape

candidates is an array of { external_id, parsed_cv, … } items — send a single candidate as a one-element array. Set corpus_id once at the top level (applies to every candidate) or per-candidate.

Limits

  • Up to 500 candidates per request (BATCH_TOO_LARGE above that).
  • ~10 MB request body per request (PAYLOAD_TOO_LARGE above that). For larger corpora, split into multiple requests (≤ 500 candidates and ≤ ~10 MB each) or use bulk sync.

Small batches embed inline and return embedded statuses directly; larger batches accept fast with pending embedding statuses and embed in the background — poll GET /api/v1/cvds/candidates/{external_id} for the final status.

Required permission

Client's permissions[] must contain cvdeepsearch. Without it the request returns 403 MISSING_PERMISSION.

Request

Responses

Per-candidate ingest result plus a roll-up summary. A 200 is returned even when some candidates fail to embed — inspect the per-candidate embedding_status and the summary.failed count.