Batch Lifecycle
Typed Batch resource with state machine, per-row error report, replay framework, and four source kinds.
The Batch resource
type Batch = {
id: string
surface: "partner" | "ipaas" | "batch"
operation: BatchOperationKind
status: BatchStatus
source: BatchSource
created_at: ISO8601
started_at: ISO8601 | null
completed_at: ISO8601 | null
progress: {
rows_total: number | null // null when total isn't yet known
rows_processed: number
rows_succeeded: number
rows_failed: number
}
errors_url: string | null // populated when status is partially_failed or failed
result_url: string | null // populated when status is succeeded or partially_failed
idempotency_key: string // echoed back
}
type BatchStatus =
| "accepted" // queued; not yet running
| "running"
| "succeeded" // all rows processed; zero failed
| "partially_failed" // some rows processed, some failed
| "failed" // unrecoverable error
| "cancelled" // operator-cancelled via DELETE Source kinds
Four kinds, picked by load profile and delivery mechanism:
type BatchSource =
| { kind: "inline"; rows: unknown[] } // ≤ 1000 rows
| { kind: "signed_url"; url: string; format: "csv" | "jsonl" } // pre-signed URL
| { kind: "s3"; bucket: string; key: string; format: "csv" | "jsonl"; assume_role_arn?: string }
| { kind: "sftp"; credentials_id: string; path: string; format: "csv" | "jsonl" } When to use each
- inline — small one-shot uploads, ≤1000 rows. Lowest latency.
- signed_url — partner generates a temporary download URL; gondor fetches at job start. Good for ad-hoc uploads from any storage.
- s3 — long-lived gondor-owned bucket or partner-owned bucket via
assume_role_arn. 50GB cap. - sftp — partner uploads to gondor-managed SFTP endpoint with PGP. The
credentials_idreferences a per-tenantSftpCredentialsresource — seePOST /v2/partner/sftp-credentials.
Lifecycle endpoints
POST /v2/batch/operations
Idempotency-Key: <uuid>
Body: { operation: BatchOperationKind, source: BatchSource, params: ... }
→ 201 Batch (status: "accepted")
GET /v2/batch/operations/{id}
→ Batch
GET /v2/batch/operations/{id}/errors
→ BatchErrorReport (paginated)
GET /v2/batch/operations/{id}/result
→ result data (per operation)
DELETE /v2/batch/operations/{id}
X-Operator-Override: destructive
→ 204 (cancels if running)
POST /v2/batch/operations/{id}/replay
Idempotency-Key: <new-uuid>
Body: { only: "failed_rows" | "all_rows" }
→ 201 Batch (new batch with same op + source filtered to the replay target) Per-row error report
type BatchErrorReport = {
batch_id: string
total_failed: number
errors: BatchErrorRow[] // paginated; sampling rules apply when total_failed > 10000
sampled: boolean
next_cursor: string | null
}
type BatchErrorRow = {
row_number: number // 1-indexed
source_identifier: string | null // e.g., the row's unique_id from the CSV
errors: Error[] // standard Error envelope per row
} The error report is paginated. For batches with >10k failed rows the response is sampled (sampled: true); fetch the full report via the errors_url which returns a JSONL stream.
Idempotency on batches
Same Idempotency-Key + same operation + same source hash → returns the existing batch (not a new one).
Same Idempotency-Key + DIFFERENT source → 409 idempotency_conflict.
Per-row idempotency (within a batch)
Some Op handlers — notably messages.send — accept per-row idempotency keys inside the batch payload. The key is scoped to the operation + the row's natural identifier; replays of the same row within the same batch return the cached per-row result. This is distinct from the batch-level Idempotency-Key header which scopes the batch as a whole.
Replay
Two replay modes:
- failed_rows — only the rows that failed in the original. The new batch carries the same Op handler and source kind, filtered to the failure set.
- all_rows — rerun the entire source. Useful when the failure was upstream of the row data (a misconfigured Op param, a flaky downstream service).
Replays inherit the original's lineage and are linkable via the parent batch_id on the replay's params.