Skip to main content
The Ingest endpoints handle document uploads and status polling. Parsing runs asynchronously — upload returns a task_id immediately, which you then poll until ingestion completes.
Use an official SDK if you can — the Python and JavaScript SDKs wrap the direct-upload flow into a single client.ingest.file(...) / client.ingest.batch(...) call that works for any file size. The endpoint-level details below are only needed if you’re calling the REST API directly.

Upload a Document

The default upload flow streams your file bytes straight from your client to Polyvia’s storage backend, then asks our API to register and parse it. This is a three-step flow that works for any file size — there is no practical upper limit.

Step 1 — Get an upload URL

POST /api/v1/ingest/upload-url No request body.
upload_url
string
required
Short-lived signed URL for the storage backend. Expires in ~1 hour and is single-use.
{ "upload_url": "https://<your-deployment>.convex.cloud/api/storage/upload?token=..." }

Step 2 — PUT the file to that URL

Stream the raw file bytes to the returned upload_url. Set Content-Type to the file’s MIME type.
Do not include your Polyvia Authorization: Bearer poly_<key> header on this request. The URL is already signed, and forwarding the key to a different origin would leak it unnecessarily.
Storage responds with the new object’s identifier:
{ "storageId": "kg2abc...", "size": 12345678 }

Step 3 — Finalize the upload

POST /api/v1/ingest/finalize application/json
storage_id
string
required
The storageId returned by storage in Step 2.
file_type
string
required
MIME type of the uploaded file.
name
string
Display name. Defaults to “Untitled” if omitted.
group_id
string
Assign the document to a group on creation.

Response

document_id
string
required
Unique identifier for the uploaded document
task_id
string
required
Ingestion task identifier — use this to poll for status
status
string
Initial status: always pending

Example

# 1. Get a signed upload URL
UPLOAD_URL=$(curl -s -X POST https://app.polyvia.ai/api/v1/ingest/upload-url \
  -H "Authorization: Bearer poly_<your-key>" | jq -r '.upload_url')

# 2. PUT the file bytes directly to storage (no auth header!)
STORAGE_ID=$(curl -s -X PUT "$UPLOAD_URL" \
  -H "Content-Type: application/pdf" \
  --data-binary @report.pdf | jq -r '.storageId')

# 3. Finalize — register the document and queue parsing
curl -X POST https://app.polyvia.ai/api/v1/ingest/finalize \
  -H "Authorization: Bearer poly_<your-key>" \
  -H "Content-Type: application/json" \
  -d "{\"storage_id\": \"$STORAGE_ID\", \"file_type\": \"application/pdf\", \"name\": \"Q4 Report\", \"group_id\": \"g_...\"}"
{
  "document_id": "k57abc123...",
  "task_id":     "3f2e1d0c-...",
  "status":      "pending"
}

Upload Multiple Documents

For batches, run the direct-upload flow once per file. The official SDKs do this in client.ingest.batch(...) — each file is uploaded and finalized independently, so a failure on one file is isolated to that entry instead of failing the whole batch.
from polyvia import Polyvia

client = Polyvia(api_key="poly_...")

batch = client.ingest.batch(
    ["q3.pdf", "q4.pdf"],
    names=["Q3 Report", "Q4 Report"],
    group_id="g_...",
)
for item in batch.results:
    if item.error:
        print(f"failed: {item.file}: {item.error}")
    else:
        client.ingest.wait(item.task_id)

Quick Multipart Upload (small files)

These multipart endpoints proxy file bytes through the API server, which has a 4.5 MB total request-body limit. They exist as a one-call convenience for small uploads. For any file size, prefer the direct-upload flow above (or use an SDK, which does it for you).

POST /api/v1/ingest

multipart/form-data
file
file
required
The document to upload. See Supported File Formats below.
name
string
Display name in your workspace. Defaults to the filename.
group_id
string
Assign the document to a group on upload.
Returns the same {document_id, task_id, status} shape as /finalize.
curl -X POST https://app.polyvia.ai/api/v1/ingest \
  -H "Authorization: Bearer poly_<your-key>" \
  -F "file=@/path/to/report.pdf" \
  -F "name=Q4 2024 Report" \
  -F "group_id=g_..."

POST /api/v1/ingest/batch

multipart/form-data. Same fields as /ingest but files is repeated per file and names is a comma-separated string aligned to files. Returns {results: [...], errors: [...] | null}.
curl -X POST https://app.polyvia.ai/api/v1/ingest/batch \
  -H "Authorization: Bearer poly_<your-key>" \
  -F "files=@q3.pdf" \
  -F "files=@q4.pdf" \
  -F "names=Q3 Report,Q4 Report" \
  -F "group_id=g_..."

Check Ingestion Status

Poll a parse task started by either upload flow.

Endpoint

GET /api/v1/ingest/{task_id}

Path Parameters

task_id
string
required
The task identifier returned by /ingest/finalize (or the legacy multipart endpoints)

Response

task_id
string
Task identifier
document_id
string
Document identifier
status
string
Processing status (see table below)
error
string | null
Error message if status is failed, otherwise null
status valueMeaning
pendingQueued, not yet started
parsingBeing parsed and indexed
completedReady to query
failedParsing failed; see error field

Example

cURL
curl https://app.polyvia.ai/api/v1/ingest/3f2e1d0c-... \
  -H "Authorization: Bearer poly_<your-key>"
while true; do
  STATUS=$(curl -s \
    -H "Authorization: Bearer poly_<your-key>" \
    "https://app.polyvia.ai/api/v1/ingest/3f2e1d0c-..." \
    | jq -r '.status')
  echo "Status: $STATUS"
  [ "$STATUS" = "completed" ] || [ "$STATUS" = "failed" ] && break
  sleep 5
done
{
  "task_id":     "3f2e1d0c-...",
  "document_id": "k57abc123...",
  "status":      "completed",
  "error":       null
}

Supported File Formats

CategoryExtensions
Documents.pdf, .docx, .pptx
Text.txt, .md (Markdown)
Images.png, .jpg / .jpeg, .webp, .gif
Audio.wav, .mp3, .m4a
See Supported Formats for parser-by-parser details on what gets extracted from each file type.
Documents are typically processed within 1–2 minutes. Audio files take longer — roughly real-time playback for transcription. Poll the status endpoint every few seconds to check progress.