ledgerdock/doc/api-contract.md

# API Contract

Base URL prefix: `/api/v1`

Primary implementation modules:
- `backend/app/api/router.py`
- `backend/app/api/routes_auth.py`
- `backend/app/api/routes_health.py`
- `backend/app/api/routes_documents.py`
- `backend/app/api/routes_search.py`
- `backend/app/api/routes_processing_logs.py`
- `backend/app/api/routes_settings.py`

## Authentication And Authorization

- Authentication is cookie-based session auth with a server-issued hashed session token.
- Clients authenticate with `POST /auth/login` using username and password.
- Backend issues a server-stored session token and sets `HttpOnly` `dcm_session` and readable `dcm_csrf` cookies.
- Login brute-force protection enforces Redis-backed throttle checks keyed by username and source IP.
- State-changing requests from browser clients must send `x-csrf-token: <dcm_csrf>` in request headers (double-submit pattern).
- For non-browser API clients, the optional `Authorization: Bearer <token>` path remains supported when the token is sent explicitly.
- `GET /auth/me` returns current identity and role.
- `POST /auth/logout` revokes current session token.

Role matrix:
- `documents/*`: `admin` or `user`
- `search/*`: `admin` or `user`
- `settings/*`: `admin` only
- `processing/logs/*`: `admin` only

Ownership rules:
- `user` role is restricted to its own documents.
- `admin` role can access all documents.

## Auth

- `POST /auth/login`
  - Body model: `AuthLoginRequest`
  - Response model: `AuthLoginResponse`
  - Additional responses:
    - `401` for invalid credentials
    - `429` for throttled login attempts, with stable message and `Retry-After` header
    - `503` when the login rate-limiter backend is unavailable
- `GET /auth/me`
  - Response model: `AuthSessionResponse`
- `POST /auth/logout`
  - Response model: `AuthLogoutResponse`

## Health

- `GET /health`
- Purpose: liveness check
- Response: `{ "status": "ok" }`

## Documents

### Collection and metadata helpers

- `GET /documents`
  - Query: `offset`, `limit`, `include_trashed`, `only_trashed`, `path_prefix`, `path_filter`, `tag_filter`, `type_filter`, `processed_from`, `processed_to`
  - Response model: `DocumentsListResponse`
- `GET /documents/tags`
  - Query: `include_trashed`
  - Response: `{ "tags": string[] }`
  - Behavior:
    - all document-assigned tags visible to caller scope are included
    - predefined tags are role-filtered: `admin` receives full catalog, `user` receives only entries with `global_shared=true`
- `GET /documents/paths`
  - Query: `include_trashed`
  - Response: `{ "paths": string[] }`
  - Behavior:
    - all document-assigned logical paths visible to caller scope are included
    - predefined paths are role-filtered: `admin` receives full catalog, `user` receives only entries with `global_shared=true`
- `GET /documents/types`
  - Query: `include_trashed`
  - Response: `{ "types": string[] }`
- `POST /documents/content-md/export`
  - Body model: `ContentExportRequest`
  - Response: ZIP stream containing one markdown file per matched document
  - Limits:
    - hard cap on matched document count (`CONTENT_EXPORT_MAX_DOCUMENTS`)
    - hard cap on cumulative markdown bytes (`CONTENT_EXPORT_MAX_TOTAL_BYTES`)
    - per-user rate limit (`CONTENT_EXPORT_RATE_LIMIT_PER_MINUTE`)
  - Behavior: archive is streamed from spool file instead of unbounded in-memory buffer

### Per-document operations

- `GET /documents/{document_id}`
  - Response model: `DocumentDetailResponse`
- `GET /documents/{document_id}/download`
  - Response: original file bytes
- `GET /documents/{document_id}/preview`
  - Response: inline preview stream only for safe MIME types
  - Behavior: script-capable MIME types are forced to attachment responses with `X-Content-Type-Options: nosniff`
- `GET /documents/{document_id}/thumbnail`
  - Response: generated thumbnail image when available
- `GET /documents/{document_id}/content-md`
  - Response: extracted markdown content for one document
- `PATCH /documents/{document_id}`
  - Body model: `DocumentUpdateRequest`
  - Response model: `DocumentResponse`
- `POST /documents/{document_id}/trash`
  - Response model: `DocumentResponse`
- `POST /documents/{document_id}/restore`
  - Response model: `DocumentResponse`
- `DELETE /documents/{document_id}`
  - Behavior: permanent delete, requires document to be trashed first
  - Response: deletion counters
- `POST /documents/{document_id}/reprocess`
  - Response model: `DocumentResponse`
  - Behavior: requeues asynchronous processing task

### Upload

- `POST /documents/upload`
- Multipart form fields:
  - `files[]` (required)
  - `relative_paths[]` (optional)
  - `logical_path` (optional, defaults to `Inbox`)
  - `tags` (optional CSV)
  - `conflict_mode` (`ask`, `replace`, `duplicate`)
- Response model: `UploadResponse`
- Behavior:
  - `ask`: returns `conflicts` if duplicate checksum is detected for caller-visible documents
  - `replace`: creates new document linked to replaced document id
  - `duplicate`: creates additional document record
  - upload `POST` request rejected with `411` when `Content-Length` is missing
  - `OPTIONS /documents/upload` CORS preflight bypasses upload `Content-Length` enforcement
  - request rejected with `413` when file count, per-file size, or total request size exceeds configured limits

## Search

- `GET /search`
- Query: `query` (min length 2), `offset`, `limit`, `include_trashed`, `only_trashed`, `path_filter`, `tag_filter`, `type_filter`, `processed_from`, `processed_to`
- Response model: `SearchResponse`
- Behavior: PostgreSQL full-text and metadata ranking with role-based ownership scope

## Processing Logs

- Access: admin only

- `GET /processing/logs`
  - Query: `offset`, `limit`, `document_id`
  - Response model: `ProcessingLogListResponse`
  - `limit` is capped by runtime configuration
  - sensitive fields are redacted in API responses
- `POST /processing/logs/trim`
  - Query: optional `keep_document_sessions`, `keep_unbound_entries`
  - Behavior: omitted query values fall back to persisted `/settings.processing_log_retention`
  - query values are capped by runtime retention limits
  - Response: trim counters
- `POST /processing/logs/clear`
  - Response: clear counters

Persistence mode:
- default is metadata-only logging (`PROCESSING_LOG_STORE_MODEL_IO_TEXT=false`, `PROCESSING_LOG_STORE_PAYLOAD_TEXT=false`)
- full prompt/response or payload content storage requires explicit operator opt-in

## Settings

- Access: admin only

- `GET /settings`
  - Response model: `AppSettingsResponse`
  - persisted providers with invalid base URLs are ignored during read sanitization; response falls back to remaining valid providers or secure defaults
  - provider API keys are exposed only as `api_key_set` and `api_key_masked`
- `PATCH /settings`
  - Body model: `AppSettingsUpdateRequest`
  - Response model: `AppSettingsResponse`
  - rejects invalid provider base URLs with `400` when scheme, allowlist, or network safety checks fail
  - provider API keys are persisted encrypted at rest (`api_key_encrypted`) and plaintext keys are not written to storage
- `POST /settings/reset`
  - Response model: `AppSettingsResponse`
- `PATCH /settings/handwriting`
  - Body model: `HandwritingSettingsUpdateRequest`
  - Response model: `AppSettingsResponse`
- `GET /settings/handwriting`
  - Response model: `HandwritingSettingsResponse`

## Schema Families

Auth schemas in `backend/app/schemas/auth.py`:
- `AuthLoginRequest`
- `AuthUserResponse`
- `AuthSessionResponse`
- `AuthLoginResponse`
- `AuthLogoutResponse`

Document schemas in `backend/app/schemas/documents.py`:
- `DocumentResponse`
- `DocumentDetailResponse`
- `DocumentsListResponse`
- `UploadConflict`
- `UploadResponse`
- `DocumentUpdateRequest`
- `SearchResponse`
- `ContentExportRequest`

Processing log schemas in `backend/app/schemas/processing_logs.py`:
- `ProcessingLogEntryResponse`
- `ProcessingLogListResponse`

Settings schemas in `backend/app/schemas/settings.py`:
- provider, task, upload-default, display, processing-log retention, predefined paths or tags, handwriting-style, and legacy handwriting models grouped under `AppSettingsResponse` and `AppSettingsUpdateRequest`.