# API Contract Base URL prefix: `/api/v1` Primary implementation modules: - `backend/app/api/router.py` - `backend/app/api/routes_auth.py` - `backend/app/api/routes_health.py` - `backend/app/api/routes_documents.py` - `backend/app/api/routes_search.py` - `backend/app/api/routes_processing_logs.py` - `backend/app/api/routes_settings.py` ## Authentication And Authorization - Authentication is cookie-based session auth with a server-issued hashed session token. - Clients authenticate with `POST /auth/login` using username and password. - Backend issues a server-stored session token and sets `HttpOnly` `dcm_session` and readable `dcm_csrf` cookies. - Login brute-force protection enforces Redis-backed throttle checks keyed by username and source IP. - State-changing requests from browser clients must send `x-csrf-token: ` in request headers (double-submit pattern). - For non-browser API clients, the optional `Authorization: Bearer ` path remains supported when the token is sent explicitly. - `GET /auth/me` returns current identity, role, and current CSRF token. - `POST /auth/logout` revokes current session token. Role matrix: - `documents/*`: `admin` or `user` - `search/*`: `admin` or `user` - `settings/*`: `admin` only - `processing/logs/*`: `admin` only Ownership rules: - `user` role is restricted to its own documents. - `admin` role can access all documents. ## Auth - `POST /auth/login` - Body model: `AuthLoginRequest` - Response model: `AuthLoginResponse` - Additional responses: - `401` for invalid credentials - `429` for throttled login attempts, with stable message and `Retry-After` header - `503` when the login rate-limiter backend is unavailable - `GET /auth/me` - Response model: `AuthSessionResponse` - `POST /auth/logout` - Response model: `AuthLogoutResponse` ## Health - `GET /health` - Purpose: liveness check - Response: `{ "status": "ok" }` ## Documents ### Collection and metadata helpers - `GET /documents` - Query: `offset`, `limit`, `include_trashed`, `only_trashed`, `path_prefix`, `path_filter`, `tag_filter`, `type_filter`, `processed_from`, `processed_to` - Response model: `DocumentsListResponse` - `GET /documents/tags` - Query: `include_trashed` - Response: `{ "tags": string[] }` - Behavior: - all document-assigned tags visible to caller scope are included - predefined tags are role-filtered: `admin` receives full catalog, `user` receives only entries with `global_shared=true` - `GET /documents/paths` - Query: `include_trashed` - Response: `{ "paths": string[] }` - Behavior: - all document-assigned logical paths visible to caller scope are included - predefined paths are role-filtered: `admin` receives full catalog, `user` receives only entries with `global_shared=true` - `GET /documents/types` - Query: `include_trashed` - Response: `{ "types": string[] }` - `POST /documents/content-md/export` - Body model: `ContentExportRequest` - Response: ZIP stream containing one markdown file per matched document - Limits: - hard cap on matched document count (`CONTENT_EXPORT_MAX_DOCUMENTS`) - hard cap on cumulative markdown bytes (`CONTENT_EXPORT_MAX_TOTAL_BYTES`) - per-user rate limit (`CONTENT_EXPORT_RATE_LIMIT_PER_MINUTE`) - Behavior: archive is streamed from spool file instead of unbounded in-memory buffer ### Per-document operations - `GET /documents/{document_id}` - Response model: `DocumentDetailResponse` - `GET /documents/{document_id}/download` - Response: original file bytes - `GET /documents/{document_id}/preview` - Response: inline preview stream only for safe MIME types - Behavior: script-capable MIME types are forced to attachment responses with `X-Content-Type-Options: nosniff` - `GET /documents/{document_id}/thumbnail` - Response: generated thumbnail image when available - `GET /documents/{document_id}/content-md` - Response: extracted markdown content for one document - `PATCH /documents/{document_id}` - Body model: `DocumentUpdateRequest` - Response model: `DocumentResponse` - `POST /documents/{document_id}/trash` - Response model: `DocumentResponse` - `POST /documents/{document_id}/restore` - Response model: `DocumentResponse` - `DELETE /documents/{document_id}` - Behavior: permanent delete, requires document to be trashed first - Response: deletion counters - `POST /documents/{document_id}/reprocess` - Response model: `DocumentResponse` - Behavior: requeues asynchronous processing task ### Upload - `POST /documents/upload` - Multipart form fields: - `files[]` (required) - `relative_paths[]` (optional) - `logical_path` (optional, defaults to `Inbox`) - `tags` (optional CSV) - `conflict_mode` (`ask`, `replace`, `duplicate`) - Response model: `UploadResponse` - Behavior: - `ask`: returns `conflicts` if duplicate checksum is detected for caller-visible documents - `replace`: creates new document linked to replaced document id - `duplicate`: creates additional document record - upload `POST` request rejected with `411` when `Content-Length` is missing - `OPTIONS /documents/upload` CORS preflight bypasses upload `Content-Length` enforcement - request rejected with `413` when file count, per-file size, or total request size exceeds configured limits ## Search - `GET /search` - Query: `query` (min length 2), `offset`, `limit`, `include_trashed`, `only_trashed`, `path_filter`, `tag_filter`, `type_filter`, `processed_from`, `processed_to` - Response model: `SearchResponse` - Behavior: PostgreSQL full-text and metadata ranking with role-based ownership scope ## Processing Logs - Access: admin only - `GET /processing/logs` - Query: `offset`, `limit`, `document_id` - Response model: `ProcessingLogListResponse` - `limit` is capped by runtime configuration - sensitive fields are redacted in API responses - `POST /processing/logs/trim` - Query: optional `keep_document_sessions`, `keep_unbound_entries` - Behavior: omitted query values fall back to persisted `/settings.processing_log_retention` - query values are capped by runtime retention limits - Response: trim counters - `POST /processing/logs/clear` - Response: clear counters Persistence mode: - default is metadata-only logging (`PROCESSING_LOG_STORE_MODEL_IO_TEXT=false`, `PROCESSING_LOG_STORE_PAYLOAD_TEXT=false`) - full prompt/response or payload content storage requires explicit operator opt-in ## Settings - Access: admin only - `GET /settings` - Response model: `AppSettingsResponse` - persisted providers with invalid base URLs are ignored during read sanitization; response falls back to remaining valid providers or secure defaults - provider API keys are exposed only as `api_key_set` and `api_key_masked` - `PATCH /settings` - Body model: `AppSettingsUpdateRequest` - Response model: `AppSettingsResponse` - rejects invalid provider base URLs with `400` when scheme, allowlist, or network safety checks fail - provider API keys are persisted encrypted at rest (`api_key_encrypted`) and plaintext keys are not written to storage - `POST /settings/reset` - Response model: `AppSettingsResponse` - `PATCH /settings/handwriting` - Body model: `HandwritingSettingsUpdateRequest` - Response model: `AppSettingsResponse` - `GET /settings/handwriting` - Response model: `HandwritingSettingsResponse` ## Schema Families Auth schemas in `backend/app/schemas/auth.py`: - `AuthLoginRequest` - `AuthUserResponse` - `AuthSessionResponse` - `AuthLoginResponse` - `AuthLogoutResponse` Document schemas in `backend/app/schemas/documents.py`: - `DocumentResponse` - `DocumentDetailResponse` - `DocumentsListResponse` - `UploadConflict` - `UploadResponse` - `DocumentUpdateRequest` - `SearchResponse` - `ContentExportRequest` Processing log schemas in `backend/app/schemas/processing_logs.py`: - `ProcessingLogEntryResponse` - `ProcessingLogListResponse` Settings schemas in `backend/app/schemas/settings.py`: - provider, task, upload-default, display, processing-log retention, predefined paths or tags, handwriting-style, and legacy handwriting models grouped under `AppSettingsResponse` and `AppSettingsUpdateRequest`.