Nautilus/ledgerdock

Fork 0

Files

Beda Schmid 26eae1a09b

Fix auth session persistence with HttpOnly cookies and CSRF

2026-03-01 21:39:22 -03:00

8.0 KiB

Raw Blame History

API Contract

Base URL prefix: /api/v1

Primary implementation modules:

backend/app/api/router.py
backend/app/api/routes_auth.py
backend/app/api/routes_health.py
backend/app/api/routes_documents.py
backend/app/api/routes_search.py
backend/app/api/routes_processing_logs.py
backend/app/api/routes_settings.py

Authentication And Authorization

Authentication is cookie-based session auth with a server-issued hashed session token.
Clients authenticate with POST /auth/login using username and password.
Backend issues a server-stored session token and sets HttpOnly dcm_session and readable dcm_csrf cookies.
Login brute-force protection enforces Redis-backed throttle checks keyed by username and source IP.
State-changing requests from browser clients must send x-csrf-token: <dcm_csrf> in request headers (double-submit pattern).
For non-browser API clients, the optional Authorization: Bearer <token> path remains supported when the token is sent explicitly.
GET /auth/me returns current identity and role.
POST /auth/logout revokes current session token.

Role matrix:

documents/*: admin or user
search/*: admin or user
settings/*: admin only
processing/logs/*: admin only

Ownership rules:

user role is restricted to its own documents.
admin role can access all documents.

Auth

POST /auth/login
- Body model: AuthLoginRequest
- Response model: AuthLoginResponse
- Additional responses:
  - 401 for invalid credentials
  - 429 for throttled login attempts, with stable message and Retry-After header
  - 503 when the login rate-limiter backend is unavailable
GET /auth/me
- Response model: AuthSessionResponse
POST /auth/logout
- Response model: AuthLogoutResponse

Health

GET /health
Purpose: liveness check
Response: { "status": "ok" }

Documents

Collection and metadata helpers

GET /documents
- Query: offset, limit, include_trashed, only_trashed, path_prefix, path_filter, tag_filter, type_filter, processed_from, processed_to
- Response model: DocumentsListResponse
GET /documents/tags
- Query: include_trashed
- Response: { "tags": string[] }
- Behavior:
  - all document-assigned tags visible to caller scope are included
  - predefined tags are role-filtered: admin receives full catalog, user receives only entries with global_shared=true
GET /documents/paths
- Query: include_trashed
- Response: { "paths": string[] }
- Behavior:
  - all document-assigned logical paths visible to caller scope are included
  - predefined paths are role-filtered: admin receives full catalog, user receives only entries with global_shared=true
GET /documents/types
- Query: include_trashed
- Response: { "types": string[] }
POST /documents/content-md/export
- Body model: ContentExportRequest
- Response: ZIP stream containing one markdown file per matched document
- Limits:
  - hard cap on matched document count (CONTENT_EXPORT_MAX_DOCUMENTS)
  - hard cap on cumulative markdown bytes (CONTENT_EXPORT_MAX_TOTAL_BYTES)
  - per-user rate limit (CONTENT_EXPORT_RATE_LIMIT_PER_MINUTE)
- Behavior: archive is streamed from spool file instead of unbounded in-memory buffer

Per-document operations

GET /documents/{document_id}
- Response model: DocumentDetailResponse
GET /documents/{document_id}/download
- Response: original file bytes
GET /documents/{document_id}/preview
- Response: inline preview stream only for safe MIME types
- Behavior: script-capable MIME types are forced to attachment responses with X-Content-Type-Options: nosniff
GET /documents/{document_id}/thumbnail
- Response: generated thumbnail image when available
GET /documents/{document_id}/content-md
- Response: extracted markdown content for one document
PATCH /documents/{document_id}
- Body model: DocumentUpdateRequest
- Response model: DocumentResponse
POST /documents/{document_id}/trash
- Response model: DocumentResponse
POST /documents/{document_id}/restore
- Response model: DocumentResponse
DELETE /documents/{document_id}
- Behavior: permanent delete, requires document to be trashed first
- Response: deletion counters
POST /documents/{document_id}/reprocess
- Response model: DocumentResponse
- Behavior: requeues asynchronous processing task

Upload

POST /documents/upload
Multipart form fields:
- files[] (required)
- relative_paths[] (optional)
- logical_path (optional, defaults to Inbox)
- tags (optional CSV)
- conflict_mode (ask, replace, duplicate)
Response model: UploadResponse
Behavior:
- ask: returns conflicts if duplicate checksum is detected for caller-visible documents
- replace: creates new document linked to replaced document id
- duplicate: creates additional document record
- upload POST request rejected with 411 when Content-Length is missing
- OPTIONS /documents/upload CORS preflight bypasses upload Content-Length enforcement
- request rejected with 413 when file count, per-file size, or total request size exceeds configured limits

Search

GET /search
Query: query (min length 2), offset, limit, include_trashed, only_trashed, path_filter, tag_filter, type_filter, processed_from, processed_to
Response model: SearchResponse
Behavior: PostgreSQL full-text and metadata ranking with role-based ownership scope

Processing Logs

Access: admin only
GET /processing/logs
- Query: offset, limit, document_id
- Response model: ProcessingLogListResponse
- limit is capped by runtime configuration
- sensitive fields are redacted in API responses
POST /processing/logs/trim
- Query: optional keep_document_sessions, keep_unbound_entries
- Behavior: omitted query values fall back to persisted /settings.processing_log_retention
- query values are capped by runtime retention limits
- Response: trim counters
POST /processing/logs/clear
- Response: clear counters

Persistence mode:

default is metadata-only logging (PROCESSING_LOG_STORE_MODEL_IO_TEXT=false, PROCESSING_LOG_STORE_PAYLOAD_TEXT=false)
full prompt/response or payload content storage requires explicit operator opt-in

Settings

Access: admin only
GET /settings
- Response model: AppSettingsResponse
- persisted providers with invalid base URLs are ignored during read sanitization; response falls back to remaining valid providers or secure defaults
- provider API keys are exposed only as api_key_set and api_key_masked
PATCH /settings
- Body model: AppSettingsUpdateRequest
- Response model: AppSettingsResponse
- rejects invalid provider base URLs with 400 when scheme, allowlist, or network safety checks fail
- provider API keys are persisted encrypted at rest (api_key_encrypted) and plaintext keys are not written to storage
POST /settings/reset
- Response model: AppSettingsResponse
PATCH /settings/handwriting
- Body model: HandwritingSettingsUpdateRequest
- Response model: AppSettingsResponse
GET /settings/handwriting
- Response model: HandwritingSettingsResponse

Schema Families

Auth schemas in backend/app/schemas/auth.py:

AuthLoginRequest
AuthUserResponse
AuthSessionResponse
AuthLoginResponse
AuthLogoutResponse

Document schemas in backend/app/schemas/documents.py:

DocumentResponse
DocumentDetailResponse
DocumentsListResponse
UploadConflict
UploadResponse
DocumentUpdateRequest
SearchResponse
ContentExportRequest

Processing log schemas in backend/app/schemas/processing_logs.py:

ProcessingLogEntryResponse
ProcessingLogListResponse

Settings schemas in backend/app/schemas/settings.py:

provider, task, upload-default, display, processing-log retention, predefined paths or tags, handwriting-style, and legacy handwriting models grouped under AppSettingsResponse and AppSettingsUpdateRequest.

8.0 KiB Raw Blame History