Harden auth and security controls with session auth and docs

This commit is contained in:
2026-03-01 15:29:09 -03:00
parent 7a19f22f41
commit 0242e061c2
36 changed files with 1794 additions and 505 deletions

View File

@@ -4,6 +4,7 @@ Base URL prefix: `/api/v1`
Primary implementation modules:
- `backend/app/api/router.py`
- `backend/app/api/routes_auth.py`
- `backend/app/api/routes_health.py`
- `backend/app/api/routes_documents.py`
- `backend/app/api/routes_search.py`
@@ -12,15 +13,32 @@ Primary implementation modules:
## Authentication And Authorization
- Protected endpoints require `Authorization: Bearer <token>` in production.
- Development deployments can allow tokenless user-role access for `documents/*` and `search/*` when `ALLOW_DEVELOPMENT_ANONYMOUS_USER_ACCESS=true`.
- `ADMIN_API_TOKEN` is required for all privileged access and acts as fail-closed root credential.
- `USER_API_TOKEN` is optional and, when configured, grants access to document endpoints only.
- Authorization matrix:
- `documents/*`: `admin` or `user`
- `search/*`: `admin` or `user`
- `settings/*`: `admin` only
- `processing/logs/*`: `admin` only
- Authentication is session-based bearer auth.
- Clients authenticate with `POST /auth/login` using username and password.
- Backend issues per-user bearer session tokens and stores hashed session state server-side.
- Clients send issued tokens as `Authorization: Bearer <token>`.
- `GET /auth/me` returns current identity and role.
- `POST /auth/logout` revokes current session token.
Role matrix:
- `documents/*`: `admin` or `user`
- `search/*`: `admin` or `user`
- `settings/*`: `admin` only
- `processing/logs/*`: `admin` only
Ownership rules:
- `user` role is restricted to its own documents.
- `admin` role can access all documents.
## Auth
- `POST /auth/login`
- Body model: `AuthLoginRequest`
- Response model: `AuthLoginResponse`
- `GET /auth/me`
- Response model: `AuthSessionResponse`
- `POST /auth/logout`
- Response model: `AuthLogoutResponse`
## Health
@@ -30,9 +48,6 @@ Primary implementation modules:
## Documents
- Access: admin or user token required (production)
- Access: admin or user token, or development tokenless user fallback when enabled
### Collection and metadata helpers
- `GET /documents`
@@ -50,6 +65,11 @@ Primary implementation modules:
- `POST /documents/content-md/export`
- Body model: `ContentExportRequest`
- Response: ZIP stream containing one markdown file per matched document
- Limits:
- hard cap on matched document count (`CONTENT_EXPORT_MAX_DOCUMENTS`)
- hard cap on cumulative markdown bytes (`CONTENT_EXPORT_MAX_TOTAL_BYTES`)
- per-user rate limit (`CONTENT_EXPORT_RATE_LIMIT_PER_MINUTE`)
- Behavior: archive is streamed from spool file instead of unbounded in-memory buffer
### Per-document operations
@@ -89,7 +109,7 @@ Primary implementation modules:
- `conflict_mode` (`ask`, `replace`, `duplicate`)
- Response model: `UploadResponse`
- Behavior:
- `ask`: returns `conflicts` if duplicate checksum is detected
- `ask`: returns `conflicts` if duplicate checksum is detected for caller-visible documents
- `replace`: creates new document linked to replaced document id
- `duplicate`: creates additional document record
- upload `POST` request rejected with `411` when `Content-Length` is missing
@@ -98,16 +118,14 @@ Primary implementation modules:
## Search
- Access: admin or user token required
- `GET /search`
- Query: `query` (min length 2), `offset`, `limit`, `include_trashed`, `only_trashed`, `path_filter`, `tag_filter`, `type_filter`, `processed_from`, `processed_to`
- Response model: `SearchResponse`
- Behavior: PostgreSQL full-text and metadata ranking
- Behavior: PostgreSQL full-text and metadata ranking with role-based ownership scope
## Processing Logs
- Access: admin token required
- Access: admin only
- `GET /processing/logs`
- Query: `offset`, `limit`, `document_id`
@@ -122,9 +140,13 @@ Primary implementation modules:
- `POST /processing/logs/clear`
- Response: clear counters
Persistence mode:
- default is metadata-only logging (`PROCESSING_LOG_STORE_MODEL_IO_TEXT=false`, `PROCESSING_LOG_STORE_PAYLOAD_TEXT=false`)
- full prompt/response or payload content storage requires explicit operator opt-in
## Settings
- Access: admin token required
- Access: admin only
- `GET /settings`
- Response model: `AppSettingsResponse`
@@ -145,6 +167,13 @@ Primary implementation modules:
## Schema Families
Auth schemas in `backend/app/schemas/auth.py`:
- `AuthLoginRequest`
- `AuthUserResponse`
- `AuthSessionResponse`
- `AuthLoginResponse`
- `AuthLogoutResponse`
Document schemas in `backend/app/schemas/documents.py`:
- `DocumentResponse`
- `DocumentDetailResponse`
@@ -160,4 +189,4 @@ Processing log schemas in `backend/app/schemas/processing_logs.py`:
- `ProcessingLogListResponse`
Settings schemas in `backend/app/schemas/settings.py`:
- Provider, task, upload-default, display, processing-log retention, predefined paths or tags, handwriting-style, and legacy handwriting models grouped under `AppSettingsResponse` and `AppSettingsUpdateRequest`.
- provider, task, upload-default, display, processing-log retention, predefined paths or tags, handwriting-style, and legacy handwriting models grouped under `AppSettingsResponse` and `AppSettingsUpdateRequest`.