Harden auth and security controls with session auth and docs
This commit is contained in:
@@ -2,15 +2,13 @@
|
||||
|
||||
## Runtime Services
|
||||
|
||||
`docker-compose.yml` defines the runtime stack:
|
||||
- `db` (Postgres 16, internal network only)
|
||||
- `redis` (Redis 7, internal network only, password-protected)
|
||||
- `typesense` (Typesense 29, internal network only)
|
||||
- `api` (FastAPI backend, host-bound port `8000`)
|
||||
- `worker` (RQ background worker)
|
||||
- `frontend` (Vite UI, host-bound port `5173`)
|
||||
|
||||
## Named Volumes
|
||||
`docker-compose.yml` defines:
|
||||
- `db` (Postgres 16)
|
||||
- `redis` (Redis 7)
|
||||
- `typesense` (Typesense 29)
|
||||
- `api` (FastAPI backend)
|
||||
- `worker` (RQ worker via `python -m app.worker.run_worker`)
|
||||
- `frontend` (Vite React UI)
|
||||
|
||||
Persistent volumes:
|
||||
- `db-data`
|
||||
@@ -24,15 +22,15 @@ Reset all persisted runtime data:
|
||||
docker compose down -v
|
||||
```
|
||||
|
||||
## Operational Commands
|
||||
## Core Commands
|
||||
|
||||
Start or rebuild stack:
|
||||
Start or rebuild:
|
||||
|
||||
```bash
|
||||
docker compose up --build -d
|
||||
```
|
||||
|
||||
Stop stack:
|
||||
Stop:
|
||||
|
||||
```bash
|
||||
docker compose down
|
||||
@@ -44,151 +42,81 @@ Tail logs:
|
||||
docker compose logs -f
|
||||
```
|
||||
|
||||
Before running compose, provide required credentials in your shell or project `.env` file:
|
||||
## Authentication Model
|
||||
|
||||
```bash
|
||||
export POSTGRES_USER="dcm"
|
||||
export POSTGRES_PASSWORD="<random-postgres-password>"
|
||||
export POSTGRES_DB="dcm"
|
||||
export DATABASE_URL="postgresql+psycopg://<user>:<password>@db:5432/<db>"
|
||||
export REDIS_PASSWORD="<random-redis-password>"
|
||||
export REDIS_URL="redis://:<password>@redis:6379/0"
|
||||
export ADMIN_API_TOKEN="<random-admin-token>"
|
||||
export USER_API_TOKEN="<random-user-token>"
|
||||
export APP_SETTINGS_ENCRYPTION_KEY="<random-settings-encryption-key>"
|
||||
export TYPESENSE_API_KEY="<random-typesense-key>"
|
||||
```
|
||||
- Legacy shared build-time frontend token behavior was removed.
|
||||
- API now uses server-issued per-user bearer sessions.
|
||||
- Bootstrap users are provisioned from environment:
|
||||
- `AUTH_BOOTSTRAP_ADMIN_USERNAME`
|
||||
- `AUTH_BOOTSTRAP_ADMIN_PASSWORD`
|
||||
- optional `AUTH_BOOTSTRAP_USER_USERNAME`
|
||||
- optional `AUTH_BOOTSTRAP_USER_PASSWORD`
|
||||
- Frontend signs in through `/api/v1/auth/login` and stores issued session token in browser session storage.
|
||||
|
||||
Compose fails fast when required credential variables are missing.
|
||||
## DEV And LIVE Configuration Matrix
|
||||
|
||||
## Backend Configuration
|
||||
Use `.env.example` as baseline. The table below documents user-managed settings and recommended values.
|
||||
|
||||
Settings source:
|
||||
- Runtime settings class: `backend/app/core/config.py`
|
||||
- API settings persistence: `backend/app/services/app_settings.py`
|
||||
| Variable | Local DEV (HTTP, docker-only) | LIVE (HTTPS behind reverse proxy) |
|
||||
| --- | --- | --- |
|
||||
| `APP_ENV` | `development` | `production` |
|
||||
| `HOST_BIND_IP` | `127.0.0.1` or local LAN bind if needed | `127.0.0.1` (publish behind proxy only) |
|
||||
| `PUBLIC_BASE_URL` | `http://localhost:8000` | `https://api.example.com` |
|
||||
| `VITE_API_BASE` | empty for host-derived `http://<frontend-host>:8000/api/v1`, or explicit local URL | `https://api.example.com/api/v1` |
|
||||
| `CORS_ORIGINS` | `["http://localhost:5173","http://localhost:3000"]` | exact frontend origins only, for example `["https://app.example.com"]` |
|
||||
| `CORS_ALLOW_CREDENTIALS` | `false` | `false` (Authorization header flow does not need credentialed CORS) |
|
||||
| `REDIS_URL` | `redis://:<password>@redis:6379/0` in isolated local network | `rediss://:<password>@redis.internal:6379/0` |
|
||||
| `REDIS_SECURITY_MODE` | `compat` or `auto` | `strict` |
|
||||
| `REDIS_TLS_MODE` | `allow_insecure` or `auto` | `required` |
|
||||
| `PROVIDER_BASE_URL_ALLOW_HTTP` | `true` only when intentionally testing local HTTP provider endpoints | `false` |
|
||||
| `PROVIDER_BASE_URL_ALLOW_PRIVATE_NETWORK` | `true` only for trusted local development targets | `false` |
|
||||
| `PROVIDER_BASE_URL_ALLOWLIST` | allow needed test hosts | explicit production allowlist, for example `["api.openai.com"]` |
|
||||
| `PROCESSING_LOG_STORE_MODEL_IO_TEXT` | `false` by default; temporary `true` only for controlled debugging | `false` |
|
||||
| `PROCESSING_LOG_STORE_PAYLOAD_TEXT` | `false` by default; temporary `true` only for controlled debugging | `false` |
|
||||
| `CONTENT_EXPORT_MAX_DOCUMENTS` | default `250` or lower based on host memory | tuned to production capacity |
|
||||
| `CONTENT_EXPORT_MAX_TOTAL_BYTES` | default `52428800` (50 MiB) or lower | tuned to production capacity |
|
||||
| `CONTENT_EXPORT_RATE_LIMIT_PER_MINUTE` | default `6` | tuned to API throughput and abuse model |
|
||||
|
||||
Key environment variables used by `api` and `worker` in compose:
|
||||
- `APP_ENV`
|
||||
- `DATABASE_URL`
|
||||
- `REDIS_URL`
|
||||
- `REDIS_SECURITY_MODE`
|
||||
- `REDIS_TLS_MODE`
|
||||
- `ALLOW_DEVELOPMENT_ANONYMOUS_USER_ACCESS`
|
||||
- `STORAGE_ROOT`
|
||||
- `ADMIN_API_TOKEN`
|
||||
- `USER_API_TOKEN`
|
||||
- `APP_SETTINGS_ENCRYPTION_KEY`
|
||||
- `PUBLIC_BASE_URL`
|
||||
- `CORS_ORIGINS` (API service)
|
||||
- `PROVIDER_BASE_URL_ALLOWLIST`
|
||||
- `PROVIDER_BASE_URL_ALLOW_HTTP`
|
||||
- `PROVIDER_BASE_URL_ALLOW_PRIVATE_NETWORK`
|
||||
- `TYPESENSE_PROTOCOL`
|
||||
- `TYPESENSE_HOST`
|
||||
- `TYPESENSE_PORT`
|
||||
- `TYPESENSE_API_KEY`
|
||||
- `TYPESENSE_COLLECTION_NAME`
|
||||
## HTTPS Proxy Deployment Notes
|
||||
|
||||
Selected defaults from `Settings` (`backend/app/core/config.py`):
|
||||
- `upload_chunk_size = 4194304`
|
||||
- `max_upload_files_per_request = 50`
|
||||
- `max_upload_file_size_bytes = 26214400`
|
||||
- `max_upload_request_size_bytes = 104857600`
|
||||
- `max_zip_members = 250`
|
||||
- `max_zip_depth = 2`
|
||||
- `max_zip_descendants_per_root = 1000`
|
||||
- `max_zip_member_uncompressed_bytes = 26214400`
|
||||
- `max_zip_total_uncompressed_bytes = 157286400`
|
||||
- `max_zip_compression_ratio = 120.0`
|
||||
- `max_text_length = 500000`
|
||||
- `processing_log_max_document_sessions = 20`
|
||||
- `processing_log_max_unbound_entries = 400`
|
||||
- `default_openai_model = "gpt-4.1-mini"`
|
||||
- `default_openai_timeout_seconds = 45`
|
||||
- `default_summary_model = "gpt-4.1-mini"`
|
||||
- `default_routing_model = "gpt-4.1-mini"`
|
||||
- `typesense_timeout_seconds = 120`
|
||||
- `typesense_num_retries = 0`
|
||||
This application supports both:
|
||||
- local HTTP-only operation (no TLS termination in containers)
|
||||
- HTTPS deployment behind a reverse proxy that handles TLS
|
||||
|
||||
## Frontend Configuration
|
||||
|
||||
Frontend runtime API target:
|
||||
- `VITE_API_BASE` in `docker-compose.yml` frontend service (optional override)
|
||||
- `VITE_API_TOKEN` in `docker-compose.yml` frontend service (optional compatibility fallback only)
|
||||
|
||||
When `VITE_API_BASE` is unset, frontend API helpers resolve to:
|
||||
- `http://<current-frontend-hostname>:8000/api/v1`
|
||||
|
||||
Frontend API authentication behavior:
|
||||
- `frontend/src/lib/api.ts` resolves bearer tokens at request time in this order:
|
||||
- custom runtime resolver (`setApiTokenResolver`)
|
||||
- runtime global token (`window.__DCM_API_TOKEN__`)
|
||||
- session token (`setRuntimeApiToken`)
|
||||
- legacy `VITE_API_TOKEN` fallback
|
||||
- requests are sent without authorization only when no runtime or fallback token source is available
|
||||
|
||||
Frontend container runtime behavior:
|
||||
- the container runs as non-root `node`
|
||||
- `/app` is owned by `node` in `frontend/Dockerfile` so Vite can create runtime temp config files under `/app`
|
||||
|
||||
Frontend local commands:
|
||||
|
||||
```bash
|
||||
cd frontend && npm run dev
|
||||
cd frontend && npm run build
|
||||
cd frontend && npm run preview
|
||||
```
|
||||
|
||||
## Settings Persistence
|
||||
|
||||
Application-level settings managed from the UI are persisted by backend settings service:
|
||||
- file path: `<STORAGE_ROOT>/settings.json`
|
||||
- endpoints: `/api/v1/settings`, `/api/v1/settings/reset`, `/api/v1/settings/handwriting`
|
||||
|
||||
Settings include:
|
||||
- upload defaults
|
||||
- display options
|
||||
- processing-log retention options (`keep_document_sessions`, `keep_unbound_entries`)
|
||||
- provider configuration
|
||||
- OCR, summary, and routing task settings
|
||||
- predefined paths and tags
|
||||
- handwriting-style clustering settings
|
||||
|
||||
Read sanitization is resilient to corrupt persisted provider rows. If a persisted provider entry fails URL validation, the entry is skipped and defaults are used when no valid provider remains. This prevents unrelated read endpoints from failing due to stale invalid provider data.
|
||||
|
||||
Provider API keys are persisted as encrypted payloads (`api_key_encrypted`) and plaintext `api_key` values are no longer written to disk.
|
||||
|
||||
Retention settings are used by worker cleanup and by `POST /api/v1/processing/logs/trim` when trim query values are not provided.
|
||||
Recommended LIVE pattern:
|
||||
1. Proxy terminates TLS and forwards to `api` and `frontend` internal HTTP endpoints.
|
||||
2. Keep container published ports bound to localhost or internal network.
|
||||
3. Set `PUBLIC_BASE_URL` and `VITE_API_BASE` to final HTTPS URLs.
|
||||
4. Set `CORS_ORIGINS` to exact HTTPS frontend origins.
|
||||
5. Keep `CORS_ALLOW_CREDENTIALS=false` for bearer header flow.
|
||||
|
||||
## Security Controls
|
||||
|
||||
- Privileged APIs are token-gated with bearer auth:
|
||||
- `documents` endpoints: user token or admin token
|
||||
- `settings` and `processing/logs` endpoints: admin token only
|
||||
- Development environments can allow tokenless user-role access for document/search routes via `ALLOW_DEVELOPMENT_ANONYMOUS_USER_ACCESS=true`; production remains token-enforced.
|
||||
- CORS allows HTTP and HTTPS origins by regex in addition to explicit `CORS_ORIGINS`, so LAN and public-domain frontend origins are accepted.
|
||||
- Authentication fails closed when `ADMIN_API_TOKEN` is not configured and admin access is requested.
|
||||
- Document preview endpoint blocks inline rendering for script-capable MIME types and forces attachment responses for active content.
|
||||
- Provider base URLs are validated on settings updates and before outbound model calls:
|
||||
- optional allowlist enforcement (`PROVIDER_BASE_URL_ALLOWLIST`)
|
||||
- optional scheme restrictions (`PROVIDER_BASE_URL_ALLOW_HTTP`)
|
||||
- optional private-network restrictions (`PROVIDER_BASE_URL_ALLOW_PRIVATE_NETWORK`)
|
||||
- per-request DNS revalidation checks for outbound runtime calls, including OCR provider path
|
||||
- Upload and archive safety guards are enforced:
|
||||
- `POST /api/v1/documents/upload` requires `Content-Length` and enforces file-count, per-file size, and total request size limits
|
||||
- `OPTIONS /api/v1/documents/upload` CORS preflight is excluded from `Content-Length` enforcement
|
||||
- ZIP member count, per-member uncompressed size, total decompressed size, compression-ratio guards, max depth, and per-root descendant fan-out cap
|
||||
- Redis queue security checks enforce URL scheme/auth/TLS policy at runtime with production fail-closed defaults.
|
||||
- Processing logs redact sensitive payload and text fields, and trim endpoints enforce retention caps from runtime config.
|
||||
- Compose hardening defaults:
|
||||
- only `api` and `frontend` publish host ports; `db`, `redis`, and `typesense` stay internal-only
|
||||
- `api`, `worker`, and `frontend` drop all Linux capabilities and set `no-new-privileges`
|
||||
- backend and frontend containers run as non-root users by default
|
||||
- CORS uses explicit origin allowlist only; broad origin regex matching is removed.
|
||||
- Worker Redis startup validates URL auth and TLS policy before consuming jobs.
|
||||
- Provider API keys are encrypted at rest with standard AEAD (`cryptography` Fernet).
|
||||
- legacy `enc-v1` payloads are read for backward compatibility
|
||||
- new writes use `enc-v2`
|
||||
- Processing logs default to metadata-only persistence.
|
||||
- Markdown export enforces:
|
||||
- max document count
|
||||
- max total markdown bytes
|
||||
- per-user Redis-backed rate limit
|
||||
- spool-file streaming to avoid unbounded memory archives
|
||||
- User-role document access is owner-scoped for non-admin accounts.
|
||||
|
||||
## Frontend Runtime
|
||||
|
||||
- Frontend no longer consumes `VITE_API_TOKEN`.
|
||||
- Session token storage key is `dcm.access_token` in browser session storage.
|
||||
- Protected media and file download flows still use authenticated fetch plus blob/object URL handling.
|
||||
|
||||
## Validation Checklist
|
||||
|
||||
After operational or configuration changes, verify:
|
||||
- `GET /api/v1/health` is healthy
|
||||
- frontend can list, upload, and search documents
|
||||
- processing worker logs show successful task execution
|
||||
- settings save or reset works and persists after restart
|
||||
After configuration changes:
|
||||
- `GET /api/v1/health` returns healthy response
|
||||
- login succeeds for bootstrap admin user
|
||||
- admin can upload, search, open preview, download, and export markdown
|
||||
- user account can only access its own documents
|
||||
- admin-only settings and processing logs are not accessible by user role
|
||||
- `docker compose logs -f api worker` shows no startup validation failures
|
||||
|
||||
Reference in New Issue
Block a user