Harden security controls from REPORT findings

This commit is contained in:
2026-03-01 13:32:08 -03:00
parent da5cbc2c01
commit bdd97d1c62
20 changed files with 1455 additions and 97 deletions

View File

@@ -3,12 +3,12 @@
## Runtime Services
`docker-compose.yml` defines the runtime stack:
- `db` (Postgres 16, localhost-bound port `5432`)
- `redis` (Redis 7, localhost-bound port `6379`)
- `typesense` (Typesense 29, localhost-bound port `8108`)
- `api` (FastAPI backend, localhost-bound port `8000`)
- `db` (Postgres 16, internal network only)
- `redis` (Redis 7, internal network only, password-protected)
- `typesense` (Typesense 29, internal network only)
- `api` (FastAPI backend, host-bound port `8000`)
- `worker` (RQ background worker)
- `frontend` (Vite UI, localhost-bound port `5173`)
- `frontend` (Vite UI, host-bound port `5173`)
## Named Volumes
@@ -44,14 +44,22 @@ Tail logs:
docker compose logs -f
```
Before running compose, provide explicit API tokens in your shell or project `.env` file:
Before running compose, provide required credentials in your shell or project `.env` file:
```bash
export POSTGRES_USER="dcm"
export POSTGRES_PASSWORD="<random-postgres-password>"
export POSTGRES_DB="dcm"
export DATABASE_URL="postgresql+psycopg://<user>:<password>@db:5432/<db>"
export REDIS_PASSWORD="<random-redis-password>"
export REDIS_URL="redis://:<password>@redis:6379/0"
export ADMIN_API_TOKEN="<random-admin-token>"
export USER_API_TOKEN="<random-user-token>"
export APP_SETTINGS_ENCRYPTION_KEY="<random-settings-encryption-key>"
export TYPESENSE_API_KEY="<random-typesense-key>"
```
Compose now fails fast if either token variable is missing.
Compose fails fast when required credential variables are missing.
## Backend Configuration
@@ -63,9 +71,13 @@ Key environment variables used by `api` and `worker` in compose:
- `APP_ENV`
- `DATABASE_URL`
- `REDIS_URL`
- `REDIS_SECURITY_MODE`
- `REDIS_TLS_MODE`
- `ALLOW_DEVELOPMENT_ANONYMOUS_USER_ACCESS`
- `STORAGE_ROOT`
- `ADMIN_API_TOKEN`
- `USER_API_TOKEN`
- `APP_SETTINGS_ENCRYPTION_KEY`
- `PUBLIC_BASE_URL`
- `CORS_ORIGINS` (API service)
- `PROVIDER_BASE_URL_ALLOWLIST`
@@ -84,6 +96,7 @@ Selected defaults from `Settings` (`backend/app/core/config.py`):
- `max_upload_request_size_bytes = 104857600`
- `max_zip_members = 250`
- `max_zip_depth = 2`
- `max_zip_descendants_per_root = 1000`
- `max_zip_member_uncompressed_bytes = 26214400`
- `max_zip_total_uncompressed_bytes = 157286400`
- `max_zip_compression_ratio = 120.0`
@@ -101,11 +114,15 @@ Selected defaults from `Settings` (`backend/app/core/config.py`):
Frontend runtime API target:
- `VITE_API_BASE` in `docker-compose.yml` frontend service
- `VITE_API_TOKEN` in `docker-compose.yml` frontend service (defaults to `USER_API_TOKEN` in compose, override to `ADMIN_API_TOKEN` when admin-only routes are needed)
- `VITE_API_TOKEN` in `docker-compose.yml` frontend service (optional compatibility fallback only)
Frontend API authentication behavior:
- `frontend/src/lib/api.ts` adds `Authorization: Bearer <VITE_API_TOKEN>` for all API requests only when `VITE_API_TOKEN` is non-empty
- requests are still sent without authorization when `VITE_API_TOKEN` is unset, which keeps unauthenticated endpoints such as `/api/v1/health` backward-compatible
- `frontend/src/lib/api.ts` resolves bearer tokens at request time in this order:
- custom runtime resolver (`setApiTokenResolver`)
- runtime global token (`window.__DCM_API_TOKEN__`)
- session token (`setRuntimeApiToken`)
- legacy `VITE_API_TOKEN` fallback
- requests are sent without authorization only when no runtime or fallback token source is available
Frontend container runtime behavior:
- the container runs as non-root `node`
@@ -136,6 +153,8 @@ Settings include:
Read sanitization is resilient to corrupt persisted provider rows. If a persisted provider entry fails URL validation, the entry is skipped and defaults are used when no valid provider remains. This prevents unrelated read endpoints from failing due to stale invalid provider data.
Provider API keys are persisted as encrypted payloads (`api_key_encrypted`) and plaintext `api_key` values are no longer written to disk.
Retention settings are used by worker cleanup and by `POST /api/v1/processing/logs/trim` when trim query values are not provided.
## Security Controls
@@ -143,18 +162,21 @@ Retention settings are used by worker cleanup and by `POST /api/v1/processing/lo
- Privileged APIs are token-gated with bearer auth:
- `documents` endpoints: user token or admin token
- `settings` and `processing/logs` endpoints: admin token only
- Authentication fails closed when `ADMIN_API_TOKEN` is not configured.
- Development environments can allow tokenless user-role access for document/search routes via `ALLOW_DEVELOPMENT_ANONYMOUS_USER_ACCESS=true`; production remains token-enforced.
- Authentication fails closed when `ADMIN_API_TOKEN` is not configured and admin access is requested.
- Document preview endpoint blocks inline rendering for script-capable MIME types and forces attachment responses for active content.
- Provider base URLs are validated on settings updates and before outbound model calls:
- allowlist enforcement (`PROVIDER_BASE_URL_ALLOWLIST`)
- scheme restrictions (`https` by default)
- local/private-network blocking and per-request DNS revalidation checks for outbound runtime calls
- local/private-network blocking and per-request DNS revalidation checks for outbound runtime calls, including OCR provider path
- Upload and archive safety guards are enforced:
- `POST /api/v1/documents/upload` requires `Content-Length` and enforces file-count, per-file size, and total request size limits
- `OPTIONS /api/v1/documents/upload` CORS preflight is excluded from `Content-Length` enforcement
- ZIP member count, per-member uncompressed size, total decompressed size, and compression-ratio guards
- ZIP member count, per-member uncompressed size, total decompressed size, compression-ratio guards, max depth, and per-root descendant fan-out cap
- Redis queue security checks enforce URL scheme/auth/TLS policy at runtime with production fail-closed defaults.
- Processing logs redact sensitive payload and text fields, and trim endpoints enforce retention caps from runtime config.
- Compose hardening defaults:
- host ports bind to `127.0.0.1` unless `HOST_BIND_IP` override is set
- only `api` and `frontend` publish host ports; `db`, `redis`, and `typesense` stay internal-only
- `api`, `worker`, and `frontend` drop all Linux capabilities and set `no-new-privileges`
- backend and frontend containers run as non-root users by default