Files
ledgerdock/doc/operations-and-configuration.md

196 lines
7.4 KiB
Markdown

# Operations And Configuration
## Runtime Services
`docker-compose.yml` defines the runtime stack:
- `db` (Postgres 16, internal network only)
- `redis` (Redis 7, internal network only, password-protected)
- `typesense` (Typesense 29, internal network only)
- `api` (FastAPI backend, host-bound port `8000`)
- `worker` (RQ background worker)
- `frontend` (Vite UI, host-bound port `5173`)
## Named Volumes
Persistent volumes:
- `db-data`
- `redis-data`
- `dcm-storage`
- `typesense-data`
Reset all persisted runtime data:
```bash
docker compose down -v
```
## Operational Commands
Start or rebuild stack:
```bash
docker compose up --build -d
```
Stop stack:
```bash
docker compose down
```
Tail logs:
```bash
docker compose logs -f
```
Before running compose, provide required credentials in your shell or project `.env` file:
```bash
export POSTGRES_USER="dcm"
export POSTGRES_PASSWORD="<random-postgres-password>"
export POSTGRES_DB="dcm"
export DATABASE_URL="postgresql+psycopg://<user>:<password>@db:5432/<db>"
export REDIS_PASSWORD="<random-redis-password>"
export REDIS_URL="redis://:<password>@redis:6379/0"
export ADMIN_API_TOKEN="<random-admin-token>"
export USER_API_TOKEN="<random-user-token>"
export APP_SETTINGS_ENCRYPTION_KEY="<random-settings-encryption-key>"
export TYPESENSE_API_KEY="<random-typesense-key>"
```
Compose fails fast when required credential variables are missing.
## Backend Configuration
Settings source:
- Runtime settings class: `backend/app/core/config.py`
- API settings persistence: `backend/app/services/app_settings.py`
Key environment variables used by `api` and `worker` in compose:
- `APP_ENV`
- `DATABASE_URL`
- `REDIS_URL`
- `REDIS_SECURITY_MODE`
- `REDIS_TLS_MODE`
- `ALLOW_DEVELOPMENT_ANONYMOUS_USER_ACCESS`
- `STORAGE_ROOT`
- `ADMIN_API_TOKEN`
- `USER_API_TOKEN`
- `APP_SETTINGS_ENCRYPTION_KEY`
- `PUBLIC_BASE_URL`
- `CORS_ORIGINS` (API service)
- `PROVIDER_BASE_URL_ALLOWLIST`
- `PROVIDER_BASE_URL_ALLOW_HTTP`
- `PROVIDER_BASE_URL_ALLOW_PRIVATE_NETWORK`
- `TYPESENSE_PROTOCOL`
- `TYPESENSE_HOST`
- `TYPESENSE_PORT`
- `TYPESENSE_API_KEY`
- `TYPESENSE_COLLECTION_NAME`
Selected defaults from `Settings` (`backend/app/core/config.py`):
- `upload_chunk_size = 4194304`
- `max_upload_files_per_request = 50`
- `max_upload_file_size_bytes = 26214400`
- `max_upload_request_size_bytes = 104857600`
- `max_zip_members = 250`
- `max_zip_depth = 2`
- `max_zip_descendants_per_root = 1000`
- `max_zip_member_uncompressed_bytes = 26214400`
- `max_zip_total_uncompressed_bytes = 157286400`
- `max_zip_compression_ratio = 120.0`
- `max_text_length = 500000`
- `processing_log_max_document_sessions = 20`
- `processing_log_max_unbound_entries = 400`
- `default_openai_model = "gpt-4.1-mini"`
- `default_openai_timeout_seconds = 45`
- `default_summary_model = "gpt-4.1-mini"`
- `default_routing_model = "gpt-4.1-mini"`
- `typesense_timeout_seconds = 120`
- `typesense_num_retries = 0`
## Frontend Configuration
Frontend runtime API target:
- `VITE_API_BASE` in `docker-compose.yml` frontend service (optional override)
- `VITE_API_TOKEN` in `docker-compose.yml` frontend service (optional compatibility fallback only)
When `VITE_API_BASE` is unset, frontend API helpers resolve the backend URL dynamically as:
- `http://<current-frontend-hostname>:8000/api/v1`
This keeps development access working when the UI is opened through a LAN IP instead of `localhost`.
Frontend API authentication behavior:
- `frontend/src/lib/api.ts` resolves bearer tokens at request time in this order:
- custom runtime resolver (`setApiTokenResolver`)
- runtime global token (`window.__DCM_API_TOKEN__`)
- session token (`setRuntimeApiToken`)
- legacy `VITE_API_TOKEN` fallback
- requests are sent without authorization only when no runtime or fallback token source is available
Frontend container runtime behavior:
- the container runs as non-root `node`
- `/app` is owned by `node` in `frontend/Dockerfile` so Vite can create runtime temp config files under `/app`
Frontend local commands:
```bash
cd frontend && npm run dev
cd frontend && npm run build
cd frontend && npm run preview
```
## Settings Persistence
Application-level settings managed from the UI are persisted by backend settings service:
- file path: `<STORAGE_ROOT>/settings.json`
- endpoints: `/api/v1/settings`, `/api/v1/settings/reset`, `/api/v1/settings/handwriting`
Settings include:
- upload defaults
- display options
- processing-log retention options (`keep_document_sessions`, `keep_unbound_entries`)
- provider configuration
- OCR, summary, and routing task settings
- predefined paths and tags
- handwriting-style clustering settings
Read sanitization is resilient to corrupt persisted provider rows. If a persisted provider entry fails URL validation, the entry is skipped and defaults are used when no valid provider remains. This prevents unrelated read endpoints from failing due to stale invalid provider data.
Provider API keys are persisted as encrypted payloads (`api_key_encrypted`) and plaintext `api_key` values are no longer written to disk.
Retention settings are used by worker cleanup and by `POST /api/v1/processing/logs/trim` when trim query values are not provided.
## Security Controls
- Privileged APIs are token-gated with bearer auth:
- `documents` endpoints: user token or admin token
- `settings` and `processing/logs` endpoints: admin token only
- Development environments can allow tokenless user-role access for document/search routes via `ALLOW_DEVELOPMENT_ANONYMOUS_USER_ACCESS=true`; production remains token-enforced.
- Development CORS allows localhost and RFC1918 private-network origins via regex in addition to explicit `CORS_ORIGINS`, so LAN-hosted frontend access remains functional.
- Authentication fails closed when `ADMIN_API_TOKEN` is not configured and admin access is requested.
- Document preview endpoint blocks inline rendering for script-capable MIME types and forces attachment responses for active content.
- Provider base URLs are validated on settings updates and before outbound model calls:
- allowlist enforcement (`PROVIDER_BASE_URL_ALLOWLIST`)
- scheme restrictions (`https` by default)
- local/private-network blocking and per-request DNS revalidation checks for outbound runtime calls, including OCR provider path
- Upload and archive safety guards are enforced:
- `POST /api/v1/documents/upload` requires `Content-Length` and enforces file-count, per-file size, and total request size limits
- `OPTIONS /api/v1/documents/upload` CORS preflight is excluded from `Content-Length` enforcement
- ZIP member count, per-member uncompressed size, total decompressed size, compression-ratio guards, max depth, and per-root descendant fan-out cap
- Redis queue security checks enforce URL scheme/auth/TLS policy at runtime with production fail-closed defaults.
- Processing logs redact sensitive payload and text fields, and trim endpoints enforce retention caps from runtime config.
- Compose hardening defaults:
- only `api` and `frontend` publish host ports; `db`, `redis`, and `typesense` stay internal-only
- `api`, `worker`, and `frontend` drop all Linux capabilities and set `no-new-privileges`
- backend and frontend containers run as non-root users by default
## Validation Checklist
After operational or configuration changes, verify:
- `GET /api/v1/health` is healthy
- frontend can list, upload, and search documents
- processing worker logs show successful task execution
- settings save or reset works and persists after restart