ledgerdock/doc/operations-and-configuration.md

# Operations And Configuration

## Runtime Services

`docker-compose.yml` defines the runtime stack:
- `db` (Postgres 16, localhost-bound port `5432`)
- `redis` (Redis 7, localhost-bound port `6379`)
- `typesense` (Typesense 29, localhost-bound port `8108`)
- `api` (FastAPI backend, localhost-bound port `8000`)
- `worker` (RQ background worker)
- `frontend` (Vite UI, localhost-bound port `5173`)

## Named Volumes

Persistent volumes:
- `db-data`
- `redis-data`
- `dcm-storage`
- `typesense-data`

Reset all persisted runtime data:

```bash
docker compose down -v
```

## Operational Commands

Start or rebuild stack:

```bash
docker compose up --build -d
```

Stop stack:

```bash
docker compose down
```

Tail logs:

```bash
docker compose logs -f
```

Before running compose, provide explicit API tokens in your shell or project `.env` file:

```bash
export ADMIN_API_TOKEN="<random-admin-token>"
export USER_API_TOKEN="<random-user-token>"
```

Compose now fails fast if either token variable is missing.

## Backend Configuration

Settings source:
- Runtime settings class: `backend/app/core/config.py`
- API settings persistence: `backend/app/services/app_settings.py`

Key environment variables used by `api` and `worker` in compose:
- `APP_ENV`
- `DATABASE_URL`
- `REDIS_URL`
- `STORAGE_ROOT`
- `ADMIN_API_TOKEN`
- `USER_API_TOKEN`
- `PUBLIC_BASE_URL`
- `CORS_ORIGINS` (API service)
- `PROVIDER_BASE_URL_ALLOWLIST`
- `PROVIDER_BASE_URL_ALLOW_HTTP`
- `PROVIDER_BASE_URL_ALLOW_PRIVATE_NETWORK`
- `TYPESENSE_PROTOCOL`
- `TYPESENSE_HOST`
- `TYPESENSE_PORT`
- `TYPESENSE_API_KEY`
- `TYPESENSE_COLLECTION_NAME`

Selected defaults from `Settings` (`backend/app/core/config.py`):
- `upload_chunk_size = 4194304`
- `max_upload_files_per_request = 50`
- `max_upload_file_size_bytes = 26214400`
- `max_upload_request_size_bytes = 104857600`
- `max_zip_members = 250`
- `max_zip_depth = 2`
- `max_zip_member_uncompressed_bytes = 26214400`
- `max_zip_total_uncompressed_bytes = 157286400`
- `max_zip_compression_ratio = 120.0`
- `max_text_length = 500000`
- `processing_log_max_document_sessions = 20`
- `processing_log_max_unbound_entries = 400`
- `default_openai_model = "gpt-4.1-mini"`
- `default_openai_timeout_seconds = 45`
- `default_summary_model = "gpt-4.1-mini"`
- `default_routing_model = "gpt-4.1-mini"`
- `typesense_timeout_seconds = 120`
- `typesense_num_retries = 0`

## Frontend Configuration

Frontend runtime API target:
- `VITE_API_BASE` in `docker-compose.yml` frontend service
- `VITE_API_TOKEN` in `docker-compose.yml` frontend service (defaults to `USER_API_TOKEN` in compose, override to `ADMIN_API_TOKEN` when admin-only routes are needed)

Frontend API authentication behavior:
- `frontend/src/lib/api.ts` adds `Authorization: Bearer <VITE_API_TOKEN>` for all API requests only when `VITE_API_TOKEN` is non-empty
- requests are still sent without authorization when `VITE_API_TOKEN` is unset, which keeps unauthenticated endpoints such as `/api/v1/health` backward-compatible

Frontend container runtime behavior:
- the container runs as non-root `node`
- `/app` is owned by `node` in `frontend/Dockerfile` so Vite can create runtime temp config files under `/app`

Frontend local commands:

```bash
cd frontend && npm run dev
cd frontend && npm run build
cd frontend && npm run preview
```

## Settings Persistence

Application-level settings managed from the UI are persisted by backend settings service:
- file path: `<STORAGE_ROOT>/settings.json`
- endpoints: `/api/v1/settings`, `/api/v1/settings/reset`, `/api/v1/settings/handwriting`

Settings include:
- upload defaults
- display options
- processing-log retention options (`keep_document_sessions`, `keep_unbound_entries`)
- provider configuration
- OCR, summary, and routing task settings
- predefined paths and tags
- handwriting-style clustering settings

Read sanitization is resilient to corrupt persisted provider rows. If a persisted provider entry fails URL validation, the entry is skipped and defaults are used when no valid provider remains. This prevents unrelated read endpoints from failing due to stale invalid provider data.

Retention settings are used by worker cleanup and by `POST /api/v1/processing/logs/trim` when trim query values are not provided.

## Security Controls

- Privileged APIs are token-gated with bearer auth:
  - `documents` endpoints: user token or admin token
  - `settings` and `processing/logs` endpoints: admin token only
- Authentication fails closed when `ADMIN_API_TOKEN` is not configured.
- Provider base URLs are validated on settings updates and before outbound model calls:
  - allowlist enforcement (`PROVIDER_BASE_URL_ALLOWLIST`)
  - scheme restrictions (`https` by default)
  - local/private-network blocking and per-request DNS revalidation checks for outbound runtime calls
- Upload and archive safety guards are enforced:
  - `POST /api/v1/documents/upload` requires `Content-Length` and enforces file-count, per-file size, and total request size limits
  - `OPTIONS /api/v1/documents/upload` CORS preflight is excluded from `Content-Length` enforcement
  - ZIP member count, per-member uncompressed size, total decompressed size, and compression-ratio guards
- Processing logs redact sensitive payload and text fields, and trim endpoints enforce retention caps from runtime config.
- Compose hardening defaults:
  - host ports bind to `127.0.0.1` unless `HOST_BIND_IP` override is set
  - `api`, `worker`, and `frontend` drop all Linux capabilities and set `no-new-privileges`
  - backend and frontend containers run as non-root users by default

## Validation Checklist

After operational or configuration changes, verify:
- `GET /api/v1/health` is healthy
- frontend can list, upload, and search documents
- processing worker logs show successful task execution
- settings save or reset works and persists after restart