ledgerdock/REPORT.md

# Security Production Readiness Report

Date: 2026-03-01
Repository: /Users/bedas/Developer/GitHub/dcm
Review Type: Static security review for production readiness

## Scope
- Backend: FastAPI API, worker queue, settings and model runtime services
- Frontend: React and Vite API client and document preview rendering
- Infrastructure: docker-compose service exposure and secret configuration

## Findings

### Critical

1. Redis queue is exposed without authentication and can be abused for worker job injection.
- Impact: If Redis is reachable by an attacker, queued job payloads can be injected and executed by the worker process, leading to remote code execution and data compromise.
- Exploit path: Reach Redis on port 6379, enqueue crafted RQ jobs into queue dcm, wait for worker consumption.
- Evidence:
  - docker-compose publishes Redis host port: `docker-compose.yml:21`
  - worker consumes from Redis queue directly: `docker-compose.yml:77`
  - queue connection uses bare Redis URL with no auth/TLS: `backend/app/worker/queue.py:15`, `backend/app/worker/queue.py:21`
  - current environment binds services to all interfaces: `.env:1`
- Remediation:
  - Do not publish Redis externally in production.
  - Enforce Redis authentication and TLS.
  - Place Redis on a private network segment with strict ACLs.
  - Treat queue producers as privileged components only.

2. Untrusted uploaded content is previewed in an unsandboxed iframe.
- Impact: Stored XSS and active content execution in preview context can enable account action abuse and data exfiltration in the browser.
- Exploit path: Upload active content (for example HTML), open preview, script executes in iframe without sandbox constraints.
- Evidence:
  - upload endpoint accepts generic uploaded files: `backend/app/api/routes_documents.py:493`
  - MIME type is derived from bytes and persisted: `backend/app/api/routes_documents.py:530`
  - preview endpoint returns original bytes inline with stored media type: `backend/app/api/routes_documents.py:449`, `backend/app/api/routes_documents.py:457`
  - frontend renders preview in iframe without sandbox attribute: `frontend/src/components/DocumentViewer.tsx:486`
  - preview source is a blob URL created from fetched content: `frontend/src/components/DocumentViewer.tsx:108`, `frontend/src/components/DocumentViewer.tsx:113`
- Remediation:
  - Block inline preview for script-capable MIME types.
  - Add strict iframe sandboxing if iframe preview remains required.
  - Prefer force-download for active formats.
  - Serve untrusted preview content from an isolated origin with restrictive CSP.

### High

1. Frontend distributes a bearer token to all clients.
- Impact: Any user with browser access can extract the token and replay authenticated calls, preventing per-user accountability and increasing blast radius.
- Exploit path: Read token from frontend runtime environment or request headers, replay API requests with Authorization header.
- Evidence:
  - frontend consumes token from public Vite env: `frontend/src/lib/api.ts:24`
  - token is attached to every request when present: `frontend/src/lib/api.ts:38`
  - compose passes `VITE_API_TOKEN` from user token: `docker-compose.yml:115`
  - privileged routes rely on static token role checks: `backend/app/api/router.py:19`, `backend/app/api/auth.py:47`, `backend/app/api/auth.py:51`
- Remediation:
  - Replace shared static token model with per-user authentication.
  - Keep secrets server-side only.
  - Use short-lived credentials with rotation and revocation.

2. Default and static service secrets are present in deploy config.
- Impact: If service ports are exposed, predictable credentials and keys allow unauthorized access to data services.
- Exploit path: Connect to published Postgres or Typesense ports and authenticate with known static values.
- Evidence:
  - static Postgres credentials: `docker-compose.yml:5`, `docker-compose.yml:6`
  - static Typesense key in compose and runtime env: `docker-compose.yml:29`, `docker-compose.yml:55`, `docker-compose.yml:93`
  - database and Typesense ports are published to host: `docker-compose.yml:9`, `docker-compose.yml:32`
  - current environment uses placeholder tokens: `.env:2`, `.env:3`, `.env:4`
- Remediation:
  - Use high-entropy secrets managed outside repository configuration.
  - Remove unnecessary host port publications in production.
  - Restrict service network access to trusted internal components.

3. ZIP recursion depth control is not enforced across queued descendants.
- Impact: Nested archives can create uncontrolled fan-out, causing CPU, queue, and storage exhaustion.
- Exploit path: Upload ZIP containing ZIPs; children are queued as independent documents without inherited depth, repeating recursively.
- Evidence:
  - configured depth limit exists: `backend/app/core/config.py:28`
  - extractor takes a depth argument but is called without propagation: `backend/app/services/extractor.py:302`, `backend/app/services/extractor.py:306`
  - worker invokes extractor without depth context: `backend/app/worker/tasks.py:122`
  - worker enqueues child archive jobs recursively: `backend/app/worker/tasks.py:225`, `backend/app/worker/tasks.py:226`
- Remediation:
  - Persist and propagate archive depth per document lineage.
  - Enforce absolute descendant and fan-out limits per root upload.
  - Reject nested archives beyond configured depth.

### Medium

1. OCR provider path does not apply DNS revalidation equivalent to model runtime path.
- Impact: Under permissive network flags, SSRF defenses can be weakened by DNS rebinding on OCR traffic.
- Exploit path: Persist provider URL that passes initial checks, then rebind DNS to private target before OCR requests.
- Evidence:
  - task model runtime enforces `resolve_dns=True`: `backend/app/services/model_runtime.py:41`
  - provider normalization in app settings does not pass DNS revalidation flag: `backend/app/services/app_settings.py:253`
  - OCR runtime uses persisted URL for client base URL: `backend/app/services/app_settings.py:891`, `backend/app/services/handwriting.py:159`
- Remediation:
  - Apply DNS revalidation before outbound OCR requests or on every runtime load.
  - Disallow private network egress by default and require explicit controlled exceptions.

2. Provider API keys are persisted in plaintext settings on storage volume.
- Impact: File system or backup compromise reveals upstream provider secrets.
- Exploit path: Read persisted settings file from storage volume or backup artifact.
- Evidence:
  - settings file location under storage root: `backend/app/services/app_settings.py:133`
  - provider payload includes plaintext `api_key`: `backend/app/services/app_settings.py:268`
  - settings payload is written to disk as JSON: `backend/app/services/app_settings.py:680`, `backend/app/services/app_settings.py:685`
  - OCR settings read returns stored API key value for runtime: `backend/app/services/app_settings.py:894`
- Remediation:
  - Move provider secrets to dedicated secret management.
  - If local persistence is unavoidable, encrypt sensitive fields at rest and restrict file permissions.

### Low

1. Frontend dependency is floating on latest.
- Impact: Non-deterministic installs and elevated supply chain drift risk.
- Exploit path: Fresh install resolves a newer unreviewed dependency release.
- Evidence:
  - dependency pinned to latest tag: `frontend/package.json:13`
- Remediation:
  - Pin exact versions and update through controlled dependency review.

## Validation Commands and Outcomes
- `/Users/bedas/Developer/Python/global_venv/bin/python backend/tests/test_security_controls.py`
  - Outcome: passed, 13 tests.
- `/Users/bedas/Developer/Python/global_venv/bin/python -m unittest discover -s backend/tests -p 'test_*.py'`
  - Outcome: passed, 24 tests.

## Coverage and Residual Risk
- Coverage:
  - Authentication and authorization controls.
  - Document upload and preview data flow.
  - Worker queue and archive processing path.
  - Provider configuration and outbound request handling.
  - Docker service exposure and secret defaults.
- Residual risk and limits:
  - Static analysis only, no live penetration testing executed.
  - Perimeter controls (reverse proxy, firewall, WAF, TLS topology) were not verifiable from repository state.
  - Dependency CVE scanning was not executed in this review pass.

## Delegation Report
- Primary owner by package:
  - Security findings package: `security_reviewer` subagent, consolidated and validated by main thread.
  - Repository reconnaissance package: main thread fallback after `explorer` interruption.
  - Report authoring package: main thread.
- Agents invoked:
  - `security_reviewer` (completed)
  - `explorer` (interrupted)
  - `awaiter` (completed validation command execution)
- Skills activated:
  - `secure-delivery-gates`
  - `documentation-standards`
- Required delegations not used and reason:
  - `explorer` as final reconnaissance owner was required but unavailable due runtime interruption, so main thread performed direct source reconnaissance fallback.