Files
ledgerdock/REPORT.md

150 lines
6.5 KiB
Markdown

# Security Production Readiness Report
Date: 2026-03-01
Repository: /Users/bedas/Developer/GitHub/dcm
Review type: Static code and configuration review (no runtime penetration testing)
## Scope
- Backend API and worker: `backend/app`
- Frontend API client/auth transport: `frontend/src`
- Compose and environment defaults: `docker-compose.yml`, `.env`
## Method and Limits
- Reviewed source and configuration files in the current checkout.
- Verified findings with direct file evidence.
- Did not run dynamic security testing, dependency CVE scanning, or infrastructure perimeter testing.
## Confirmed Product Security Findings
### Critical
1. Browser-exposed shared bearer token path (`VITE_API_TOKEN` fallback)
- Severity: Critical
- Why this is a product issue: The frontend code supports a build-time token fallback and injects it into all API requests. This creates a shared credential model in browser code.
- Impact: Any user with browser access can recover and reuse the token, collapsing auth boundaries and auditability.
- Exploit path: Open app -> inspect runtime/bundle or intercepted request -> replay bearer token against protected API endpoints.
- Evidence:
- `frontend/src/lib/api.ts:39`
- `frontend/src/lib/api.ts:98`
- `frontend/src/lib/api.ts:111`
- `frontend/src/lib/api.ts:155`
- `docker-compose.yml:123`
- `backend/app/api/router.py:25`
- `backend/app/api/router.py:37`
- Production recommendation:
- Remove browser-side static token fallback.
- Use per-user server-issued auth (session or short-lived JWT) with role-bound authorization.
### High
1. CORS policy is effectively any HTTP/HTTPS origin, with credentials enabled
- Severity: High
- Why this is a product issue: CORS middleware enables `allow_origin_regex` that matches broad web origins and sets `allow_credentials=True`.
- Impact: If credentials are present, cross-origin access risk increases and token abuse becomes easier from arbitrary origins.
- Exploit path: Malicious origin performs cross-origin requests with available credentials and can read API responses under permissive CORS policy.
- Evidence:
- `backend/app/main.py:21`
- `backend/app/main.py:41`
- `backend/app/main.py:42`
- `backend/app/main.py:44`
- Production recommendation:
- Replace regex-based broad origin acceptance with explicit trusted origin allowlist.
- Keep `allow_credentials=False` unless strictly required for cookie-based flows.
### Medium
1. Sensitive processing content is persisted in logs by default
- Severity: Medium
- Why this is a product issue: Pipeline logging records OCR text, extraction text, prompts, and LLM outputs into persistent processing logs.
- Impact: Increased confidentiality risk and larger data-retention blast radius if logs are queried or exfiltrated.
- Exploit path: Access to admin log endpoints or database allows retrieval of sensitive operational content.
- Evidence:
- `backend/app/worker/tasks.py:619`
- `backend/app/worker/tasks.py:638`
- `backend/app/services/routing_pipeline.py:789`
- `backend/app/services/routing_pipeline.py:802`
- `backend/app/services/routing_pipeline.py:814`
- `backend/app/core/config.py:45`
- Production recommendation:
- Default to metadata-only logs.
- Disable persistent storage of prompt/response/raw extracted text unless temporary debug mode is explicitly enabled with strict TTL.
2. Markdown export endpoint is unbounded and memory-amplifiable
- Severity: Medium
- Why this is a product issue: Export loads all matching documents and builds ZIP in-memory with `BytesIO`, without hard limits on selection size.
- Impact: Authenticated users can trigger high memory use and service degradation.
- Exploit path: Repeated wide `path_prefix` exports cause large in-memory archive construction.
- Evidence:
- `backend/app/api/routes_documents.py:402`
- `backend/app/api/routes_documents.py:412`
- `backend/app/api/routes_documents.py:416`
- `backend/app/api/routes_documents.py:418`
- `backend/app/api/routes_documents.py:421`
- `backend/app/api/routes_documents.py:425`
- Production recommendation:
- Enforce max export document count and total bytes.
- Stream archive generation to temp files.
- Add endpoint rate limiting.
## Risks Requiring Product Decision or Further Verification
1. Authorization model appears role-based without per-document ownership boundaries
- Evidence:
- `backend/app/models/document.py:29`
- `backend/app/api/router.py:19`
- `backend/app/api/router.py:31`
- Question: Is this intentionally single-operator, or should production support multi-user/tenant data isolation?
2. Worker startup command uses raw Redis URL string and bypasses in-code URL security validator at startup
- Evidence:
- `docker-compose.yml:81`
- `backend/app/worker/queue.py:15`
- Question: Should worker startup also enforce `validate_redis_url_security` before consuming jobs?
3. Provider key encryption uses custom cryptographic construction
- Evidence:
- `backend/app/services/app_settings.py:131`
- `backend/app/services/app_settings.py:154`
- `backend/app/services/app_settings.py:176`
- Question: Are compliance or internal policy requirements demanding standardized AEAD primitives from vetted cryptography libraries?
## User-Managed Configuration Observations (Not Product Defects)
These are deployment/operator choices and should be tracked separately from code defects.
1. Development-mode posture in local `.env`
- Evidence:
- `.env:1`
- `.env:3`
- Notes: `APP_ENV=development` and anonymous development access are enabled.
2. Local `.env` includes placeholder shared API token values
- Evidence:
- `.env:15`
- `.env:16`
- `.env:31`
- Notes: If replaced with real values and reused, this increases operational risk. This is operator responsibility.
3. Compose defaults allow permissive provider egress controls
- Evidence:
- `docker-compose.yml:51`
- `docker-compose.yml:52`
- `.env:21`
- `.env:22`
- `.env:23`
- Notes: Allowing HTTP/private-network provider targets is a deployment policy choice.
4. Internal service transport defaults are plaintext in local stack
- Evidence:
- `docker-compose.yml:56`
- `.env:11`
- Notes: `http`/`redis://` may be acceptable for isolated local dev, but not for exposed production networks.
## Production Readiness Priority Order
1. Remove browser static token model and adopt per-user auth.
2. Tighten CORS to explicit trusted origins only.
3. Reduce persistent sensitive logging to metadata by default.
4. Add hard limits and streaming behavior for markdown export.
5. Resolve product decisions on tenant isolation, worker Redis security enforcement, and cryptography standardization.