Files
ledgerdock/REPORT.md

139 lines
5.9 KiB
Markdown

# Security Audit Report
Date: 2026-02-21
Repository: /Users/bedas/Developer/GitHub/dcm
Audit type: Static, read-only code and configuration review
## Scope
- Backend API, worker, extraction and routing pipeline, settings handling, and storage interactions.
- Frontend dependency posture.
- Docker runtime and service exposure.
## Method
- File-level inspection with targeted code tracing for authn/authz, input validation, upload and archive processing, outbound network behavior, secret handling, logging, and deployment hardening.
- No runtime penetration testing was performed.
## Findings
### 1) Critical - Missing authentication and authorization on privileged API routes
- Impact: Any reachable client can access document, settings, and log-management functionality.
- Evidence:
- `backend/app/main.py:29`
- `backend/app/api/router.py:14`
- `backend/app/api/routes_documents.py:464`
- `backend/app/api/routes_documents.py:666`
- `backend/app/api/routes_settings.py:148`
- `backend/app/api/routes_processing_logs.py:22`
- Recommendation:
- Enforce authentication globally for non-health routes.
- Add per-endpoint authorization checks for read/update/delete/admin actions.
### 2) Critical - SSRF and data exfiltration risk via configurable model provider base URL
- Impact: An attacker can redirect model calls to attacker-controlled or internal hosts and exfiltrate document-derived content.
- Evidence:
- `backend/app/api/routes_settings.py:148`
- `backend/app/schemas/settings.py:24`
- `backend/app/services/app_settings.py:249`
- `backend/app/services/model_runtime.py:144`
- `backend/app/services/model_runtime.py:170`
- `backend/app/worker/tasks.py:505`
- `backend/app/services/routing_pipeline.py:803`
- Recommendation:
- Restrict provider endpoints to an allowlist.
- Validate URL scheme and block private/link-local destinations.
- Protect settings updates behind strict admin authorization.
- Enforce outbound egress controls at runtime.
### 3) High - Unbounded upload and archive extraction can cause memory/disk denial of service
- Impact: Oversized files or compressed archive bombs can exhaust API/worker resources.
- Evidence:
- `backend/app/api/routes_documents.py:486`
- `backend/app/services/extractor.py:309`
- `backend/app/services/extractor.py:312`
- `backend/app/worker/tasks.py:122`
- `backend/app/core/config.py:20`
- Recommendation:
- Enforce request and file size limits.
- Stream uploads and extraction where possible.
- Cap total uncompressed archive size and per-entry size.
### 4) High - Sensitive data logging exposed through unsecured log endpoints
- Impact: Extracted text, prompts, and model outputs may be retrievable by unauthorized callers.
- Evidence:
- `backend/app/models/processing_log.py:31`
- `backend/app/models/processing_log.py:32`
- `backend/app/services/routing_pipeline.py:803`
- `backend/app/services/routing_pipeline.py:814`
- `backend/app/worker/tasks.py:479`
- `backend/app/schemas/processing_logs.py:21`
- `backend/app/api/routes_processing_logs.py:22`
- Recommendation:
- Require admin authorization for log endpoints.
- Remove or redact sensitive payloads from logs.
- Reduce retention for operational logs that may include sensitive context.
### 5) High - Internal services exposed with weak default posture in docker compose
- Impact: Exposed Redis/Postgres/Typesense can enable data compromise and queue abuse.
- Evidence:
- `docker-compose.yml:5`
- `docker-compose.yml:6`
- `docker-compose.yml:9`
- `docker-compose.yml:21`
- `docker-compose.yml:29`
- `docker-compose.yml:32`
- `docker-compose.yml:68`
- `backend/app/worker/queue.py:15`
- `backend/app/core/config.py:34`
- Recommendation:
- Remove unnecessary host port exposure for internal services.
- Use strong credentials and network ACL segmentation.
- Enable authentication and transport protections for stateful services.
### 6) Medium - Plaintext secrets and weak defaults in configuration paths
- Impact: Credentials and API keys can be exposed from source or storage.
- Evidence:
- `backend/app/services/app_settings.py:129`
- `backend/app/services/app_settings.py:257`
- `backend/app/services/app_settings.py:667`
- `backend/app/core/config.py:17`
- `backend/app/core/config.py:34`
- `backend/.env.example:15`
- Recommendation:
- Use managed secrets storage and encryption at rest.
- Remove default credentials.
- Rotate exposed and default keys/credentials.
### 7) Low - Minimal HTTP hardening headers and broad CORS shape
- Impact: Increased browser-side attack surface, especially once authentication is introduced.
- Evidence:
- `backend/app/main.py:23`
- `backend/app/main.py:25`
- `backend/app/main.py:26`
- `backend/app/main.py:27`
- Recommendation:
- Add standard security headers middleware.
- Constrain allowed methods and headers to actual application needs.
### 8) Low - Containers appear to run as root by default
- Impact: In-container compromise has higher blast radius.
- Evidence:
- `backend/Dockerfile:1`
- `backend/Dockerfile:17`
- `frontend/Dockerfile:1`
- `frontend/Dockerfile:16`
- Recommendation:
- Run containers as non-root users.
- Drop unnecessary Linux capabilities.
## Residual Risk and Assumptions
- This audit assumes services may be reachable beyond a strictly isolated localhost-only environment.
- If an external auth proxy is enforced upstream, risk severity of unauthenticated routes is reduced but not eliminated unless backend also enforces trust boundaries.
- Dependency CVE posture was not exhaustively enumerated in this static pass.
## Priority Remediation Order
1. Enforce authentication and authorization across API routes.
2. Lock down settings mutation paths, especially model provider endpoint configuration.
3. Add strict upload/extraction resource limits.
4. Remove sensitive logging and protect log APIs.
5. Harden Docker/network exposure and secrets management.