docs: refresh security production readiness report

This commit is contained in:
2026-03-01 18:03:45 -03:00
parent 74d91eb4b1
commit aba320b617

193
REPORT.md
View File

@@ -2,148 +2,97 @@
Date: 2026-03-01
Repository: /Users/bedas/Developer/GitHub/dcm
Review type: Static code and configuration review (no runtime penetration testing)
Assessment type: Static code/configuration review with local security-test execution
## Verdict
Not production ready.
Reason: one blocking, code-level security issue was found.
## Scope
- Backend API and worker: `backend/app`
- Frontend API client/auth transport: `frontend/src`
- Compose and environment defaults: `docker-compose.yml`, `.env`
- Backend API and worker code in `backend/app`
- Frontend auth/API client code in `frontend/src`
- Runtime/deployment configuration in `.env` and `docker-compose.yml`
## Method and Limits
- Reviewed source and configuration files in the current checkout.
- Verified findings with direct file evidence.
- Did not run dynamic security testing, dependency CVE scanning, or infrastructure perimeter testing.
## Validation Commands And Outcomes
- `command -v git` -> pass
- `git rev-parse --is-inside-work-tree` -> pass (`true`)
- `git status --short` -> clean working tree before analysis
- `/Users/bedas/Developer/Python/global_venv/bin/python backend/tests/test_security_controls.py` -> pass (24 tests)
- `/Users/bedas/Developer/Python/global_venv/bin/python backend/tests/test_app_settings_provider_resilience.py` -> pass (6 tests)
- `/Users/bedas/Developer/Python/global_venv/bin/python backend/tests/test_processing_log_retention_settings.py` -> pass (5 tests)
- `/Users/bedas/Developer/Python/global_venv/bin/python backend/tests/test_upload_request_size_middleware.py` -> failed to run (import error in test setup)
- `/Users/bedas/Developer/Python/global_venv/bin/python -m pytest ...` -> not runnable in current venv (`No module named pytest`)
## Confirmed Product Security Findings
## Blocking Security Findings
### Critical
1. Browser-exposed shared bearer token path (`VITE_API_TOKEN` fallback)
- Severity: Critical
- Why this is a product issue: The frontend code supports a build-time token fallback and injects it into all API requests. This creates a shared credential model in browser code.
- Impact: Any user with browser access can recover and reuse the token, collapsing auth boundaries and auditability.
- Exploit path: Open app -> inspect runtime/bundle or intercepted request -> replay bearer token against protected API endpoints.
### High: No brute-force protection on authentication login
- Severity: High (blocking)
- Why this is blocking: `/api/v1/auth/login` accepts unlimited credential attempts with no per-IP or per-username throttling, no lockout, and no backoff. This leaves credential stuffing and password guessing defenses incomplete for production.
- Impact: online account takeover risk increases substantially when passwords are weak, reused, leaked, or defaulted.
- Exploit path: repeated automated POST requests to `/api/v1/auth/login` until valid credentials are found.
- Evidence:
- `frontend/src/lib/api.ts:39`
- `frontend/src/lib/api.ts:98`
- `frontend/src/lib/api.ts:111`
- `frontend/src/lib/api.ts:155`
- `docker-compose.yml:123`
- `backend/app/api/router.py:25`
- `backend/app/api/router.py:37`
- Production recommendation:
- Remove browser-side static token fallback.
- Use per-user server-issued auth (session or short-lived JWT) with role-bound authorization.
- `backend/app/api/routes_auth.py:34` defines login endpoint
- `backend/app/api/routes_auth.py:42` to `backend/app/api/routes_auth.py:51` performs auth check and returns 401 only
- `backend/app/api/routes_documents.py:75` to `backend/app/api/routes_documents.py:95` shows rate limiting exists but is applied only to content export, not login
- `backend/app/services/rate_limiter.py:16` to `backend/app/services/rate_limiter.py:42` contains reusable limiter logic currently not used by auth routes
- Required remediation:
- add login throttling keyed by username and source IP
- add escalating delay or temporary lockout on repeated failures
- return a stable error message and status on throttled attempts
- log and monitor failed auth attempt rates
### High
## Additional Code-Level Findings (Non-blocking)
1. CORS policy is effectively any HTTP/HTTPS origin, with credentials enabled
- Severity: High
- Why this is a product issue: CORS middleware enables `allow_origin_regex` that matches broad web origins and sets `allow_credentials=True`.
- Impact: If credentials are present, cross-origin access risk increases and token abuse becomes easier from arbitrary origins.
- Exploit path: Malicious origin performs cross-origin requests with available credentials and can read API responses under permissive CORS policy.
- Evidence:
- `backend/app/main.py:21`
- `backend/app/main.py:41`
- `backend/app/main.py:42`
- `backend/app/main.py:44`
- Production recommendation:
- Replace regex-based broad origin acceptance with explicit trusted origin allowlist.
- Keep `allow_credentials=False` unless strictly required for cookie-based flows.
### Medium
1. Sensitive processing content is persisted in logs by default
### Medium: Non-deterministic frontend dependency versioning
- Severity: Medium
- Why this is a product issue: Pipeline logging records OCR text, extraction text, prompts, and LLM outputs into persistent processing logs.
- Impact: Increased confidentiality risk and larger data-retention blast radius if logs are queried or exfiltrated.
- Exploit path: Access to admin log endpoints or database allows retrieval of sensitive operational content.
- Risk: `lucide-react` is set to `latest`, making builds time-dependent and increasing supply-chain unpredictability.
- Evidence:
- `backend/app/worker/tasks.py:619`
- `backend/app/worker/tasks.py:638`
- `backend/app/services/routing_pipeline.py:789`
- `backend/app/services/routing_pipeline.py:802`
- `backend/app/services/routing_pipeline.py:814`
- `backend/app/core/config.py:45`
- Production recommendation:
- Default to metadata-only logs.
- Disable persistent storage of prompt/response/raw extracted text unless temporary debug mode is explicitly enabled with strict TTL.
- `frontend/package.json:13`
- Recommendation:
- pin an explicit semver version and update intentionally via dependency review workflow
2. Markdown export endpoint is unbounded and memory-amplifiable
- Severity: Medium
- Why this is a product issue: Export loads all matching documents and builds ZIP in-memory with `BytesIO`, without hard limits on selection size.
- Impact: Authenticated users can trigger high memory use and service degradation.
- Exploit path: Repeated wide `path_prefix` exports cause large in-memory archive construction.
### Low: One security middleware regression test is currently not executable
- Severity: Low
- Risk: reduced confidence in continued enforcement of upload middleware behavior.
- Evidence:
- `backend/app/api/routes_documents.py:402`
- `backend/app/api/routes_documents.py:412`
- `backend/app/api/routes_documents.py:416`
- `backend/app/api/routes_documents.py:418`
- `backend/app/api/routes_documents.py:421`
- `backend/app/api/routes_documents.py:425`
- Production recommendation:
- Enforce max export document count and total bytes.
- Stream archive generation to temp files.
- Add endpoint rate limiting.
- `backend/tests/test_upload_request_size_middleware.py` currently errors at import time when run directly in this environment
- Recommendation:
- fix test stubs/import assumptions so this test is runnable in CI and local developer environments
## Risks Requiring Product Decision or Further Verification
## MUST KNOW User-Dependent Risks (Not Blocking Per Request)
1. Authorization model appears role-based without per-document ownership boundaries
- Evidence:
- `backend/app/models/document.py:29`
- `backend/app/api/router.py:19`
- `backend/app/api/router.py:31`
- Question: Is this intentionally single-operator, or should production support multi-user/tenant data isolation?
These are deployment/operator-controlled and therefore reported as must-know, not blocking findings.
2. Worker startup command uses raw Redis URL string and bypasses in-code URL security validator at startup
- Evidence:
- `docker-compose.yml:81`
- `backend/app/worker/queue.py:15`
- Question: Should worker startup also enforce `validate_redis_url_security` before consuming jobs?
1. Environment is set to development posture with broad host binding.
- Evidence: `.env:1` (`APP_ENV=development`), `.env:2` (`HOST_BIND_IP=0.0.0.0`)
- Risk: exposed surface and relaxed defaults if used outside isolated development network.
3. Provider key encryption uses custom cryptographic construction
- Evidence:
- `backend/app/services/app_settings.py:131`
- `backend/app/services/app_settings.py:154`
- `backend/app/services/app_settings.py:176`
- Question: Are compliance or internal policy requirements demanding standardized AEAD primitives from vetted cryptography libraries?
2. Bootstrap credentials appear placeholder/weak and must be replaced before live use.
- Evidence: `.env:15`, `.env:17`
- Risk: straightforward credential compromise if unchanged.
## User-Managed Configuration Observations (Not Product Defects)
3. Sensitive log text persistence is enabled in current `.env`.
- Evidence: `.env:22`, `.env:23`; code path at `backend/app/services/processing_logs.py:47` and `backend/app/services/processing_logs.py:128`
- Risk: OCR/model prompts, responses, and payload text can be stored and later exposed to admins or backups.
These are deployment/operator choices and should be tracked separately from code defects.
4. Provider outbound network controls are permissive in current `.env`.
- Evidence: `.env:29` to `.env:31`
- Risk: admin-configured provider endpoints can target non-HTTPS/private hosts if account compromise occurs.
1. Development-mode posture in local `.env`
- Evidence:
- `.env:1`
- `.env:3`
- Notes: `APP_ENV=development` and anonymous development access are enabled.
5. Public base URL is HTTP and CORS includes LAN origin.
- Evidence: `.env:33`, `.env:34`
- Risk: transport security and origin trust are weaker than production HTTPS allowlist posture.
2. Local `.env` includes placeholder shared API token values
- Evidence:
- `.env:15`
- `.env:16`
- `.env:31`
- Notes: If replaced with real values and reused, this increases operational risk. This is operator responsibility.
6. Redis transport is non-TLS in current env posture.
- Evidence: `.env:10` to `.env:12`
- Risk: acceptable for isolated local stack, unsafe for untrusted networks.
3. Compose defaults allow permissive provider egress controls
- Evidence:
- `docker-compose.yml:51`
- `docker-compose.yml:52`
- `.env:21`
- `.env:22`
- `.env:23`
- Notes: Allowing HTTP/private-network provider targets is a deployment policy choice.
7. Compose defaults are development-first for several sensitive controls.
- Evidence: `docker-compose.yml:40`, `docker-compose.yml:52`, `docker-compose.yml:53`, `docker-compose.yml:60`, `docker-compose.yml:68`
- Risk: if overrides are missed, deployment may run with weaker network and provider policies.
4. Internal service transport defaults are plaintext in local stack
- Evidence:
- `docker-compose.yml:56`
- `.env:11`
- Notes: `http`/`redis://` may be acceptable for isolated local dev, but not for exposed production networks.
## Production Decision
Current state is not production ready due to the blocking auth brute-force gap.
## Production Readiness Priority Order
1. Remove browser static token model and adopt per-user auth.
2. Tighten CORS to explicit trusted origins only.
3. Reduce persistent sensitive logging to metadata by default.
4. Add hard limits and streaming behavior for markdown export.
5. Resolve product decisions on tenant isolation, worker Redis security enforcement, and cryptography standardization.
After fixing the blocking issue, production readiness still depends on secure operator configuration of `.env` and runtime network perimeter.