From eae7afd36e2bd676ce2b091bd04b06399e1a0401 Mon Sep 17 00:00:00 2001 From: Beda Schmid Date: Sun, 1 Mar 2026 21:22:25 -0300 Subject: [PATCH] docs: refresh production security assessment report --- REPORT.md | 266 ++++++++++++++++++++++++++++++++++++++---------------- 1 file changed, 189 insertions(+), 77 deletions(-) diff --git a/REPORT.md b/REPORT.md index b72065d..5debf01 100644 --- a/REPORT.md +++ b/REPORT.md @@ -1,97 +1,209 @@ -# Security Production Readiness Report +# Security Analysis Report Date: 2026-03-02 Repository: /Users/bedas/Developer/GitHub/dcm -Assessment type: Static code and configuration review with targeted local security test execution +Scope: backend FastAPI API and worker, frontend React app, Docker runtime configuration, and local `.env` posture. -## Verdict -Not production ready. +## Executive Verdict -Reason: one blocking, code-level access-control and data-disclosure issue was found. +Current state is **not production ready**. -## Preflight Results -- `command -v git` -> pass (`/usr/bin/git`) -- `git rev-parse --is-inside-work-tree` -> pass (`true`) -- `git status --short` -> clean before analysis +- Blocking code-level issues exist and should be fixed before production exposure. +- Additional user-dependent deployment risks are present in `.env` and runtime defaults. Per request, these are listed as MUST KNOW and not marked as blocking. -## Validation Commands And Outcomes -- `/Users/bedas/Developer/Python/global_venv/bin/python -m unittest backend/tests/test_security_controls.py` -> pass (32 tests) -- `/Users/bedas/Developer/Python/global_venv/bin/python -m unittest backend/tests/test_upload_request_size_middleware.py` -> pass (3 tests) -- `/Users/bedas/Developer/Python/global_venv/bin/python -m unittest backend/tests/test_app_settings_provider_resilience.py` -> pass (6 tests) +## Method and Coverage -## Blocking Security Findings +Performed a read-only static review of: -### High: Non-global catalog presets are exposed to all authenticated users -- Severity: High (blocking) -- Why this is blocking: - - The settings model supports `global_shared` scope for predefined paths and tags, but the user-accessible discovery endpoints return all predefined entries without filtering by this scope. - - This breaks intended discoverability boundaries and leaks admin-curated non-global taxonomy metadata to standard users. -- Impact: - - Information disclosure across role boundaries for internal path and tag catalogs. - - Reduced separation between admin-only and user-visible metadata. -- Exploit path: - - Any authenticated non-admin user calls `GET /api/v1/documents/paths` and `GET /api/v1/documents/tags`. - - Endpoint responses include every predefined path or tag value regardless of `global_shared` state. -- Evidence: - - `backend/app/api/routes_documents.py:399-403` and `backend/app/api/routes_documents.py:423-427` include all predefined tags and paths. - - `backend/app/schemas/settings.py:145` and `backend/app/schemas/settings.py:159` define global discoverability scope. - - `backend/app/services/app_settings.py:709-710`, `backend/app/services/app_settings.py:730`, `backend/app/services/app_settings.py:758-759`, and `backend/app/services/app_settings.py:779` preserve `global_shared` state in normalized settings. -- Required remediation: - - Filter predefined entries returned by user-facing discovery endpoints by role and `global_shared`. - - Keep full catalog visibility for admins only. - - Add regression tests for non-admin path/tag discovery with mixed `global_shared` values. +- API auth, authorization, upload and file handling, routing, settings, and worker pipelines. +- Frontend auth token handling and preview rendering behavior. +- Docker and environment defaults affecting network and secret posture. +- Existing security-focused tests and basic frontend API tests. -## MUST KNOW User-Dependent Risks (Not Blocking Per Request) +## Blocking Security Issues (Code-Level) -These items are deployment, environment, or proxy dependent and are therefore not marked blocking per request requirements. +### 1) High - No abuse controls on expensive authenticated endpoints -### High: Development-first runtime defaults can be promoted to production if not overridden -- Evidence: - - `.env.example:5` (`APP_ENV=development`) - - `.env.example:36-38` (`PROVIDER_BASE_URL_ALLOW_HTTP=true`, `PROVIDER_BASE_URL_ALLOW_PRIVATE_NETWORK=true`, empty allowlist) - - `.env.example:14-16` (`redis://`, `REDIS_SECURITY_MODE=compat`, `REDIS_TLS_MODE=allow_insecure`) - - `.env.example:40-41` (`PUBLIC_BASE_URL` on HTTP, local CORS defaults) - - `docker-compose.yml:56-57`, `docker-compose.yml:64-66`, `docker-compose.yml:101-102`, `docker-compose.yml:106-107` -- Risk: - - Weak outbound provider constraints, plaintext internal transport defaults, and non-production environment posture can persist in live deployments. +Impact: -### Medium: Login throttle IP identity depends on proxy trust model -- Evidence: - - `backend/app/api/routes_auth.py:32-35` uses `request.client.host` only. -- Risk: - - Behind reverse proxies, all clients may collapse to proxy IP, increasing lockout abuse and reducing attribution quality. +- Any authenticated user can repeatedly trigger high-cost operations (upload processing, OCR, summarization, routing, indexing), causing queue saturation, infrastructure exhaustion, and external provider cost abuse. -### Medium: API documentation endpoints are exposed by default -- Evidence: - - `backend/app/main.py:37` creates `FastAPI(...)` with default docs behavior. - - README explicitly references `/docs` as available. -- Risk: - - Public endpoint inventory and schema visibility in exposed deployments. +Exploit path: -### Medium: Bearer token is stored in browser `sessionStorage` -- Evidence: - - `frontend/src/lib/api.ts:41`, `frontend/src/lib/api.ts:61-67`, `frontend/src/lib/api.ts:84-94` -- Risk: - - Any successful frontend XSS can exfiltrate active session tokens. +- Repeated `POST /api/v1/documents/upload` +- Repeated `POST /api/v1/documents/{document_id}/reprocess` -### Low: Typesense transport defaults to HTTP on internal network -- Evidence: - - `docker-compose.yml:66-67`, `docker-compose.yml:107-108` -- Risk: - - Acceptable for isolated local networks, but not suitable for untrusted or cross-host network links. +Evidence: -## Security Controls Confirmed Present -- Login brute-force throttling with lockout and Retry-After handling. -- Owner-scoped access checks for non-admin document operations. -- Provider base URL validation with allowlist and DNS revalidation hooks. -- Upload request size and archive extraction guardrails. -- Processing log redaction and metadata-only persistence defaults. +- Upload endpoint has size limits but no rate/volume quota checks: `backend/app/api/routes_documents.py:665-985`. +- Reprocess endpoint has no rate/cooldown checks: `backend/app/api/routes_documents.py:958-985`. +- A Redis rate limiter exists but is currently used for markdown export only: `backend/app/services/rate_limiter.py:16-42` and `backend/app/api/routes_documents.py:100-120`. -## Coverage Limits -- No dependency CVE audit was executed in this run. -- This report is based on repository code, configuration templates, and available local tests. +Remediation: -## Production Decision -Current state is not production ready because of the blocking catalog discoverability exposure. +- Add per-user and per-IP rate limiting to upload and reprocess endpoints. +- Add per-user daily/rolling quotas (documents, bytes, reprocess calls). +- Add queue depth backpressure and reject or defer requests when saturated. +- Add alerting for anomalous request and job-enqueue rates. + +### 2) Medium - API docs and schema are exposed by default + +Impact: + +- Unauthenticated endpoint discovery and contract reconnaissance are easier (`/docs`, `/redoc`, `/openapi.json`). + +Exploit path: + +- Remote probing of public API metadata when service is internet-reachable. + +Evidence: + +- FastAPI app is created with default docs behavior and no production gating: `backend/app/main.py:37`. + +Remediation: + +- Disable docs in production (`docs_url=None`, `redoc_url=None`, `openapi_url=None`) or +- Restrict these routes at reverse proxy / edge to trusted admin networks. + +### 3) Medium - Bearer token stored in browser sessionStorage + +Impact: + +- Any successful XSS on the frontend origin can steal bearer tokens and replay them. + +Exploit path: + +- Malicious script execution on app origin reads `sessionStorage` and exfiltrates `Authorization` token. + +Evidence: + +- Token persisted in sessionStorage and injected into `Authorization` header: `frontend/src/lib/api.ts:39-42`, `frontend/src/lib/api.ts:61-67`, `frontend/src/lib/api.ts:84-95`, `frontend/src/lib/api.ts:103-112`. + +Remediation: + +- Prefer HttpOnly Secure SameSite cookies for session auth, plus CSRF protection. +- If bearer-in-JS remains, enforce strict CSP, remove inline script execution, and add strong dependency hygiene. + +## MUST KNOW (User-Dependent, Non-Blocking Per Request) + +### A) Current `.env` is development-oriented and exposed beyond localhost + +Why this matters: + +- Service currently binds to all interfaces and uses development settings, which is unsafe for direct internet exposure. + +Evidence: + +- `APP_ENV=development`: `.env:1` +- `HOST_BIND_IP=0.0.0.0`: `.env:2` +- `PUBLIC_BASE_URL=http://...`: `.env:33` +- Broad CORS for localhost and LAN host: `.env:34` + +Action: + +- Set production values (`APP_ENV=production`, HTTPS base URL, strict CORS, host binding behind hardened reverse proxy). + +### B) Provider SSRF protections are disabled in active env + +Why this matters: + +- Allowing HTTP, private network targets, and empty allowlist can permit unsafe outbound model-provider endpoints. + +Evidence: + +- Active `.env`: `PROVIDER_BASE_URL_ALLOW_HTTP=true`, `PROVIDER_BASE_URL_ALLOW_PRIVATE_NETWORK=true`, `PROVIDER_BASE_URL_ALLOWLIST=[]` at `.env:29-31`. +- Compose defaults are permissive if env is not hardened: `docker-compose.yml:55-57`, `docker-compose.yml:100-102`. + +Action: + +- For production set: `ALLOW_HTTP=false`, `ALLOW_PRIVATE_NETWORK=false`, explicit host allowlist. + +### C) Sensitive model and payload text logging is enabled in active env + +Why this matters: + +- Prompt/response and payload text may include confidential document content and credentials, increasing breach impact. + +Evidence: + +- Active `.env` enables both flags: `.env:22-23`. +- Code defaults are safer (`false`): `backend/app/core/config.py:60-61`. + +Action: + +- Disable in production unless explicitly required for short-term diagnostics. +- Apply strict retention and access controls if temporarily enabled. + +### D) Redis transport/auth hardening depends on env mode and URL + +Why this matters: + +- Current env uses `redis://` with auto security/tls mode; this is not suitable for untrusted network paths. + +Evidence: + +- Active `.env`: `REDIS_URL=redis://...`, `REDIS_SECURITY_MODE=auto`, `REDIS_TLS_MODE=auto` at `.env:10-12`. +- Strict checks are triggered by production mode when auto is used: `backend/app/core/config.py:157-171`. + +Action: + +- In production use `rediss://`, `REDIS_SECURITY_MODE=strict`, `REDIS_TLS_MODE=required`. + +### E) Frontend container runs a development server + +Why this matters: + +- Vite dev server is not intended as a hardened production serving layer. + +Evidence: + +- Frontend container command runs `npm run dev`: `frontend/Dockerfile:20`. +- `dev` script maps to Vite dev mode: `frontend/package.json:7`. + +Action: + +- Build static assets and serve behind a production-grade web server/reverse proxy. + +### F) Login throttle IP identity depends on proxy topology + +Why this matters: + +- Throttle identity uses `request.client.host`; if a proxy masks client IPs, lockout behavior may be inaccurate. + +Evidence: + +- IP extraction uses transport client host directly: `backend/app/api/routes_auth.py:32-35`. + +Action: + +- Ensure trusted proxy configuration preserves real client IP semantics before internet deployment. + +## Validation Commands and Outcomes + +Preflight: + +- `command -v git` -> passed (`/usr/bin/git`) +- `git rev-parse --is-inside-work-tree` -> passed (`true`) +- `git status --short` -> clean before work + +Security-related backend tests: + +- `PYTHONDONTWRITEBYTECODE=1 /Users/bedas/Developer/Python/global_venv/bin/python backend/tests/test_security_controls.py` -> passed (34 tests) +- `PYTHONDONTWRITEBYTECODE=1 /Users/bedas/Developer/Python/global_venv/bin/python backend/tests/test_upload_request_size_middleware.py` -> passed (3 tests) +- `PYTHONDONTWRITEBYTECODE=1 /Users/bedas/Developer/Python/global_venv/bin/python backend/tests/test_app_settings_provider_resilience.py` -> passed (6 tests) +- `PYTHONDONTWRITEBYTECODE=1 /Users/bedas/Developer/Python/global_venv/bin/python backend/tests/test_processing_log_retention_settings.py` -> passed (5 tests) + +Frontend auth-client test: + +- `npm run test` (in `frontend/`) -> passed + +Note: + +- `pytest` is not installed in `/Users/bedas/Developer/Python/global_venv/bin/python`, so direct module execution via `unittest` entrypoints was used for backend test files. + +## Residual Risk and Coverage Limits + +- No dynamic penetration test was run against a live deployed stack. +- No dependency CVE audit (`pip-audit`, `npm audit`) was run in this turn. +- Reverse proxy, firewall, TLS termination, and cloud/network policy were not reviewed. -After remediating that issue, production readiness still depends on strict deployment choices for environment variables, proxy trust configuration, TLS, and frontend hardening.