Compare commits
10 Commits
da5cbc2c01
...
74d91eb4b1
| Author | SHA1 | Date | |
|---|---|---|---|
|
74d91eb4b1
|
|||
|
1c57084ebf
|
|||
|
bfc89fe5ce
|
|||
|
1b2e0cb8af
|
|||
|
0242e061c2
|
|||
|
7a19f22f41
|
|||
|
c5423fc9c3
|
|||
|
3d280396ae
|
|||
|
48cfc79b5f
|
|||
|
bdd97d1c62
|
51
.env.example
Normal file
51
.env.example
Normal file
@@ -0,0 +1,51 @@
|
||||
# LedgerDock environment template
|
||||
# Copy to .env and adjust all secret values before first run.
|
||||
|
||||
# Development defaults (HTTP local stack)
|
||||
APP_ENV=development
|
||||
HOST_BIND_IP=127.0.0.1
|
||||
|
||||
POSTGRES_USER=dcm
|
||||
POSTGRES_PASSWORD=ChangeMe-Postgres-Secret
|
||||
POSTGRES_DB=dcm
|
||||
DATABASE_URL=postgresql+psycopg://dcm:ChangeMe-Postgres-Secret@db:5432/dcm
|
||||
|
||||
REDIS_PASSWORD=ChangeMe-Redis-Secret
|
||||
REDIS_URL=redis://:ChangeMe-Redis-Secret@redis:6379/0
|
||||
REDIS_SECURITY_MODE=compat
|
||||
REDIS_TLS_MODE=allow_insecure
|
||||
|
||||
AUTH_BOOTSTRAP_ADMIN_USERNAME=admin
|
||||
AUTH_BOOTSTRAP_ADMIN_PASSWORD=ChangeMe-Admin-Password
|
||||
AUTH_BOOTSTRAP_USER_USERNAME=user
|
||||
AUTH_BOOTSTRAP_USER_PASSWORD=ChangeMe-User-Password
|
||||
|
||||
APP_SETTINGS_ENCRYPTION_KEY=ChangeMe-Settings-Encryption-Key
|
||||
TYPESENSE_API_KEY=ChangeMe-Typesense-Key
|
||||
|
||||
PROCESSING_LOG_STORE_MODEL_IO_TEXT=false
|
||||
PROCESSING_LOG_STORE_PAYLOAD_TEXT=false
|
||||
CONTENT_EXPORT_MAX_DOCUMENTS=250
|
||||
CONTENT_EXPORT_MAX_TOTAL_BYTES=52428800
|
||||
CONTENT_EXPORT_RATE_LIMIT_PER_MINUTE=6
|
||||
|
||||
PROVIDER_BASE_URL_ALLOW_HTTP=true
|
||||
PROVIDER_BASE_URL_ALLOW_PRIVATE_NETWORK=true
|
||||
PROVIDER_BASE_URL_ALLOWLIST=[]
|
||||
|
||||
PUBLIC_BASE_URL=http://localhost:8000
|
||||
CORS_ORIGINS=["http://localhost:5173","http://localhost:3000"]
|
||||
VITE_API_BASE=
|
||||
|
||||
# Production baseline overrides (set explicitly for live deployments):
|
||||
# APP_ENV=production
|
||||
# HOST_BIND_IP=127.0.0.1
|
||||
# REDIS_URL=rediss://:<strong-password>@redis.example.internal:6379/0
|
||||
# REDIS_SECURITY_MODE=strict
|
||||
# REDIS_TLS_MODE=required
|
||||
# PROVIDER_BASE_URL_ALLOW_HTTP=false
|
||||
# PROVIDER_BASE_URL_ALLOW_PRIVATE_NETWORK=false
|
||||
# PROVIDER_BASE_URL_ALLOWLIST=["api.openai.com"]
|
||||
# PUBLIC_BASE_URL=https://api.example.com
|
||||
# CORS_ORIGINS=["https://app.example.com"]
|
||||
# VITE_API_BASE=https://api.example.com/api/v1
|
||||
47
README.md
47
README.md
@@ -24,9 +24,9 @@ The default `docker compose` stack includes:
|
||||
- `frontend` - React UI (`http://localhost:5173`)
|
||||
- `api` - FastAPI backend (`http://localhost:8000`, docs at `/docs`)
|
||||
- `worker` - background processing jobs
|
||||
- `db` - PostgreSQL (`localhost:5432`)
|
||||
- `redis` - queue backend (`localhost:6379`)
|
||||
- `typesense` - search index (`localhost:8108`)
|
||||
- `db` - PostgreSQL (internal service network)
|
||||
- `redis` - queue backend (internal service network)
|
||||
- `typesense` - search index (internal service network)
|
||||
|
||||
## Requirements
|
||||
|
||||
@@ -42,12 +42,31 @@ From repository root:
|
||||
docker compose up --build -d
|
||||
```
|
||||
|
||||
Before first run, set required secrets and connection values in `.env` (or your shell):
|
||||
|
||||
- `POSTGRES_USER`
|
||||
- `POSTGRES_PASSWORD`
|
||||
- `POSTGRES_DB`
|
||||
- `DATABASE_URL`
|
||||
- `REDIS_PASSWORD`
|
||||
- `REDIS_URL`
|
||||
- `AUTH_BOOTSTRAP_ADMIN_USERNAME`
|
||||
- `AUTH_BOOTSTRAP_ADMIN_PASSWORD`
|
||||
- optional `AUTH_BOOTSTRAP_USER_USERNAME`
|
||||
- optional `AUTH_BOOTSTRAP_USER_PASSWORD`
|
||||
- `APP_SETTINGS_ENCRYPTION_KEY`
|
||||
- `TYPESENSE_API_KEY`
|
||||
|
||||
Start from `.env.example` to avoid missing required variables.
|
||||
|
||||
Open:
|
||||
|
||||
- Frontend: `http://localhost:5173`
|
||||
- API docs: `http://localhost:8000/docs`
|
||||
- Health: `http://localhost:8000/api/v1/health`
|
||||
|
||||
Use bootstrap credentials (`AUTH_BOOTSTRAP_ADMIN_USERNAME` and `AUTH_BOOTSTRAP_ADMIN_PASSWORD`) to sign in from the frontend login screen.
|
||||
|
||||
Stop the stack:
|
||||
|
||||
```bash
|
||||
@@ -102,21 +121,31 @@ cd frontend && npm run preview
|
||||
|
||||
Main runtime variables are defined in `docker-compose.yml`:
|
||||
|
||||
- API and worker: `DATABASE_URL`, `REDIS_URL`, `STORAGE_ROOT`, `PUBLIC_BASE_URL`, `CORS_ORIGINS`, `TYPESENSE_*`
|
||||
- Frontend: `VITE_API_BASE`
|
||||
- API and worker: `DATABASE_URL`, `REDIS_URL`, `REDIS_SECURITY_MODE`, `REDIS_TLS_MODE`, `STORAGE_ROOT`, `PUBLIC_BASE_URL`, `CORS_ORIGINS`, `AUTH_BOOTSTRAP_*`, `PROCESSING_LOG_STORE_*`, `CONTENT_EXPORT_*`, `TYPESENSE_*`, `APP_SETTINGS_ENCRYPTION_KEY`
|
||||
- Frontend: optional `VITE_API_BASE`
|
||||
|
||||
When `VITE_API_BASE` is unset, the frontend uses `http://<current-hostname>:8000/api/v1`.
|
||||
|
||||
Application settings saved from the UI persist at:
|
||||
|
||||
- `<STORAGE_ROOT>/settings.json` (inside the storage volume)
|
||||
|
||||
Provider API keys are persisted encrypted at rest (`api_key_encrypted`) and are no longer written as plaintext values.
|
||||
|
||||
Settings endpoints:
|
||||
|
||||
- `GET/PUT /api/v1/settings`
|
||||
- `GET/PATCH /api/v1/settings`
|
||||
- `POST /api/v1/settings/reset`
|
||||
- `POST /api/v1/settings/handwriting`
|
||||
- `POST /api/v1/processing/logs/trim`
|
||||
- `PATCH /api/v1/settings/handwriting`
|
||||
- `POST /api/v1/processing/logs/trim` (admin only)
|
||||
|
||||
Note: the compose file currently includes host-specific URL values (for example `PUBLIC_BASE_URL` and `VITE_API_BASE`). Adjust these for your environment when needed.
|
||||
Auth endpoints:
|
||||
|
||||
- `POST /api/v1/auth/login`
|
||||
- `GET /api/v1/auth/me`
|
||||
- `POST /api/v1/auth/logout`
|
||||
|
||||
Detailed DEV and LIVE environment guidance, including HTTPS reverse-proxy deployment values, is documented in `doc/operations-and-configuration.md` and `.env.example`.
|
||||
|
||||
## Data Persistence
|
||||
|
||||
|
||||
216
REPORT.md
216
REPORT.md
@@ -2,108 +2,148 @@
|
||||
|
||||
Date: 2026-03-01
|
||||
Repository: /Users/bedas/Developer/GitHub/dcm
|
||||
Review Type: Static security review for production readiness
|
||||
Review type: Static code and configuration review (no runtime penetration testing)
|
||||
|
||||
## Scope
|
||||
- Backend: FastAPI API, worker queue, settings and model runtime services
|
||||
- Frontend: React and Vite API client and document preview rendering
|
||||
- Infrastructure: docker-compose service exposure and secret configuration
|
||||
- Backend API and worker: `backend/app`
|
||||
- Frontend API client/auth transport: `frontend/src`
|
||||
- Compose and environment defaults: `docker-compose.yml`, `.env`
|
||||
|
||||
## Findings
|
||||
## Method and Limits
|
||||
- Reviewed source and configuration files in the current checkout.
|
||||
- Verified findings with direct file evidence.
|
||||
- Did not run dynamic security testing, dependency CVE scanning, or infrastructure perimeter testing.
|
||||
|
||||
## Confirmed Product Security Findings
|
||||
|
||||
### Critical
|
||||
|
||||
1. Redis queue is exposed without authentication and can be abused for worker job injection.
|
||||
- Impact: If Redis is reachable by an attacker, queued job payloads can be injected and executed by the worker process, leading to remote code execution and data compromise.
|
||||
- Exploit path: Reach Redis on port 6379, enqueue crafted RQ jobs into queue dcm, wait for worker consumption.
|
||||
1. Browser-exposed shared bearer token path (`VITE_API_TOKEN` fallback)
|
||||
- Severity: Critical
|
||||
- Why this is a product issue: The frontend code supports a build-time token fallback and injects it into all API requests. This creates a shared credential model in browser code.
|
||||
- Impact: Any user with browser access can recover and reuse the token, collapsing auth boundaries and auditability.
|
||||
- Exploit path: Open app -> inspect runtime/bundle or intercepted request -> replay bearer token against protected API endpoints.
|
||||
- Evidence:
|
||||
- docker-compose publishes Redis host port: `docker-compose.yml:21`
|
||||
- worker consumes from Redis queue directly: `docker-compose.yml:77`
|
||||
- queue connection uses bare Redis URL with no auth/TLS: `backend/app/worker/queue.py:15`, `backend/app/worker/queue.py:21`
|
||||
- current environment binds services to all interfaces: `.env:1`
|
||||
- Remediation:
|
||||
- Do not publish Redis externally in production.
|
||||
- Enforce Redis authentication and TLS.
|
||||
- Place Redis on a private network segment with strict ACLs.
|
||||
- Treat queue producers as privileged components only.
|
||||
|
||||
2. Untrusted uploaded content is previewed in an unsandboxed iframe.
|
||||
- Impact: Stored XSS and active content execution in preview context can enable account action abuse and data exfiltration in the browser.
|
||||
- Exploit path: Upload active content (for example HTML), open preview, script executes in iframe without sandbox constraints.
|
||||
- Evidence:
|
||||
- upload endpoint accepts generic uploaded files: `backend/app/api/routes_documents.py:493`
|
||||
- MIME type is derived from bytes and persisted: `backend/app/api/routes_documents.py:530`
|
||||
- preview endpoint returns original bytes inline with stored media type: `backend/app/api/routes_documents.py:449`, `backend/app/api/routes_documents.py:457`
|
||||
- frontend renders preview in iframe without sandbox attribute: `frontend/src/components/DocumentViewer.tsx:486`
|
||||
- preview source is a blob URL created from fetched content: `frontend/src/components/DocumentViewer.tsx:108`, `frontend/src/components/DocumentViewer.tsx:113`
|
||||
- Remediation:
|
||||
- Block inline preview for script-capable MIME types.
|
||||
- Add strict iframe sandboxing if iframe preview remains required.
|
||||
- Prefer force-download for active formats.
|
||||
- Serve untrusted preview content from an isolated origin with restrictive CSP.
|
||||
- `frontend/src/lib/api.ts:39`
|
||||
- `frontend/src/lib/api.ts:98`
|
||||
- `frontend/src/lib/api.ts:111`
|
||||
- `frontend/src/lib/api.ts:155`
|
||||
- `docker-compose.yml:123`
|
||||
- `backend/app/api/router.py:25`
|
||||
- `backend/app/api/router.py:37`
|
||||
- Production recommendation:
|
||||
- Remove browser-side static token fallback.
|
||||
- Use per-user server-issued auth (session or short-lived JWT) with role-bound authorization.
|
||||
|
||||
### High
|
||||
|
||||
1. Frontend distributes a bearer token to all clients.
|
||||
- Impact: Any user with browser access can extract the token and replay authenticated calls, preventing per-user accountability and increasing blast radius.
|
||||
- Exploit path: Read token from frontend runtime environment or request headers, replay API requests with Authorization header.
|
||||
1. CORS policy is effectively any HTTP/HTTPS origin, with credentials enabled
|
||||
- Severity: High
|
||||
- Why this is a product issue: CORS middleware enables `allow_origin_regex` that matches broad web origins and sets `allow_credentials=True`.
|
||||
- Impact: If credentials are present, cross-origin access risk increases and token abuse becomes easier from arbitrary origins.
|
||||
- Exploit path: Malicious origin performs cross-origin requests with available credentials and can read API responses under permissive CORS policy.
|
||||
- Evidence:
|
||||
- frontend consumes token from public Vite env: `frontend/src/lib/api.ts:24`
|
||||
- token is attached to every request when present: `frontend/src/lib/api.ts:38`
|
||||
- compose passes `VITE_API_TOKEN` from user token: `docker-compose.yml:115`
|
||||
- privileged routes rely on static token role checks: `backend/app/api/router.py:19`, `backend/app/api/auth.py:47`, `backend/app/api/auth.py:51`
|
||||
- Remediation:
|
||||
- Replace shared static token model with per-user authentication.
|
||||
- Keep secrets server-side only.
|
||||
- Use short-lived credentials with rotation and revocation.
|
||||
|
||||
2. Default and static service secrets are present in deploy config.
|
||||
- Impact: If service ports are exposed, predictable credentials and keys allow unauthorized access to data services.
|
||||
- Exploit path: Connect to published Postgres or Typesense ports and authenticate with known static values.
|
||||
- Evidence:
|
||||
- static Postgres credentials: `docker-compose.yml:5`, `docker-compose.yml:6`
|
||||
- static Typesense key in compose and runtime env: `docker-compose.yml:29`, `docker-compose.yml:55`, `docker-compose.yml:93`
|
||||
- database and Typesense ports are published to host: `docker-compose.yml:9`, `docker-compose.yml:32`
|
||||
- current environment uses placeholder tokens: `.env:2`, `.env:3`, `.env:4`
|
||||
- Remediation:
|
||||
- Use high-entropy secrets managed outside repository configuration.
|
||||
- Remove unnecessary host port publications in production.
|
||||
- Restrict service network access to trusted internal components.
|
||||
|
||||
3. ZIP recursion depth control is not enforced across queued descendants.
|
||||
- Impact: Nested archives can create uncontrolled fan-out, causing CPU, queue, and storage exhaustion.
|
||||
- Exploit path: Upload ZIP containing ZIPs; children are queued as independent documents without inherited depth, repeating recursively.
|
||||
- Evidence:
|
||||
- configured depth limit exists: `backend/app/core/config.py:28`
|
||||
- extractor takes a depth argument but is called without propagation: `backend/app/services/extractor.py:302`, `backend/app/services/extractor.py:306`
|
||||
- worker invokes extractor without depth context: `backend/app/worker/tasks.py:122`
|
||||
- worker enqueues child archive jobs recursively: `backend/app/worker/tasks.py:225`, `backend/app/worker/tasks.py:226`
|
||||
- Remediation:
|
||||
- Persist and propagate archive depth per document lineage.
|
||||
- Enforce absolute descendant and fan-out limits per root upload.
|
||||
- Reject nested archives beyond configured depth.
|
||||
- `backend/app/main.py:21`
|
||||
- `backend/app/main.py:41`
|
||||
- `backend/app/main.py:42`
|
||||
- `backend/app/main.py:44`
|
||||
- Production recommendation:
|
||||
- Replace regex-based broad origin acceptance with explicit trusted origin allowlist.
|
||||
- Keep `allow_credentials=False` unless strictly required for cookie-based flows.
|
||||
|
||||
### Medium
|
||||
|
||||
1. OCR provider path does not apply DNS revalidation equivalent to model runtime path.
|
||||
- Impact: Under permissive network flags, SSRF defenses can be weakened by DNS rebinding on OCR traffic.
|
||||
- Exploit path: Persist provider URL that passes initial checks, then rebind DNS to private target before OCR requests.
|
||||
1. Sensitive processing content is persisted in logs by default
|
||||
- Severity: Medium
|
||||
- Why this is a product issue: Pipeline logging records OCR text, extraction text, prompts, and LLM outputs into persistent processing logs.
|
||||
- Impact: Increased confidentiality risk and larger data-retention blast radius if logs are queried or exfiltrated.
|
||||
- Exploit path: Access to admin log endpoints or database allows retrieval of sensitive operational content.
|
||||
- Evidence:
|
||||
- task model runtime enforces `resolve_dns=True`: `backend/app/services/model_runtime.py:41`
|
||||
- provider normalization in app settings does not pass DNS revalidation flag: `backend/app/services/app_settings.py:253`
|
||||
- OCR runtime uses persisted URL for client base URL: `backend/app/services/app_settings.py:891`, `backend/app/services/handwriting.py:159`
|
||||
- Remediation:
|
||||
- Apply DNS revalidation before outbound OCR requests or on every runtime load.
|
||||
- Disallow private network egress by default and require explicit controlled exceptions.
|
||||
- `backend/app/worker/tasks.py:619`
|
||||
- `backend/app/worker/tasks.py:638`
|
||||
- `backend/app/services/routing_pipeline.py:789`
|
||||
- `backend/app/services/routing_pipeline.py:802`
|
||||
- `backend/app/services/routing_pipeline.py:814`
|
||||
- `backend/app/core/config.py:45`
|
||||
- Production recommendation:
|
||||
- Default to metadata-only logs.
|
||||
- Disable persistent storage of prompt/response/raw extracted text unless temporary debug mode is explicitly enabled with strict TTL.
|
||||
|
||||
2. Provider API keys are persisted in plaintext settings on storage volume.
|
||||
- Impact: File system or backup compromise reveals upstream provider secrets.
|
||||
- Exploit path: Read persisted settings file from storage volume or backup artifact.
|
||||
2. Markdown export endpoint is unbounded and memory-amplifiable
|
||||
- Severity: Medium
|
||||
- Why this is a product issue: Export loads all matching documents and builds ZIP in-memory with `BytesIO`, without hard limits on selection size.
|
||||
- Impact: Authenticated users can trigger high memory use and service degradation.
|
||||
- Exploit path: Repeated wide `path_prefix` exports cause large in-memory archive construction.
|
||||
- Evidence:
|
||||
- settings file location under storage root: `backend/app/services/app_settings.py:133`
|
||||
- provider payload includes plaintext `api_key`: `backend/app/services/app_settings.py:268`
|
||||
- settings payload is written to disk as JSON: `backend/app/services/app_settings.py:680`, `backend/app/services/app_settings.py:685`
|
||||
- OCR settings read returns stored API key value for runtime: `backend/app/services/app_settings.py:894`
|
||||
- Remediation:
|
||||
- Move provider secrets to dedicated secret management.
|
||||
- If local persistence is unavoidable, encrypt sensitive fields at rest and restrict file permissions.
|
||||
- `backend/app/api/routes_documents.py:402`
|
||||
- `backend/app/api/routes_documents.py:412`
|
||||
- `backend/app/api/routes_documents.py:416`
|
||||
- `backend/app/api/routes_documents.py:418`
|
||||
- `backend/app/api/routes_documents.py:421`
|
||||
- `backend/app/api/routes_documents.py:425`
|
||||
- Production recommendation:
|
||||
- Enforce max export document count and total bytes.
|
||||
- Stream archive generation to temp files.
|
||||
- Add endpoint rate limiting.
|
||||
|
||||
## Risks Requiring Product Decision or Further Verification
|
||||
|
||||
1. Authorization model appears role-based without per-document ownership boundaries
|
||||
- Evidence:
|
||||
- `backend/app/models/document.py:29`
|
||||
- `backend/app/api/router.py:19`
|
||||
- `backend/app/api/router.py:31`
|
||||
- Question: Is this intentionally single-operator, or should production support multi-user/tenant data isolation?
|
||||
|
||||
2. Worker startup command uses raw Redis URL string and bypasses in-code URL security validator at startup
|
||||
- Evidence:
|
||||
- `docker-compose.yml:81`
|
||||
- `backend/app/worker/queue.py:15`
|
||||
- Question: Should worker startup also enforce `validate_redis_url_security` before consuming jobs?
|
||||
|
||||
3. Provider key encryption uses custom cryptographic construction
|
||||
- Evidence:
|
||||
- `backend/app/services/app_settings.py:131`
|
||||
- `backend/app/services/app_settings.py:154`
|
||||
- `backend/app/services/app_settings.py:176`
|
||||
- Question: Are compliance or internal policy requirements demanding standardized AEAD primitives from vetted cryptography libraries?
|
||||
|
||||
## User-Managed Configuration Observations (Not Product Defects)
|
||||
|
||||
These are deployment/operator choices and should be tracked separately from code defects.
|
||||
|
||||
1. Development-mode posture in local `.env`
|
||||
- Evidence:
|
||||
- `.env:1`
|
||||
- `.env:3`
|
||||
- Notes: `APP_ENV=development` and anonymous development access are enabled.
|
||||
|
||||
2. Local `.env` includes placeholder shared API token values
|
||||
- Evidence:
|
||||
- `.env:15`
|
||||
- `.env:16`
|
||||
- `.env:31`
|
||||
- Notes: If replaced with real values and reused, this increases operational risk. This is operator responsibility.
|
||||
|
||||
3. Compose defaults allow permissive provider egress controls
|
||||
- Evidence:
|
||||
- `docker-compose.yml:51`
|
||||
- `docker-compose.yml:52`
|
||||
- `.env:21`
|
||||
- `.env:22`
|
||||
- `.env:23`
|
||||
- Notes: Allowing HTTP/private-network provider targets is a deployment policy choice.
|
||||
|
||||
4. Internal service transport defaults are plaintext in local stack
|
||||
- Evidence:
|
||||
- `docker-compose.yml:56`
|
||||
- `.env:11`
|
||||
- Notes: `http`/`redis://` may be acceptable for isolated local dev, but not for exposed production networks.
|
||||
|
||||
## Production Readiness Priority Order
|
||||
|
||||
1. Remove browser static token model and adopt per-user auth.
|
||||
2. Tighten CORS to explicit trusted origins only.
|
||||
3. Reduce persistent sensitive logging to metadata by default.
|
||||
4. Add hard limits and streaming behavior for markdown export.
|
||||
5. Resolve product decisions on tenant isolation, worker Redis security enforcement, and cryptography standardization.
|
||||
|
||||
@@ -1,15 +1,26 @@
|
||||
APP_ENV=development
|
||||
DATABASE_URL=postgresql+psycopg://dcm:dcm@db:5432/dcm
|
||||
REDIS_URL=redis://redis:6379/0
|
||||
REDIS_URL=redis://:replace-with-redis-password@redis:6379/0
|
||||
REDIS_SECURITY_MODE=auto
|
||||
REDIS_TLS_MODE=auto
|
||||
STORAGE_ROOT=/data/storage
|
||||
ADMIN_API_TOKEN=replace-with-random-admin-token
|
||||
USER_API_TOKEN=replace-with-random-user-token
|
||||
AUTH_BOOTSTRAP_ADMIN_USERNAME=admin
|
||||
AUTH_BOOTSTRAP_ADMIN_PASSWORD=replace-with-random-admin-password
|
||||
AUTH_BOOTSTRAP_USER_USERNAME=user
|
||||
AUTH_BOOTSTRAP_USER_PASSWORD=replace-with-random-user-password
|
||||
APP_SETTINGS_ENCRYPTION_KEY=replace-with-random-settings-encryption-key
|
||||
PROCESSING_LOG_STORE_MODEL_IO_TEXT=false
|
||||
PROCESSING_LOG_STORE_PAYLOAD_TEXT=false
|
||||
CONTENT_EXPORT_MAX_DOCUMENTS=250
|
||||
CONTENT_EXPORT_MAX_TOTAL_BYTES=52428800
|
||||
CONTENT_EXPORT_RATE_LIMIT_PER_MINUTE=6
|
||||
MAX_UPLOAD_FILES_PER_REQUEST=50
|
||||
MAX_UPLOAD_FILE_SIZE_BYTES=26214400
|
||||
MAX_UPLOAD_REQUEST_SIZE_BYTES=104857600
|
||||
MAX_ZIP_MEMBER_UNCOMPRESSED_BYTES=26214400
|
||||
MAX_ZIP_TOTAL_UNCOMPRESSED_BYTES=157286400
|
||||
MAX_ZIP_COMPRESSION_RATIO=120
|
||||
MAX_ZIP_DESCENDANTS_PER_ROOT=1000
|
||||
PROVIDER_BASE_URL_ALLOWLIST=["api.openai.com"]
|
||||
PROVIDER_BASE_URL_ALLOW_HTTP=false
|
||||
PROVIDER_BASE_URL_ALLOW_PRIVATE_NETWORK=false
|
||||
@@ -23,6 +34,6 @@ DEFAULT_ROUTING_MODEL=gpt-4.1-mini
|
||||
TYPESENSE_PROTOCOL=http
|
||||
TYPESENSE_HOST=typesense
|
||||
TYPESENSE_PORT=8108
|
||||
TYPESENSE_API_KEY=dcm-typesense-key
|
||||
TYPESENSE_API_KEY=replace-with-random-typesense-api-key
|
||||
TYPESENSE_COLLECTION_NAME=documents
|
||||
PUBLIC_BASE_URL=http://localhost:8000
|
||||
|
||||
@@ -1,65 +1,48 @@
|
||||
"""Token-based authentication and authorization dependencies for privileged API routes."""
|
||||
"""Authentication and authorization dependencies for protected API routes."""
|
||||
|
||||
import hmac
|
||||
from dataclasses import dataclass
|
||||
from datetime import datetime
|
||||
from typing import Annotated
|
||||
from uuid import UUID
|
||||
|
||||
from fastapi import Depends, HTTPException, status
|
||||
from fastapi.security import HTTPAuthorizationCredentials, HTTPBearer
|
||||
from sqlalchemy.orm import Session
|
||||
|
||||
from app.core.config import Settings, get_settings
|
||||
from app.db.base import get_session
|
||||
from app.models.auth import UserRole
|
||||
from app.services.authentication import resolve_auth_session
|
||||
|
||||
|
||||
bearer_auth = HTTPBearer(auto_error=False)
|
||||
|
||||
|
||||
class AuthRole:
|
||||
"""Declares supported authorization roles for privileged API operations."""
|
||||
@dataclass(frozen=True)
|
||||
class AuthContext:
|
||||
"""Carries authenticated identity and role details for one request."""
|
||||
|
||||
ADMIN = "admin"
|
||||
USER = "user"
|
||||
user_id: UUID
|
||||
username: str
|
||||
role: UserRole
|
||||
session_id: UUID
|
||||
expires_at: datetime
|
||||
|
||||
|
||||
def _raise_unauthorized() -> None:
|
||||
"""Raises an HTTP 401 response with bearer authentication challenge headers."""
|
||||
"""Raises a 401 challenge response for missing or invalid bearer sessions."""
|
||||
|
||||
raise HTTPException(
|
||||
status_code=status.HTTP_401_UNAUTHORIZED,
|
||||
detail="Invalid or missing API token",
|
||||
detail="Invalid or expired authentication session",
|
||||
headers={"WWW-Authenticate": "Bearer"},
|
||||
)
|
||||
|
||||
|
||||
def _configured_admin_token(settings: Settings) -> str:
|
||||
"""Returns required admin token or raises configuration error when unset."""
|
||||
|
||||
token = settings.admin_api_token.strip()
|
||||
if token:
|
||||
return token
|
||||
raise HTTPException(
|
||||
status_code=status.HTTP_503_SERVICE_UNAVAILABLE,
|
||||
detail="Admin API token is not configured",
|
||||
)
|
||||
|
||||
|
||||
def _resolve_token_role(token: str, settings: Settings) -> str:
|
||||
"""Resolves role from a bearer token using constant-time comparisons."""
|
||||
|
||||
admin_token = _configured_admin_token(settings)
|
||||
if hmac.compare_digest(token, admin_token):
|
||||
return AuthRole.ADMIN
|
||||
|
||||
user_token = settings.user_api_token.strip()
|
||||
if user_token and hmac.compare_digest(token, user_token):
|
||||
return AuthRole.USER
|
||||
|
||||
_raise_unauthorized()
|
||||
|
||||
|
||||
def get_request_role(
|
||||
def get_request_auth_context(
|
||||
credentials: Annotated[HTTPAuthorizationCredentials | None, Depends(bearer_auth)],
|
||||
settings: Annotated[Settings, Depends(get_settings)],
|
||||
) -> str:
|
||||
"""Authenticates request token and returns its authorization role."""
|
||||
session: Annotated[Session, Depends(get_session)],
|
||||
) -> AuthContext:
|
||||
"""Authenticates bearer session token and returns role-bound request identity context."""
|
||||
|
||||
if credentials is None:
|
||||
_raise_unauthorized()
|
||||
@@ -67,21 +50,32 @@ def get_request_role(
|
||||
token = credentials.credentials.strip()
|
||||
if not token:
|
||||
_raise_unauthorized()
|
||||
return _resolve_token_role(token=token, settings=settings)
|
||||
|
||||
resolved_session = resolve_auth_session(session, token=token)
|
||||
if resolved_session is None or resolved_session.user is None:
|
||||
_raise_unauthorized()
|
||||
|
||||
return AuthContext(
|
||||
user_id=resolved_session.user.id,
|
||||
username=resolved_session.user.username,
|
||||
role=resolved_session.user.role,
|
||||
session_id=resolved_session.id,
|
||||
expires_at=resolved_session.expires_at,
|
||||
)
|
||||
|
||||
|
||||
def require_user_or_admin(role: Annotated[str, Depends(get_request_role)]) -> str:
|
||||
"""Requires a valid user or admin token and returns resolved role."""
|
||||
def require_user_or_admin(context: Annotated[AuthContext, Depends(get_request_auth_context)]) -> AuthContext:
|
||||
"""Requires any authenticated user session and returns its request identity context."""
|
||||
|
||||
return role
|
||||
return context
|
||||
|
||||
|
||||
def require_admin(role: Annotated[str, Depends(get_request_role)]) -> str:
|
||||
"""Requires admin role and rejects requests authenticated as regular users."""
|
||||
def require_admin(context: Annotated[AuthContext, Depends(get_request_auth_context)]) -> AuthContext:
|
||||
"""Requires authenticated admin role and rejects standard user sessions."""
|
||||
|
||||
if role != AuthRole.ADMIN:
|
||||
if context.role != UserRole.ADMIN:
|
||||
raise HTTPException(
|
||||
status_code=status.HTTP_403_FORBIDDEN,
|
||||
detail="Admin token required",
|
||||
detail="Administrator role required",
|
||||
)
|
||||
return role
|
||||
return context
|
||||
|
||||
@@ -2,7 +2,8 @@
|
||||
|
||||
from fastapi import APIRouter, Depends
|
||||
|
||||
from app.api.auth import require_admin, require_user_or_admin
|
||||
from app.api.auth import require_admin
|
||||
from app.api.routes_auth import router as auth_router
|
||||
from app.api.routes_documents import router as documents_router
|
||||
from app.api.routes_health import router as health_router
|
||||
from app.api.routes_processing_logs import router as processing_logs_router
|
||||
@@ -12,11 +13,11 @@ from app.api.routes_settings import router as settings_router
|
||||
|
||||
api_router = APIRouter()
|
||||
api_router.include_router(health_router)
|
||||
api_router.include_router(auth_router)
|
||||
api_router.include_router(
|
||||
documents_router,
|
||||
prefix="/documents",
|
||||
tags=["documents"],
|
||||
dependencies=[Depends(require_user_or_admin)],
|
||||
)
|
||||
api_router.include_router(
|
||||
processing_logs_router,
|
||||
@@ -28,7 +29,6 @@ api_router.include_router(
|
||||
search_router,
|
||||
prefix="/search",
|
||||
tags=["search"],
|
||||
dependencies=[Depends(require_user_or_admin)],
|
||||
)
|
||||
api_router.include_router(
|
||||
settings_router,
|
||||
|
||||
94
backend/app/api/routes_auth.py
Normal file
94
backend/app/api/routes_auth.py
Normal file
@@ -0,0 +1,94 @@
|
||||
"""Authentication endpoints for credential login, session introspection, and logout."""
|
||||
|
||||
from fastapi import APIRouter, Depends, HTTPException, Request, status
|
||||
from sqlalchemy.orm import Session
|
||||
|
||||
from app.api.auth import AuthContext, require_user_or_admin
|
||||
from app.db.base import get_session
|
||||
from app.schemas.auth import (
|
||||
AuthLoginRequest,
|
||||
AuthLoginResponse,
|
||||
AuthLogoutResponse,
|
||||
AuthSessionResponse,
|
||||
AuthUserResponse,
|
||||
)
|
||||
from app.services.authentication import authenticate_user, issue_user_session, revoke_auth_session
|
||||
|
||||
|
||||
router = APIRouter(prefix="/auth", tags=["auth"])
|
||||
|
||||
|
||||
def _request_ip_address(request: Request) -> str | None:
|
||||
"""Returns best-effort client IP extracted from the request transport context."""
|
||||
|
||||
return request.client.host if request.client is not None else None
|
||||
|
||||
|
||||
def _request_user_agent(request: Request) -> str | None:
|
||||
"""Returns best-effort user-agent metadata for created auth sessions."""
|
||||
|
||||
user_agent = request.headers.get("user-agent", "").strip()
|
||||
return user_agent[:512] if user_agent else None
|
||||
|
||||
|
||||
@router.post("/login", response_model=AuthLoginResponse)
|
||||
def login(
|
||||
payload: AuthLoginRequest,
|
||||
request: Request,
|
||||
session: Session = Depends(get_session),
|
||||
) -> AuthLoginResponse:
|
||||
"""Authenticates username and password and returns an issued bearer session token."""
|
||||
|
||||
user = authenticate_user(
|
||||
session,
|
||||
username=payload.username,
|
||||
password=payload.password,
|
||||
)
|
||||
if user is None:
|
||||
raise HTTPException(
|
||||
status_code=status.HTTP_401_UNAUTHORIZED,
|
||||
detail="Invalid username or password",
|
||||
)
|
||||
|
||||
issued_session = issue_user_session(
|
||||
session,
|
||||
user=user,
|
||||
user_agent=_request_user_agent(request),
|
||||
ip_address=_request_ip_address(request),
|
||||
)
|
||||
session.commit()
|
||||
return AuthLoginResponse(
|
||||
access_token=issued_session.token,
|
||||
expires_at=issued_session.expires_at,
|
||||
user=AuthUserResponse.model_validate(user),
|
||||
)
|
||||
|
||||
|
||||
@router.get("/me", response_model=AuthSessionResponse)
|
||||
def me(context: AuthContext = Depends(require_user_or_admin)) -> AuthSessionResponse:
|
||||
"""Returns current authenticated session identity and expiration metadata."""
|
||||
|
||||
return AuthSessionResponse(
|
||||
expires_at=context.expires_at,
|
||||
user=AuthUserResponse(
|
||||
id=context.user_id,
|
||||
username=context.username,
|
||||
role=context.role,
|
||||
),
|
||||
)
|
||||
|
||||
|
||||
@router.post("/logout", response_model=AuthLogoutResponse)
|
||||
def logout(
|
||||
context: AuthContext = Depends(require_user_or_admin),
|
||||
session: Session = Depends(get_session),
|
||||
) -> AuthLogoutResponse:
|
||||
"""Revokes current bearer session token and confirms logout state."""
|
||||
|
||||
revoked = revoke_auth_session(
|
||||
session,
|
||||
session_id=context.session_id,
|
||||
)
|
||||
if revoked:
|
||||
session.commit()
|
||||
return AuthLogoutResponse(revoked=revoked)
|
||||
@@ -1,12 +1,12 @@
|
||||
"""Authenticated document CRUD, lifecycle, metadata, file access, and content export endpoints."""
|
||||
|
||||
import io
|
||||
import re
|
||||
import tempfile
|
||||
import unicodedata
|
||||
import zipfile
|
||||
from datetime import datetime, time
|
||||
from pathlib import Path
|
||||
from typing import Annotated, Literal
|
||||
from typing import Annotated, BinaryIO, Iterator, Literal
|
||||
from uuid import UUID
|
||||
|
||||
from fastapi import APIRouter, Depends, File, Form, HTTPException, Query, UploadFile
|
||||
@@ -14,8 +14,10 @@ from fastapi.responses import FileResponse, Response, StreamingResponse
|
||||
from sqlalchemy import or_, func, select
|
||||
from sqlalchemy.orm import Session
|
||||
|
||||
from app.core.config import get_settings
|
||||
from app.api.auth import AuthContext, require_user_or_admin
|
||||
from app.core.config import get_settings, is_inline_preview_mime_type_safe
|
||||
from app.db.base import get_session
|
||||
from app.models.auth import UserRole
|
||||
from app.models.document import Document, DocumentStatus
|
||||
from app.schemas.documents import (
|
||||
ContentExportRequest,
|
||||
@@ -30,6 +32,7 @@ from app.services.app_settings import read_predefined_paths_settings, read_prede
|
||||
from app.services.extractor import sniff_mime
|
||||
from app.services.handwriting_style import delete_many_handwriting_style_documents
|
||||
from app.services.processing_logs import log_processing_event, set_processing_log_autocommit
|
||||
from app.services.rate_limiter import increment_rate_limit
|
||||
from app.services.storage import absolute_path, compute_sha256, store_bytes
|
||||
from app.services.typesense_index import delete_many_documents_index, upsert_document_index
|
||||
from app.worker.queue import get_processing_queue
|
||||
@@ -39,6 +42,59 @@ router = APIRouter()
|
||||
settings = get_settings()
|
||||
|
||||
|
||||
def _scope_document_statement_for_auth_context(statement, auth_context: AuthContext):
|
||||
"""Restricts document statements to caller-owned rows for non-admin users."""
|
||||
|
||||
if auth_context.role == UserRole.ADMIN:
|
||||
return statement
|
||||
return statement.where(Document.owner_user_id == auth_context.user_id)
|
||||
|
||||
|
||||
def _ensure_document_access(document: Document, auth_context: AuthContext) -> None:
|
||||
"""Enforces owner-level access for non-admin users and raises not-found on violations."""
|
||||
|
||||
if auth_context.role == UserRole.ADMIN:
|
||||
return
|
||||
if document.owner_user_id != auth_context.user_id:
|
||||
raise HTTPException(status_code=404, detail="Document not found")
|
||||
|
||||
|
||||
def _stream_binary_file_chunks(handle: BinaryIO, *, chunk_bytes: int) -> Iterator[bytes]:
|
||||
"""Streams binary file-like content in bounded chunks and closes handle after completion."""
|
||||
|
||||
try:
|
||||
while True:
|
||||
chunk = handle.read(chunk_bytes)
|
||||
if not chunk:
|
||||
break
|
||||
yield chunk
|
||||
finally:
|
||||
handle.close()
|
||||
|
||||
|
||||
def _enforce_content_export_rate_limit(auth_context: AuthContext) -> None:
|
||||
"""Applies per-user fixed-window rate limiting for markdown export requests."""
|
||||
|
||||
try:
|
||||
current_count, limit = increment_rate_limit(
|
||||
scope="content-md-export",
|
||||
subject=str(auth_context.user_id),
|
||||
limit=settings.content_export_rate_limit_per_minute,
|
||||
window_seconds=60,
|
||||
)
|
||||
except RuntimeError as error:
|
||||
raise HTTPException(
|
||||
status_code=503,
|
||||
detail="Rate limiter backend unavailable",
|
||||
) from error
|
||||
|
||||
if limit > 0 and current_count > limit:
|
||||
raise HTTPException(
|
||||
status_code=429,
|
||||
detail=f"Export rate limit exceeded ({limit} requests per minute)",
|
||||
)
|
||||
|
||||
|
||||
def _parse_csv(value: str | None) -> list[str]:
|
||||
"""Parses comma-separated query values into a normalized non-empty list."""
|
||||
|
||||
@@ -296,6 +352,7 @@ def list_documents(
|
||||
type_filter: str | None = Query(default=None),
|
||||
processed_from: str | None = Query(default=None),
|
||||
processed_to: str | None = Query(default=None),
|
||||
auth_context: AuthContext = Depends(require_user_or_admin),
|
||||
session: Session = Depends(get_session),
|
||||
) -> DocumentsListResponse:
|
||||
"""Returns paginated documents ordered by newest upload timestamp."""
|
||||
@@ -305,6 +362,7 @@ def list_documents(
|
||||
include_trashed=include_trashed,
|
||||
path_prefix=path_prefix,
|
||||
)
|
||||
base_statement = _scope_document_statement_for_auth_context(base_statement, auth_context)
|
||||
base_statement = _apply_discovery_filters(
|
||||
base_statement,
|
||||
path_filter=path_filter,
|
||||
@@ -326,11 +384,13 @@ def list_documents(
|
||||
@router.get("/tags")
|
||||
def list_tags(
|
||||
include_trashed: bool = Query(default=False),
|
||||
auth_context: AuthContext = Depends(require_user_or_admin),
|
||||
session: Session = Depends(get_session),
|
||||
) -> dict[str, list[str]]:
|
||||
"""Returns distinct tags currently assigned across all matching documents."""
|
||||
|
||||
statement = select(Document.tags)
|
||||
statement = _scope_document_statement_for_auth_context(statement, auth_context)
|
||||
if not include_trashed:
|
||||
statement = statement.where(Document.status != DocumentStatus.TRASHED)
|
||||
|
||||
@@ -348,11 +408,13 @@ def list_tags(
|
||||
@router.get("/paths")
|
||||
def list_paths(
|
||||
include_trashed: bool = Query(default=False),
|
||||
auth_context: AuthContext = Depends(require_user_or_admin),
|
||||
session: Session = Depends(get_session),
|
||||
) -> dict[str, list[str]]:
|
||||
"""Returns distinct logical paths currently assigned across all matching documents."""
|
||||
|
||||
statement = select(Document.logical_path)
|
||||
statement = _scope_document_statement_for_auth_context(statement, auth_context)
|
||||
if not include_trashed:
|
||||
statement = statement.where(Document.status != DocumentStatus.TRASHED)
|
||||
|
||||
@@ -370,11 +432,13 @@ def list_paths(
|
||||
@router.get("/types")
|
||||
def list_types(
|
||||
include_trashed: bool = Query(default=False),
|
||||
auth_context: AuthContext = Depends(require_user_or_admin),
|
||||
session: Session = Depends(get_session),
|
||||
) -> dict[str, list[str]]:
|
||||
"""Returns distinct document type values from extension, MIME, and image text type."""
|
||||
|
||||
statement = select(Document.extension, Document.mime_type, Document.image_text_type)
|
||||
statement = _scope_document_statement_for_auth_context(statement, auth_context)
|
||||
if not include_trashed:
|
||||
statement = statement.where(Document.status != DocumentStatus.TRASHED)
|
||||
rows = session.execute(statement).all()
|
||||
@@ -390,16 +454,20 @@ def list_types(
|
||||
@router.post("/content-md/export")
|
||||
def export_contents_markdown(
|
||||
payload: ContentExportRequest,
|
||||
auth_context: AuthContext = Depends(require_user_or_admin),
|
||||
session: Session = Depends(get_session),
|
||||
) -> StreamingResponse:
|
||||
"""Exports extracted contents for selected documents as individual markdown files in a ZIP archive."""
|
||||
|
||||
_enforce_content_export_rate_limit(auth_context)
|
||||
|
||||
has_document_ids = len(payload.document_ids) > 0
|
||||
has_path_prefix = bool(payload.path_prefix and payload.path_prefix.strip())
|
||||
if not has_document_ids and not has_path_prefix:
|
||||
raise HTTPException(status_code=400, detail="Provide document_ids or path_prefix for export")
|
||||
|
||||
statement = select(Document)
|
||||
statement = _scope_document_statement_for_auth_context(statement, auth_context)
|
||||
if has_document_ids:
|
||||
statement = statement.where(Document.id.in_(payload.document_ids))
|
||||
if has_path_prefix:
|
||||
@@ -409,37 +477,82 @@ def export_contents_markdown(
|
||||
elif not payload.include_trashed:
|
||||
statement = statement.where(Document.status != DocumentStatus.TRASHED)
|
||||
|
||||
documents = session.execute(statement.order_by(Document.logical_path.asc(), Document.created_at.asc())).scalars().all()
|
||||
max_documents = max(1, int(settings.content_export_max_documents))
|
||||
ordered_statement = statement.order_by(Document.logical_path.asc(), Document.created_at.asc()).limit(max_documents + 1)
|
||||
documents = session.execute(ordered_statement).scalars().all()
|
||||
if len(documents) > max_documents:
|
||||
raise HTTPException(
|
||||
status_code=413,
|
||||
detail=f"Export exceeds maximum document count ({len(documents)} > {max_documents})",
|
||||
)
|
||||
if not documents:
|
||||
raise HTTPException(status_code=404, detail="No matching documents found for export")
|
||||
|
||||
archive_buffer = io.BytesIO()
|
||||
max_total_bytes = max(1, int(settings.content_export_max_total_bytes))
|
||||
max_spool_memory = max(64 * 1024, int(settings.content_export_spool_max_memory_bytes))
|
||||
archive_file = tempfile.SpooledTemporaryFile(max_size=max_spool_memory, mode="w+b")
|
||||
total_export_bytes = 0
|
||||
used_entries: set[str] = set()
|
||||
with zipfile.ZipFile(archive_buffer, mode="w", compression=zipfile.ZIP_DEFLATED) as archive:
|
||||
try:
|
||||
with zipfile.ZipFile(archive_file, mode="w", compression=zipfile.ZIP_DEFLATED) as archive:
|
||||
for document in documents:
|
||||
markdown_bytes = _markdown_for_document(document).encode("utf-8")
|
||||
total_export_bytes += len(markdown_bytes)
|
||||
if total_export_bytes > max_total_bytes:
|
||||
raise HTTPException(
|
||||
status_code=413,
|
||||
detail=(
|
||||
"Export exceeds total markdown size limit "
|
||||
f"({total_export_bytes} > {max_total_bytes} bytes)"
|
||||
),
|
||||
)
|
||||
entry_name = _zip_entry_name(document, used_entries)
|
||||
archive.writestr(entry_name, _markdown_for_document(document))
|
||||
archive.writestr(entry_name, markdown_bytes)
|
||||
archive_file.seek(0)
|
||||
except Exception:
|
||||
archive_file.close()
|
||||
raise
|
||||
|
||||
archive_buffer.seek(0)
|
||||
chunk_bytes = max(4 * 1024, int(settings.content_export_stream_chunk_bytes))
|
||||
headers = {"Content-Disposition": 'attachment; filename="document-contents-md.zip"'}
|
||||
return StreamingResponse(archive_buffer, media_type="application/zip", headers=headers)
|
||||
return StreamingResponse(
|
||||
_stream_binary_file_chunks(archive_file, chunk_bytes=chunk_bytes),
|
||||
media_type="application/zip",
|
||||
headers=headers,
|
||||
)
|
||||
|
||||
|
||||
@router.get("/{document_id}", response_model=DocumentDetailResponse)
|
||||
def get_document(document_id: UUID, session: Session = Depends(get_session)) -> DocumentDetailResponse:
|
||||
def get_document(
|
||||
document_id: UUID,
|
||||
auth_context: AuthContext = Depends(require_user_or_admin),
|
||||
session: Session = Depends(get_session),
|
||||
) -> DocumentDetailResponse:
|
||||
"""Returns one document by unique identifier."""
|
||||
|
||||
document = session.execute(select(Document).where(Document.id == document_id)).scalar_one_or_none()
|
||||
statement = _scope_document_statement_for_auth_context(
|
||||
select(Document).where(Document.id == document_id),
|
||||
auth_context,
|
||||
)
|
||||
document = session.execute(statement).scalar_one_or_none()
|
||||
if document is None:
|
||||
raise HTTPException(status_code=404, detail="Document not found")
|
||||
return DocumentDetailResponse.model_validate(document)
|
||||
|
||||
|
||||
@router.get("/{document_id}/download")
|
||||
def download_document(document_id: UUID, session: Session = Depends(get_session)) -> FileResponse:
|
||||
def download_document(
|
||||
document_id: UUID,
|
||||
auth_context: AuthContext = Depends(require_user_or_admin),
|
||||
session: Session = Depends(get_session),
|
||||
) -> FileResponse:
|
||||
"""Downloads original document bytes for the requested document identifier."""
|
||||
|
||||
document = session.execute(select(Document).where(Document.id == document_id)).scalar_one_or_none()
|
||||
statement = _scope_document_statement_for_auth_context(
|
||||
select(Document).where(Document.id == document_id),
|
||||
auth_context,
|
||||
)
|
||||
document = session.execute(statement).scalar_one_or_none()
|
||||
if document is None:
|
||||
raise HTTPException(status_code=404, detail="Document not found")
|
||||
file_path = absolute_path(document.stored_relative_path)
|
||||
@@ -447,22 +560,46 @@ def download_document(document_id: UUID, session: Session = Depends(get_session)
|
||||
|
||||
|
||||
@router.get("/{document_id}/preview")
|
||||
def preview_document(document_id: UUID, session: Session = Depends(get_session)) -> FileResponse:
|
||||
"""Streams the original document inline when browser rendering is supported."""
|
||||
def preview_document(
|
||||
document_id: UUID,
|
||||
auth_context: AuthContext = Depends(require_user_or_admin),
|
||||
session: Session = Depends(get_session),
|
||||
) -> FileResponse:
|
||||
"""Streams trusted-safe MIME types inline and forces attachment for active script-capable types."""
|
||||
|
||||
document = session.execute(select(Document).where(Document.id == document_id)).scalar_one_or_none()
|
||||
statement = _scope_document_statement_for_auth_context(
|
||||
select(Document).where(Document.id == document_id),
|
||||
auth_context,
|
||||
)
|
||||
document = session.execute(statement).scalar_one_or_none()
|
||||
if document is None:
|
||||
raise HTTPException(status_code=404, detail="Document not found")
|
||||
|
||||
original_path = absolute_path(document.stored_relative_path)
|
||||
return FileResponse(path=original_path, media_type=document.mime_type)
|
||||
common_headers = {"X-Content-Type-Options": "nosniff"}
|
||||
if not is_inline_preview_mime_type_safe(document.mime_type):
|
||||
return FileResponse(
|
||||
path=original_path,
|
||||
filename=document.original_filename,
|
||||
media_type="application/octet-stream",
|
||||
headers=common_headers,
|
||||
)
|
||||
return FileResponse(path=original_path, media_type=document.mime_type, headers=common_headers)
|
||||
|
||||
|
||||
@router.get("/{document_id}/thumbnail")
|
||||
def thumbnail_document(document_id: UUID, session: Session = Depends(get_session)) -> FileResponse:
|
||||
def thumbnail_document(
|
||||
document_id: UUID,
|
||||
auth_context: AuthContext = Depends(require_user_or_admin),
|
||||
session: Session = Depends(get_session),
|
||||
) -> FileResponse:
|
||||
"""Returns a generated thumbnail image for dashboard card previews."""
|
||||
|
||||
document = session.execute(select(Document).where(Document.id == document_id)).scalar_one_or_none()
|
||||
statement = _scope_document_statement_for_auth_context(
|
||||
select(Document).where(Document.id == document_id),
|
||||
auth_context,
|
||||
)
|
||||
document = session.execute(statement).scalar_one_or_none()
|
||||
if document is None:
|
||||
raise HTTPException(status_code=404, detail="Document not found")
|
||||
|
||||
@@ -477,10 +614,18 @@ def thumbnail_document(document_id: UUID, session: Session = Depends(get_session
|
||||
|
||||
|
||||
@router.get("/{document_id}/content-md")
|
||||
def download_document_content_markdown(document_id: UUID, session: Session = Depends(get_session)) -> Response:
|
||||
def download_document_content_markdown(
|
||||
document_id: UUID,
|
||||
auth_context: AuthContext = Depends(require_user_or_admin),
|
||||
session: Session = Depends(get_session),
|
||||
) -> Response:
|
||||
"""Downloads extracted content for one document as a markdown file."""
|
||||
|
||||
document = session.execute(select(Document).where(Document.id == document_id)).scalar_one_or_none()
|
||||
statement = _scope_document_statement_for_auth_context(
|
||||
select(Document).where(Document.id == document_id),
|
||||
auth_context,
|
||||
)
|
||||
document = session.execute(statement).scalar_one_or_none()
|
||||
if document is None:
|
||||
raise HTTPException(status_code=404, detail="Document not found")
|
||||
|
||||
@@ -497,6 +642,7 @@ async def upload_documents(
|
||||
logical_path: Annotated[str, Form()] = "Inbox",
|
||||
tags: Annotated[str | None, Form()] = None,
|
||||
conflict_mode: Annotated[Literal["ask", "replace", "duplicate"], Form()] = "ask",
|
||||
auth_context: AuthContext = Depends(require_user_or_admin),
|
||||
session: Session = Depends(get_session),
|
||||
) -> UploadResponse:
|
||||
"""Uploads files, records metadata, and enqueues asynchronous extraction tasks."""
|
||||
@@ -554,7 +700,11 @@ async def upload_documents(
|
||||
}
|
||||
)
|
||||
|
||||
existing = session.execute(select(Document).where(Document.sha256 == sha256)).scalar_one_or_none()
|
||||
existing_statement = _scope_document_statement_for_auth_context(
|
||||
select(Document).where(Document.sha256 == sha256),
|
||||
auth_context,
|
||||
)
|
||||
existing = session.execute(existing_statement).scalar_one_or_none()
|
||||
if existing and conflict_mode == "ask":
|
||||
log_processing_event(
|
||||
session=session,
|
||||
@@ -581,9 +731,11 @@ async def upload_documents(
|
||||
return UploadResponse(uploaded=[], conflicts=conflicts)
|
||||
|
||||
for prepared in prepared_uploads:
|
||||
existing = session.execute(
|
||||
select(Document).where(Document.sha256 == str(prepared["sha256"]))
|
||||
).scalar_one_or_none()
|
||||
existing_statement = _scope_document_statement_for_auth_context(
|
||||
select(Document).where(Document.sha256 == str(prepared["sha256"])),
|
||||
auth_context,
|
||||
)
|
||||
existing = session.execute(existing_statement).scalar_one_or_none()
|
||||
replaces_document_id = existing.id if existing and conflict_mode == "replace" else None
|
||||
|
||||
stored_relative_path = store_bytes(str(prepared["filename"]), bytes(prepared["data"]))
|
||||
@@ -598,6 +750,7 @@ async def upload_documents(
|
||||
size_bytes=len(bytes(prepared["data"])),
|
||||
logical_path=logical_path,
|
||||
tags=list(normalized_tags),
|
||||
owner_user_id=auth_context.user_id,
|
||||
replaces_document_id=replaces_document_id,
|
||||
metadata_json={"upload": "web"},
|
||||
)
|
||||
@@ -629,11 +782,16 @@ async def upload_documents(
|
||||
def update_document(
|
||||
document_id: UUID,
|
||||
payload: DocumentUpdateRequest,
|
||||
auth_context: AuthContext = Depends(require_user_or_admin),
|
||||
session: Session = Depends(get_session),
|
||||
) -> DocumentResponse:
|
||||
"""Updates document metadata and refreshes semantic index representation."""
|
||||
|
||||
document = session.execute(select(Document).where(Document.id == document_id)).scalar_one_or_none()
|
||||
statement = _scope_document_statement_for_auth_context(
|
||||
select(Document).where(Document.id == document_id),
|
||||
auth_context,
|
||||
)
|
||||
document = session.execute(statement).scalar_one_or_none()
|
||||
if document is None:
|
||||
raise HTTPException(status_code=404, detail="Document not found")
|
||||
|
||||
@@ -655,10 +813,18 @@ def update_document(
|
||||
|
||||
|
||||
@router.post("/{document_id}/trash", response_model=DocumentResponse)
|
||||
def trash_document(document_id: UUID, session: Session = Depends(get_session)) -> DocumentResponse:
|
||||
def trash_document(
|
||||
document_id: UUID,
|
||||
auth_context: AuthContext = Depends(require_user_or_admin),
|
||||
session: Session = Depends(get_session),
|
||||
) -> DocumentResponse:
|
||||
"""Marks a document as trashed without deleting files from storage."""
|
||||
|
||||
document = session.execute(select(Document).where(Document.id == document_id)).scalar_one_or_none()
|
||||
statement = _scope_document_statement_for_auth_context(
|
||||
select(Document).where(Document.id == document_id),
|
||||
auth_context,
|
||||
)
|
||||
document = session.execute(statement).scalar_one_or_none()
|
||||
if document is None:
|
||||
raise HTTPException(status_code=404, detail="Document not found")
|
||||
|
||||
@@ -679,10 +845,18 @@ def trash_document(document_id: UUID, session: Session = Depends(get_session)) -
|
||||
|
||||
|
||||
@router.post("/{document_id}/restore", response_model=DocumentResponse)
|
||||
def restore_document(document_id: UUID, session: Session = Depends(get_session)) -> DocumentResponse:
|
||||
def restore_document(
|
||||
document_id: UUID,
|
||||
auth_context: AuthContext = Depends(require_user_or_admin),
|
||||
session: Session = Depends(get_session),
|
||||
) -> DocumentResponse:
|
||||
"""Restores a trashed document to its previous lifecycle status."""
|
||||
|
||||
document = session.execute(select(Document).where(Document.id == document_id)).scalar_one_or_none()
|
||||
statement = _scope_document_statement_for_auth_context(
|
||||
select(Document).where(Document.id == document_id),
|
||||
auth_context,
|
||||
)
|
||||
document = session.execute(statement).scalar_one_or_none()
|
||||
if document is None:
|
||||
raise HTTPException(status_code=404, detail="Document not found")
|
||||
|
||||
@@ -704,16 +878,27 @@ def restore_document(document_id: UUID, session: Session = Depends(get_session))
|
||||
|
||||
|
||||
@router.delete("/{document_id}")
|
||||
def delete_document(document_id: UUID, session: Session = Depends(get_session)) -> dict[str, int]:
|
||||
def delete_document(
|
||||
document_id: UUID,
|
||||
auth_context: AuthContext = Depends(require_user_or_admin),
|
||||
session: Session = Depends(get_session),
|
||||
) -> dict[str, int]:
|
||||
"""Permanently deletes a document and all descendant archive members including stored files."""
|
||||
|
||||
root = session.execute(select(Document).where(Document.id == document_id)).scalar_one_or_none()
|
||||
root_statement = _scope_document_statement_for_auth_context(
|
||||
select(Document).where(Document.id == document_id),
|
||||
auth_context,
|
||||
)
|
||||
root = session.execute(root_statement).scalar_one_or_none()
|
||||
if root is None:
|
||||
raise HTTPException(status_code=404, detail="Document not found")
|
||||
if root.status != DocumentStatus.TRASHED:
|
||||
raise HTTPException(status_code=400, detail="Move document to trash before permanent deletion")
|
||||
|
||||
document_tree = _collect_document_tree(session=session, root_document_id=document_id)
|
||||
if auth_context.role != UserRole.ADMIN:
|
||||
for _, document in document_tree:
|
||||
_ensure_document_access(document, auth_context)
|
||||
document_ids = [document.id for _, document in document_tree]
|
||||
try:
|
||||
delete_many_documents_index([str(current_id) for current_id in document_ids])
|
||||
@@ -744,10 +929,18 @@ def delete_document(document_id: UUID, session: Session = Depends(get_session))
|
||||
|
||||
|
||||
@router.post("/{document_id}/reprocess", response_model=DocumentResponse)
|
||||
def reprocess_document(document_id: UUID, session: Session = Depends(get_session)) -> DocumentResponse:
|
||||
def reprocess_document(
|
||||
document_id: UUID,
|
||||
auth_context: AuthContext = Depends(require_user_or_admin),
|
||||
session: Session = Depends(get_session),
|
||||
) -> DocumentResponse:
|
||||
"""Re-enqueues a document for extraction and suggestion processing."""
|
||||
|
||||
document = session.execute(select(Document).where(Document.id == document_id)).scalar_one_or_none()
|
||||
statement = _scope_document_statement_for_auth_context(
|
||||
select(Document).where(Document.id == document_id),
|
||||
auth_context,
|
||||
)
|
||||
document = session.execute(statement).scalar_one_or_none()
|
||||
if document is None:
|
||||
raise HTTPException(status_code=404, detail="Document not found")
|
||||
if document.status == DocumentStatus.TRASHED:
|
||||
|
||||
@@ -4,7 +4,8 @@ from fastapi import APIRouter, Depends, Query
|
||||
from sqlalchemy import Text, cast, func, select
|
||||
from sqlalchemy.orm import Session
|
||||
|
||||
from app.api.routes_documents import _apply_discovery_filters
|
||||
from app.api.auth import AuthContext, require_user_or_admin
|
||||
from app.api.routes_documents import _apply_discovery_filters, _scope_document_statement_for_auth_context
|
||||
from app.db.base import get_session
|
||||
from app.models.document import Document, DocumentStatus
|
||||
from app.schemas.documents import DocumentResponse, SearchResponse
|
||||
@@ -25,6 +26,7 @@ def search_documents(
|
||||
type_filter: str | None = Query(default=None),
|
||||
processed_from: str | None = Query(default=None),
|
||||
processed_to: str | None = Query(default=None),
|
||||
auth_context: AuthContext = Depends(require_user_or_admin),
|
||||
session: Session = Depends(get_session),
|
||||
) -> SearchResponse:
|
||||
"""Searches documents using PostgreSQL full-text ranking plus metadata matching."""
|
||||
@@ -50,6 +52,7 @@ def search_documents(
|
||||
)
|
||||
|
||||
statement = select(Document).where(search_filter)
|
||||
statement = _scope_document_statement_for_auth_context(statement, auth_context)
|
||||
if only_trashed:
|
||||
statement = statement.where(Document.status == DocumentStatus.TRASHED)
|
||||
elif not include_trashed:
|
||||
@@ -67,6 +70,7 @@ def search_documents(
|
||||
items = session.execute(statement).scalars().all()
|
||||
|
||||
count_statement = select(func.count(Document.id)).where(search_filter)
|
||||
count_statement = _scope_document_statement_for_auth_context(count_statement, auth_context)
|
||||
if only_trashed:
|
||||
count_statement = count_statement.where(Document.status == DocumentStatus.TRASHED)
|
||||
elif not include_trashed:
|
||||
|
||||
@@ -19,19 +19,33 @@ class Settings(BaseSettings):
|
||||
app_env: str = "development"
|
||||
database_url: str = "postgresql+psycopg://dcm:dcm@db:5432/dcm"
|
||||
redis_url: str = "redis://redis:6379/0"
|
||||
redis_security_mode: str = "auto"
|
||||
redis_tls_mode: str = "auto"
|
||||
auth_bootstrap_admin_username: str = "admin"
|
||||
auth_bootstrap_admin_password: str = ""
|
||||
auth_bootstrap_user_username: str = ""
|
||||
auth_bootstrap_user_password: str = ""
|
||||
auth_session_ttl_minutes: int = 720
|
||||
auth_password_pbkdf2_iterations: int = 390000
|
||||
auth_session_token_bytes: int = 32
|
||||
auth_session_pepper: str = ""
|
||||
storage_root: Path = Path("/data/storage")
|
||||
upload_chunk_size: int = 4 * 1024 * 1024
|
||||
max_upload_files_per_request: int = 50
|
||||
max_upload_file_size_bytes: int = 25 * 1024 * 1024
|
||||
max_upload_request_size_bytes: int = 100 * 1024 * 1024
|
||||
content_export_max_documents: int = 250
|
||||
content_export_max_total_bytes: int = 50 * 1024 * 1024
|
||||
content_export_rate_limit_per_minute: int = 6
|
||||
content_export_stream_chunk_bytes: int = 256 * 1024
|
||||
content_export_spool_max_memory_bytes: int = 2 * 1024 * 1024
|
||||
max_zip_members: int = 250
|
||||
max_zip_depth: int = 2
|
||||
max_zip_descendants_per_root: int = 1000
|
||||
max_zip_member_uncompressed_bytes: int = 25 * 1024 * 1024
|
||||
max_zip_total_uncompressed_bytes: int = 150 * 1024 * 1024
|
||||
max_zip_compression_ratio: float = 120.0
|
||||
max_text_length: int = 500_000
|
||||
admin_api_token: str = ""
|
||||
user_api_token: str = ""
|
||||
provider_base_url_allowlist: list[str] = Field(default_factory=lambda: ["api.openai.com"])
|
||||
provider_base_url_allow_http: bool = False
|
||||
provider_base_url_allow_private_network: bool = False
|
||||
@@ -39,17 +53,20 @@ class Settings(BaseSettings):
|
||||
processing_log_max_unbound_entries: int = 400
|
||||
processing_log_max_payload_chars: int = 4096
|
||||
processing_log_max_text_chars: int = 12000
|
||||
processing_log_store_model_io_text: bool = False
|
||||
processing_log_store_payload_text: bool = False
|
||||
default_openai_base_url: str = "https://api.openai.com/v1"
|
||||
default_openai_model: str = "gpt-4.1-mini"
|
||||
default_openai_timeout_seconds: int = 45
|
||||
default_openai_handwriting_enabled: bool = True
|
||||
default_openai_api_key: str = ""
|
||||
app_settings_encryption_key: str = ""
|
||||
default_summary_model: str = "gpt-4.1-mini"
|
||||
default_routing_model: str = "gpt-4.1-mini"
|
||||
typesense_protocol: str = "http"
|
||||
typesense_host: str = "typesense"
|
||||
typesense_port: int = 8108
|
||||
typesense_api_key: str = "dcm-typesense-key"
|
||||
typesense_api_key: str = ""
|
||||
typesense_collection_name: str = "documents"
|
||||
typesense_timeout_seconds: int = 120
|
||||
typesense_num_retries: int = 0
|
||||
@@ -58,6 +75,111 @@ class Settings(BaseSettings):
|
||||
|
||||
|
||||
LOCAL_HOSTNAME_SUFFIXES = (".local", ".internal", ".home.arpa")
|
||||
SCRIPT_CAPABLE_INLINE_MIME_TYPES = frozenset(
|
||||
{
|
||||
"application/ecmascript",
|
||||
"application/javascript",
|
||||
"application/x-javascript",
|
||||
"application/xhtml+xml",
|
||||
"image/svg+xml",
|
||||
"text/ecmascript",
|
||||
"text/html",
|
||||
"text/javascript",
|
||||
}
|
||||
)
|
||||
SCRIPT_CAPABLE_XML_MIME_TYPES = frozenset({"application/xml", "text/xml"})
|
||||
REDIS_SECURITY_MODES = frozenset({"auto", "strict", "compat"})
|
||||
REDIS_TLS_MODES = frozenset({"auto", "required", "allow_insecure"})
|
||||
|
||||
|
||||
def _is_production_environment(app_env: str) -> bool:
|
||||
"""Returns whether the runtime environment should enforce production-only security gates."""
|
||||
|
||||
normalized = app_env.strip().lower()
|
||||
return normalized in {"production", "prod"}
|
||||
|
||||
|
||||
def _normalize_redis_security_mode(raw_mode: str) -> str:
|
||||
"""Normalizes Redis security mode values into one supported mode."""
|
||||
|
||||
normalized = raw_mode.strip().lower()
|
||||
if normalized not in REDIS_SECURITY_MODES:
|
||||
return "auto"
|
||||
return normalized
|
||||
|
||||
|
||||
def _normalize_redis_tls_mode(raw_mode: str) -> str:
|
||||
"""Normalizes Redis TLS mode values into one supported mode."""
|
||||
|
||||
normalized = raw_mode.strip().lower()
|
||||
if normalized not in REDIS_TLS_MODES:
|
||||
return "auto"
|
||||
return normalized
|
||||
|
||||
|
||||
def validate_redis_url_security(
|
||||
redis_url: str,
|
||||
*,
|
||||
app_env: str | None = None,
|
||||
security_mode: str | None = None,
|
||||
tls_mode: str | None = None,
|
||||
) -> str:
|
||||
"""Validates Redis URL security posture with production fail-closed defaults."""
|
||||
|
||||
settings = get_settings()
|
||||
resolved_app_env = app_env if app_env is not None else settings.app_env
|
||||
resolved_security_mode = (
|
||||
_normalize_redis_security_mode(security_mode)
|
||||
if security_mode is not None
|
||||
else _normalize_redis_security_mode(settings.redis_security_mode)
|
||||
)
|
||||
resolved_tls_mode = (
|
||||
_normalize_redis_tls_mode(tls_mode)
|
||||
if tls_mode is not None
|
||||
else _normalize_redis_tls_mode(settings.redis_tls_mode)
|
||||
)
|
||||
|
||||
candidate = redis_url.strip()
|
||||
if not candidate:
|
||||
raise ValueError("Redis URL must not be empty")
|
||||
|
||||
parsed = urlparse(candidate)
|
||||
scheme = parsed.scheme.lower()
|
||||
if scheme not in {"redis", "rediss"}:
|
||||
raise ValueError("Redis URL must use redis:// or rediss://")
|
||||
if not parsed.hostname:
|
||||
raise ValueError("Redis URL must include a hostname")
|
||||
|
||||
strict_security = (
|
||||
resolved_security_mode == "strict"
|
||||
or (resolved_security_mode == "auto" and _is_production_environment(resolved_app_env))
|
||||
)
|
||||
require_tls = (
|
||||
resolved_tls_mode == "required"
|
||||
or (resolved_tls_mode == "auto" and strict_security)
|
||||
)
|
||||
has_password = bool(parsed.password and parsed.password.strip())
|
||||
uses_tls = scheme == "rediss"
|
||||
|
||||
if strict_security and not has_password:
|
||||
raise ValueError("Redis URL must include authentication when security mode is strict")
|
||||
if require_tls and not uses_tls:
|
||||
raise ValueError("Redis URL must use rediss:// when TLS is required")
|
||||
|
||||
return candidate
|
||||
|
||||
|
||||
def is_inline_preview_mime_type_safe(mime_type: str) -> bool:
|
||||
"""Returns whether a MIME type is safe to serve inline from untrusted document uploads."""
|
||||
|
||||
normalized = mime_type.split(";", 1)[0].strip().lower() if mime_type else ""
|
||||
if not normalized:
|
||||
return False
|
||||
if normalized in SCRIPT_CAPABLE_INLINE_MIME_TYPES:
|
||||
return False
|
||||
if normalized in SCRIPT_CAPABLE_XML_MIME_TYPES or normalized.endswith("+xml"):
|
||||
return False
|
||||
return True
|
||||
|
||||
|
||||
def _normalize_allowlist(allowlist: object) -> tuple[str, ...]:
|
||||
|
||||
@@ -10,6 +10,7 @@ from app.api.router import api_router
|
||||
from app.core.config import get_settings
|
||||
from app.db.base import init_db
|
||||
from app.services.app_settings import ensure_app_settings
|
||||
from app.services.authentication import ensure_bootstrap_users
|
||||
from app.services.handwriting_style import ensure_handwriting_style_collection
|
||||
from app.services.storage import ensure_storage
|
||||
from app.services.typesense_index import ensure_typesense_collection
|
||||
@@ -34,10 +35,11 @@ def create_app() -> FastAPI:
|
||||
"""Builds and configures the FastAPI application instance."""
|
||||
|
||||
app = FastAPI(title="DCM DMS API", version="0.1.0")
|
||||
allowed_origins = [origin.strip() for origin in settings.cors_origins if isinstance(origin, str) and origin.strip()]
|
||||
app.add_middleware(
|
||||
CORSMiddleware,
|
||||
allow_origins=settings.cors_origins,
|
||||
allow_credentials=True,
|
||||
allow_origins=allowed_origins,
|
||||
allow_credentials=False,
|
||||
allow_methods=["*"],
|
||||
allow_headers=["*"],
|
||||
)
|
||||
@@ -80,8 +82,9 @@ def create_app() -> FastAPI:
|
||||
"""Initializes storage directories and database schema on service startup."""
|
||||
|
||||
ensure_storage()
|
||||
ensure_app_settings()
|
||||
init_db()
|
||||
ensure_bootstrap_users()
|
||||
ensure_app_settings()
|
||||
try:
|
||||
ensure_typesense_collection()
|
||||
except Exception:
|
||||
|
||||
@@ -1,6 +1,7 @@
|
||||
"""Model exports for ORM metadata discovery."""
|
||||
|
||||
from app.models.auth import AppUser, AuthSession, UserRole
|
||||
from app.models.document import Document, DocumentStatus
|
||||
from app.models.processing_log import ProcessingLogEntry
|
||||
|
||||
__all__ = ["Document", "DocumentStatus", "ProcessingLogEntry"]
|
||||
__all__ = ["AppUser", "AuthSession", "Document", "DocumentStatus", "ProcessingLogEntry", "UserRole"]
|
||||
|
||||
66
backend/app/models/auth.py
Normal file
66
backend/app/models/auth.py
Normal file
@@ -0,0 +1,66 @@
|
||||
"""Data models for authenticated users and issued API sessions."""
|
||||
|
||||
import uuid
|
||||
from datetime import UTC, datetime
|
||||
from enum import Enum
|
||||
|
||||
from sqlalchemy import Boolean, DateTime, Enum as SqlEnum, ForeignKey, String
|
||||
from sqlalchemy.dialects.postgresql import UUID
|
||||
from sqlalchemy.orm import Mapped, mapped_column, relationship
|
||||
|
||||
from app.db.base import Base
|
||||
|
||||
|
||||
class UserRole(str, Enum):
|
||||
"""Declares authorization roles used for API route access control."""
|
||||
|
||||
ADMIN = "admin"
|
||||
USER = "user"
|
||||
|
||||
|
||||
class AppUser(Base):
|
||||
"""Stores one authenticatable user account with role-bound authorization."""
|
||||
|
||||
__tablename__ = "app_users"
|
||||
|
||||
id: Mapped[uuid.UUID] = mapped_column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
|
||||
username: Mapped[str] = mapped_column(String(128), nullable=False, unique=True, index=True)
|
||||
password_hash: Mapped[str] = mapped_column(String(512), nullable=False)
|
||||
role: Mapped[UserRole] = mapped_column(SqlEnum(UserRole), nullable=False, default=UserRole.USER)
|
||||
is_active: Mapped[bool] = mapped_column(Boolean, nullable=False, default=True)
|
||||
created_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), nullable=False, default=lambda: datetime.now(UTC))
|
||||
updated_at: Mapped[datetime] = mapped_column(
|
||||
DateTime(timezone=True),
|
||||
nullable=False,
|
||||
default=lambda: datetime.now(UTC),
|
||||
onupdate=lambda: datetime.now(UTC),
|
||||
)
|
||||
|
||||
sessions: Mapped[list["AuthSession"]] = relationship(
|
||||
"AuthSession",
|
||||
back_populates="user",
|
||||
cascade="all, delete-orphan",
|
||||
)
|
||||
|
||||
|
||||
class AuthSession(Base):
|
||||
"""Stores one issued bearer session token for a specific authenticated user."""
|
||||
|
||||
__tablename__ = "auth_sessions"
|
||||
|
||||
id: Mapped[uuid.UUID] = mapped_column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
|
||||
user_id: Mapped[uuid.UUID] = mapped_column(UUID(as_uuid=True), ForeignKey("app_users.id", ondelete="CASCADE"), nullable=False, index=True)
|
||||
token_hash: Mapped[str] = mapped_column(String(128), nullable=False, unique=True, index=True)
|
||||
expires_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), nullable=False, index=True)
|
||||
revoked_at: Mapped[datetime | None] = mapped_column(DateTime(timezone=True), nullable=True)
|
||||
user_agent: Mapped[str | None] = mapped_column(String(512), nullable=True)
|
||||
ip_address: Mapped[str | None] = mapped_column(String(64), nullable=True)
|
||||
created_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), nullable=False, default=lambda: datetime.now(UTC))
|
||||
updated_at: Mapped[datetime] = mapped_column(
|
||||
DateTime(timezone=True),
|
||||
nullable=False,
|
||||
default=lambda: datetime.now(UTC),
|
||||
onupdate=lambda: datetime.now(UTC),
|
||||
)
|
||||
|
||||
user: Mapped[AppUser] = relationship("AppUser", back_populates="sessions")
|
||||
@@ -38,6 +38,12 @@ class Document(Base):
|
||||
suggested_path: Mapped[str | None] = mapped_column(String(1024), nullable=True)
|
||||
tags: Mapped[list[str]] = mapped_column(ARRAY(String), nullable=False, default=list)
|
||||
suggested_tags: Mapped[list[str]] = mapped_column(ARRAY(String), nullable=False, default=list)
|
||||
owner_user_id: Mapped[uuid.UUID | None] = mapped_column(
|
||||
UUID(as_uuid=True),
|
||||
ForeignKey("app_users.id", ondelete="SET NULL"),
|
||||
nullable=True,
|
||||
index=True,
|
||||
)
|
||||
metadata_json: Mapped[dict] = mapped_column(JSONB, nullable=False, default=dict)
|
||||
extracted_text: Mapped[str] = mapped_column(Text, nullable=False, default="")
|
||||
image_text_type: Mapped[str | None] = mapped_column(String(64), nullable=True)
|
||||
@@ -63,3 +69,4 @@ class Document(Base):
|
||||
foreign_keys=[parent_document_id],
|
||||
post_update=True,
|
||||
)
|
||||
owner_user: Mapped["AppUser | None"] = relationship("AppUser", foreign_keys=[owner_user_id], post_update=True)
|
||||
|
||||
48
backend/app/schemas/auth.py
Normal file
48
backend/app/schemas/auth.py
Normal file
@@ -0,0 +1,48 @@
|
||||
"""Pydantic schemas for authentication and session API payloads."""
|
||||
|
||||
from datetime import datetime
|
||||
from uuid import UUID
|
||||
|
||||
from pydantic import BaseModel, Field
|
||||
|
||||
from app.models.auth import UserRole
|
||||
|
||||
|
||||
class AuthLoginRequest(BaseModel):
|
||||
"""Represents credential input used to create one authenticated API session."""
|
||||
|
||||
username: str = Field(min_length=1, max_length=128)
|
||||
password: str = Field(min_length=1, max_length=256)
|
||||
|
||||
|
||||
class AuthUserResponse(BaseModel):
|
||||
"""Represents one authenticated user identity and authorization role."""
|
||||
|
||||
id: UUID
|
||||
username: str
|
||||
role: UserRole
|
||||
|
||||
class Config:
|
||||
"""Enables ORM object parsing for SQLAlchemy model instances."""
|
||||
|
||||
from_attributes = True
|
||||
|
||||
|
||||
class AuthSessionResponse(BaseModel):
|
||||
"""Represents active session metadata for one authenticated user."""
|
||||
|
||||
user: AuthUserResponse
|
||||
expires_at: datetime
|
||||
|
||||
|
||||
class AuthLoginResponse(AuthSessionResponse):
|
||||
"""Represents one newly issued bearer token and associated user context."""
|
||||
|
||||
access_token: str
|
||||
token_type: str = "bearer"
|
||||
|
||||
|
||||
class AuthLogoutResponse(BaseModel):
|
||||
"""Represents logout outcome after current session revocation attempt."""
|
||||
|
||||
revoked: bool
|
||||
@@ -1,10 +1,24 @@
|
||||
"""Persistent single-user application settings service backed by host-mounted storage."""
|
||||
|
||||
import base64
|
||||
import binascii
|
||||
import hashlib
|
||||
import hmac
|
||||
import json
|
||||
import os
|
||||
import re
|
||||
import secrets
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
try:
|
||||
from cryptography.fernet import Fernet, InvalidToken
|
||||
except Exception: # pragma: no cover - dependency failures are surfaced at runtime usage.
|
||||
Fernet = None # type: ignore[assignment]
|
||||
|
||||
class InvalidToken(Exception):
|
||||
"""Fallback InvalidToken type used when cryptography dependency import fails."""
|
||||
|
||||
from app.core.config import get_settings, normalize_and_validate_provider_base_url
|
||||
|
||||
|
||||
@@ -57,6 +71,221 @@ DEFAULT_ROUTING_PROMPT = (
|
||||
"Confidence must be between 0 and 1."
|
||||
)
|
||||
|
||||
PROVIDER_API_KEY_CIPHERTEXT_PREFIX = "enc-v2"
|
||||
PROVIDER_API_KEY_LEGACY_CIPHERTEXT_PREFIX = "enc-v1"
|
||||
PROVIDER_API_KEY_KEYFILE_NAME = ".settings-api-key"
|
||||
PROVIDER_API_KEY_LEGACY_STREAM_CONTEXT = b"dcm-provider-api-key-stream"
|
||||
PROVIDER_API_KEY_LEGACY_AUTH_CONTEXT = b"dcm-provider-api-key-auth"
|
||||
PROVIDER_API_KEY_LEGACY_NONCE_BYTES = 16
|
||||
PROVIDER_API_KEY_LEGACY_TAG_BYTES = 32
|
||||
|
||||
|
||||
def _settings_api_key_path() -> Path:
|
||||
"""Returns the storage path used for local symmetric encryption key persistence."""
|
||||
|
||||
return settings.storage_root / PROVIDER_API_KEY_KEYFILE_NAME
|
||||
|
||||
|
||||
def _write_private_text_file(path: Path, content: str) -> None:
|
||||
"""Writes text files with restrictive owner-only permissions for local secret material."""
|
||||
|
||||
path.parent.mkdir(parents=True, exist_ok=True)
|
||||
file_descriptor = os.open(str(path), os.O_WRONLY | os.O_CREAT | os.O_TRUNC, 0o600)
|
||||
with os.fdopen(file_descriptor, "w", encoding="utf-8") as handle:
|
||||
handle.write(content)
|
||||
os.chmod(path, 0o600)
|
||||
|
||||
|
||||
def _urlsafe_b64encode_no_padding(data: bytes) -> str:
|
||||
"""Encodes bytes to URL-safe base64 without padding for compact JSON persistence."""
|
||||
|
||||
return base64.urlsafe_b64encode(data).decode("ascii").rstrip("=")
|
||||
|
||||
|
||||
def _urlsafe_b64decode_no_padding(data: str) -> bytes:
|
||||
"""Decodes URL-safe base64 values that may omit trailing padding characters."""
|
||||
|
||||
padded = data + "=" * (-len(data) % 4)
|
||||
return base64.urlsafe_b64decode(padded.encode("ascii"))
|
||||
|
||||
|
||||
def _derive_provider_api_key_key() -> bytes:
|
||||
"""Resolves the master key used to encrypt provider API keys for settings storage."""
|
||||
|
||||
configured_key = settings.app_settings_encryption_key.strip()
|
||||
if configured_key:
|
||||
try:
|
||||
decoded = _urlsafe_b64decode_no_padding(configured_key)
|
||||
if len(decoded) >= 32:
|
||||
return decoded[:32]
|
||||
except (binascii.Error, ValueError):
|
||||
pass
|
||||
return hashlib.sha256(configured_key.encode("utf-8")).digest()
|
||||
|
||||
key_path = _settings_api_key_path()
|
||||
if key_path.exists():
|
||||
try:
|
||||
persisted = key_path.read_text(encoding="utf-8").strip()
|
||||
decoded = _urlsafe_b64decode_no_padding(persisted)
|
||||
if len(decoded) >= 32:
|
||||
return decoded[:32]
|
||||
except (OSError, UnicodeDecodeError, binascii.Error, ValueError):
|
||||
pass
|
||||
|
||||
generated = secrets.token_bytes(32)
|
||||
_write_private_text_file(key_path, _urlsafe_b64encode_no_padding(generated))
|
||||
return generated
|
||||
|
||||
|
||||
def _legacy_xor_bytes(left: bytes, right: bytes) -> bytes:
|
||||
"""Applies byte-wise XOR for equal-length byte sequences used by legacy ciphertext migration."""
|
||||
|
||||
return bytes(first ^ second for first, second in zip(left, right))
|
||||
|
||||
|
||||
def _legacy_derive_stream_cipher_bytes(master_key: bytes, nonce: bytes, length: int) -> bytes:
|
||||
"""Derives legacy deterministic stream bytes from HMAC-SHA256 blocks for migration reads."""
|
||||
|
||||
stream = bytearray()
|
||||
counter = 0
|
||||
while len(stream) < length:
|
||||
counter_bytes = counter.to_bytes(4, "big")
|
||||
block = hmac.new(
|
||||
master_key,
|
||||
PROVIDER_API_KEY_LEGACY_STREAM_CONTEXT + nonce + counter_bytes,
|
||||
hashlib.sha256,
|
||||
).digest()
|
||||
stream.extend(block)
|
||||
counter += 1
|
||||
return bytes(stream[:length])
|
||||
|
||||
|
||||
def _provider_key_fernet(master_key: bytes) -> Fernet:
|
||||
"""Builds Fernet instance from 32-byte symmetric key material."""
|
||||
|
||||
if Fernet is None:
|
||||
raise AppSettingsValidationError("cryptography dependency is not available")
|
||||
fernet_key = base64.urlsafe_b64encode(master_key[:32])
|
||||
return Fernet(fernet_key)
|
||||
|
||||
|
||||
def _encrypt_provider_api_key_fallback(value: str) -> str:
|
||||
"""Encrypts provider keys with legacy HMAC stream construction when cryptography is unavailable."""
|
||||
|
||||
plaintext = value.encode("utf-8")
|
||||
master_key = _derive_provider_api_key_key()
|
||||
nonce = secrets.token_bytes(PROVIDER_API_KEY_LEGACY_NONCE_BYTES)
|
||||
keystream = _legacy_derive_stream_cipher_bytes(master_key, nonce, len(plaintext))
|
||||
ciphertext = _legacy_xor_bytes(plaintext, keystream)
|
||||
tag = hmac.new(
|
||||
master_key,
|
||||
PROVIDER_API_KEY_LEGACY_AUTH_CONTEXT + nonce + ciphertext,
|
||||
hashlib.sha256,
|
||||
).digest()
|
||||
payload = nonce + ciphertext + tag
|
||||
encoded = _urlsafe_b64encode_no_padding(payload)
|
||||
return f"{PROVIDER_API_KEY_CIPHERTEXT_PREFIX}:{encoded}"
|
||||
|
||||
|
||||
def _encrypt_provider_api_key(value: str) -> str:
|
||||
"""Encrypts one provider API key for at-rest JSON persistence."""
|
||||
|
||||
normalized = value.strip()
|
||||
if not normalized:
|
||||
return ""
|
||||
|
||||
if Fernet is None:
|
||||
return _encrypt_provider_api_key_fallback(normalized)
|
||||
master_key = _derive_provider_api_key_key()
|
||||
token = _provider_key_fernet(master_key).encrypt(normalized.encode("utf-8")).decode("ascii")
|
||||
return f"{PROVIDER_API_KEY_CIPHERTEXT_PREFIX}:{token}"
|
||||
|
||||
|
||||
def _decrypt_provider_api_key_legacy_payload(encoded_payload: str) -> str:
|
||||
"""Decrypts legacy stream-cipher payload bytes used for migration and fallback reads."""
|
||||
|
||||
if not encoded_payload:
|
||||
raise AppSettingsValidationError("Provider API key ciphertext is missing payload bytes")
|
||||
try:
|
||||
payload = _urlsafe_b64decode_no_padding(encoded_payload)
|
||||
except (binascii.Error, ValueError) as error:
|
||||
raise AppSettingsValidationError("Provider API key ciphertext is not valid base64") from error
|
||||
|
||||
minimum_length = PROVIDER_API_KEY_LEGACY_NONCE_BYTES + PROVIDER_API_KEY_LEGACY_TAG_BYTES
|
||||
if len(payload) < minimum_length:
|
||||
raise AppSettingsValidationError("Provider API key ciphertext payload is truncated")
|
||||
|
||||
nonce = payload[:PROVIDER_API_KEY_LEGACY_NONCE_BYTES]
|
||||
ciphertext = payload[PROVIDER_API_KEY_LEGACY_NONCE_BYTES:-PROVIDER_API_KEY_LEGACY_TAG_BYTES]
|
||||
received_tag = payload[-PROVIDER_API_KEY_LEGACY_TAG_BYTES:]
|
||||
master_key = _derive_provider_api_key_key()
|
||||
expected_tag = hmac.new(
|
||||
master_key,
|
||||
PROVIDER_API_KEY_LEGACY_AUTH_CONTEXT + nonce + ciphertext,
|
||||
hashlib.sha256,
|
||||
).digest()
|
||||
if not hmac.compare_digest(received_tag, expected_tag):
|
||||
raise AppSettingsValidationError("Provider API key ciphertext integrity check failed")
|
||||
|
||||
keystream = _legacy_derive_stream_cipher_bytes(master_key, nonce, len(ciphertext))
|
||||
plaintext = _legacy_xor_bytes(ciphertext, keystream)
|
||||
try:
|
||||
return plaintext.decode("utf-8").strip()
|
||||
except UnicodeDecodeError as error:
|
||||
raise AppSettingsValidationError("Provider API key ciphertext is not valid UTF-8") from error
|
||||
|
||||
|
||||
def _decrypt_provider_api_key_legacy(value: str) -> str:
|
||||
"""Decrypts legacy `enc-v1` payloads to support non-breaking key migration."""
|
||||
|
||||
encoded_payload = value.split(":", 1)[1]
|
||||
return _decrypt_provider_api_key_legacy_payload(encoded_payload)
|
||||
|
||||
|
||||
def _decrypt_provider_api_key(value: str) -> str:
|
||||
"""Decrypts provider API key ciphertext while rejecting tampered payloads."""
|
||||
|
||||
normalized = value.strip()
|
||||
if not normalized:
|
||||
return ""
|
||||
if not normalized.startswith(f"{PROVIDER_API_KEY_CIPHERTEXT_PREFIX}:") and not normalized.startswith(
|
||||
f"{PROVIDER_API_KEY_LEGACY_CIPHERTEXT_PREFIX}:"
|
||||
):
|
||||
return normalized
|
||||
|
||||
if normalized.startswith(f"{PROVIDER_API_KEY_LEGACY_CIPHERTEXT_PREFIX}:"):
|
||||
return _decrypt_provider_api_key_legacy(normalized)
|
||||
|
||||
token = normalized.split(":", 1)[1].strip()
|
||||
if not token:
|
||||
raise AppSettingsValidationError("Provider API key ciphertext is missing payload bytes")
|
||||
if Fernet is None:
|
||||
return _decrypt_provider_api_key_legacy_payload(token)
|
||||
try:
|
||||
plaintext = _provider_key_fernet(_derive_provider_api_key_key()).decrypt(token.encode("ascii"))
|
||||
except (InvalidToken, ValueError, UnicodeEncodeError) as error:
|
||||
raise AppSettingsValidationError("Provider API key ciphertext integrity check failed") from error
|
||||
try:
|
||||
return plaintext.decode("utf-8").strip()
|
||||
except UnicodeDecodeError as error:
|
||||
raise AppSettingsValidationError("Provider API key ciphertext is not valid UTF-8") from error
|
||||
|
||||
|
||||
def _read_provider_api_key(provider_payload: dict[str, Any]) -> str:
|
||||
"""Reads provider API key values from encrypted or legacy plaintext settings payloads."""
|
||||
|
||||
encrypted_value = provider_payload.get("api_key_encrypted")
|
||||
if isinstance(encrypted_value, str) and encrypted_value.strip():
|
||||
try:
|
||||
return _decrypt_provider_api_key(encrypted_value)
|
||||
except AppSettingsValidationError:
|
||||
return ""
|
||||
|
||||
plaintext_value = provider_payload.get("api_key")
|
||||
if plaintext_value is None:
|
||||
return ""
|
||||
return str(plaintext_value).strip()
|
||||
|
||||
|
||||
def _default_settings() -> dict[str, Any]:
|
||||
"""Builds default settings including providers and model task bindings."""
|
||||
@@ -243,8 +472,17 @@ def _normalize_provider(
|
||||
if provider_type != "openai_compatible":
|
||||
provider_type = "openai_compatible"
|
||||
|
||||
api_key_value = payload.get("api_key", fallback_values.get("api_key", defaults["api_key"]))
|
||||
api_key = str(api_key_value).strip() if api_key_value is not None else ""
|
||||
payload_api_key = _read_provider_api_key(payload)
|
||||
fallback_api_key = _read_provider_api_key(fallback_values)
|
||||
default_api_key = _read_provider_api_key(defaults)
|
||||
if "api_key" in payload and payload.get("api_key") is not None:
|
||||
api_key = str(payload.get("api_key")).strip()
|
||||
elif payload_api_key:
|
||||
api_key = payload_api_key
|
||||
elif fallback_api_key:
|
||||
api_key = fallback_api_key
|
||||
else:
|
||||
api_key = default_api_key
|
||||
|
||||
raw_base_url = str(payload.get("base_url", fallback_values.get("base_url", defaults["base_url"]))).strip()
|
||||
if not raw_base_url:
|
||||
@@ -266,6 +504,7 @@ def _normalize_provider(
|
||||
)
|
||||
),
|
||||
"api_key": api_key,
|
||||
"api_key_encrypted": _encrypt_provider_api_key(api_key),
|
||||
}
|
||||
|
||||
|
||||
@@ -653,6 +892,26 @@ def _sanitize_settings(payload: dict[str, Any]) -> dict[str, Any]:
|
||||
}
|
||||
|
||||
|
||||
def _serialize_settings_for_storage(payload: dict[str, Any]) -> dict[str, Any]:
|
||||
"""Converts sanitized runtime payload into storage-safe form without plaintext provider keys."""
|
||||
|
||||
storage_payload = dict(payload)
|
||||
providers_storage: list[dict[str, Any]] = []
|
||||
for provider in payload.get("providers", []):
|
||||
if not isinstance(provider, dict):
|
||||
continue
|
||||
provider_storage = dict(provider)
|
||||
plaintext_api_key = str(provider_storage.pop("api_key", "")).strip()
|
||||
encrypted_api_key = str(provider_storage.get("api_key_encrypted", "")).strip()
|
||||
if plaintext_api_key:
|
||||
encrypted_api_key = _encrypt_provider_api_key(plaintext_api_key)
|
||||
provider_storage["api_key_encrypted"] = encrypted_api_key
|
||||
providers_storage.append(provider_storage)
|
||||
|
||||
storage_payload["providers"] = providers_storage
|
||||
return storage_payload
|
||||
|
||||
|
||||
def ensure_app_settings() -> None:
|
||||
"""Creates a settings file with defaults when no persisted settings are present."""
|
||||
|
||||
@@ -662,7 +921,7 @@ def ensure_app_settings() -> None:
|
||||
return
|
||||
|
||||
defaults = _sanitize_settings(_default_settings())
|
||||
path.write_text(json.dumps(defaults, indent=2), encoding="utf-8")
|
||||
_write_private_text_file(path, json.dumps(_serialize_settings_for_storage(defaults), indent=2))
|
||||
|
||||
|
||||
def _read_raw_settings() -> dict[str, Any]:
|
||||
@@ -682,7 +941,8 @@ def _write_settings(payload: dict[str, Any]) -> None:
|
||||
|
||||
path = _settings_path()
|
||||
path.parent.mkdir(parents=True, exist_ok=True)
|
||||
path.write_text(json.dumps(payload, indent=2), encoding="utf-8")
|
||||
storage_payload = _serialize_settings_for_storage(payload)
|
||||
_write_private_text_file(path, json.dumps(storage_payload, indent=2))
|
||||
|
||||
|
||||
def read_app_settings() -> dict[str, Any]:
|
||||
@@ -879,16 +1139,21 @@ def update_app_settings(
|
||||
|
||||
|
||||
def read_handwriting_provider_settings() -> dict[str, Any]:
|
||||
"""Returns OCR settings in legacy shape for the handwriting transcription service."""
|
||||
"""Returns OCR settings in legacy shape with DNS-revalidated provider base URL safety checks."""
|
||||
|
||||
runtime = read_task_runtime_settings(TASK_OCR_HANDWRITING)
|
||||
provider = runtime["provider"]
|
||||
task = runtime["task"]
|
||||
raw_base_url = str(provider.get("base_url", settings.default_openai_base_url))
|
||||
try:
|
||||
normalized_base_url = normalize_and_validate_provider_base_url(raw_base_url, resolve_dns=True)
|
||||
except ValueError as error:
|
||||
raise AppSettingsValidationError(str(error)) from error
|
||||
|
||||
return {
|
||||
"provider": provider["provider_type"],
|
||||
"enabled": bool(task.get("enabled", True)),
|
||||
"openai_base_url": str(provider.get("base_url", settings.default_openai_base_url)),
|
||||
"openai_base_url": normalized_base_url,
|
||||
"openai_model": str(task.get("model", settings.default_openai_model)),
|
||||
"openai_timeout_seconds": int(provider.get("timeout_seconds", settings.default_openai_timeout_seconds)),
|
||||
"openai_api_key": str(provider.get("api_key", "")),
|
||||
|
||||
289
backend/app/services/authentication.py
Normal file
289
backend/app/services/authentication.py
Normal file
@@ -0,0 +1,289 @@
|
||||
"""Authentication services for user credential validation and session issuance."""
|
||||
|
||||
import base64
|
||||
import binascii
|
||||
from dataclasses import dataclass
|
||||
from datetime import UTC, datetime, timedelta
|
||||
import hashlib
|
||||
import hmac
|
||||
import secrets
|
||||
import uuid
|
||||
|
||||
from sqlalchemy import delete, select
|
||||
from sqlalchemy.orm import Session
|
||||
|
||||
from app.core.config import Settings, get_settings
|
||||
from app.db.base import SessionLocal
|
||||
from app.models.auth import AppUser, AuthSession, UserRole
|
||||
|
||||
|
||||
PASSWORD_HASH_SCHEME = "pbkdf2_sha256"
|
||||
DEFAULT_AUTH_FALLBACK_SECRET = "dcm-session-secret"
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class IssuedSession:
|
||||
"""Represents one newly issued bearer session token and expiration timestamp."""
|
||||
|
||||
token: str
|
||||
expires_at: datetime
|
||||
|
||||
|
||||
def normalize_username(username: str) -> str:
|
||||
"""Normalizes usernames to a stable lowercase identity key."""
|
||||
|
||||
return username.strip().lower()
|
||||
|
||||
|
||||
def _urlsafe_b64encode_no_padding(data: bytes) -> str:
|
||||
"""Encodes bytes to compact URL-safe base64 without padding."""
|
||||
|
||||
return base64.urlsafe_b64encode(data).decode("ascii").rstrip("=")
|
||||
|
||||
|
||||
def _urlsafe_b64decode_no_padding(data: str) -> bytes:
|
||||
"""Decodes URL-safe base64 values that may omit trailing padding characters."""
|
||||
|
||||
padded = data + "=" * (-len(data) % 4)
|
||||
return base64.urlsafe_b64decode(padded.encode("ascii"))
|
||||
|
||||
|
||||
def _password_iterations(settings: Settings) -> int:
|
||||
"""Returns PBKDF2 iteration count clamped to a secure operational range."""
|
||||
|
||||
return max(200_000, min(1_200_000, int(settings.auth_password_pbkdf2_iterations)))
|
||||
|
||||
|
||||
def hash_password(password: str, settings: Settings | None = None) -> str:
|
||||
"""Derives and formats a PBKDF2-SHA256 password hash for persisted user credentials."""
|
||||
|
||||
resolved_settings = settings or get_settings()
|
||||
normalized_password = password.strip()
|
||||
if not normalized_password:
|
||||
raise ValueError("Password must not be empty")
|
||||
|
||||
iterations = _password_iterations(resolved_settings)
|
||||
salt = secrets.token_bytes(16)
|
||||
derived = hashlib.pbkdf2_hmac(
|
||||
"sha256",
|
||||
normalized_password.encode("utf-8"),
|
||||
salt,
|
||||
iterations,
|
||||
dklen=32,
|
||||
)
|
||||
return (
|
||||
f"{PASSWORD_HASH_SCHEME}$"
|
||||
f"{iterations}$"
|
||||
f"{_urlsafe_b64encode_no_padding(salt)}$"
|
||||
f"{_urlsafe_b64encode_no_padding(derived)}"
|
||||
)
|
||||
|
||||
|
||||
def verify_password(password: str, stored_hash: str, settings: Settings | None = None) -> bool:
|
||||
"""Verifies one plaintext password against persisted PBKDF2-SHA256 hash material."""
|
||||
|
||||
resolved_settings = settings or get_settings()
|
||||
normalized_password = password.strip()
|
||||
if not normalized_password:
|
||||
return False
|
||||
|
||||
parts = stored_hash.strip().split("$")
|
||||
if len(parts) != 4:
|
||||
return False
|
||||
scheme, iterations_text, salt_text, digest_text = parts
|
||||
if scheme != PASSWORD_HASH_SCHEME:
|
||||
return False
|
||||
try:
|
||||
iterations = int(iterations_text)
|
||||
except ValueError:
|
||||
return False
|
||||
if iterations < 200_000 or iterations > 2_000_000:
|
||||
return False
|
||||
try:
|
||||
salt = _urlsafe_b64decode_no_padding(salt_text)
|
||||
expected_digest = _urlsafe_b64decode_no_padding(digest_text)
|
||||
except (binascii.Error, ValueError):
|
||||
return False
|
||||
derived_digest = hashlib.pbkdf2_hmac(
|
||||
"sha256",
|
||||
normalized_password.encode("utf-8"),
|
||||
salt,
|
||||
iterations,
|
||||
dklen=len(expected_digest),
|
||||
)
|
||||
if not hmac.compare_digest(expected_digest, derived_digest):
|
||||
return False
|
||||
return iterations >= _password_iterations(resolved_settings)
|
||||
|
||||
|
||||
def _auth_session_secret(settings: Settings) -> bytes:
|
||||
"""Resolves a stable secret used to hash issued bearer session tokens."""
|
||||
|
||||
candidate = settings.auth_session_pepper.strip() or settings.app_settings_encryption_key.strip()
|
||||
if not candidate:
|
||||
candidate = DEFAULT_AUTH_FALLBACK_SECRET
|
||||
return hashlib.sha256(candidate.encode("utf-8")).digest()
|
||||
|
||||
|
||||
def _hash_session_token(token: str, settings: Settings | None = None) -> str:
|
||||
"""Derives a deterministic SHA256 token hash guarded by secret pepper material."""
|
||||
|
||||
resolved_settings = settings or get_settings()
|
||||
secret = _auth_session_secret(resolved_settings)
|
||||
digest = hmac.new(secret, token.encode("utf-8"), hashlib.sha256).hexdigest()
|
||||
return digest
|
||||
|
||||
|
||||
def _new_session_token(settings: Settings) -> str:
|
||||
"""Creates a random URL-safe bearer token for one API session."""
|
||||
|
||||
token_bytes = max(24, min(128, int(settings.auth_session_token_bytes)))
|
||||
return secrets.token_urlsafe(token_bytes)
|
||||
|
||||
|
||||
def _resolve_optional_user_credentials(username: str, password: str) -> tuple[str, str] | None:
|
||||
"""Returns optional user credentials only when both username and password are configured."""
|
||||
|
||||
normalized_username = normalize_username(username)
|
||||
normalized_password = password.strip()
|
||||
if not normalized_username and not normalized_password:
|
||||
return None
|
||||
if not normalized_username or not normalized_password:
|
||||
raise ValueError("Optional bootstrap user requires both username and password")
|
||||
return normalized_username, normalized_password
|
||||
|
||||
|
||||
def _upsert_bootstrap_user(session: Session, *, username: str, password: str, role: UserRole) -> AppUser:
|
||||
"""Creates or updates one bootstrap account with deterministic role assignment."""
|
||||
|
||||
existing = session.execute(select(AppUser).where(AppUser.username == username)).scalar_one_or_none()
|
||||
password_hash = hash_password(password)
|
||||
if existing is None:
|
||||
user = AppUser(
|
||||
username=username,
|
||||
password_hash=password_hash,
|
||||
role=role,
|
||||
is_active=True,
|
||||
)
|
||||
session.add(user)
|
||||
return user
|
||||
|
||||
existing.password_hash = password_hash
|
||||
existing.role = role
|
||||
existing.is_active = True
|
||||
return existing
|
||||
|
||||
|
||||
def ensure_bootstrap_users() -> None:
|
||||
"""Creates or refreshes bootstrap user accounts from runtime environment credentials."""
|
||||
|
||||
settings = get_settings()
|
||||
admin_username = normalize_username(settings.auth_bootstrap_admin_username)
|
||||
admin_password = settings.auth_bootstrap_admin_password.strip()
|
||||
if not admin_username:
|
||||
raise RuntimeError("AUTH_BOOTSTRAP_ADMIN_USERNAME must not be empty")
|
||||
if not admin_password:
|
||||
raise RuntimeError("AUTH_BOOTSTRAP_ADMIN_PASSWORD must not be empty")
|
||||
|
||||
optional_user_credentials = _resolve_optional_user_credentials(
|
||||
username=settings.auth_bootstrap_user_username,
|
||||
password=settings.auth_bootstrap_user_password,
|
||||
)
|
||||
|
||||
with SessionLocal() as session:
|
||||
_upsert_bootstrap_user(
|
||||
session,
|
||||
username=admin_username,
|
||||
password=admin_password,
|
||||
role=UserRole.ADMIN,
|
||||
)
|
||||
if optional_user_credentials is not None:
|
||||
user_username, user_password = optional_user_credentials
|
||||
if user_username == admin_username:
|
||||
raise RuntimeError("AUTH_BOOTSTRAP_USER_USERNAME must differ from admin username")
|
||||
_upsert_bootstrap_user(
|
||||
session,
|
||||
username=user_username,
|
||||
password=user_password,
|
||||
role=UserRole.USER,
|
||||
)
|
||||
session.commit()
|
||||
|
||||
|
||||
def authenticate_user(session: Session, *, username: str, password: str) -> AppUser | None:
|
||||
"""Authenticates one username/password pair and returns active account on success."""
|
||||
|
||||
normalized_username = normalize_username(username)
|
||||
if not normalized_username:
|
||||
return None
|
||||
user = session.execute(select(AppUser).where(AppUser.username == normalized_username)).scalar_one_or_none()
|
||||
if user is None or not user.is_active:
|
||||
return None
|
||||
if not verify_password(password, user.password_hash):
|
||||
return None
|
||||
return user
|
||||
|
||||
|
||||
def issue_user_session(
|
||||
session: Session,
|
||||
*,
|
||||
user: AppUser,
|
||||
user_agent: str | None = None,
|
||||
ip_address: str | None = None,
|
||||
) -> IssuedSession:
|
||||
"""Issues one new bearer token session for a validated user account."""
|
||||
|
||||
settings = get_settings()
|
||||
now = datetime.now(UTC)
|
||||
ttl_minutes = max(5, min(7 * 24 * 60, int(settings.auth_session_ttl_minutes)))
|
||||
expires_at = now + timedelta(minutes=ttl_minutes)
|
||||
token = _new_session_token(settings)
|
||||
token_hash = _hash_session_token(token, settings)
|
||||
|
||||
session.execute(
|
||||
delete(AuthSession).where(
|
||||
AuthSession.user_id == user.id,
|
||||
AuthSession.expires_at <= now,
|
||||
)
|
||||
)
|
||||
session_entry = AuthSession(
|
||||
user_id=user.id,
|
||||
token_hash=token_hash,
|
||||
expires_at=expires_at,
|
||||
user_agent=(user_agent or "").strip()[:512] or None,
|
||||
ip_address=(ip_address or "").strip()[:64] or None,
|
||||
)
|
||||
session.add(session_entry)
|
||||
return IssuedSession(token=token, expires_at=expires_at)
|
||||
|
||||
|
||||
def resolve_auth_session(session: Session, *, token: str) -> AuthSession | None:
|
||||
"""Resolves one non-revoked and non-expired session from a bearer token value."""
|
||||
|
||||
normalized = token.strip()
|
||||
if not normalized:
|
||||
return None
|
||||
token_hash = _hash_session_token(normalized)
|
||||
now = datetime.now(UTC)
|
||||
session_entry = session.execute(
|
||||
select(AuthSession).where(
|
||||
AuthSession.token_hash == token_hash,
|
||||
AuthSession.revoked_at.is_(None),
|
||||
AuthSession.expires_at > now,
|
||||
)
|
||||
).scalar_one_or_none()
|
||||
if session_entry is None or session_entry.user is None:
|
||||
return None
|
||||
if not session_entry.user.is_active:
|
||||
return None
|
||||
return session_entry
|
||||
|
||||
|
||||
def revoke_auth_session(session: Session, *, session_id: uuid.UUID) -> bool:
|
||||
"""Revokes one active session by identifier and returns whether a change was applied."""
|
||||
|
||||
existing = session.execute(select(AuthSession).where(AuthSession.id == session_id)).scalar_one_or_none()
|
||||
if existing is None or existing.revoked_at is not None:
|
||||
return False
|
||||
existing.revoked_at = datetime.now(UTC)
|
||||
return True
|
||||
@@ -299,17 +299,24 @@ def extract_text_content(filename: str, data: bytes, mime_type: str) -> Extracti
|
||||
)
|
||||
|
||||
|
||||
def extract_archive_members(data: bytes, depth: int = 0) -> list[ArchiveMember]:
|
||||
"""Extracts processable ZIP members within configured decompression safety budgets."""
|
||||
def extract_archive_members(data: bytes, depth: int = 0, max_members: int | None = None) -> list[ArchiveMember]:
|
||||
"""Extracts processable ZIP members with depth-aware and decompression safety guardrails."""
|
||||
|
||||
members: list[ArchiveMember] = []
|
||||
if depth > settings.max_zip_depth:
|
||||
normalized_depth = max(0, depth)
|
||||
if normalized_depth >= settings.max_zip_depth:
|
||||
return members
|
||||
|
||||
member_limit = settings.max_zip_members
|
||||
if max_members is not None:
|
||||
member_limit = max(0, min(settings.max_zip_members, int(max_members)))
|
||||
if member_limit <= 0:
|
||||
return members
|
||||
|
||||
total_uncompressed_bytes = 0
|
||||
try:
|
||||
with zipfile.ZipFile(io.BytesIO(data)) as archive:
|
||||
infos = [info for info in archive.infolist() if not info.is_dir()][: settings.max_zip_members]
|
||||
infos = [info for info in archive.infolist() if not info.is_dir()][:member_limit]
|
||||
for info in infos:
|
||||
if info.file_size <= 0:
|
||||
continue
|
||||
|
||||
@@ -10,6 +10,7 @@ from typing import Any
|
||||
from openai import APIConnectionError, APIError, APITimeoutError, OpenAI
|
||||
from PIL import Image, ImageOps
|
||||
|
||||
from app.core.config import normalize_and_validate_provider_base_url
|
||||
from app.services.app_settings import DEFAULT_OCR_PROMPT, read_handwriting_provider_settings
|
||||
|
||||
MAX_IMAGE_SIDE = 2000
|
||||
@@ -151,12 +152,17 @@ def _normalize_image_bytes(image_data: bytes) -> tuple[bytes, str]:
|
||||
|
||||
|
||||
def _create_client(provider_settings: dict[str, Any]) -> OpenAI:
|
||||
"""Creates an OpenAI client configured for compatible endpoints and timeouts."""
|
||||
"""Creates an OpenAI client configured with DNS-revalidated endpoint and request timeout controls."""
|
||||
|
||||
api_key = str(provider_settings.get("openai_api_key", "")).strip() or "no-key-required"
|
||||
raw_base_url = str(provider_settings.get("openai_base_url", "")).strip()
|
||||
try:
|
||||
normalized_base_url = normalize_and_validate_provider_base_url(raw_base_url, resolve_dns=True)
|
||||
except ValueError as error:
|
||||
raise HandwritingTranscriptionError(f"invalid_provider_base_url:{error}") from error
|
||||
return OpenAI(
|
||||
api_key=api_key,
|
||||
base_url=str(provider_settings["openai_base_url"]),
|
||||
base_url=normalized_base_url,
|
||||
timeout=int(provider_settings["openai_timeout_seconds"]),
|
||||
)
|
||||
|
||||
|
||||
@@ -6,10 +6,13 @@ from uuid import UUID
|
||||
from sqlalchemy import delete, func, select
|
||||
from sqlalchemy.orm import Session
|
||||
|
||||
from app.core.config import get_settings
|
||||
from app.models.document import Document
|
||||
from app.models.processing_log import ProcessingLogEntry
|
||||
|
||||
|
||||
settings = get_settings()
|
||||
|
||||
MAX_STAGE_LENGTH = 64
|
||||
MAX_EVENT_LENGTH = 256
|
||||
MAX_LEVEL_LENGTH = 16
|
||||
@@ -37,9 +40,49 @@ def _trim(value: str | None, max_length: int) -> str | None:
|
||||
|
||||
|
||||
def _safe_payload(payload_json: dict[str, Any] | None) -> dict[str, Any]:
|
||||
"""Ensures payload values are persisted as dictionaries."""
|
||||
"""Normalizes payload persistence mode using metadata-only defaults for sensitive content."""
|
||||
|
||||
return payload_json if isinstance(payload_json, dict) else {}
|
||||
if not isinstance(payload_json, dict):
|
||||
return {}
|
||||
if settings.processing_log_store_payload_text:
|
||||
return payload_json
|
||||
return _metadata_only_payload(payload_json)
|
||||
|
||||
|
||||
def _metadata_only_payload(payload_json: dict[str, Any]) -> dict[str, Any]:
|
||||
"""Converts payload content into metadata descriptors without persisting raw text values."""
|
||||
|
||||
metadata: dict[str, Any] = {}
|
||||
for index, (raw_key, raw_value) in enumerate(payload_json.items()):
|
||||
if index >= 80:
|
||||
break
|
||||
key = str(raw_key)
|
||||
metadata[key] = _metadata_only_payload_value(raw_value)
|
||||
return metadata
|
||||
|
||||
|
||||
def _metadata_only_payload_value(value: Any) -> Any:
|
||||
"""Converts one payload value into non-sensitive metadata representation."""
|
||||
|
||||
if isinstance(value, dict):
|
||||
return _metadata_only_payload(value)
|
||||
if isinstance(value, (list, tuple)):
|
||||
items = list(value)
|
||||
return {
|
||||
"item_count": len(items),
|
||||
"items_preview": [_metadata_only_payload_value(item) for item in items[:20]],
|
||||
}
|
||||
if isinstance(value, str):
|
||||
normalized = value.strip()
|
||||
return {
|
||||
"text_chars": len(normalized),
|
||||
"text_omitted": bool(normalized),
|
||||
}
|
||||
if isinstance(value, bytes):
|
||||
return {"binary_bytes": len(value)}
|
||||
if isinstance(value, (int, float, bool)) or value is None:
|
||||
return value
|
||||
return {"value_type": type(value).__name__}
|
||||
|
||||
|
||||
def set_processing_log_autocommit(session: Session, enabled: bool) -> None:
|
||||
@@ -82,8 +125,8 @@ def log_processing_event(
|
||||
document_filename=_trim(resolved_document_filename, MAX_DOCUMENT_FILENAME_LENGTH),
|
||||
provider_id=_trim(provider_id, MAX_PROVIDER_LENGTH),
|
||||
model_name=_trim(model_name, MAX_MODEL_LENGTH),
|
||||
prompt_text=_trim(prompt_text, MAX_PROMPT_LENGTH),
|
||||
response_text=_trim(response_text, MAX_RESPONSE_LENGTH),
|
||||
prompt_text=_trim(prompt_text, MAX_PROMPT_LENGTH) if settings.processing_log_store_model_io_text else None,
|
||||
response_text=_trim(response_text, MAX_RESPONSE_LENGTH) if settings.processing_log_store_model_io_text else None,
|
||||
payload_json=_safe_payload(payload_json),
|
||||
)
|
||||
session.add(entry)
|
||||
|
||||
42
backend/app/services/rate_limiter.py
Normal file
42
backend/app/services/rate_limiter.py
Normal file
@@ -0,0 +1,42 @@
|
||||
"""Redis-backed fixed-window rate limiter helpers for sensitive API operations."""
|
||||
|
||||
import time
|
||||
|
||||
from redis.exceptions import RedisError
|
||||
|
||||
from app.worker.queue import get_redis
|
||||
|
||||
|
||||
def _rate_limit_key(*, scope: str, subject: str, window_id: int) -> str:
|
||||
"""Builds a stable Redis key for one scope, subject, and fixed time window."""
|
||||
|
||||
return f"dcm:rate-limit:{scope}:{subject}:{window_id}"
|
||||
|
||||
|
||||
def increment_rate_limit(
|
||||
*,
|
||||
scope: str,
|
||||
subject: str,
|
||||
limit: int,
|
||||
window_seconds: int = 60,
|
||||
) -> tuple[int, int]:
|
||||
"""Increments one rate bucket and returns current count with configured limit."""
|
||||
|
||||
bounded_limit = max(0, int(limit))
|
||||
if bounded_limit == 0:
|
||||
return (0, 0)
|
||||
|
||||
bounded_window = max(1, int(window_seconds))
|
||||
current_window = int(time.time() // bounded_window)
|
||||
key = _rate_limit_key(scope=scope, subject=subject, window_id=current_window)
|
||||
|
||||
redis_client = get_redis()
|
||||
try:
|
||||
pipeline = redis_client.pipeline(transaction=True)
|
||||
pipeline.incr(key, 1)
|
||||
pipeline.expire(key, bounded_window + 5)
|
||||
count_value, _ = pipeline.execute()
|
||||
except RedisError as error:
|
||||
raise RuntimeError("Rate limiter backend unavailable") from error
|
||||
|
||||
return (int(count_value), bounded_limit)
|
||||
@@ -3,16 +3,17 @@
|
||||
from redis import Redis
|
||||
from rq import Queue
|
||||
|
||||
from app.core.config import get_settings
|
||||
from app.core.config import get_settings, validate_redis_url_security
|
||||
|
||||
|
||||
settings = get_settings()
|
||||
|
||||
|
||||
def get_redis() -> Redis:
|
||||
"""Creates a Redis connection from configured URL."""
|
||||
"""Creates a Redis connection after enforcing URL security policy checks."""
|
||||
|
||||
return Redis.from_url(settings.redis_url)
|
||||
secure_redis_url = validate_redis_url_security(settings.redis_url)
|
||||
return Redis.from_url(secure_redis_url)
|
||||
|
||||
|
||||
def get_processing_queue() -> Queue:
|
||||
|
||||
26
backend/app/worker/run_worker.py
Normal file
26
backend/app/worker/run_worker.py
Normal file
@@ -0,0 +1,26 @@
|
||||
"""Worker entrypoint that enforces Redis URL security checks before queue consumption."""
|
||||
|
||||
from redis import Redis
|
||||
from rq import Worker
|
||||
|
||||
from app.core.config import get_settings, validate_redis_url_security
|
||||
|
||||
|
||||
def _build_worker_connection() -> Redis:
|
||||
"""Builds validated Redis connection used by RQ worker runtime."""
|
||||
|
||||
settings = get_settings()
|
||||
secure_redis_url = validate_redis_url_security(settings.redis_url)
|
||||
return Redis.from_url(secure_redis_url)
|
||||
|
||||
|
||||
def run_worker() -> None:
|
||||
"""Runs the RQ worker loop for the configured DCM processing queue."""
|
||||
|
||||
connection = _build_worker_connection()
|
||||
worker = Worker(["dcm"], connection=connection)
|
||||
worker.work()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
run_worker()
|
||||
@@ -7,6 +7,7 @@ from pathlib import Path
|
||||
from sqlalchemy import select
|
||||
from sqlalchemy.orm import Session
|
||||
|
||||
from app.core.config import get_settings
|
||||
from app.db.base import SessionLocal
|
||||
from app.models.document import Document, DocumentStatus
|
||||
from app.services.app_settings import (
|
||||
@@ -37,6 +38,13 @@ from app.services.storage import absolute_path, compute_sha256, store_bytes, wri
|
||||
from app.worker.queue import get_processing_queue
|
||||
|
||||
|
||||
settings = get_settings()
|
||||
|
||||
ARCHIVE_ROOT_ID_METADATA_KEY = "archive_root_document_id"
|
||||
ARCHIVE_DEPTH_METADATA_KEY = "archive_depth"
|
||||
ARCHIVE_DESCENDANT_COUNT_METADATA_KEY = "archive_descendant_count"
|
||||
|
||||
|
||||
def _cleanup_processing_logs_with_settings(session: Session) -> None:
|
||||
"""Applies configured processing log retention while trimming old log entries."""
|
||||
|
||||
@@ -48,13 +56,80 @@ def _cleanup_processing_logs_with_settings(session: Session) -> None:
|
||||
)
|
||||
|
||||
|
||||
def _metadata_non_negative_int(value: object, fallback: int = 0) -> int:
|
||||
"""Parses metadata values as non-negative integers with safe fallback behavior."""
|
||||
|
||||
try:
|
||||
parsed = int(value)
|
||||
except (TypeError, ValueError):
|
||||
return fallback
|
||||
return max(0, parsed)
|
||||
|
||||
|
||||
def _metadata_uuid(value: object) -> uuid.UUID | None:
|
||||
"""Parses metadata values as UUIDs while tolerating malformed legacy values."""
|
||||
|
||||
if not isinstance(value, str) or not value.strip():
|
||||
return None
|
||||
try:
|
||||
return uuid.UUID(value.strip())
|
||||
except ValueError:
|
||||
return None
|
||||
|
||||
|
||||
def _resolve_archive_lineage(session: Session, document: Document) -> tuple[uuid.UUID, int]:
|
||||
"""Resolves archive root document id and depth for metadata propagation compatibility."""
|
||||
|
||||
metadata_json = dict(document.metadata_json)
|
||||
metadata_root = _metadata_uuid(metadata_json.get(ARCHIVE_ROOT_ID_METADATA_KEY))
|
||||
metadata_depth = _metadata_non_negative_int(metadata_json.get(ARCHIVE_DEPTH_METADATA_KEY), fallback=0)
|
||||
if metadata_root is not None:
|
||||
return metadata_root, metadata_depth
|
||||
|
||||
if not document.is_archive_member:
|
||||
return document.id, 0
|
||||
|
||||
depth = 0
|
||||
root_document_id = document.id
|
||||
parent_document_id = document.parent_document_id
|
||||
visited: set[uuid.UUID] = {document.id}
|
||||
while parent_document_id is not None and parent_document_id not in visited:
|
||||
visited.add(parent_document_id)
|
||||
parent_document = session.execute(select(Document).where(Document.id == parent_document_id)).scalar_one_or_none()
|
||||
if parent_document is None:
|
||||
break
|
||||
depth += 1
|
||||
root_document_id = parent_document.id
|
||||
parent_document_id = parent_document.parent_document_id
|
||||
|
||||
return root_document_id, depth
|
||||
|
||||
|
||||
def _merge_archive_metadata(document: Document, **updates: object) -> None:
|
||||
"""Applies archive metadata updates while preserving unrelated document metadata keys."""
|
||||
|
||||
metadata_json = dict(document.metadata_json)
|
||||
metadata_json.update(updates)
|
||||
document.metadata_json = metadata_json
|
||||
|
||||
|
||||
def _load_archive_root_for_update(session: Session, root_document_id: uuid.UUID) -> Document | None:
|
||||
"""Loads archive root row with write lock to serialize descendant-count budget updates."""
|
||||
|
||||
return session.execute(
|
||||
select(Document).where(Document.id == root_document_id).with_for_update()
|
||||
).scalar_one_or_none()
|
||||
|
||||
|
||||
def _create_archive_member_document(
|
||||
parent: Document,
|
||||
member_name: str,
|
||||
member_data: bytes,
|
||||
mime_type: str,
|
||||
archive_root_document_id: uuid.UUID,
|
||||
archive_depth: int,
|
||||
) -> Document:
|
||||
"""Creates a child document entity for a file extracted from an uploaded archive."""
|
||||
"""Creates child document entities with lineage metadata for recursive archive processing."""
|
||||
|
||||
extension = Path(member_name).suffix.lower()
|
||||
stored_relative_path = store_bytes(member_name, member_data)
|
||||
@@ -68,7 +143,13 @@ def _create_archive_member_document(
|
||||
size_bytes=len(member_data),
|
||||
logical_path=parent.logical_path,
|
||||
tags=list(parent.tags),
|
||||
metadata_json={"origin": "archive", "parent": str(parent.id)},
|
||||
owner_user_id=parent.owner_user_id,
|
||||
metadata_json={
|
||||
"origin": "archive",
|
||||
"parent": str(parent.id),
|
||||
ARCHIVE_ROOT_ID_METADATA_KEY: str(archive_root_document_id),
|
||||
ARCHIVE_DEPTH_METADATA_KEY: archive_depth,
|
||||
},
|
||||
is_archive_member=True,
|
||||
archived_member_path=member_name,
|
||||
parent_document_id=parent.id,
|
||||
@@ -110,16 +191,46 @@ def process_document_task(document_id: str) -> None:
|
||||
|
||||
if document.extension == ".zip":
|
||||
child_ids: list[str] = []
|
||||
archive_root_document_id, archive_depth = _resolve_archive_lineage(session=session, document=document)
|
||||
_merge_archive_metadata(
|
||||
document,
|
||||
**{
|
||||
ARCHIVE_ROOT_ID_METADATA_KEY: str(archive_root_document_id),
|
||||
ARCHIVE_DEPTH_METADATA_KEY: archive_depth,
|
||||
},
|
||||
)
|
||||
root_document = _load_archive_root_for_update(session=session, root_document_id=archive_root_document_id)
|
||||
if root_document is None:
|
||||
root_document = document
|
||||
|
||||
root_metadata_json = dict(root_document.metadata_json)
|
||||
existing_descendant_count = _metadata_non_negative_int(
|
||||
root_metadata_json.get(ARCHIVE_DESCENDANT_COUNT_METADATA_KEY),
|
||||
fallback=0,
|
||||
)
|
||||
max_descendants_per_root = max(0, int(settings.max_zip_descendants_per_root))
|
||||
remaining_descendant_budget = max(0, max_descendants_per_root - existing_descendant_count)
|
||||
extraction_member_cap = remaining_descendant_budget
|
||||
|
||||
log_processing_event(
|
||||
session=session,
|
||||
stage="archive",
|
||||
event="Archive extraction started",
|
||||
level="info",
|
||||
document=document,
|
||||
payload_json={"size_bytes": len(data)},
|
||||
payload_json={
|
||||
"size_bytes": len(data),
|
||||
"archive_root_document_id": str(archive_root_document_id),
|
||||
"archive_depth": archive_depth,
|
||||
"remaining_descendant_budget": remaining_descendant_budget,
|
||||
},
|
||||
)
|
||||
try:
|
||||
members = extract_archive_members(data)
|
||||
members = extract_archive_members(
|
||||
data,
|
||||
depth=archive_depth,
|
||||
max_members=extraction_member_cap,
|
||||
)
|
||||
for member in members:
|
||||
mime_type = sniff_mime(member.data)
|
||||
child = _create_archive_member_document(
|
||||
@@ -127,6 +238,8 @@ def process_document_task(document_id: str) -> None:
|
||||
member_name=member.name,
|
||||
member_data=member.data,
|
||||
mime_type=mime_type,
|
||||
archive_root_document_id=archive_root_document_id,
|
||||
archive_depth=archive_depth + 1,
|
||||
)
|
||||
session.add(child)
|
||||
session.flush()
|
||||
@@ -142,8 +255,27 @@ def process_document_task(document_id: str) -> None:
|
||||
"member_name": member.name,
|
||||
"member_size_bytes": len(member.data),
|
||||
"mime_type": mime_type,
|
||||
"archive_root_document_id": str(archive_root_document_id),
|
||||
"archive_depth": archive_depth + 1,
|
||||
},
|
||||
)
|
||||
|
||||
updated_root_metadata = dict(root_document.metadata_json)
|
||||
updated_root_metadata[ARCHIVE_ROOT_ID_METADATA_KEY] = str(archive_root_document_id)
|
||||
updated_root_metadata[ARCHIVE_DEPTH_METADATA_KEY] = 0
|
||||
updated_root_metadata[ARCHIVE_DESCENDANT_COUNT_METADATA_KEY] = existing_descendant_count + len(child_ids)
|
||||
root_document.metadata_json = updated_root_metadata
|
||||
|
||||
limit_flags: dict[str, object] = {}
|
||||
if archive_depth >= settings.max_zip_depth:
|
||||
limit_flags["max_depth_reached"] = True
|
||||
if remaining_descendant_budget <= 0:
|
||||
limit_flags["max_descendants_reached"] = True
|
||||
elif len(child_ids) >= remaining_descendant_budget:
|
||||
limit_flags["max_descendants_reached"] = True
|
||||
if limit_flags:
|
||||
_merge_archive_metadata(document, **limit_flags)
|
||||
|
||||
document.status = DocumentStatus.PROCESSED
|
||||
document.extracted_text = f"archive with {len(members)} files"
|
||||
log_processing_event(
|
||||
@@ -152,7 +284,13 @@ def process_document_task(document_id: str) -> None:
|
||||
event="Archive extraction completed",
|
||||
level="info",
|
||||
document=document,
|
||||
payload_json={"member_count": len(members)},
|
||||
payload_json={
|
||||
"member_count": len(members),
|
||||
"archive_root_document_id": str(archive_root_document_id),
|
||||
"archive_depth": archive_depth,
|
||||
"descendant_count": existing_descendant_count + len(child_ids),
|
||||
"remaining_descendant_budget": max(0, remaining_descendant_budget - len(child_ids)),
|
||||
},
|
||||
)
|
||||
except Exception as exc:
|
||||
document.status = DocumentStatus.ERROR
|
||||
@@ -231,7 +369,10 @@ def process_document_task(document_id: str) -> None:
|
||||
event="Archive child job enqueued",
|
||||
level="info",
|
||||
document_id=uuid.UUID(child_id),
|
||||
payload_json={"parent_document_id": str(document.id)},
|
||||
payload_json={
|
||||
"parent_document_id": str(document.id),
|
||||
"archive_root_document_id": str(archive_root_document_id),
|
||||
},
|
||||
)
|
||||
session.commit()
|
||||
return
|
||||
|
||||
@@ -16,3 +16,4 @@ orjson==3.11.3
|
||||
openai==1.107.2
|
||||
typesense==1.1.1
|
||||
tiktoken==0.11.0
|
||||
cryptography==46.0.1
|
||||
|
||||
@@ -144,6 +144,87 @@ class AppSettingsProviderResilienceTests(unittest.TestCase):
|
||||
app_settings.update_app_settings(providers=[provider_update])
|
||||
write_settings_mock.assert_not_called()
|
||||
|
||||
def test_sanitize_settings_migrates_legacy_plaintext_api_key_to_encrypted_field(self) -> None:
|
||||
"""Legacy plaintext API keys are still readable and emitted with encrypted storage representation."""
|
||||
|
||||
payload = {
|
||||
"providers": [
|
||||
{
|
||||
"id": "secure-provider",
|
||||
"label": "Secure Provider",
|
||||
"provider_type": "openai_compatible",
|
||||
"base_url": "https://api.openai.com/v1",
|
||||
"timeout_seconds": 45,
|
||||
"api_key": "legacy-plaintext-secret",
|
||||
}
|
||||
],
|
||||
"tasks": {
|
||||
app_settings.TASK_OCR_HANDWRITING: {"provider_id": "secure-provider"},
|
||||
app_settings.TASK_SUMMARY_GENERATION: {"provider_id": "secure-provider"},
|
||||
app_settings.TASK_ROUTING_CLASSIFICATION: {"provider_id": "secure-provider"},
|
||||
},
|
||||
}
|
||||
|
||||
with patch.object(app_settings, "_derive_provider_api_key_key", return_value=b"k" * 32):
|
||||
sanitized = app_settings._sanitize_settings(payload)
|
||||
|
||||
provider = sanitized["providers"][0]
|
||||
self.assertEqual(provider["api_key"], "legacy-plaintext-secret")
|
||||
self.assertTrue(
|
||||
str(provider.get("api_key_encrypted", "")).startswith(
|
||||
f"{app_settings.PROVIDER_API_KEY_CIPHERTEXT_PREFIX}:"
|
||||
)
|
||||
)
|
||||
|
||||
def test_serialize_settings_for_storage_excludes_plaintext_api_key(self) -> None:
|
||||
"""Storage payload serialization persists encrypted provider API keys only."""
|
||||
|
||||
payload = _sample_current_payload()
|
||||
payload["providers"][0]["api_key"] = "storage-secret"
|
||||
payload["providers"][0]["api_key_encrypted"] = ""
|
||||
|
||||
with patch.object(app_settings, "_derive_provider_api_key_key", return_value=b"s" * 32):
|
||||
storage_payload = app_settings._serialize_settings_for_storage(payload)
|
||||
|
||||
provider_storage = storage_payload["providers"][0]
|
||||
self.assertNotIn("api_key", provider_storage)
|
||||
self.assertTrue(
|
||||
str(provider_storage.get("api_key_encrypted", "")).startswith(
|
||||
f"{app_settings.PROVIDER_API_KEY_CIPHERTEXT_PREFIX}:"
|
||||
)
|
||||
)
|
||||
|
||||
def test_read_handwriting_provider_settings_revalidates_dns(self) -> None:
|
||||
"""OCR runtime provider settings enforce DNS revalidation before creating outbound clients."""
|
||||
|
||||
runtime_payload = {
|
||||
"provider": {
|
||||
"id": "openai-default",
|
||||
"provider_type": "openai_compatible",
|
||||
"base_url": "https://api.openai.com/v1",
|
||||
"timeout_seconds": 45,
|
||||
"api_key": "runtime-secret",
|
||||
},
|
||||
"task": {
|
||||
"enabled": True,
|
||||
"model": "gpt-4.1-mini",
|
||||
"prompt": "prompt",
|
||||
},
|
||||
}
|
||||
with (
|
||||
patch.object(app_settings, "read_task_runtime_settings", return_value=runtime_payload),
|
||||
patch.object(
|
||||
app_settings,
|
||||
"normalize_and_validate_provider_base_url",
|
||||
return_value="https://api.openai.com/v1",
|
||||
) as normalize_mock,
|
||||
):
|
||||
runtime_settings = app_settings.read_handwriting_provider_settings()
|
||||
|
||||
normalize_mock.assert_called_once_with("https://api.openai.com/v1", resolve_dns=True)
|
||||
self.assertEqual(runtime_settings["openai_base_url"], "https://api.openai.com/v1")
|
||||
self.assertEqual(runtime_settings["openai_api_key"], "runtime-secret")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
unittest.main()
|
||||
|
||||
@@ -3,12 +3,15 @@
|
||||
from __future__ import annotations
|
||||
|
||||
from datetime import UTC, datetime
|
||||
import io
|
||||
import socket
|
||||
import sys
|
||||
import uuid
|
||||
from pathlib import Path
|
||||
from types import ModuleType, SimpleNamespace
|
||||
import unittest
|
||||
from unittest.mock import patch
|
||||
import zipfile
|
||||
|
||||
|
||||
BACKEND_ROOT = Path(__file__).resolve().parents[1]
|
||||
@@ -83,13 +86,200 @@ if "fastapi.security" not in sys.modules:
|
||||
fastapi_security_stub.HTTPBearer = _HTTPBearer
|
||||
sys.modules["fastapi.security"] = fastapi_security_stub
|
||||
|
||||
from fastapi import HTTPException
|
||||
from fastapi.security import HTTPAuthorizationCredentials
|
||||
if "magic" not in sys.modules:
|
||||
magic_stub = ModuleType("magic")
|
||||
|
||||
from app.api.auth import AuthRole, get_request_role, require_admin
|
||||
def _from_buffer(_data: bytes, mime: bool = True) -> str:
|
||||
"""Returns deterministic fallback MIME values for extractor import stubs."""
|
||||
|
||||
return "application/octet-stream" if mime else ""
|
||||
|
||||
magic_stub.from_buffer = _from_buffer
|
||||
sys.modules["magic"] = magic_stub
|
||||
|
||||
if "docx" not in sys.modules:
|
||||
docx_stub = ModuleType("docx")
|
||||
|
||||
class _DocxDocument:
|
||||
"""Minimal docx document stub for extractor import compatibility."""
|
||||
|
||||
def __init__(self, *_args: object, **_kwargs: object) -> None:
|
||||
self.paragraphs: list[SimpleNamespace] = []
|
||||
|
||||
docx_stub.Document = _DocxDocument
|
||||
sys.modules["docx"] = docx_stub
|
||||
|
||||
if "openpyxl" not in sys.modules:
|
||||
openpyxl_stub = ModuleType("openpyxl")
|
||||
|
||||
class _Workbook:
|
||||
"""Minimal workbook stub for extractor import compatibility."""
|
||||
|
||||
worksheets: list[SimpleNamespace] = []
|
||||
|
||||
def _load_workbook(*_args: object, **_kwargs: object) -> _Workbook:
|
||||
"""Returns deterministic workbook placeholder for extractor import stubs."""
|
||||
|
||||
return _Workbook()
|
||||
|
||||
openpyxl_stub.load_workbook = _load_workbook
|
||||
sys.modules["openpyxl"] = openpyxl_stub
|
||||
|
||||
if "PIL" not in sys.modules:
|
||||
pil_stub = ModuleType("PIL")
|
||||
|
||||
class _Image:
|
||||
"""Minimal PIL.Image replacement for extractor and handwriting import stubs."""
|
||||
|
||||
class Resampling:
|
||||
"""Minimal enum-like namespace used by handwriting image resize path."""
|
||||
|
||||
LANCZOS = 1
|
||||
|
||||
@staticmethod
|
||||
def open(*_args: object, **_kwargs: object) -> "_Image":
|
||||
"""Raises for unsupported image operations in dependency-light tests."""
|
||||
|
||||
raise RuntimeError("Image.open is not available in stub")
|
||||
|
||||
class _ImageOps:
|
||||
"""Minimal PIL.ImageOps replacement for import compatibility."""
|
||||
|
||||
@staticmethod
|
||||
def exif_transpose(image: object) -> object:
|
||||
"""Returns original image object unchanged in dependency-light tests."""
|
||||
|
||||
return image
|
||||
|
||||
pil_stub.Image = _Image
|
||||
pil_stub.ImageOps = _ImageOps
|
||||
sys.modules["PIL"] = pil_stub
|
||||
|
||||
if "pypdf" not in sys.modules:
|
||||
pypdf_stub = ModuleType("pypdf")
|
||||
|
||||
class _PdfReader:
|
||||
"""Minimal PdfReader replacement for extractor import compatibility."""
|
||||
|
||||
def __init__(self, *_args: object, **_kwargs: object) -> None:
|
||||
self.pages: list[SimpleNamespace] = []
|
||||
|
||||
pypdf_stub.PdfReader = _PdfReader
|
||||
sys.modules["pypdf"] = pypdf_stub
|
||||
|
||||
if "pymupdf" not in sys.modules:
|
||||
pymupdf_stub = ModuleType("pymupdf")
|
||||
|
||||
class _Matrix:
|
||||
"""Minimal matrix placeholder for extractor import compatibility."""
|
||||
|
||||
def __init__(self, *_args: object, **_kwargs: object) -> None:
|
||||
pass
|
||||
|
||||
def _open(*_args: object, **_kwargs: object) -> object:
|
||||
"""Raises when preview rendering is invoked in dependency-light tests."""
|
||||
|
||||
raise RuntimeError("pymupdf is not available in stub")
|
||||
|
||||
pymupdf_stub.Matrix = _Matrix
|
||||
pymupdf_stub.open = _open
|
||||
sys.modules["pymupdf"] = pymupdf_stub
|
||||
|
||||
if "app.services.handwriting" not in sys.modules:
|
||||
handwriting_stub = ModuleType("app.services.handwriting")
|
||||
|
||||
class _HandwritingError(Exception):
|
||||
"""Minimal base error class for extractor import compatibility."""
|
||||
|
||||
class _HandwritingNotConfiguredError(_HandwritingError):
|
||||
"""Minimal not-configured error class for extractor import compatibility."""
|
||||
|
||||
class _HandwritingTimeoutError(_HandwritingError):
|
||||
"""Minimal timeout error class for extractor import compatibility."""
|
||||
|
||||
def _classify_image_text_bytes(*_args: object, **_kwargs: object) -> SimpleNamespace:
|
||||
"""Returns deterministic image text classification fallback."""
|
||||
|
||||
return SimpleNamespace(label="unknown", confidence=0.0, provider="stub", model="stub")
|
||||
|
||||
def _transcribe_handwriting_bytes(*_args: object, **_kwargs: object) -> SimpleNamespace:
|
||||
"""Returns deterministic handwriting transcription fallback."""
|
||||
|
||||
return SimpleNamespace(text="", uncertainties=[], provider="stub", model="stub")
|
||||
|
||||
handwriting_stub.IMAGE_TEXT_TYPE_NO_TEXT = "no_text"
|
||||
handwriting_stub.IMAGE_TEXT_TYPE_UNKNOWN = "unknown"
|
||||
handwriting_stub.IMAGE_TEXT_TYPE_HANDWRITING = "handwriting"
|
||||
handwriting_stub.HandwritingTranscriptionError = _HandwritingError
|
||||
handwriting_stub.HandwritingTranscriptionNotConfiguredError = _HandwritingNotConfiguredError
|
||||
handwriting_stub.HandwritingTranscriptionTimeoutError = _HandwritingTimeoutError
|
||||
handwriting_stub.classify_image_text_bytes = _classify_image_text_bytes
|
||||
handwriting_stub.transcribe_handwriting_bytes = _transcribe_handwriting_bytes
|
||||
sys.modules["app.services.handwriting"] = handwriting_stub
|
||||
|
||||
if "app.services.handwriting_style" not in sys.modules:
|
||||
handwriting_style_stub = ModuleType("app.services.handwriting_style")
|
||||
|
||||
def _assign_handwriting_style(*_args: object, **_kwargs: object) -> SimpleNamespace:
|
||||
"""Returns deterministic style assignment payload for worker import compatibility."""
|
||||
|
||||
return SimpleNamespace(
|
||||
style_cluster_id="cluster-1",
|
||||
matched_existing=False,
|
||||
similarity=0.0,
|
||||
vector_distance=0.0,
|
||||
compared_neighbors=0,
|
||||
match_min_similarity=0.0,
|
||||
bootstrap_match_min_similarity=0.0,
|
||||
)
|
||||
|
||||
def _delete_handwriting_style_document(*_args: object, **_kwargs: object) -> None:
|
||||
"""No-op style document delete stub for worker import compatibility."""
|
||||
|
||||
return None
|
||||
|
||||
handwriting_style_stub.assign_handwriting_style = _assign_handwriting_style
|
||||
handwriting_style_stub.delete_handwriting_style_document = _delete_handwriting_style_document
|
||||
sys.modules["app.services.handwriting_style"] = handwriting_style_stub
|
||||
|
||||
if "app.services.routing_pipeline" not in sys.modules:
|
||||
routing_pipeline_stub = ModuleType("app.services.routing_pipeline")
|
||||
|
||||
def _apply_routing_decision(*_args: object, **_kwargs: object) -> None:
|
||||
"""No-op routing application stub for worker import compatibility."""
|
||||
|
||||
return None
|
||||
|
||||
def _classify_document_routing(*_args: object, **_kwargs: object) -> dict[str, object]:
|
||||
"""Returns deterministic routing decision payload for worker import compatibility."""
|
||||
|
||||
return {"chosen_path": None, "chosen_tags": []}
|
||||
|
||||
def _summarize_document(*_args: object, **_kwargs: object) -> str:
|
||||
"""Returns deterministic summary text for worker import compatibility."""
|
||||
|
||||
return "summary"
|
||||
|
||||
def _upsert_semantic_index(*_args: object, **_kwargs: object) -> None:
|
||||
"""No-op semantic index update stub for worker import compatibility."""
|
||||
|
||||
return None
|
||||
|
||||
routing_pipeline_stub.apply_routing_decision = _apply_routing_decision
|
||||
routing_pipeline_stub.classify_document_routing = _classify_document_routing
|
||||
routing_pipeline_stub.summarize_document = _summarize_document
|
||||
routing_pipeline_stub.upsert_semantic_index = _upsert_semantic_index
|
||||
sys.modules["app.services.routing_pipeline"] = routing_pipeline_stub
|
||||
|
||||
from fastapi import HTTPException
|
||||
|
||||
from app.api.auth import AuthContext, require_admin
|
||||
from app.core import config as config_module
|
||||
from app.models.auth import UserRole
|
||||
from app.models.processing_log import sanitize_processing_log_payload_value, sanitize_processing_log_text
|
||||
from app.schemas.processing_logs import ProcessingLogEntryResponse
|
||||
from app.services import extractor as extractor_module
|
||||
from app.worker import tasks as worker_tasks_module
|
||||
|
||||
|
||||
def _security_settings(
|
||||
@@ -108,30 +298,34 @@ def _security_settings(
|
||||
|
||||
|
||||
class AuthDependencyTests(unittest.TestCase):
|
||||
"""Verifies token authentication and admin authorization behavior."""
|
||||
|
||||
def test_get_request_role_accepts_admin_token(self) -> None:
|
||||
"""Admin token resolves admin role."""
|
||||
|
||||
settings = SimpleNamespace(admin_api_token="admin-token", user_api_token="user-token")
|
||||
credentials = HTTPAuthorizationCredentials(scheme="Bearer", credentials="admin-token")
|
||||
role = get_request_role(credentials=credentials, settings=settings)
|
||||
self.assertEqual(role, AuthRole.ADMIN)
|
||||
|
||||
def test_get_request_role_rejects_missing_credentials(self) -> None:
|
||||
"""Missing bearer credentials return 401."""
|
||||
|
||||
settings = SimpleNamespace(admin_api_token="admin-token", user_api_token="user-token")
|
||||
with self.assertRaises(HTTPException) as context:
|
||||
get_request_role(credentials=None, settings=settings)
|
||||
self.assertEqual(context.exception.status_code, 401)
|
||||
"""Verifies role-based admin authorization behavior."""
|
||||
|
||||
def test_require_admin_rejects_user_role(self) -> None:
|
||||
"""User role cannot access admin-only endpoints."""
|
||||
|
||||
with self.assertRaises(HTTPException) as context:
|
||||
require_admin(role=AuthRole.USER)
|
||||
self.assertEqual(context.exception.status_code, 403)
|
||||
auth_context = AuthContext(
|
||||
user_id=uuid.uuid4(),
|
||||
username="user",
|
||||
role=UserRole.USER,
|
||||
session_id=uuid.uuid4(),
|
||||
expires_at=datetime.now(UTC),
|
||||
)
|
||||
with self.assertRaises(HTTPException) as raised:
|
||||
require_admin(context=auth_context)
|
||||
self.assertEqual(raised.exception.status_code, 403)
|
||||
|
||||
def test_require_admin_accepts_admin_role(self) -> None:
|
||||
"""Admin role is accepted for admin-only endpoints."""
|
||||
|
||||
auth_context = AuthContext(
|
||||
user_id=uuid.uuid4(),
|
||||
username="admin",
|
||||
role=UserRole.ADMIN,
|
||||
session_id=uuid.uuid4(),
|
||||
expires_at=datetime.now(UTC),
|
||||
)
|
||||
resolved = require_admin(context=auth_context)
|
||||
self.assertEqual(resolved.role, UserRole.ADMIN)
|
||||
|
||||
|
||||
class ProviderBaseUrlValidationTests(unittest.TestCase):
|
||||
@@ -202,6 +396,241 @@ class ProviderBaseUrlValidationTests(unittest.TestCase):
|
||||
self.assertEqual(getaddrinfo_mock.call_count, 2)
|
||||
|
||||
|
||||
class RedisQueueSecurityTests(unittest.TestCase):
|
||||
"""Verifies Redis URL security policy behavior for compatibility and strict environments."""
|
||||
|
||||
def test_auto_mode_allows_insecure_redis_url_in_development(self) -> None:
|
||||
"""Development mode stays backward-compatible with local unauthenticated redis URLs."""
|
||||
|
||||
normalized = config_module.validate_redis_url_security(
|
||||
"redis://redis:6379/0",
|
||||
app_env="development",
|
||||
security_mode="auto",
|
||||
tls_mode="auto",
|
||||
)
|
||||
self.assertEqual(normalized, "redis://redis:6379/0")
|
||||
|
||||
def test_auto_mode_rejects_missing_auth_in_production(self) -> None:
|
||||
"""Production auto mode fails closed when Redis URL omits authentication."""
|
||||
|
||||
with self.assertRaises(ValueError):
|
||||
config_module.validate_redis_url_security(
|
||||
"rediss://redis:6379/0",
|
||||
app_env="production",
|
||||
security_mode="auto",
|
||||
tls_mode="auto",
|
||||
)
|
||||
|
||||
def test_auto_mode_rejects_plaintext_redis_in_production(self) -> None:
|
||||
"""Production auto mode requires TLS transport for Redis URLs."""
|
||||
|
||||
with self.assertRaises(ValueError):
|
||||
config_module.validate_redis_url_security(
|
||||
"redis://:password@redis:6379/0",
|
||||
app_env="production",
|
||||
security_mode="auto",
|
||||
tls_mode="auto",
|
||||
)
|
||||
|
||||
def test_strict_mode_enforces_auth_and_tls_outside_production(self) -> None:
|
||||
"""Strict mode enforces production-grade Redis controls in all environments."""
|
||||
|
||||
with self.assertRaises(ValueError):
|
||||
config_module.validate_redis_url_security(
|
||||
"redis://redis:6379/0",
|
||||
app_env="development",
|
||||
security_mode="strict",
|
||||
tls_mode="auto",
|
||||
)
|
||||
|
||||
normalized = config_module.validate_redis_url_security(
|
||||
"rediss://:password@redis:6379/0",
|
||||
app_env="development",
|
||||
security_mode="strict",
|
||||
tls_mode="auto",
|
||||
)
|
||||
self.assertEqual(normalized, "rediss://:password@redis:6379/0")
|
||||
|
||||
def test_compat_mode_allows_insecure_redis_in_production_for_safe_migration(self) -> None:
|
||||
"""Compatibility mode keeps legacy production Redis URLs usable during migration windows."""
|
||||
|
||||
normalized = config_module.validate_redis_url_security(
|
||||
"redis://redis:6379/0",
|
||||
app_env="production",
|
||||
security_mode="compat",
|
||||
tls_mode="allow_insecure",
|
||||
)
|
||||
self.assertEqual(normalized, "redis://redis:6379/0")
|
||||
|
||||
|
||||
class PreviewMimeSafetyTests(unittest.TestCase):
|
||||
"""Verifies inline preview MIME safety checks for uploaded document responses."""
|
||||
|
||||
def test_preview_blocks_script_capable_html_and_svg_types(self) -> None:
|
||||
"""HTML and SVG MIME types are rejected for inline preview responses."""
|
||||
|
||||
self.assertFalse(config_module.is_inline_preview_mime_type_safe("text/html"))
|
||||
self.assertFalse(config_module.is_inline_preview_mime_type_safe("image/svg+xml"))
|
||||
|
||||
def test_preview_allows_pdf_and_safe_image_types(self) -> None:
|
||||
"""PDF and raster image MIME types stay eligible for inline preview responses."""
|
||||
|
||||
self.assertTrue(config_module.is_inline_preview_mime_type_safe("application/pdf"))
|
||||
self.assertTrue(config_module.is_inline_preview_mime_type_safe("image/png"))
|
||||
|
||||
|
||||
def _build_zip_bytes(entries: dict[str, bytes]) -> bytes:
|
||||
"""Builds in-memory ZIP bytes for archive extraction guardrail tests."""
|
||||
|
||||
output = io.BytesIO()
|
||||
with zipfile.ZipFile(output, mode="w", compression=zipfile.ZIP_DEFLATED) as archive:
|
||||
for filename, payload in entries.items():
|
||||
archive.writestr(filename, payload)
|
||||
return output.getvalue()
|
||||
|
||||
|
||||
class ArchiveExtractionGuardrailTests(unittest.TestCase):
|
||||
"""Verifies depth-aware archive extraction and per-call member cap enforcement."""
|
||||
|
||||
def test_extract_archive_members_rejects_depth_at_configured_limit(self) -> None:
|
||||
"""Archive member extraction is disabled at or beyond configured depth ceiling."""
|
||||
|
||||
archive_bytes = _build_zip_bytes({"sample.txt": b"sample"})
|
||||
patched_settings = SimpleNamespace(
|
||||
max_zip_depth=2,
|
||||
max_zip_members=250,
|
||||
max_zip_member_uncompressed_bytes=25 * 1024 * 1024,
|
||||
max_zip_total_uncompressed_bytes=150 * 1024 * 1024,
|
||||
max_zip_compression_ratio=120.0,
|
||||
)
|
||||
with patch.object(extractor_module, "settings", patched_settings):
|
||||
members = extractor_module.extract_archive_members(archive_bytes, depth=2)
|
||||
self.assertEqual(members, [])
|
||||
|
||||
def test_extract_archive_members_respects_member_cap_argument(self) -> None:
|
||||
"""Archive extraction truncates results when caller-provided member cap is lower than archive size."""
|
||||
|
||||
archive_bytes = _build_zip_bytes(
|
||||
{
|
||||
"one.txt": b"1",
|
||||
"two.txt": b"2",
|
||||
"three.txt": b"3",
|
||||
}
|
||||
)
|
||||
patched_settings = SimpleNamespace(
|
||||
max_zip_depth=3,
|
||||
max_zip_members=250,
|
||||
max_zip_member_uncompressed_bytes=25 * 1024 * 1024,
|
||||
max_zip_total_uncompressed_bytes=150 * 1024 * 1024,
|
||||
max_zip_compression_ratio=120.0,
|
||||
)
|
||||
with patch.object(extractor_module, "settings", patched_settings):
|
||||
members = extractor_module.extract_archive_members(archive_bytes, depth=0, max_members=1)
|
||||
self.assertEqual(len(members), 1)
|
||||
|
||||
|
||||
class ArchiveLineagePropagationTests(unittest.TestCase):
|
||||
"""Verifies archive lineage metadata propagation helpers used by worker descendant queueing."""
|
||||
|
||||
def test_create_archive_member_document_persists_lineage_metadata(self) -> None:
|
||||
"""Child archive documents include root id and incremented depth metadata."""
|
||||
|
||||
parent_id = uuid.uuid4()
|
||||
parent = SimpleNamespace(
|
||||
id=parent_id,
|
||||
source_relative_path="uploads/root.zip",
|
||||
logical_path="Inbox",
|
||||
tags=["finance"],
|
||||
owner_user_id=uuid.uuid4(),
|
||||
)
|
||||
|
||||
with (
|
||||
patch.object(worker_tasks_module, "store_bytes", return_value="stored/path/child.zip"),
|
||||
patch.object(worker_tasks_module, "compute_sha256", return_value="deadbeef"),
|
||||
):
|
||||
child = worker_tasks_module._create_archive_member_document(
|
||||
parent=parent,
|
||||
member_name="nested/child.zip",
|
||||
member_data=b"zip-bytes",
|
||||
mime_type="application/zip",
|
||||
archive_root_document_id=parent_id,
|
||||
archive_depth=1,
|
||||
)
|
||||
|
||||
self.assertEqual(child.parent_document_id, parent_id)
|
||||
self.assertEqual(child.metadata_json.get(worker_tasks_module.ARCHIVE_ROOT_ID_METADATA_KEY), str(parent_id))
|
||||
self.assertEqual(child.metadata_json.get(worker_tasks_module.ARCHIVE_DEPTH_METADATA_KEY), 1)
|
||||
self.assertTrue(child.is_archive_member)
|
||||
self.assertEqual(child.owner_user_id, parent.owner_user_id)
|
||||
|
||||
def test_resolve_archive_lineage_prefers_existing_metadata(self) -> None:
|
||||
"""Existing archive lineage metadata is reused without traversing parent relationships."""
|
||||
|
||||
root_id = uuid.uuid4()
|
||||
document = SimpleNamespace(
|
||||
id=uuid.uuid4(),
|
||||
metadata_json={
|
||||
worker_tasks_module.ARCHIVE_ROOT_ID_METADATA_KEY: str(root_id),
|
||||
worker_tasks_module.ARCHIVE_DEPTH_METADATA_KEY: 3,
|
||||
},
|
||||
is_archive_member=True,
|
||||
parent_document_id=uuid.uuid4(),
|
||||
)
|
||||
|
||||
class _SessionShouldNotBeUsed:
|
||||
"""Fails test if lineage helper performs unnecessary parent traversals."""
|
||||
|
||||
def execute(self, _statement: object) -> object:
|
||||
raise AssertionError("session.execute should not be called when metadata is present")
|
||||
|
||||
resolved_root, resolved_depth = worker_tasks_module._resolve_archive_lineage(
|
||||
session=_SessionShouldNotBeUsed(),
|
||||
document=document,
|
||||
)
|
||||
self.assertEqual(resolved_root, root_id)
|
||||
self.assertEqual(resolved_depth, 3)
|
||||
|
||||
def test_resolve_archive_lineage_walks_parent_chain_when_metadata_missing(self) -> None:
|
||||
"""Lineage fallback traverses parent references to recover root id and depth."""
|
||||
|
||||
root_id = uuid.uuid4()
|
||||
parent_id = uuid.uuid4()
|
||||
root_document = SimpleNamespace(id=root_id, parent_document_id=None)
|
||||
parent_document = SimpleNamespace(id=parent_id, parent_document_id=root_id)
|
||||
document = SimpleNamespace(
|
||||
id=uuid.uuid4(),
|
||||
metadata_json={},
|
||||
is_archive_member=True,
|
||||
parent_document_id=parent_id,
|
||||
)
|
||||
|
||||
class _ScalarResult:
|
||||
"""Wraps scalar ORM results for deterministic worker helper tests."""
|
||||
|
||||
def __init__(self, value: object) -> None:
|
||||
self._value = value
|
||||
|
||||
def scalar_one_or_none(self) -> object:
|
||||
return self._value
|
||||
|
||||
class _SequenceSession:
|
||||
"""Returns predetermined parent rows in traversal order."""
|
||||
|
||||
def __init__(self, values: list[object]) -> None:
|
||||
self._values = values
|
||||
|
||||
def execute(self, _statement: object) -> _ScalarResult:
|
||||
next_value = self._values.pop(0) if self._values else None
|
||||
return _ScalarResult(next_value)
|
||||
|
||||
resolved_root, resolved_depth = worker_tasks_module._resolve_archive_lineage(
|
||||
session=_SequenceSession([parent_document, root_document]),
|
||||
document=document,
|
||||
)
|
||||
self.assertEqual(resolved_root, root_id)
|
||||
self.assertEqual(resolved_depth, 2)
|
||||
|
||||
|
||||
class ProcessingLogRedactionTests(unittest.TestCase):
|
||||
"""Verifies sensitive processing-log values are redacted for persistence and responses."""
|
||||
|
||||
|
||||
@@ -115,6 +115,7 @@ def _install_main_import_stubs() -> dict[str, ModuleType | None]:
|
||||
"""Returns minimal settings consumed by app.main during test import."""
|
||||
|
||||
return SimpleNamespace(
|
||||
app_env="development",
|
||||
cors_origins=["http://localhost:5173"],
|
||||
max_upload_request_size_bytes=1024,
|
||||
)
|
||||
|
||||
@@ -6,7 +6,8 @@ This directory contains technical documentation for DMS.
|
||||
|
||||
- `../README.md` - project overview, setup, and quick operations
|
||||
- `architecture-overview.md` - backend, frontend, and infrastructure architecture
|
||||
- `api-contract.md` - API endpoint contract grouped by route module, including token auth roles, upload limits, and settings or processing-log security constraints
|
||||
- `api-contract.md` - API endpoint contract grouped by route module, including session auth, role and ownership scope, upload limits, and settings or processing-log security constraints
|
||||
- `data-model-reference.md` - database entity definitions and lifecycle states
|
||||
- `operations-and-configuration.md` - runtime operations, hardened compose defaults, security environment variables, and persisted settings configuration and read-sanitization behavior
|
||||
- `frontend-design-foundation.md` - frontend visual system, tokens, UI implementation rules, authenticated media delivery under API token auth, processing-log timeline behavior, and settings helper-copy guidance
|
||||
- `operations-and-configuration.md` - runtime operations, hardened compose defaults, DEV and LIVE security values, and persisted settings configuration behavior
|
||||
- `frontend-design-foundation.md` - frontend visual system, tokens, UI implementation rules, authenticated media delivery under session auth, processing-log timeline behavior, and settings helper-copy guidance
|
||||
- `../.env.example` - repository-level environment template with local defaults and production override guidance
|
||||
|
||||
@@ -4,6 +4,7 @@ Base URL prefix: `/api/v1`
|
||||
|
||||
Primary implementation modules:
|
||||
- `backend/app/api/router.py`
|
||||
- `backend/app/api/routes_auth.py`
|
||||
- `backend/app/api/routes_health.py`
|
||||
- `backend/app/api/routes_documents.py`
|
||||
- `backend/app/api/routes_search.py`
|
||||
@@ -12,14 +13,32 @@ Primary implementation modules:
|
||||
|
||||
## Authentication And Authorization
|
||||
|
||||
- Protected endpoints require `Authorization: Bearer <token>`.
|
||||
- `ADMIN_API_TOKEN` is required for all privileged access and acts as fail-closed root credential.
|
||||
- `USER_API_TOKEN` is optional and, when configured, grants access to document endpoints only.
|
||||
- Authorization matrix:
|
||||
- `documents/*`: `admin` or `user`
|
||||
- `search/*`: `admin` or `user`
|
||||
- `settings/*`: `admin` only
|
||||
- `processing/logs/*`: `admin` only
|
||||
- Authentication is session-based bearer auth.
|
||||
- Clients authenticate with `POST /auth/login` using username and password.
|
||||
- Backend issues per-user bearer session tokens and stores hashed session state server-side.
|
||||
- Clients send issued tokens as `Authorization: Bearer <token>`.
|
||||
- `GET /auth/me` returns current identity and role.
|
||||
- `POST /auth/logout` revokes current session token.
|
||||
|
||||
Role matrix:
|
||||
- `documents/*`: `admin` or `user`
|
||||
- `search/*`: `admin` or `user`
|
||||
- `settings/*`: `admin` only
|
||||
- `processing/logs/*`: `admin` only
|
||||
|
||||
Ownership rules:
|
||||
- `user` role is restricted to its own documents.
|
||||
- `admin` role can access all documents.
|
||||
|
||||
## Auth
|
||||
|
||||
- `POST /auth/login`
|
||||
- Body model: `AuthLoginRequest`
|
||||
- Response model: `AuthLoginResponse`
|
||||
- `GET /auth/me`
|
||||
- Response model: `AuthSessionResponse`
|
||||
- `POST /auth/logout`
|
||||
- Response model: `AuthLogoutResponse`
|
||||
|
||||
## Health
|
||||
|
||||
@@ -29,8 +48,6 @@ Primary implementation modules:
|
||||
|
||||
## Documents
|
||||
|
||||
- Access: admin or user token required
|
||||
|
||||
### Collection and metadata helpers
|
||||
|
||||
- `GET /documents`
|
||||
@@ -48,6 +65,11 @@ Primary implementation modules:
|
||||
- `POST /documents/content-md/export`
|
||||
- Body model: `ContentExportRequest`
|
||||
- Response: ZIP stream containing one markdown file per matched document
|
||||
- Limits:
|
||||
- hard cap on matched document count (`CONTENT_EXPORT_MAX_DOCUMENTS`)
|
||||
- hard cap on cumulative markdown bytes (`CONTENT_EXPORT_MAX_TOTAL_BYTES`)
|
||||
- per-user rate limit (`CONTENT_EXPORT_RATE_LIMIT_PER_MINUTE`)
|
||||
- Behavior: archive is streamed from spool file instead of unbounded in-memory buffer
|
||||
|
||||
### Per-document operations
|
||||
|
||||
@@ -56,7 +78,8 @@ Primary implementation modules:
|
||||
- `GET /documents/{document_id}/download`
|
||||
- Response: original file bytes
|
||||
- `GET /documents/{document_id}/preview`
|
||||
- Response: inline preview stream where browser-supported
|
||||
- Response: inline preview stream only for safe MIME types
|
||||
- Behavior: script-capable MIME types are forced to attachment responses with `X-Content-Type-Options: nosniff`
|
||||
- `GET /documents/{document_id}/thumbnail`
|
||||
- Response: generated thumbnail image when available
|
||||
- `GET /documents/{document_id}/content-md`
|
||||
@@ -86,7 +109,7 @@ Primary implementation modules:
|
||||
- `conflict_mode` (`ask`, `replace`, `duplicate`)
|
||||
- Response model: `UploadResponse`
|
||||
- Behavior:
|
||||
- `ask`: returns `conflicts` if duplicate checksum is detected
|
||||
- `ask`: returns `conflicts` if duplicate checksum is detected for caller-visible documents
|
||||
- `replace`: creates new document linked to replaced document id
|
||||
- `duplicate`: creates additional document record
|
||||
- upload `POST` request rejected with `411` when `Content-Length` is missing
|
||||
@@ -95,16 +118,14 @@ Primary implementation modules:
|
||||
|
||||
## Search
|
||||
|
||||
- Access: admin or user token required
|
||||
|
||||
- `GET /search`
|
||||
- Query: `query` (min length 2), `offset`, `limit`, `include_trashed`, `only_trashed`, `path_filter`, `tag_filter`, `type_filter`, `processed_from`, `processed_to`
|
||||
- Response model: `SearchResponse`
|
||||
- Behavior: PostgreSQL full-text and metadata ranking
|
||||
- Behavior: PostgreSQL full-text and metadata ranking with role-based ownership scope
|
||||
|
||||
## Processing Logs
|
||||
|
||||
- Access: admin token required
|
||||
- Access: admin only
|
||||
|
||||
- `GET /processing/logs`
|
||||
- Query: `offset`, `limit`, `document_id`
|
||||
@@ -119,17 +140,23 @@ Primary implementation modules:
|
||||
- `POST /processing/logs/clear`
|
||||
- Response: clear counters
|
||||
|
||||
Persistence mode:
|
||||
- default is metadata-only logging (`PROCESSING_LOG_STORE_MODEL_IO_TEXT=false`, `PROCESSING_LOG_STORE_PAYLOAD_TEXT=false`)
|
||||
- full prompt/response or payload content storage requires explicit operator opt-in
|
||||
|
||||
## Settings
|
||||
|
||||
- Access: admin token required
|
||||
- Access: admin only
|
||||
|
||||
- `GET /settings`
|
||||
- Response model: `AppSettingsResponse`
|
||||
- persisted providers with invalid base URLs are ignored during read sanitization; response falls back to remaining valid providers or secure defaults
|
||||
- provider API keys are exposed only as `api_key_set` and `api_key_masked`
|
||||
- `PATCH /settings`
|
||||
- Body model: `AppSettingsUpdateRequest`
|
||||
- Response model: `AppSettingsResponse`
|
||||
- rejects invalid provider base URLs with `400` when scheme, allowlist, or network safety checks fail
|
||||
- provider API keys are persisted encrypted at rest (`api_key_encrypted`) and plaintext keys are not written to storage
|
||||
- `POST /settings/reset`
|
||||
- Response model: `AppSettingsResponse`
|
||||
- `PATCH /settings/handwriting`
|
||||
@@ -140,6 +167,13 @@ Primary implementation modules:
|
||||
|
||||
## Schema Families
|
||||
|
||||
Auth schemas in `backend/app/schemas/auth.py`:
|
||||
- `AuthLoginRequest`
|
||||
- `AuthUserResponse`
|
||||
- `AuthSessionResponse`
|
||||
- `AuthLoginResponse`
|
||||
- `AuthLogoutResponse`
|
||||
|
||||
Document schemas in `backend/app/schemas/documents.py`:
|
||||
- `DocumentResponse`
|
||||
- `DocumentDetailResponse`
|
||||
@@ -155,4 +189,4 @@ Processing log schemas in `backend/app/schemas/processing_logs.py`:
|
||||
- `ProcessingLogListResponse`
|
||||
|
||||
Settings schemas in `backend/app/schemas/settings.py`:
|
||||
- Provider, task, upload-default, display, processing-log retention, predefined paths or tags, handwriting-style, and legacy handwriting models grouped under `AppSettingsResponse` and `AppSettingsUpdateRequest`.
|
||||
- provider, task, upload-default, display, processing-log retention, predefined paths or tags, handwriting-style, and legacy handwriting models grouped under `AppSettingsResponse` and `AppSettingsUpdateRequest`.
|
||||
|
||||
@@ -6,9 +6,9 @@ DMS runs as a multi-service application defined in `docker-compose.yml`:
|
||||
- `frontend` serves the React UI on port `5173`
|
||||
- `api` serves FastAPI on port `8000`
|
||||
- `worker` executes asynchronous extraction and indexing jobs
|
||||
- `db` provides PostgreSQL persistence on port `5432`
|
||||
- `redis` backs queueing on port `6379`
|
||||
- `typesense` stores search index and vector-adjacent metadata on port `8108`
|
||||
- `db` provides PostgreSQL persistence on the internal compose network
|
||||
- `redis` backs queueing on the internal compose network
|
||||
- `typesense` stores search index and vector-adjacent metadata on the internal compose network
|
||||
|
||||
## Backend Architecture
|
||||
|
||||
@@ -16,16 +16,16 @@ Backend source root: `backend/app/`
|
||||
|
||||
Main boundaries:
|
||||
- `api/` route handlers and HTTP contract
|
||||
- `services/` domain logic (storage, extraction, routing, settings, processing logs, Typesense)
|
||||
- `services/` domain logic (authentication, storage, extraction, routing, settings, processing logs, Typesense)
|
||||
- `db/` SQLAlchemy base, engine, and session lifecycle
|
||||
- `models/` persistence entities (`Document`, `ProcessingLogEntry`)
|
||||
- `models/` persistence entities (`AppUser`, `AuthSession`, `Document`, `ProcessingLogEntry`)
|
||||
- `schemas/` Pydantic response and request schemas
|
||||
- `worker/` RQ queue integration and background processing tasks
|
||||
|
||||
Application bootstrap in `backend/app/main.py`:
|
||||
- mounts routers under `/api/v1`
|
||||
- configures CORS from settings
|
||||
- initializes storage, settings, database schema, and Typesense collection on startup
|
||||
- initializes storage, database schema, bootstrap users, settings, and Typesense collection on startup
|
||||
|
||||
## Processing Lifecycle
|
||||
|
||||
@@ -48,11 +48,12 @@ Core structure:
|
||||
- `design-foundation.css` and `styles.css` define design tokens and global/component styling
|
||||
|
||||
Main user flows:
|
||||
- Login and role-gated navigation (`admin` and `user`)
|
||||
- Upload and conflict resolution
|
||||
- Search and filtered document browsing
|
||||
- Metadata editing and lifecycle actions (trash, restore, delete, reprocess)
|
||||
- Settings management for providers, tasks, and UI defaults
|
||||
- Processing log review
|
||||
- Settings management for providers, tasks, and UI defaults (admin only)
|
||||
- Processing log review (admin only)
|
||||
|
||||
## Persistence and State
|
||||
|
||||
@@ -64,3 +65,13 @@ Persistent data:
|
||||
Transient runtime state:
|
||||
- Redis queues processing tasks and worker execution state
|
||||
- frontend local component state drives active filters, selection, and modal flows
|
||||
|
||||
Security-sensitive runtime behavior:
|
||||
- API access is session-based with per-user server-issued bearer tokens and role checks.
|
||||
- Document and search reads for `user` role are owner-scoped via `owner_user_id`; `admin` can access global scope.
|
||||
- Redis connection URLs are validated by backend queue helpers with environment-aware auth and TLS policy enforcement.
|
||||
- Worker startup runs through `python -m app.worker.run_worker`, which validates Redis URL policy before queue consumption.
|
||||
- Inline preview is limited to safe MIME types and script-capable content is served as attachment-only.
|
||||
- Archive fan-out processing propagates root and depth lineage metadata and enforces depth and per-root descendant caps.
|
||||
- Markdown export applies per-user rate limits, hard document-count and total-byte caps, and spool-file streaming.
|
||||
- Processing logs default to metadata-only persistence, with explicit operator toggles required to store model IO text.
|
||||
|
||||
@@ -2,6 +2,38 @@
|
||||
|
||||
Primary SQLAlchemy models are defined in `backend/app/models/`.
|
||||
|
||||
## app_users
|
||||
|
||||
Model: `AppUser` in `backend/app/models/auth.py`
|
||||
|
||||
Purpose:
|
||||
- Stores authenticatable user identities for session-based API access.
|
||||
|
||||
Core fields:
|
||||
- Identity and credentials: `id`, `username`, `password_hash`
|
||||
- Authorization and lifecycle: `role`, `is_active`
|
||||
- Audit timestamps: `created_at`, `updated_at`
|
||||
|
||||
Enum `UserRole`:
|
||||
- `admin`
|
||||
- `user`
|
||||
|
||||
## auth_sessions
|
||||
|
||||
Model: `AuthSession` in `backend/app/models/auth.py`
|
||||
|
||||
Purpose:
|
||||
- Stores issued bearer sessions linked to user identities.
|
||||
|
||||
Core fields:
|
||||
- Identity and linkage: `id`, `user_id`, `token_hash`
|
||||
- Session lifecycle: `expires_at`, `revoked_at`
|
||||
- Request context: `user_agent`, `ip_address`
|
||||
- Audit timestamps: `created_at`, `updated_at`
|
||||
|
||||
Foreign keys:
|
||||
- `user_id` references `app_users.id` with `ON DELETE CASCADE`.
|
||||
|
||||
## documents
|
||||
|
||||
Model: `Document` in `backend/app/models/document.py`
|
||||
@@ -12,7 +44,7 @@ Purpose:
|
||||
Core fields:
|
||||
- Identity and source: `id`, `original_filename`, `source_relative_path`, `stored_relative_path`
|
||||
- File attributes: `mime_type`, `extension`, `sha256`, `size_bytes`
|
||||
- Organization: `logical_path`, `suggested_path`, `tags`, `suggested_tags`
|
||||
- Ownership and organization: `owner_user_id`, `logical_path`, `suggested_path`, `tags`, `suggested_tags`
|
||||
- Processing outputs: `extracted_text`, `image_text_type`, `handwriting_style_id`, `preview_available`
|
||||
- Lifecycle and relations: `status`, `is_archive_member`, `archived_member_path`, `parent_document_id`, `replaces_document_id`
|
||||
- Metadata and timestamps: `metadata_json`, `created_at`, `processed_at`, `updated_at`
|
||||
@@ -24,8 +56,12 @@ Enum `DocumentStatus`:
|
||||
- `error`
|
||||
- `trashed`
|
||||
|
||||
Foreign keys:
|
||||
- `owner_user_id` references `app_users.id` with `ON DELETE SET NULL`.
|
||||
|
||||
Relationships:
|
||||
- Self-referential `parent_document` relationship for archive extraction trees.
|
||||
- `owner_user` relationship to `AppUser`.
|
||||
|
||||
## processing_logs
|
||||
|
||||
@@ -47,7 +83,10 @@ Foreign keys:
|
||||
|
||||
## Model Lifecycle Notes
|
||||
|
||||
- Upload inserts a `Document` row in `queued` state and enqueues background processing.
|
||||
- Worker updates extraction results and final status (`processed`, `unsupported`, or `error`).
|
||||
- API startup initializes schema and creates or refreshes bootstrap users from auth environment variables.
|
||||
- `POST /auth/login` validates `AppUser` credentials, creates `AuthSession` with hashed token, and returns bearer token once.
|
||||
- Upload inserts `Document` row in `queued` state, assigns `owner_user_id`, and enqueues background processing.
|
||||
- Worker updates extraction results and final status (`processed`, `unsupported`, or `error`), preserving ownership on archive descendants.
|
||||
- User-role queries are owner-scoped; admin-role queries can access all documents.
|
||||
- Trash and restore operations toggle `status` while preserving source files until permanent delete.
|
||||
- Permanent delete removes the document tree (including archive descendants) and associated stored files.
|
||||
|
||||
@@ -52,9 +52,12 @@ Do not hardcode new palette or spacing values in component styles when a token a
|
||||
## Authenticated Media Delivery
|
||||
|
||||
- Document previews and thumbnails must load through authenticated fetch flows in `frontend/src/lib/api.ts`, then render via temporary object URLs.
|
||||
- Runtime auth uses server-issued per-user session tokens persisted with `setRuntimeApiToken` and read by `getRuntimeApiToken`.
|
||||
- Static build-time token distribution is not supported.
|
||||
- Direct `window.open` calls for protected media endpoints are not allowed because browser navigation requests do not include the API token header.
|
||||
- Download actions for original files and markdown exports must use authenticated blob fetches plus controlled browser download triggers.
|
||||
- Revoke all temporary object URLs after replacement, unmount, or completion to prevent browser memory leaks.
|
||||
- `DocumentViewer` iframe previews must be restricted to safe MIME types and rendered with `sandbox`, restrictive `allow`, and `referrerPolicy="no-referrer"` attributes. Active or script-capable formats must not be embedded inline.
|
||||
|
||||
## Extension Checklist
|
||||
|
||||
|
||||
@@ -2,15 +2,13 @@
|
||||
|
||||
## Runtime Services
|
||||
|
||||
`docker-compose.yml` defines the runtime stack:
|
||||
- `db` (Postgres 16, localhost-bound port `5432`)
|
||||
- `redis` (Redis 7, localhost-bound port `6379`)
|
||||
- `typesense` (Typesense 29, localhost-bound port `8108`)
|
||||
- `api` (FastAPI backend, localhost-bound port `8000`)
|
||||
- `worker` (RQ background worker)
|
||||
- `frontend` (Vite UI, localhost-bound port `5173`)
|
||||
|
||||
## Named Volumes
|
||||
`docker-compose.yml` defines:
|
||||
- `db` (Postgres 16)
|
||||
- `redis` (Redis 7)
|
||||
- `typesense` (Typesense 29)
|
||||
- `api` (FastAPI backend)
|
||||
- `worker` (RQ worker via `python -m app.worker.run_worker`)
|
||||
- `frontend` (Vite React UI)
|
||||
|
||||
Persistent volumes:
|
||||
- `db-data`
|
||||
@@ -24,15 +22,15 @@ Reset all persisted runtime data:
|
||||
docker compose down -v
|
||||
```
|
||||
|
||||
## Operational Commands
|
||||
## Core Commands
|
||||
|
||||
Start or rebuild stack:
|
||||
Start or rebuild:
|
||||
|
||||
```bash
|
||||
docker compose up --build -d
|
||||
```
|
||||
|
||||
Stop stack:
|
||||
Stop:
|
||||
|
||||
```bash
|
||||
docker compose down
|
||||
@@ -44,124 +42,82 @@ Tail logs:
|
||||
docker compose logs -f
|
||||
```
|
||||
|
||||
Before running compose, provide explicit API tokens in your shell or project `.env` file:
|
||||
## Authentication Model
|
||||
|
||||
```bash
|
||||
export ADMIN_API_TOKEN="<random-admin-token>"
|
||||
export USER_API_TOKEN="<random-user-token>"
|
||||
```
|
||||
- Legacy shared build-time frontend token behavior was removed.
|
||||
- API now uses server-issued per-user bearer sessions.
|
||||
- Bootstrap users are provisioned from environment:
|
||||
- `AUTH_BOOTSTRAP_ADMIN_USERNAME`
|
||||
- `AUTH_BOOTSTRAP_ADMIN_PASSWORD`
|
||||
- optional `AUTH_BOOTSTRAP_USER_USERNAME`
|
||||
- optional `AUTH_BOOTSTRAP_USER_PASSWORD`
|
||||
- Frontend signs in through `/api/v1/auth/login` and stores issued session token in browser session storage.
|
||||
|
||||
Compose now fails fast if either token variable is missing.
|
||||
## DEV And LIVE Configuration Matrix
|
||||
|
||||
## Backend Configuration
|
||||
Use `.env.example` as baseline. The table below documents user-managed settings and recommended values.
|
||||
|
||||
Settings source:
|
||||
- Runtime settings class: `backend/app/core/config.py`
|
||||
- API settings persistence: `backend/app/services/app_settings.py`
|
||||
| Variable | Local DEV (HTTP, docker-only) | LIVE (HTTPS behind reverse proxy) |
|
||||
| --- | --- | --- |
|
||||
| `APP_ENV` | `development` | `production` |
|
||||
| `HOST_BIND_IP` | `127.0.0.1` or local LAN bind if needed | `127.0.0.1` (publish behind proxy only) |
|
||||
| `PUBLIC_BASE_URL` | `http://localhost:8000` | `https://api.example.com` |
|
||||
| `VITE_API_BASE` | empty for host-derived `http://<frontend-host>:8000/api/v1`, or explicit local URL | `https://api.example.com/api/v1` |
|
||||
| `CORS_ORIGINS` | `["http://localhost:5173","http://localhost:3000"]` | exact frontend origins only, for example `["https://app.example.com"]` |
|
||||
| `REDIS_URL` | `redis://:<password>@redis:6379/0` in isolated local network | `rediss://:<password>@redis.internal:6379/0` |
|
||||
| `REDIS_SECURITY_MODE` | `compat` or `auto` | `strict` |
|
||||
| `REDIS_TLS_MODE` | `allow_insecure` or `auto` | `required` |
|
||||
| `PROVIDER_BASE_URL_ALLOW_HTTP` | `true` only when intentionally testing local HTTP provider endpoints | `false` |
|
||||
| `PROVIDER_BASE_URL_ALLOW_PRIVATE_NETWORK` | `true` only for trusted local development targets | `false` |
|
||||
| `PROVIDER_BASE_URL_ALLOWLIST` | allow needed test hosts | explicit production allowlist, for example `["api.openai.com"]` |
|
||||
| `PROCESSING_LOG_STORE_MODEL_IO_TEXT` | `false` by default; temporary `true` only for controlled debugging | `false` |
|
||||
| `PROCESSING_LOG_STORE_PAYLOAD_TEXT` | `false` by default; temporary `true` only for controlled debugging | `false` |
|
||||
| `CONTENT_EXPORT_MAX_DOCUMENTS` | default `250` or lower based on host memory | tuned to production capacity |
|
||||
| `CONTENT_EXPORT_MAX_TOTAL_BYTES` | default `52428800` (50 MiB) or lower | tuned to production capacity |
|
||||
| `CONTENT_EXPORT_RATE_LIMIT_PER_MINUTE` | default `6` | tuned to API throughput and abuse model |
|
||||
|
||||
Key environment variables used by `api` and `worker` in compose:
|
||||
- `APP_ENV`
|
||||
- `DATABASE_URL`
|
||||
- `REDIS_URL`
|
||||
- `STORAGE_ROOT`
|
||||
- `ADMIN_API_TOKEN`
|
||||
- `USER_API_TOKEN`
|
||||
- `PUBLIC_BASE_URL`
|
||||
- `CORS_ORIGINS` (API service)
|
||||
- `PROVIDER_BASE_URL_ALLOWLIST`
|
||||
- `PROVIDER_BASE_URL_ALLOW_HTTP`
|
||||
- `PROVIDER_BASE_URL_ALLOW_PRIVATE_NETWORK`
|
||||
- `TYPESENSE_PROTOCOL`
|
||||
- `TYPESENSE_HOST`
|
||||
- `TYPESENSE_PORT`
|
||||
- `TYPESENSE_API_KEY`
|
||||
- `TYPESENSE_COLLECTION_NAME`
|
||||
`PUBLIC_BASE_URL` must point to the backend API public URL, not the frontend URL.
|
||||
|
||||
Selected defaults from `Settings` (`backend/app/core/config.py`):
|
||||
- `upload_chunk_size = 4194304`
|
||||
- `max_upload_files_per_request = 50`
|
||||
- `max_upload_file_size_bytes = 26214400`
|
||||
- `max_upload_request_size_bytes = 104857600`
|
||||
- `max_zip_members = 250`
|
||||
- `max_zip_depth = 2`
|
||||
- `max_zip_member_uncompressed_bytes = 26214400`
|
||||
- `max_zip_total_uncompressed_bytes = 157286400`
|
||||
- `max_zip_compression_ratio = 120.0`
|
||||
- `max_text_length = 500000`
|
||||
- `processing_log_max_document_sessions = 20`
|
||||
- `processing_log_max_unbound_entries = 400`
|
||||
- `default_openai_model = "gpt-4.1-mini"`
|
||||
- `default_openai_timeout_seconds = 45`
|
||||
- `default_summary_model = "gpt-4.1-mini"`
|
||||
- `default_routing_model = "gpt-4.1-mini"`
|
||||
- `typesense_timeout_seconds = 120`
|
||||
- `typesense_num_retries = 0`
|
||||
## HTTPS Proxy Deployment Notes
|
||||
|
||||
## Frontend Configuration
|
||||
This application supports both:
|
||||
- local HTTP-only operation (no TLS termination in containers)
|
||||
- HTTPS deployment behind a reverse proxy that handles TLS
|
||||
|
||||
Frontend runtime API target:
|
||||
- `VITE_API_BASE` in `docker-compose.yml` frontend service
|
||||
- `VITE_API_TOKEN` in `docker-compose.yml` frontend service (defaults to `USER_API_TOKEN` in compose, override to `ADMIN_API_TOKEN` when admin-only routes are needed)
|
||||
|
||||
Frontend API authentication behavior:
|
||||
- `frontend/src/lib/api.ts` adds `Authorization: Bearer <VITE_API_TOKEN>` for all API requests only when `VITE_API_TOKEN` is non-empty
|
||||
- requests are still sent without authorization when `VITE_API_TOKEN` is unset, which keeps unauthenticated endpoints such as `/api/v1/health` backward-compatible
|
||||
|
||||
Frontend container runtime behavior:
|
||||
- the container runs as non-root `node`
|
||||
- `/app` is owned by `node` in `frontend/Dockerfile` so Vite can create runtime temp config files under `/app`
|
||||
|
||||
Frontend local commands:
|
||||
|
||||
```bash
|
||||
cd frontend && npm run dev
|
||||
cd frontend && npm run build
|
||||
cd frontend && npm run preview
|
||||
```
|
||||
|
||||
## Settings Persistence
|
||||
|
||||
Application-level settings managed from the UI are persisted by backend settings service:
|
||||
- file path: `<STORAGE_ROOT>/settings.json`
|
||||
- endpoints: `/api/v1/settings`, `/api/v1/settings/reset`, `/api/v1/settings/handwriting`
|
||||
|
||||
Settings include:
|
||||
- upload defaults
|
||||
- display options
|
||||
- processing-log retention options (`keep_document_sessions`, `keep_unbound_entries`)
|
||||
- provider configuration
|
||||
- OCR, summary, and routing task settings
|
||||
- predefined paths and tags
|
||||
- handwriting-style clustering settings
|
||||
|
||||
Read sanitization is resilient to corrupt persisted provider rows. If a persisted provider entry fails URL validation, the entry is skipped and defaults are used when no valid provider remains. This prevents unrelated read endpoints from failing due to stale invalid provider data.
|
||||
|
||||
Retention settings are used by worker cleanup and by `POST /api/v1/processing/logs/trim` when trim query values are not provided.
|
||||
Recommended LIVE pattern:
|
||||
1. Proxy terminates TLS and forwards to `api` and `frontend` internal HTTP endpoints.
|
||||
2. Keep container published ports bound to localhost or internal network.
|
||||
3. Set `PUBLIC_BASE_URL` and `VITE_API_BASE` to final HTTPS URLs.
|
||||
4. Set `CORS_ORIGINS` to exact HTTPS frontend origins.
|
||||
5. Credentialed CORS is intentionally disabled in application code for bearer-header auth.
|
||||
|
||||
## Security Controls
|
||||
|
||||
- Privileged APIs are token-gated with bearer auth:
|
||||
- `documents` endpoints: user token or admin token
|
||||
- `settings` and `processing/logs` endpoints: admin token only
|
||||
- Authentication fails closed when `ADMIN_API_TOKEN` is not configured.
|
||||
- Provider base URLs are validated on settings updates and before outbound model calls:
|
||||
- allowlist enforcement (`PROVIDER_BASE_URL_ALLOWLIST`)
|
||||
- scheme restrictions (`https` by default)
|
||||
- local/private-network blocking and per-request DNS revalidation checks for outbound runtime calls
|
||||
- Upload and archive safety guards are enforced:
|
||||
- `POST /api/v1/documents/upload` requires `Content-Length` and enforces file-count, per-file size, and total request size limits
|
||||
- `OPTIONS /api/v1/documents/upload` CORS preflight is excluded from `Content-Length` enforcement
|
||||
- ZIP member count, per-member uncompressed size, total decompressed size, and compression-ratio guards
|
||||
- Processing logs redact sensitive payload and text fields, and trim endpoints enforce retention caps from runtime config.
|
||||
- Compose hardening defaults:
|
||||
- host ports bind to `127.0.0.1` unless `HOST_BIND_IP` override is set
|
||||
- `api`, `worker`, and `frontend` drop all Linux capabilities and set `no-new-privileges`
|
||||
- backend and frontend containers run as non-root users by default
|
||||
- CORS uses explicit origin allowlist only; broad origin regex matching is removed.
|
||||
- Worker Redis startup validates URL auth and TLS policy before consuming jobs.
|
||||
- Provider API keys are encrypted at rest with standard AEAD (`cryptography` Fernet).
|
||||
- legacy `enc-v1` payloads are read for backward compatibility
|
||||
- new writes use `enc-v2`
|
||||
- Processing logs default to metadata-only persistence.
|
||||
- Markdown export enforces:
|
||||
- max document count
|
||||
- max total markdown bytes
|
||||
- per-user Redis-backed rate limit
|
||||
- spool-file streaming to avoid unbounded memory archives
|
||||
- User-role document access is owner-scoped for non-admin accounts.
|
||||
|
||||
## Frontend Runtime
|
||||
|
||||
- Frontend no longer consumes `VITE_API_TOKEN`.
|
||||
- Session token storage key is `dcm.access_token` in browser session storage.
|
||||
- Protected media and file download flows still use authenticated fetch plus blob/object URL handling.
|
||||
|
||||
## Validation Checklist
|
||||
|
||||
After operational or configuration changes, verify:
|
||||
- `GET /api/v1/health` is healthy
|
||||
- frontend can list, upload, and search documents
|
||||
- processing worker logs show successful task execution
|
||||
- settings save or reset works and persists after restart
|
||||
After configuration changes:
|
||||
- `GET /api/v1/health` returns healthy response
|
||||
- login succeeds for bootstrap admin user
|
||||
- admin can upload, search, open preview, download, and export markdown
|
||||
- user account can only access its own documents
|
||||
- admin-only settings and processing logs are not accessible by user role
|
||||
- `docker compose logs -f api worker` shows no startup validation failures
|
||||
|
||||
@@ -2,23 +2,25 @@ services:
|
||||
db:
|
||||
image: postgres:16-alpine
|
||||
environment:
|
||||
POSTGRES_USER: dcm
|
||||
POSTGRES_PASSWORD: dcm
|
||||
POSTGRES_DB: dcm
|
||||
ports:
|
||||
- "${HOST_BIND_IP:-127.0.0.1}:5432:5432"
|
||||
POSTGRES_USER: ${POSTGRES_USER:?POSTGRES_USER must be set}
|
||||
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:?POSTGRES_PASSWORD must be set}
|
||||
POSTGRES_DB: ${POSTGRES_DB:?POSTGRES_DB must be set}
|
||||
volumes:
|
||||
- db-data:/var/lib/postgresql/data
|
||||
healthcheck:
|
||||
test: ["CMD-SHELL", "pg_isready -U dcm -d dcm"]
|
||||
test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER:?POSTGRES_USER must be set} -d ${POSTGRES_DB:?POSTGRES_DB must be set}"]
|
||||
interval: 10s
|
||||
timeout: 5s
|
||||
retries: 10
|
||||
|
||||
redis:
|
||||
image: redis:7-alpine
|
||||
ports:
|
||||
- "${HOST_BIND_IP:-127.0.0.1}:6379:6379"
|
||||
command:
|
||||
- "redis-server"
|
||||
- "--appendonly"
|
||||
- "yes"
|
||||
- "--requirepass"
|
||||
- "${REDIS_PASSWORD:?REDIS_PASSWORD must be set}"
|
||||
volumes:
|
||||
- redis-data:/data
|
||||
|
||||
@@ -26,10 +28,8 @@ services:
|
||||
image: typesense/typesense:29.0
|
||||
command:
|
||||
- "--data-dir=/data"
|
||||
- "--api-key=dcm-typesense-key"
|
||||
- "--api-key=${TYPESENSE_API_KEY:?TYPESENSE_API_KEY must be set}"
|
||||
- "--enable-cors"
|
||||
ports:
|
||||
- "${HOST_BIND_IP:-127.0.0.1}:8108:8108"
|
||||
volumes:
|
||||
- typesense-data:/data
|
||||
|
||||
@@ -37,22 +37,32 @@ services:
|
||||
build:
|
||||
context: ./backend
|
||||
environment:
|
||||
APP_ENV: development
|
||||
DATABASE_URL: postgresql+psycopg://dcm:dcm@db:5432/dcm
|
||||
REDIS_URL: redis://redis:6379/0
|
||||
APP_ENV: ${APP_ENV:-development}
|
||||
DATABASE_URL: ${DATABASE_URL:?DATABASE_URL must be set}
|
||||
REDIS_URL: ${REDIS_URL:?REDIS_URL must be set}
|
||||
REDIS_SECURITY_MODE: ${REDIS_SECURITY_MODE:-auto}
|
||||
REDIS_TLS_MODE: ${REDIS_TLS_MODE:-auto}
|
||||
STORAGE_ROOT: /data/storage
|
||||
ADMIN_API_TOKEN: ${ADMIN_API_TOKEN:?ADMIN_API_TOKEN must be set}
|
||||
USER_API_TOKEN: ${USER_API_TOKEN:?USER_API_TOKEN must be set}
|
||||
PROVIDER_BASE_URL_ALLOWLIST: '${PROVIDER_BASE_URL_ALLOWLIST:-["api.openai.com"]}'
|
||||
PROVIDER_BASE_URL_ALLOW_HTTP: ${PROVIDER_BASE_URL_ALLOW_HTTP:-false}
|
||||
PROVIDER_BASE_URL_ALLOW_PRIVATE_NETWORK: ${PROVIDER_BASE_URL_ALLOW_PRIVATE_NETWORK:-false}
|
||||
AUTH_BOOTSTRAP_ADMIN_USERNAME: ${AUTH_BOOTSTRAP_ADMIN_USERNAME:?AUTH_BOOTSTRAP_ADMIN_USERNAME must be set}
|
||||
AUTH_BOOTSTRAP_ADMIN_PASSWORD: ${AUTH_BOOTSTRAP_ADMIN_PASSWORD:?AUTH_BOOTSTRAP_ADMIN_PASSWORD must be set}
|
||||
AUTH_BOOTSTRAP_USER_USERNAME: ${AUTH_BOOTSTRAP_USER_USERNAME:-}
|
||||
AUTH_BOOTSTRAP_USER_PASSWORD: ${AUTH_BOOTSTRAP_USER_PASSWORD:-}
|
||||
APP_SETTINGS_ENCRYPTION_KEY: ${APP_SETTINGS_ENCRYPTION_KEY:?APP_SETTINGS_ENCRYPTION_KEY must be set}
|
||||
PROVIDER_BASE_URL_ALLOWLIST: '${PROVIDER_BASE_URL_ALLOWLIST:-[]}'
|
||||
PROVIDER_BASE_URL_ALLOW_HTTP: ${PROVIDER_BASE_URL_ALLOW_HTTP:-true}
|
||||
PROVIDER_BASE_URL_ALLOW_PRIVATE_NETWORK: ${PROVIDER_BASE_URL_ALLOW_PRIVATE_NETWORK:-true}
|
||||
PROCESSING_LOG_STORE_MODEL_IO_TEXT: ${PROCESSING_LOG_STORE_MODEL_IO_TEXT:-false}
|
||||
PROCESSING_LOG_STORE_PAYLOAD_TEXT: ${PROCESSING_LOG_STORE_PAYLOAD_TEXT:-false}
|
||||
CONTENT_EXPORT_MAX_DOCUMENTS: ${CONTENT_EXPORT_MAX_DOCUMENTS:-250}
|
||||
CONTENT_EXPORT_MAX_TOTAL_BYTES: ${CONTENT_EXPORT_MAX_TOTAL_BYTES:-52428800}
|
||||
CONTENT_EXPORT_RATE_LIMIT_PER_MINUTE: ${CONTENT_EXPORT_RATE_LIMIT_PER_MINUTE:-6}
|
||||
OCR_LANGUAGES: eng,deu
|
||||
PUBLIC_BASE_URL: ${PUBLIC_BASE_URL:-http://localhost:8000}
|
||||
CORS_ORIGINS: '${CORS_ORIGINS:-["http://localhost:5173","http://localhost:3000"]}'
|
||||
TYPESENSE_PROTOCOL: http
|
||||
TYPESENSE_HOST: typesense
|
||||
TYPESENSE_PORT: 8108
|
||||
TYPESENSE_API_KEY: dcm-typesense-key
|
||||
TYPESENSE_API_KEY: ${TYPESENSE_API_KEY:?TYPESENSE_API_KEY must be set}
|
||||
TYPESENSE_COLLECTION_NAME: documents
|
||||
ports:
|
||||
- "${HOST_BIND_IP:-127.0.0.1}:8000:8000"
|
||||
@@ -74,23 +84,26 @@ services:
|
||||
worker:
|
||||
build:
|
||||
context: ./backend
|
||||
command: ["rq", "worker", "dcm", "--url", "redis://redis:6379/0"]
|
||||
command: ["python", "-m", "app.worker.run_worker"]
|
||||
environment:
|
||||
APP_ENV: development
|
||||
DATABASE_URL: postgresql+psycopg://dcm:dcm@db:5432/dcm
|
||||
REDIS_URL: redis://redis:6379/0
|
||||
APP_ENV: ${APP_ENV:-development}
|
||||
DATABASE_URL: ${DATABASE_URL:?DATABASE_URL must be set}
|
||||
REDIS_URL: ${REDIS_URL:?REDIS_URL must be set}
|
||||
REDIS_SECURITY_MODE: ${REDIS_SECURITY_MODE:-auto}
|
||||
REDIS_TLS_MODE: ${REDIS_TLS_MODE:-auto}
|
||||
STORAGE_ROOT: /data/storage
|
||||
ADMIN_API_TOKEN: ${ADMIN_API_TOKEN:?ADMIN_API_TOKEN must be set}
|
||||
USER_API_TOKEN: ${USER_API_TOKEN:?USER_API_TOKEN must be set}
|
||||
PROVIDER_BASE_URL_ALLOWLIST: '${PROVIDER_BASE_URL_ALLOWLIST:-["api.openai.com"]}'
|
||||
PROVIDER_BASE_URL_ALLOW_HTTP: ${PROVIDER_BASE_URL_ALLOW_HTTP:-false}
|
||||
PROVIDER_BASE_URL_ALLOW_PRIVATE_NETWORK: ${PROVIDER_BASE_URL_ALLOW_PRIVATE_NETWORK:-false}
|
||||
APP_SETTINGS_ENCRYPTION_KEY: ${APP_SETTINGS_ENCRYPTION_KEY:?APP_SETTINGS_ENCRYPTION_KEY must be set}
|
||||
PROVIDER_BASE_URL_ALLOWLIST: '${PROVIDER_BASE_URL_ALLOWLIST:-[]}'
|
||||
PROVIDER_BASE_URL_ALLOW_HTTP: ${PROVIDER_BASE_URL_ALLOW_HTTP:-true}
|
||||
PROVIDER_BASE_URL_ALLOW_PRIVATE_NETWORK: ${PROVIDER_BASE_URL_ALLOW_PRIVATE_NETWORK:-true}
|
||||
PROCESSING_LOG_STORE_MODEL_IO_TEXT: ${PROCESSING_LOG_STORE_MODEL_IO_TEXT:-false}
|
||||
PROCESSING_LOG_STORE_PAYLOAD_TEXT: ${PROCESSING_LOG_STORE_PAYLOAD_TEXT:-false}
|
||||
OCR_LANGUAGES: eng,deu
|
||||
PUBLIC_BASE_URL: http://localhost:8000
|
||||
PUBLIC_BASE_URL: ${PUBLIC_BASE_URL:-http://localhost:8000}
|
||||
TYPESENSE_PROTOCOL: http
|
||||
TYPESENSE_HOST: typesense
|
||||
TYPESENSE_PORT: 8108
|
||||
TYPESENSE_API_KEY: dcm-typesense-key
|
||||
TYPESENSE_API_KEY: ${TYPESENSE_API_KEY:?TYPESENSE_API_KEY must be set}
|
||||
TYPESENSE_COLLECTION_NAME: documents
|
||||
volumes:
|
||||
- ./backend/app:/app/app
|
||||
@@ -111,8 +124,7 @@ services:
|
||||
build:
|
||||
context: ./frontend
|
||||
environment:
|
||||
VITE_API_BASE: ${VITE_API_BASE:-http://localhost:8000/api/v1}
|
||||
VITE_API_TOKEN: ${VITE_API_TOKEN:-${USER_API_TOKEN:-}}
|
||||
VITE_API_BASE: ${VITE_API_BASE:-}
|
||||
ports:
|
||||
- "${HOST_BIND_IP:-127.0.0.1}:5173:5173"
|
||||
volumes:
|
||||
|
||||
@@ -3,9 +3,11 @@
|
||||
*/
|
||||
import { useCallback, useEffect, useMemo, useRef, useState } from 'react';
|
||||
import type { JSX } from 'react';
|
||||
import { LogOut, User } from 'lucide-react';
|
||||
|
||||
import ActionModal from './components/ActionModal';
|
||||
import DocumentGrid from './components/DocumentGrid';
|
||||
import LoginScreen from './components/LoginScreen';
|
||||
import DocumentViewer from './components/DocumentViewer';
|
||||
import PathInput from './components/PathInput';
|
||||
import ProcessingLogPanel from './components/ProcessingLogPanel';
|
||||
@@ -17,22 +19,28 @@ import {
|
||||
downloadBlobFile,
|
||||
deleteDocument,
|
||||
exportContentsMarkdown,
|
||||
getCurrentAuthSession,
|
||||
getRuntimeApiToken,
|
||||
getAppSettings,
|
||||
listDocuments,
|
||||
listPaths,
|
||||
listProcessingLogs,
|
||||
listTags,
|
||||
listTypes,
|
||||
loginWithPassword,
|
||||
logoutCurrentSession,
|
||||
resetAppSettings,
|
||||
setRuntimeApiToken,
|
||||
searchDocuments,
|
||||
trashDocument,
|
||||
updateAppSettings,
|
||||
uploadDocuments,
|
||||
} from './lib/api';
|
||||
import type { AppSettings, AppSettingsUpdate, DmsDocument, ProcessingLogEntry } from './types';
|
||||
import type { AppSettings, AppSettingsUpdate, AuthUser, DmsDocument, ProcessingLogEntry } from './types';
|
||||
|
||||
type AppScreen = 'documents' | 'settings';
|
||||
type DocumentView = 'active' | 'trash';
|
||||
type AuthPhase = 'checking' | 'unauthenticated' | 'authenticated';
|
||||
|
||||
interface DialogOption {
|
||||
key: string;
|
||||
@@ -51,6 +59,10 @@ interface DialogState {
|
||||
*/
|
||||
export default function App(): JSX.Element {
|
||||
const DEFAULT_PAGE_SIZE = 12;
|
||||
const [authPhase, setAuthPhase] = useState<AuthPhase>('checking');
|
||||
const [authUser, setAuthUser] = useState<AuthUser | null>(null);
|
||||
const [authError, setAuthError] = useState<string | null>(null);
|
||||
const [isAuthenticating, setIsAuthenticating] = useState<boolean>(false);
|
||||
const [screen, setScreen] = useState<AppScreen>('documents');
|
||||
const [documentView, setDocumentView] = useState<DocumentView>('active');
|
||||
const [documents, setDocuments] = useState<DmsDocument[]>([]);
|
||||
@@ -82,6 +94,7 @@ export default function App(): JSX.Element {
|
||||
const [error, setError] = useState<string | null>(null);
|
||||
const [dialogState, setDialogState] = useState<DialogState | null>(null);
|
||||
const dialogResolverRef = useRef<((value: string) => void) | null>(null);
|
||||
const isAdmin = authUser?.role === 'admin';
|
||||
|
||||
const pageSize = useMemo(() => {
|
||||
const configured = appSettings?.display?.cards_per_page;
|
||||
@@ -118,6 +131,74 @@ export default function App(): JSX.Element {
|
||||
}
|
||||
}, []);
|
||||
|
||||
/**
|
||||
* Clears workspace state when authentication context changes or session is revoked.
|
||||
*/
|
||||
const resetApplicationState = useCallback((): void => {
|
||||
setScreen('documents');
|
||||
setDocumentView('active');
|
||||
setDocuments([]);
|
||||
setTotalDocuments(0);
|
||||
setCurrentPage(1);
|
||||
setSearchText('');
|
||||
setActiveSearchQuery('');
|
||||
setSelectedDocumentId(null);
|
||||
setSelectedDocumentIds([]);
|
||||
setExportPathInput('');
|
||||
setTagFilter('');
|
||||
setTypeFilter('');
|
||||
setPathFilter('');
|
||||
setProcessedFrom('');
|
||||
setProcessedTo('');
|
||||
setKnownTags([]);
|
||||
setKnownPaths([]);
|
||||
setKnownTypes([]);
|
||||
setAppSettings(null);
|
||||
setSettingsSaveAction(null);
|
||||
setProcessingLogs([]);
|
||||
setProcessingLogError(null);
|
||||
setError(null);
|
||||
}, []);
|
||||
|
||||
/**
|
||||
* Exchanges submitted credentials for server-issued bearer session and activates app shell.
|
||||
*/
|
||||
const handleLogin = useCallback(async (username: string, password: string): Promise<void> => {
|
||||
setIsAuthenticating(true);
|
||||
setAuthError(null);
|
||||
try {
|
||||
const payload = await loginWithPassword(username, password);
|
||||
setRuntimeApiToken(payload.access_token);
|
||||
setAuthUser(payload.user);
|
||||
setAuthPhase('authenticated');
|
||||
setError(null);
|
||||
} catch (caughtError) {
|
||||
const message = caughtError instanceof Error ? caughtError.message : 'Login failed';
|
||||
setAuthError(message);
|
||||
setRuntimeApiToken(null);
|
||||
setAuthUser(null);
|
||||
setAuthPhase('unauthenticated');
|
||||
resetApplicationState();
|
||||
} finally {
|
||||
setIsAuthenticating(false);
|
||||
}
|
||||
}, [resetApplicationState]);
|
||||
|
||||
/**
|
||||
* Revokes current session server-side when possible and always clears local auth state.
|
||||
*/
|
||||
const handleLogout = useCallback(async (): Promise<void> => {
|
||||
setError(null);
|
||||
try {
|
||||
await logoutCurrentSession();
|
||||
} catch {}
|
||||
setRuntimeApiToken(null);
|
||||
setAuthUser(null);
|
||||
setAuthError(null);
|
||||
setAuthPhase('unauthenticated');
|
||||
resetApplicationState();
|
||||
}, [resetApplicationState]);
|
||||
|
||||
const loadCatalogs = useCallback(async (): Promise<void> => {
|
||||
const [tags, paths, types] = await Promise.all([listTags(true), listPaths(true), listTypes(true)]);
|
||||
setKnownTags(tags);
|
||||
@@ -185,6 +266,10 @@ export default function App(): JSX.Element {
|
||||
]);
|
||||
|
||||
const loadSettings = useCallback(async (): Promise<void> => {
|
||||
if (!isAdmin) {
|
||||
setAppSettings(null);
|
||||
return;
|
||||
}
|
||||
setError(null);
|
||||
try {
|
||||
const payload = await getAppSettings();
|
||||
@@ -192,9 +277,14 @@ export default function App(): JSX.Element {
|
||||
} catch (caughtError) {
|
||||
setError(caughtError instanceof Error ? caughtError.message : 'Failed to load settings');
|
||||
}
|
||||
}, []);
|
||||
}, [isAdmin]);
|
||||
|
||||
const loadProcessingTimeline = useCallback(async (options?: { silent?: boolean }): Promise<void> => {
|
||||
if (!isAdmin) {
|
||||
setProcessingLogs([]);
|
||||
setProcessingLogError(null);
|
||||
return;
|
||||
}
|
||||
const silent = options?.silent ?? false;
|
||||
if (!silent) {
|
||||
setIsLoadingLogs(true);
|
||||
@@ -210,18 +300,52 @@ export default function App(): JSX.Element {
|
||||
setIsLoadingLogs(false);
|
||||
}
|
||||
}
|
||||
}, []);
|
||||
}, [isAdmin]);
|
||||
|
||||
useEffect(() => {
|
||||
const existingToken = getRuntimeApiToken();
|
||||
if (!existingToken) {
|
||||
setAuthPhase('unauthenticated');
|
||||
setAuthUser(null);
|
||||
return;
|
||||
}
|
||||
|
||||
const resolveSession = async (): Promise<void> => {
|
||||
try {
|
||||
const sessionPayload = await getCurrentAuthSession();
|
||||
setAuthUser(sessionPayload.user);
|
||||
setAuthError(null);
|
||||
setAuthPhase('authenticated');
|
||||
} catch {
|
||||
setRuntimeApiToken(null);
|
||||
setAuthUser(null);
|
||||
setAuthPhase('unauthenticated');
|
||||
resetApplicationState();
|
||||
}
|
||||
};
|
||||
void resolveSession();
|
||||
}, [resetApplicationState]);
|
||||
|
||||
useEffect(() => {
|
||||
if (authPhase !== 'authenticated') {
|
||||
return;
|
||||
}
|
||||
const bootstrap = async (): Promise<void> => {
|
||||
try {
|
||||
if (isAdmin) {
|
||||
await Promise.all([loadDocuments(), loadCatalogs(), loadSettings(), loadProcessingTimeline()]);
|
||||
return;
|
||||
}
|
||||
await Promise.all([loadDocuments(), loadCatalogs()]);
|
||||
setAppSettings(null);
|
||||
setProcessingLogs([]);
|
||||
setProcessingLogError(null);
|
||||
} catch (caughtError) {
|
||||
setError(caughtError instanceof Error ? caughtError.message : 'Failed to initialize application');
|
||||
}
|
||||
};
|
||||
void bootstrap();
|
||||
}, [loadCatalogs, loadDocuments, loadProcessingTimeline, loadSettings]);
|
||||
}, [authPhase, isAdmin, loadCatalogs, loadDocuments, loadProcessingTimeline, loadSettings]);
|
||||
|
||||
useEffect(() => {
|
||||
setSelectedDocumentIds([]);
|
||||
@@ -229,13 +353,25 @@ export default function App(): JSX.Element {
|
||||
}, [documentView, pageSize]);
|
||||
|
||||
useEffect(() => {
|
||||
if (!isAdmin && screen === 'settings') {
|
||||
setScreen('documents');
|
||||
}
|
||||
}, [isAdmin, screen]);
|
||||
|
||||
useEffect(() => {
|
||||
if (authPhase !== 'authenticated') {
|
||||
return;
|
||||
}
|
||||
if (screen !== 'documents') {
|
||||
return;
|
||||
}
|
||||
void loadDocuments();
|
||||
}, [loadDocuments, screen]);
|
||||
}, [authPhase, loadDocuments, screen]);
|
||||
|
||||
useEffect(() => {
|
||||
if (authPhase !== 'authenticated') {
|
||||
return;
|
||||
}
|
||||
if (screen !== 'documents') {
|
||||
return;
|
||||
}
|
||||
@@ -243,9 +379,12 @@ export default function App(): JSX.Element {
|
||||
void loadDocuments({ silent: true });
|
||||
}, 3000);
|
||||
return () => window.clearInterval(pollInterval);
|
||||
}, [loadDocuments, screen]);
|
||||
}, [authPhase, loadDocuments, screen]);
|
||||
|
||||
useEffect(() => {
|
||||
if (authPhase !== 'authenticated' || !isAdmin) {
|
||||
return;
|
||||
}
|
||||
if (screen !== 'documents') {
|
||||
return;
|
||||
}
|
||||
@@ -254,7 +393,7 @@ export default function App(): JSX.Element {
|
||||
void loadProcessingTimeline({ silent: true });
|
||||
}, 1500);
|
||||
return () => window.clearInterval(pollInterval);
|
||||
}, [loadProcessingTimeline, screen]);
|
||||
}, [authPhase, isAdmin, loadProcessingTimeline, screen]);
|
||||
|
||||
const selectedDocument = useMemo(
|
||||
() => documents.find((document) => document.id === selectedDocumentId) ?? null,
|
||||
@@ -299,13 +438,17 @@ export default function App(): JSX.Element {
|
||||
});
|
||||
}
|
||||
|
||||
if (isAdmin) {
|
||||
await Promise.all([loadDocuments(), loadCatalogs(), loadProcessingTimeline()]);
|
||||
} else {
|
||||
await Promise.all([loadDocuments(), loadCatalogs()]);
|
||||
}
|
||||
} catch (caughtError) {
|
||||
setError(caughtError instanceof Error ? caughtError.message : 'Upload failed');
|
||||
} finally {
|
||||
setIsUploading(false);
|
||||
}
|
||||
}, [appSettings, loadCatalogs, loadDocuments, loadProcessingTimeline, presentDialog]);
|
||||
}, [appSettings, isAdmin, loadCatalogs, loadDocuments, loadProcessingTimeline, presentDialog]);
|
||||
|
||||
const handleSearch = useCallback(async (): Promise<void> => {
|
||||
setSelectedDocumentIds([]);
|
||||
@@ -579,14 +722,35 @@ export default function App(): JSX.Element {
|
||||
setCurrentPage(1);
|
||||
}, []);
|
||||
|
||||
if (authPhase === 'checking') {
|
||||
return (
|
||||
<main className="auth-shell">
|
||||
<section className="auth-card">
|
||||
<h1>LedgerDock</h1>
|
||||
<p>Checking current session...</p>
|
||||
</section>
|
||||
</main>
|
||||
);
|
||||
}
|
||||
|
||||
if (authPhase !== 'authenticated') {
|
||||
return <LoginScreen error={authError} isSubmitting={isAuthenticating} onSubmit={handleLogin} />;
|
||||
}
|
||||
|
||||
return (
|
||||
<main className="app-shell">
|
||||
<header className="topbar">
|
||||
<div>
|
||||
<div className="topbar-inner">
|
||||
<div className="topbar-brand">
|
||||
<h1>LedgerDock</h1>
|
||||
<p>Document command deck for OCR, routing intelligence, and controlled metadata ops.</p>
|
||||
<p className="topbar-auth-status">
|
||||
<User className="topbar-user-icon" aria-hidden="true" />
|
||||
You are currently signed in as <span className="topbar-current-username">{authUser?.username}</span>
|
||||
</p>
|
||||
</div>
|
||||
<div className="topbar-controls">
|
||||
<div className="topbar-primary-row">
|
||||
<div className="topbar-nav-group">
|
||||
<button
|
||||
type="button"
|
||||
@@ -608,6 +772,7 @@ export default function App(): JSX.Element {
|
||||
>
|
||||
Trash
|
||||
</button>
|
||||
{isAdmin && (
|
||||
<button
|
||||
type="button"
|
||||
className={screen === 'settings' ? 'active-view-button' : 'secondary-action'}
|
||||
@@ -615,6 +780,16 @@ export default function App(): JSX.Element {
|
||||
>
|
||||
Settings
|
||||
</button>
|
||||
)}
|
||||
</div>
|
||||
<button
|
||||
type="button"
|
||||
className="secondary-action topbar-icon-action"
|
||||
onClick={() => void handleLogout()}
|
||||
aria-label="Sign out"
|
||||
>
|
||||
<LogOut className="topbar-signout-icon" aria-hidden="true" />
|
||||
</button>
|
||||
</div>
|
||||
|
||||
{screen === 'documents' && (
|
||||
@@ -623,7 +798,7 @@ export default function App(): JSX.Element {
|
||||
</div>
|
||||
)}
|
||||
|
||||
{screen === 'settings' && (
|
||||
{screen === 'settings' && isAdmin && (
|
||||
<div className="topbar-settings-group">
|
||||
<button type="button" className="secondary-action" onClick={() => void handleResetSettings()} disabled={isSavingSettings}>
|
||||
Reset To Defaults
|
||||
@@ -634,11 +809,12 @@ export default function App(): JSX.Element {
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
</div>
|
||||
</header>
|
||||
|
||||
{error && <p className="error-banner">{error}</p>}
|
||||
|
||||
{screen === 'settings' && (
|
||||
{screen === 'settings' && isAdmin && (
|
||||
<SettingsScreen
|
||||
settings={appSettings}
|
||||
isSaving={isSavingSettings}
|
||||
@@ -762,7 +938,8 @@ export default function App(): JSX.Element {
|
||||
requestConfirmation={requestConfirmation}
|
||||
/>
|
||||
</section>
|
||||
{processingLogError && <p className="error-banner">{processingLogError}</p>}
|
||||
{isAdmin && processingLogError && <p className="error-banner">{processingLogError}</p>}
|
||||
{isAdmin && (
|
||||
<ProcessingLogPanel
|
||||
entries={processingLogs}
|
||||
isLoading={isLoadingLogs}
|
||||
@@ -772,6 +949,7 @@ export default function App(): JSX.Element {
|
||||
typingAnimationEnabled={typingAnimationEnabled}
|
||||
onClear={() => void handleClearProcessingLogs()}
|
||||
/>
|
||||
)}
|
||||
</>
|
||||
)}
|
||||
|
||||
|
||||
@@ -19,6 +19,47 @@ import type { DmsDocument, DmsDocumentDetail } from '../types';
|
||||
import PathInput from './PathInput';
|
||||
import TagInput from './TagInput';
|
||||
|
||||
const SAFE_IMAGE_PREVIEW_MIME_TYPES = new Set<string>([
|
||||
'image/bmp',
|
||||
'image/gif',
|
||||
'image/jpeg',
|
||||
'image/jpg',
|
||||
'image/png',
|
||||
'image/webp',
|
||||
]);
|
||||
|
||||
const SAFE_IFRAME_PREVIEW_MIME_TYPES = new Set<string>([
|
||||
'application/json',
|
||||
'application/pdf',
|
||||
'text/csv',
|
||||
'text/markdown',
|
||||
'text/plain',
|
||||
]);
|
||||
|
||||
/**
|
||||
* Normalizes MIME values by stripping parameters and lowercasing for stable comparison.
|
||||
*/
|
||||
function normalizeMimeType(mimeType: string | null | undefined): string {
|
||||
if (!mimeType) {
|
||||
return '';
|
||||
}
|
||||
return mimeType.split(';')[0]?.trim().toLowerCase() ?? '';
|
||||
}
|
||||
|
||||
/**
|
||||
* Resolves whether a MIME type is safe to render as an image preview.
|
||||
*/
|
||||
function isSafeImagePreviewMimeType(mimeType: string): boolean {
|
||||
return SAFE_IMAGE_PREVIEW_MIME_TYPES.has(mimeType);
|
||||
}
|
||||
|
||||
/**
|
||||
* Resolves whether a MIME type is safe to render inside a sandboxed iframe preview.
|
||||
*/
|
||||
function isSafeIframePreviewMimeType(mimeType: string): boolean {
|
||||
return SAFE_IFRAME_PREVIEW_MIME_TYPES.has(mimeType);
|
||||
}
|
||||
|
||||
/**
|
||||
* Defines props for the selected document viewer panel.
|
||||
*/
|
||||
@@ -60,6 +101,30 @@ export default function DocumentViewer({
|
||||
const [error, setError] = useState<string | null>(null);
|
||||
const previewObjectUrlRef = useRef<string | null>(null);
|
||||
|
||||
/**
|
||||
* Resolves normalized MIME type used by preview safety checks.
|
||||
*/
|
||||
const previewMimeType = useMemo(() => normalizeMimeType(document?.mime_type), [document?.mime_type]);
|
||||
|
||||
/**
|
||||
* Resolves whether selected document should render as a safe image element in preview.
|
||||
*/
|
||||
const isImageDocument = useMemo(() => {
|
||||
return isSafeImagePreviewMimeType(previewMimeType);
|
||||
}, [previewMimeType]);
|
||||
|
||||
/**
|
||||
* Resolves whether selected document should render in sandboxed iframe preview.
|
||||
*/
|
||||
const canRenderIframePreview = useMemo(() => {
|
||||
return isSafeIframePreviewMimeType(previewMimeType);
|
||||
}, [previewMimeType]);
|
||||
|
||||
/**
|
||||
* Resolves whether selected document supports any inline preview mode.
|
||||
*/
|
||||
const canRenderInlinePreview = isImageDocument || canRenderIframePreview;
|
||||
|
||||
/**
|
||||
* Syncs editable metadata fields whenever selection changes.
|
||||
*/
|
||||
@@ -100,6 +165,12 @@ export default function DocumentViewer({
|
||||
setIsLoadingPreview(false);
|
||||
return;
|
||||
}
|
||||
if (!canRenderInlinePreview) {
|
||||
revokePreviewObjectUrl();
|
||||
setPreviewObjectUrl(null);
|
||||
setIsLoadingPreview(false);
|
||||
return;
|
||||
}
|
||||
|
||||
let cancelled = false;
|
||||
setIsLoadingPreview(true);
|
||||
@@ -131,7 +202,7 @@ export default function DocumentViewer({
|
||||
cancelled = true;
|
||||
revokePreviewObjectUrl();
|
||||
};
|
||||
}, [document?.id]);
|
||||
}, [document?.id, canRenderInlinePreview]);
|
||||
|
||||
/**
|
||||
* Refreshes editable metadata from list updates only while form is clean.
|
||||
@@ -183,16 +254,6 @@ export default function DocumentViewer({
|
||||
};
|
||||
}, [document?.id]);
|
||||
|
||||
/**
|
||||
* Resolves whether selected document should render as an image element in preview.
|
||||
*/
|
||||
const isImageDocument = useMemo(() => {
|
||||
if (!document) {
|
||||
return false;
|
||||
}
|
||||
return document.mime_type.startsWith('image/');
|
||||
}, [document]);
|
||||
|
||||
/**
|
||||
* Extracts provider/transcription errors from document metadata for user visibility.
|
||||
*/
|
||||
@@ -482,11 +543,22 @@ export default function DocumentViewer({
|
||||
{previewObjectUrl ? (
|
||||
isImageDocument ? (
|
||||
<img src={previewObjectUrl} alt={document.original_filename} />
|
||||
) : canRenderIframePreview ? (
|
||||
<iframe
|
||||
src={previewObjectUrl}
|
||||
title={document.original_filename}
|
||||
sandbox=""
|
||||
referrerPolicy="no-referrer"
|
||||
allow="clipboard-read 'none'; clipboard-write 'none'; geolocation 'none'; microphone 'none'; camera 'none'; payment 'none'; usb 'none'; fullscreen 'none'"
|
||||
loading="lazy"
|
||||
/>
|
||||
) : (
|
||||
<iframe src={previewObjectUrl} title={document.original_filename} />
|
||||
<p className="small">Preview blocked for this file type. Download to inspect safely.</p>
|
||||
)
|
||||
) : isLoadingPreview ? (
|
||||
<p className="small">Loading preview...</p>
|
||||
) : !canRenderInlinePreview ? (
|
||||
<p className="small">Preview blocked for this file type. Download to inspect safely.</p>
|
||||
) : (
|
||||
<p className="small">Preview unavailable for this document.</p>
|
||||
)}
|
||||
|
||||
71
frontend/src/components/LoginScreen.tsx
Normal file
71
frontend/src/components/LoginScreen.tsx
Normal file
@@ -0,0 +1,71 @@
|
||||
/**
|
||||
* Login screen for session-based authentication before loading protected application views.
|
||||
*/
|
||||
import { FormEvent, useState } from 'react';
|
||||
import type { JSX } from 'react';
|
||||
|
||||
interface LoginScreenProps {
|
||||
error: string | null;
|
||||
isSubmitting: boolean;
|
||||
onSubmit: (username: string, password: string) => Promise<void>;
|
||||
}
|
||||
|
||||
/**
|
||||
* Renders credential form used to issue per-user API bearer sessions.
|
||||
*/
|
||||
export default function LoginScreen({
|
||||
error,
|
||||
isSubmitting,
|
||||
onSubmit,
|
||||
}: LoginScreenProps): JSX.Element {
|
||||
const [username, setUsername] = useState<string>('');
|
||||
const [password, setPassword] = useState<string>('');
|
||||
|
||||
/**
|
||||
* Submits credentials and leaves result handling to parent application orchestration.
|
||||
*/
|
||||
const handleSubmit = (event: FormEvent<HTMLFormElement>): void => {
|
||||
event.preventDefault();
|
||||
if (isSubmitting) {
|
||||
return;
|
||||
}
|
||||
void onSubmit(username, password);
|
||||
};
|
||||
|
||||
return (
|
||||
<main className="auth-shell">
|
||||
<section className="auth-card">
|
||||
<h1>LedgerDock</h1>
|
||||
<p>Sign in with your account to access documents and role-scoped controls.</p>
|
||||
<form onSubmit={handleSubmit} className="auth-form">
|
||||
<label>
|
||||
Username
|
||||
<input
|
||||
type="text"
|
||||
value={username}
|
||||
onChange={(event) => setUsername(event.target.value)}
|
||||
autoComplete="username"
|
||||
required
|
||||
disabled={isSubmitting}
|
||||
/>
|
||||
</label>
|
||||
<label>
|
||||
Password
|
||||
<input
|
||||
type="password"
|
||||
value={password}
|
||||
onChange={(event) => setPassword(event.target.value)}
|
||||
autoComplete="current-password"
|
||||
required
|
||||
disabled={isSubmitting}
|
||||
/>
|
||||
</label>
|
||||
<button type="submit" disabled={isSubmitting}>
|
||||
{isSubmitting ? 'Signing In...' : 'Sign In'}
|
||||
</button>
|
||||
</form>
|
||||
{error && <p className="error-banner">{error}</p>}
|
||||
</section>
|
||||
</main>
|
||||
);
|
||||
}
|
||||
@@ -1,5 +1,16 @@
|
||||
// @ts-expect-error Node strip-types runtime requires explicit .ts extension in ESM imports.
|
||||
import { downloadDocumentContentMarkdown, downloadDocumentFile, getDocumentPreviewBlob, getDocumentThumbnailBlob } from './api.ts';
|
||||
// @ts-ignore Node strip-types runtime requires explicit .ts extension in ESM imports.
|
||||
import {
|
||||
downloadDocumentContentMarkdown,
|
||||
downloadDocumentFile,
|
||||
getCurrentAuthSession,
|
||||
getDocumentPreviewBlob,
|
||||
getDocumentThumbnailBlob,
|
||||
getRuntimeApiToken,
|
||||
loginWithPassword,
|
||||
logoutCurrentSession,
|
||||
setRuntimeApiToken,
|
||||
updateDocumentMetadata,
|
||||
} from './api.ts';
|
||||
|
||||
/**
|
||||
* Throws when a test condition is false.
|
||||
@@ -25,15 +36,65 @@ async function assertRejects(action: () => Promise<unknown>, expectedMessage: st
|
||||
}
|
||||
|
||||
/**
|
||||
* Runs API helper tests for authenticated media and download flows.
|
||||
* Converts fetch inputs into a URL string for assertions.
|
||||
*/
|
||||
function toRequestUrl(input: RequestInfo | URL): string {
|
||||
if (typeof input === 'string') {
|
||||
return input;
|
||||
}
|
||||
if (input instanceof URL) {
|
||||
return input.toString();
|
||||
}
|
||||
return input.url;
|
||||
}
|
||||
|
||||
/**
|
||||
* Creates a minimal session storage implementation for Node-based tests.
|
||||
*/
|
||||
function createMemorySessionStorage(): Storage {
|
||||
const values = new Map<string, string>();
|
||||
return {
|
||||
get length(): number {
|
||||
return values.size;
|
||||
},
|
||||
clear(): void {
|
||||
values.clear();
|
||||
},
|
||||
getItem(key: string): string | null {
|
||||
return values.has(key) ? values.get(key) ?? null : null;
|
||||
},
|
||||
key(index: number): string | null {
|
||||
return Array.from(values.keys())[index] ?? null;
|
||||
},
|
||||
removeItem(key: string): void {
|
||||
values.delete(key);
|
||||
},
|
||||
setItem(key: string, value: string): void {
|
||||
values.set(key, String(value));
|
||||
},
|
||||
};
|
||||
}
|
||||
|
||||
/**
|
||||
* Runs API helper tests for authenticated media and auth session workflows.
|
||||
*/
|
||||
async function runApiTests(): Promise<void> {
|
||||
const originalFetch = globalThis.fetch;
|
||||
const sessionStorageDescriptor = Object.getOwnPropertyDescriptor(globalThis, 'sessionStorage');
|
||||
|
||||
try {
|
||||
Object.defineProperty(globalThis, 'sessionStorage', {
|
||||
configurable: true,
|
||||
writable: true,
|
||||
value: createMemorySessionStorage(),
|
||||
});
|
||||
setRuntimeApiToken(null);
|
||||
|
||||
const requestUrls: string[] = [];
|
||||
globalThis.fetch = (async (input: RequestInfo | URL): Promise<Response> => {
|
||||
requestUrls.push(typeof input === 'string' ? input : input.toString());
|
||||
const requestAuthHeaders: Array<string | null> = [];
|
||||
globalThis.fetch = (async (input: RequestInfo | URL, init?: RequestInit): Promise<Response> => {
|
||||
requestUrls.push(toRequestUrl(input));
|
||||
requestAuthHeaders.push(new Headers(init?.headers).get('Authorization'));
|
||||
return new Response('preview-bytes', { status: 200 });
|
||||
}) as typeof fetch;
|
||||
|
||||
@@ -50,6 +111,69 @@ async function runApiTests(): Promise<void> {
|
||||
requestUrls[1] === 'http://localhost:8000/api/v1/documents/doc-1/preview',
|
||||
`Unexpected preview URL ${requestUrls[1]}`,
|
||||
);
|
||||
assert(requestAuthHeaders[0] === null, `Expected no auth header for thumbnail request, got "${requestAuthHeaders[0]}"`);
|
||||
assert(requestAuthHeaders[1] === null, `Expected no auth header for preview request, got "${requestAuthHeaders[1]}"`);
|
||||
|
||||
setRuntimeApiToken('session-user-token');
|
||||
assert(getRuntimeApiToken() === 'session-user-token', 'Expected session token readback to match persisted token');
|
||||
globalThis.fetch = (async (_input: RequestInfo | URL, init?: RequestInit): Promise<Response> => {
|
||||
const authHeader = new Headers(init?.headers).get('Authorization');
|
||||
assert(authHeader === 'Bearer session-user-token', `Expected session token auth header, got "${authHeader}"`);
|
||||
return new Response('preview-bytes', { status: 200 });
|
||||
}) as typeof fetch;
|
||||
await getDocumentPreviewBlob('doc-session-auth');
|
||||
|
||||
let mergedContentType: string | null = null;
|
||||
let mergedAuthorization: string | null = null;
|
||||
globalThis.fetch = (async (_input: RequestInfo | URL, init?: RequestInit): Promise<Response> => {
|
||||
const headers = new Headers(init?.headers);
|
||||
mergedContentType = headers.get('Content-Type');
|
||||
mergedAuthorization = headers.get('Authorization');
|
||||
return new Response('{}', { status: 200 });
|
||||
}) as typeof fetch;
|
||||
await updateDocumentMetadata('doc-headers', { original_filename: 'renamed.pdf' });
|
||||
assert(mergedContentType === 'application/json', `Expected JSON content type to be preserved, got "${mergedContentType}"`);
|
||||
assert(mergedAuthorization === 'Bearer session-user-token', `Expected auth header, got "${mergedAuthorization}"`);
|
||||
|
||||
globalThis.fetch = (async (): Promise<Response> => {
|
||||
return new Response(
|
||||
JSON.stringify({
|
||||
access_token: 'issued-session-token',
|
||||
token_type: 'bearer',
|
||||
expires_at: '2026-03-01T10:30:00Z',
|
||||
user: {
|
||||
id: '3a42f5e0-b1ad-4f68-b2f4-3fa8c2fb31c9',
|
||||
username: 'admin',
|
||||
role: 'admin',
|
||||
},
|
||||
}),
|
||||
{ status: 200, headers: { 'Content-Type': 'application/json' } },
|
||||
);
|
||||
}) as typeof fetch;
|
||||
const loginPayload = await loginWithPassword('admin', 'password');
|
||||
assert(loginPayload.access_token === 'issued-session-token', 'Unexpected issued session token in login payload');
|
||||
assert(loginPayload.user.username === 'admin', 'Unexpected login user payload');
|
||||
|
||||
globalThis.fetch = (async (): Promise<Response> => {
|
||||
return new Response(
|
||||
JSON.stringify({
|
||||
expires_at: '2026-03-01T10:30:00Z',
|
||||
user: {
|
||||
id: '3a42f5e0-b1ad-4f68-b2f4-3fa8c2fb31c9',
|
||||
username: 'admin',
|
||||
role: 'admin',
|
||||
},
|
||||
}),
|
||||
{ status: 200, headers: { 'Content-Type': 'application/json' } },
|
||||
);
|
||||
}) as typeof fetch;
|
||||
const sessionPayload = await getCurrentAuthSession();
|
||||
assert(sessionPayload.user.role === 'admin', 'Expected admin role from auth session payload');
|
||||
|
||||
globalThis.fetch = (async (): Promise<Response> => {
|
||||
return new Response('{}', { status: 200, headers: { 'Content-Type': 'application/json' } });
|
||||
}) as typeof fetch;
|
||||
await logoutCurrentSession();
|
||||
|
||||
globalThis.fetch = (async (): Promise<Response> => {
|
||||
return new Response('file-bytes', {
|
||||
@@ -78,6 +202,12 @@ async function runApiTests(): Promise<void> {
|
||||
|
||||
await assertRejects(async () => downloadDocumentContentMarkdown('doc-4'), 'Failed to download document markdown');
|
||||
} finally {
|
||||
setRuntimeApiToken(null);
|
||||
if (sessionStorageDescriptor) {
|
||||
Object.defineProperty(globalThis, 'sessionStorage', sessionStorageDescriptor);
|
||||
} else {
|
||||
delete (globalThis as { sessionStorage?: Storage }).sessionStorage;
|
||||
}
|
||||
globalThis.fetch = originalFetch;
|
||||
}
|
||||
}
|
||||
|
||||
@@ -4,6 +4,8 @@
|
||||
import type {
|
||||
AppSettings,
|
||||
AppSettingsUpdate,
|
||||
AuthLoginResponse,
|
||||
AuthSessionInfo,
|
||||
DocumentListResponse,
|
||||
DmsDocument,
|
||||
DmsDocumentDetail,
|
||||
@@ -14,28 +16,99 @@ import type {
|
||||
} from '../types';
|
||||
|
||||
/**
|
||||
* Resolves backend base URL from environment with localhost fallback.
|
||||
* Resolves backend base URL from environment with host-derived HTTP fallback.
|
||||
*/
|
||||
const API_BASE = import.meta.env?.VITE_API_BASE ?? 'http://localhost:8000/api/v1';
|
||||
function resolveApiBase(): string {
|
||||
const envValue = import.meta.env?.VITE_API_BASE;
|
||||
if (typeof envValue === 'string') {
|
||||
const trimmed = envValue.trim().replace(/\/+$/, '');
|
||||
if (trimmed) {
|
||||
return trimmed;
|
||||
}
|
||||
}
|
||||
|
||||
if (typeof window !== 'undefined' && window.location?.hostname) {
|
||||
return `${window.location.protocol}//${window.location.hostname}:8000/api/v1`;
|
||||
}
|
||||
return 'http://localhost:8000/api/v1';
|
||||
}
|
||||
|
||||
const API_BASE = resolveApiBase();
|
||||
|
||||
/**
|
||||
* Optional bearer token used for authenticated backend routes.
|
||||
* Session storage key used for per-user runtime token persistence.
|
||||
*/
|
||||
const API_TOKEN = import.meta.env?.VITE_API_TOKEN?.trim();
|
||||
export const API_TOKEN_RUNTIME_STORAGE_KEY = 'dcm.access_token';
|
||||
|
||||
type ApiRequestInit = Omit<RequestInit, 'headers'> & { headers?: HeadersInit };
|
||||
|
||||
type ApiErrorPayload = { detail?: string } | null;
|
||||
|
||||
/**
|
||||
* Merges request headers and appends bearer authorization when configured.
|
||||
* Normalizes candidate token values by trimming whitespace and filtering non-string values.
|
||||
*/
|
||||
function normalizeBearerToken(candidate: unknown): string | undefined {
|
||||
if (typeof candidate !== 'string') {
|
||||
return undefined;
|
||||
}
|
||||
const normalized = candidate.trim();
|
||||
return normalized ? normalized : undefined;
|
||||
}
|
||||
|
||||
/**
|
||||
* Resolves bearer token persisted for current browser session.
|
||||
*/
|
||||
export function getRuntimeApiToken(): string | undefined {
|
||||
if (typeof globalThis.sessionStorage === 'undefined') {
|
||||
return undefined;
|
||||
}
|
||||
try {
|
||||
return normalizeBearerToken(globalThis.sessionStorage.getItem(API_TOKEN_RUNTIME_STORAGE_KEY));
|
||||
} catch {
|
||||
return undefined;
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Resolves bearer token from authenticated browser-session storage.
|
||||
*/
|
||||
function resolveApiToken(): string | undefined {
|
||||
return getRuntimeApiToken();
|
||||
}
|
||||
|
||||
/**
|
||||
* Stores or clears the per-user runtime API token in session storage.
|
||||
*
|
||||
* @param token Token value to persist for this browser session; clears persisted token when empty.
|
||||
*/
|
||||
export function setRuntimeApiToken(token: string | null | undefined): void {
|
||||
if (typeof globalThis.sessionStorage === 'undefined') {
|
||||
return;
|
||||
}
|
||||
try {
|
||||
const normalized = normalizeBearerToken(token);
|
||||
if (normalized) {
|
||||
globalThis.sessionStorage.setItem(API_TOKEN_RUNTIME_STORAGE_KEY, normalized);
|
||||
return;
|
||||
}
|
||||
globalThis.sessionStorage.removeItem(API_TOKEN_RUNTIME_STORAGE_KEY);
|
||||
} catch {
|
||||
return;
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Merges request headers and appends bearer authorization when a token can be resolved.
|
||||
*/
|
||||
function buildRequestHeaders(headers?: HeadersInit): Headers | undefined {
|
||||
if (!API_TOKEN && !headers) {
|
||||
const apiToken = resolveApiToken();
|
||||
if (!apiToken && !headers) {
|
||||
return undefined;
|
||||
}
|
||||
|
||||
const requestHeaders = new Headers(headers);
|
||||
if (API_TOKEN) {
|
||||
requestHeaders.set('Authorization', `Bearer ${API_TOKEN}`);
|
||||
if (apiToken) {
|
||||
requestHeaders.set('Authorization', `Bearer ${apiToken}`);
|
||||
}
|
||||
return requestHeaders;
|
||||
}
|
||||
@@ -51,6 +124,21 @@ function apiRequest(input: string, init: ApiRequestInit = {}): Promise<Response>
|
||||
});
|
||||
}
|
||||
|
||||
/**
|
||||
* Extracts backend error detail text from JSON payloads when available.
|
||||
*/
|
||||
async function responseErrorDetail(response: Response): Promise<string> {
|
||||
try {
|
||||
const payload = (await response.json()) as ApiErrorPayload;
|
||||
if (payload && typeof payload.detail === 'string' && payload.detail.trim()) {
|
||||
return payload.detail.trim();
|
||||
}
|
||||
} catch {
|
||||
return '';
|
||||
}
|
||||
return '';
|
||||
}
|
||||
|
||||
/**
|
||||
* Encodes query parameters while skipping undefined and null values.
|
||||
*/
|
||||
@@ -94,6 +182,59 @@ export function downloadBlobFile(blob: Blob, filename: string): void {
|
||||
}, 0);
|
||||
}
|
||||
|
||||
/**
|
||||
* Authenticates one user and returns issued bearer token plus role-bound session metadata.
|
||||
*/
|
||||
export async function loginWithPassword(username: string, password: string): Promise<AuthLoginResponse> {
|
||||
const response = await fetch(`${API_BASE}/auth/login`, {
|
||||
method: 'POST',
|
||||
headers: { 'Content-Type': 'application/json' },
|
||||
body: JSON.stringify({
|
||||
username: username.trim(),
|
||||
password,
|
||||
}),
|
||||
});
|
||||
if (!response.ok) {
|
||||
const detail = await responseErrorDetail(response);
|
||||
if (detail) {
|
||||
throw new Error(detail);
|
||||
}
|
||||
throw new Error('Login failed');
|
||||
}
|
||||
return response.json() as Promise<AuthLoginResponse>;
|
||||
}
|
||||
|
||||
/**
|
||||
* Loads currently authenticated user session metadata.
|
||||
*/
|
||||
export async function getCurrentAuthSession(): Promise<AuthSessionInfo> {
|
||||
const response = await apiRequest(`${API_BASE}/auth/me`);
|
||||
if (!response.ok) {
|
||||
const detail = await responseErrorDetail(response);
|
||||
if (detail) {
|
||||
throw new Error(detail);
|
||||
}
|
||||
throw new Error('Failed to load authentication session');
|
||||
}
|
||||
return response.json() as Promise<AuthSessionInfo>;
|
||||
}
|
||||
|
||||
/**
|
||||
* Revokes the current authenticated bearer session.
|
||||
*/
|
||||
export async function logoutCurrentSession(): Promise<void> {
|
||||
const response = await apiRequest(`${API_BASE}/auth/logout`, {
|
||||
method: 'POST',
|
||||
});
|
||||
if (!response.ok && response.status !== 401) {
|
||||
const detail = await responseErrorDetail(response);
|
||||
if (detail) {
|
||||
throw new Error(detail);
|
||||
}
|
||||
throw new Error('Failed to logout');
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Loads documents from the backend list endpoint.
|
||||
*/
|
||||
@@ -495,7 +636,8 @@ export async function updateAppSettings(payload: AppSettingsUpdate): Promise<App
|
||||
body: JSON.stringify(payload),
|
||||
});
|
||||
if (!response.ok) {
|
||||
throw new Error('Failed to update settings');
|
||||
const detail = await responseErrorDetail(response);
|
||||
throw new Error(detail ? `Failed to update settings: ${detail}` : 'Failed to update settings');
|
||||
}
|
||||
return response.json() as Promise<AppSettings>;
|
||||
}
|
||||
|
||||
@@ -4,11 +4,58 @@
|
||||
.app-shell {
|
||||
width: min(1820px, 100% - 2rem);
|
||||
margin: 0 auto;
|
||||
padding: var(--space-3) 0 var(--space-4);
|
||||
padding: 0 0 var(--space-4);
|
||||
display: grid;
|
||||
gap: var(--space-3);
|
||||
}
|
||||
|
||||
.auth-shell {
|
||||
min-height: 100vh;
|
||||
display: grid;
|
||||
place-items: center;
|
||||
padding: var(--space-4) var(--space-2);
|
||||
}
|
||||
|
||||
.auth-card {
|
||||
width: min(430px, 100%);
|
||||
display: grid;
|
||||
gap: var(--space-2);
|
||||
padding: var(--space-3);
|
||||
border: 1px solid var(--color-border-strong);
|
||||
border-radius: var(--radius-lg);
|
||||
background: linear-gradient(180deg, rgba(28, 42, 63, 0.95) 0%, rgba(20, 30, 47, 0.95) 100%);
|
||||
box-shadow: var(--shadow-soft);
|
||||
}
|
||||
|
||||
.auth-card h1 {
|
||||
margin: 0;
|
||||
font-family: var(--font-display);
|
||||
font-size: clamp(1.4rem, 2.1vw, 2rem);
|
||||
}
|
||||
|
||||
.auth-card p {
|
||||
margin: 0;
|
||||
color: var(--color-text-muted);
|
||||
font-size: 0.88rem;
|
||||
}
|
||||
|
||||
.auth-form {
|
||||
display: grid;
|
||||
gap: var(--space-2);
|
||||
}
|
||||
|
||||
.auth-form label {
|
||||
display: grid;
|
||||
gap: 0.35rem;
|
||||
font-size: 0.8rem;
|
||||
color: var(--color-text-muted);
|
||||
}
|
||||
|
||||
.auth-form button {
|
||||
margin-top: 0.25rem;
|
||||
min-height: 2.1rem;
|
||||
}
|
||||
|
||||
.app-shell > * {
|
||||
animation: rise-in 220ms ease both;
|
||||
}
|
||||
@@ -23,18 +70,33 @@
|
||||
|
||||
.topbar {
|
||||
position: sticky;
|
||||
top: var(--space-2);
|
||||
top: 0;
|
||||
z-index: 50;
|
||||
left: 0;
|
||||
width: 100vw;
|
||||
margin-left: calc(50% - 50vw);
|
||||
margin-right: calc(50% - 50vw);
|
||||
padding: 0;
|
||||
border: 1px solid var(--color-border-strong);
|
||||
border-radius: 0;
|
||||
background: linear-gradient(180deg, rgba(28, 42, 63, 0.96) 0%, rgba(20, 30, 47, 0.96) 100%);
|
||||
box-shadow: var(--shadow-soft);
|
||||
backdrop-filter: blur(10px);
|
||||
}
|
||||
|
||||
.topbar-inner {
|
||||
width: min(1820px, 100% - 2rem);
|
||||
margin: 0 auto;
|
||||
display: grid;
|
||||
grid-template-columns: minmax(260px, 1fr) auto;
|
||||
gap: var(--space-3);
|
||||
align-items: start;
|
||||
padding: var(--space-3);
|
||||
border: 1px solid var(--color-border-strong);
|
||||
border-radius: var(--radius-lg);
|
||||
background: linear-gradient(180deg, rgba(28, 42, 63, 0.96) 0%, rgba(20, 30, 47, 0.96) 100%);
|
||||
box-shadow: var(--shadow-soft);
|
||||
backdrop-filter: blur(10px);
|
||||
}
|
||||
|
||||
.topbar-brand {
|
||||
display: grid;
|
||||
gap: 0;
|
||||
}
|
||||
|
||||
.topbar h1 {
|
||||
@@ -50,12 +112,39 @@
|
||||
font-size: 0.85rem;
|
||||
}
|
||||
|
||||
.topbar-auth-status {
|
||||
display: inline-flex;
|
||||
align-items: center;
|
||||
gap: 0.35rem;
|
||||
margin-top: 0.45rem;
|
||||
color: var(--color-text-muted);
|
||||
font-size: 0.76rem;
|
||||
}
|
||||
|
||||
.topbar-user-icon {
|
||||
width: 0.85rem;
|
||||
height: 0.85rem;
|
||||
}
|
||||
|
||||
.topbar-current-username {
|
||||
color: var(--color-text);
|
||||
font-family: var(--font-mono);
|
||||
font-size: 0.76rem;
|
||||
}
|
||||
|
||||
.topbar-controls {
|
||||
display: grid;
|
||||
gap: var(--space-2);
|
||||
justify-items: end;
|
||||
}
|
||||
|
||||
.topbar-primary-row {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
justify-content: flex-end;
|
||||
gap: var(--space-2);
|
||||
}
|
||||
|
||||
.topbar-nav-group,
|
||||
.topbar-document-group,
|
||||
.topbar-settings-group {
|
||||
@@ -65,6 +154,21 @@
|
||||
gap: var(--space-2);
|
||||
}
|
||||
|
||||
.topbar-icon-action {
|
||||
width: 2.05rem;
|
||||
min-height: 2.05rem;
|
||||
padding: 0;
|
||||
display: inline-flex;
|
||||
align-items: center;
|
||||
justify-content: center;
|
||||
border-radius: var(--radius-xs);
|
||||
}
|
||||
|
||||
.topbar-signout-icon {
|
||||
width: 0.92rem;
|
||||
height: 0.92rem;
|
||||
}
|
||||
|
||||
.topbar-document-group .upload-actions-inline {
|
||||
display: flex;
|
||||
gap: var(--space-2);
|
||||
@@ -1244,6 +1348,12 @@ button:disabled {
|
||||
}
|
||||
|
||||
.topbar {
|
||||
width: 100%;
|
||||
margin-left: 0;
|
||||
margin-right: 0;
|
||||
}
|
||||
|
||||
.topbar-inner {
|
||||
grid-template-columns: 1fr;
|
||||
}
|
||||
|
||||
@@ -1252,10 +1362,16 @@ button:disabled {
|
||||
}
|
||||
|
||||
.topbar-nav-group,
|
||||
.topbar-primary-row,
|
||||
.topbar-document-group,
|
||||
.topbar-settings-group {
|
||||
justify-content: flex-start;
|
||||
}
|
||||
|
||||
.topbar-primary-row {
|
||||
justify-content: space-between;
|
||||
width: 100%;
|
||||
}
|
||||
}
|
||||
|
||||
@media (max-width: 1040px) {
|
||||
@@ -1340,12 +1456,14 @@ button:disabled {
|
||||
|
||||
@media (max-width: 560px) {
|
||||
.topbar-nav-group,
|
||||
.topbar-primary-row,
|
||||
.topbar-document-group,
|
||||
.topbar-settings-group {
|
||||
width: 100%;
|
||||
}
|
||||
|
||||
.topbar-nav-group button,
|
||||
.topbar-primary-row button,
|
||||
.topbar-document-group button,
|
||||
.topbar-settings-group button {
|
||||
flex: 1;
|
||||
|
||||
@@ -58,6 +58,31 @@ export interface SearchResponse {
|
||||
items: DmsDocument[];
|
||||
}
|
||||
|
||||
/**
|
||||
* Represents one authenticated user identity returned by backend auth endpoints.
|
||||
*/
|
||||
export interface AuthUser {
|
||||
id: string;
|
||||
username: string;
|
||||
role: 'admin' | 'user';
|
||||
}
|
||||
|
||||
/**
|
||||
* Represents active authentication session metadata.
|
||||
*/
|
||||
export interface AuthSessionInfo {
|
||||
user: AuthUser;
|
||||
expires_at: string;
|
||||
}
|
||||
|
||||
/**
|
||||
* Represents login response payload with issued bearer token and session metadata.
|
||||
*/
|
||||
export interface AuthLoginResponse extends AuthSessionInfo {
|
||||
access_token: string;
|
||||
token_type: 'bearer';
|
||||
}
|
||||
|
||||
/**
|
||||
* Represents distinct document type values available for filter controls.
|
||||
*/
|
||||
|
||||
@@ -15,5 +15,6 @@
|
||||
"noFallthroughCasesInSwitch": true,
|
||||
"types": ["vite/client", "react", "react-dom"]
|
||||
},
|
||||
"include": ["src"]
|
||||
"include": ["src"],
|
||||
"exclude": ["src/**/*.test.ts", "src/**/*.test.tsx"]
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user