3.7 KiB
3.7 KiB
Architecture Overview
System Topology
DMS runs as a multi-service application defined in docker-compose.yml:
frontendserves the React UI on port5173apiserves FastAPI on port8000workerexecutes asynchronous extraction and indexing jobsdbprovides PostgreSQL persistence on the internal compose networkredisbacks queueing on the internal compose networktypesensestores search index and vector-adjacent metadata on the internal compose network
Backend Architecture
Backend source root: backend/app/
Main boundaries:
api/route handlers and HTTP contractservices/domain logic (authentication, storage, extraction, routing, settings, processing logs, Typesense)db/SQLAlchemy base, engine, and session lifecyclemodels/persistence entities (AppUser,AuthSession,Document,ProcessingLogEntry)schemas/Pydantic response and request schemasworker/RQ queue integration and background processing tasks
Application bootstrap in backend/app/main.py:
- mounts routers under
/api/v1 - configures CORS from settings
- initializes storage, database schema, bootstrap users, settings, and Typesense collection on startup
Processing Lifecycle
- Upload starts at
POST /api/v1/documents/upload. - API stores file bytes and inserts document rows with status
queued. - API enqueues
app.worker.tasks.process_document_taskinto Redis. - Worker extracts content and metadata, handles ZIP expansion, runs OCR and routing suggestions, and writes processing logs.
- Worker updates database fields, document status, and search index entries.
- UI polls for documents and processing logs to reflect progress.
Frontend Architecture
Frontend source root: frontend/src/
Core structure:
App.tsxorchestrates screen switching, state, polling, and action flowscomponents/contains upload, filter, grid, viewer, modal, settings, and log panel moduleslib/api.tscentralizes API client callstypes.tsdefines typed API contracts used by componentsdesign-foundation.cssandstyles.cssdefine design tokens and global/component styling
Main user flows:
- Login and role-gated navigation (
adminanduser) - Upload and conflict resolution
- Search and filtered document browsing
- Metadata editing and lifecycle actions (trash, restore, delete, reprocess)
- Settings management for providers, tasks, and UI defaults (admin only)
- Processing log review (admin only)
Persistence and State
Persistent data:
- PostgreSQL stores document metadata and processing logs
- Docker volume-backed storage keeps original files, previews, and settings JSON
- Typesense stores indexed search representations
Transient runtime state:
- Redis queues processing tasks and worker execution state
- frontend local component state drives active filters, selection, and modal flows
Security-sensitive runtime behavior:
- API access is session-based with per-user server-issued bearer tokens and role checks.
- Document and search reads for
userrole are owner-scoped viaowner_user_id;admincan access global scope. - Redis connection URLs are validated by backend queue helpers with environment-aware auth and TLS policy enforcement.
- Worker startup runs through
python -m app.worker.run_worker, which validates Redis URL policy before queue consumption. - Inline preview is limited to safe MIME types and script-capable content is served as attachment-only.
- Archive fan-out processing propagates root and depth lineage metadata and enforces depth and per-root descendant caps.
- Markdown export applies per-user rate limits, hard document-count and total-byte caps, and spool-file streaming.
- Processing logs default to metadata-only persistence, with explicit operator toggles required to store model IO text.