Files
ledgerdock/README.md
2026-02-21 09:44:18 -03:00

119 lines
2.8 KiB
Markdown

# DMS
DMS is a single-user document management system with:
- drag-and-drop upload anywhere in the UI
- file and folder upload
- document processing and indexing (PDF, text, OpenAI handwriting/image transcription, DOCX, XLSX, ZIP extraction)
- fallback handling for unsupported formats
- original file preservation and download
- metadata-based and full-text search
- learning-based routing suggestions
## Requirements
- Docker Engine with Docker Compose plugin
- Internet access for the first image build
## Install And Run With Docker Compose
1. Open a terminal in this repository root.
2. Start the full stack:
```bash
docker compose up --build -d
```
3. Open the applications:
- Frontend: `http://localhost:5173`
- Backend API docs: `http://localhost:8000/docs`
- Health check: `http://localhost:8000/api/v1/health`
## Setup
1. Open the frontend and upload files or folders.
2. Set default destination path and tags before upload if needed.
3. Configure handwriting transcription settings in the UI:
- OpenAI compatible base URL
- model (default: `gpt-4.1-mini`)
- API key and timeout
4. Open a document in the details panel, adjust destination and tags, then save.
5. Keep `Learn from this routing decision` enabled to train future routing suggestions.
## Data Persistence On Host
All runtime data is stored on the host using bind mounts.
Default host location:
- `./data/postgres`
- `./data/redis`
- `./data/storage`
To persist under another host directory, set `DCM_DATA_DIR`:
```bash
DCM_DATA_DIR=/data docker compose up --build -d
```
This will place runtime data under `/data` on the host.
## Common Commands
Start:
```bash
docker compose up --build -d
```
Stop:
```bash
docker compose down
```
View logs:
```bash
docker compose logs -f
```
Rebuild a clean stack while keeping persisted data:
```bash
docker compose down
docker compose up --build -d
```
Reset all persisted runtime data:
```bash
docker compose down
rm -rf ./data
```
## Handwriting Transcription Notes
- Handwriting and image transcription uses an OpenAI compatible vision endpoint.
- Before transcription, images are normalized:
- EXIF rotation is corrected
- long side is resized to a maximum of 2000px
- image is sent as a base64 data URL payload
- Handwriting provider settings are persisted in host storage and survive container restarts.
## API Overview
GET endpoints:
- `GET /api/v1/health`
- `GET /api/v1/documents`
- `GET /api/v1/documents/{document_id}`
- `GET /api/v1/documents/{document_id}/preview`
- `GET /api/v1/documents/{document_id}/download`
- `GET /api/v1/documents/tags`
- `GET /api/v1/search?query=...`
- `GET /api/v1/settings`
Additional endpoints used by the UI:
- `POST /api/v1/documents/upload`
- `PATCH /api/v1/documents/{document_id}`
- `POST /api/v1/documents/{document_id}/reprocess`
- `PATCH /api/v1/settings/handwriting`