Initial commit
This commit is contained in:
118
README.md
Normal file
118
README.md
Normal file
@@ -0,0 +1,118 @@
|
||||
# DMS
|
||||
|
||||
DMS is a single-user document management system with:
|
||||
- drag-and-drop upload anywhere in the UI
|
||||
- file and folder upload
|
||||
- document processing and indexing (PDF, text, OpenAI handwriting/image transcription, DOCX, XLSX, ZIP extraction)
|
||||
- fallback handling for unsupported formats
|
||||
- original file preservation and download
|
||||
- metadata-based and full-text search
|
||||
- learning-based routing suggestions
|
||||
|
||||
## Requirements
|
||||
|
||||
- Docker Engine with Docker Compose plugin
|
||||
- Internet access for the first image build
|
||||
|
||||
## Install And Run With Docker Compose
|
||||
|
||||
1. Open a terminal in this repository root.
|
||||
2. Start the full stack:
|
||||
|
||||
```bash
|
||||
docker compose up --build -d
|
||||
```
|
||||
|
||||
3. Open the applications:
|
||||
- Frontend: `http://localhost:5173`
|
||||
- Backend API docs: `http://localhost:8000/docs`
|
||||
- Health check: `http://localhost:8000/api/v1/health`
|
||||
|
||||
## Setup
|
||||
|
||||
1. Open the frontend and upload files or folders.
|
||||
2. Set default destination path and tags before upload if needed.
|
||||
3. Configure handwriting transcription settings in the UI:
|
||||
- OpenAI compatible base URL
|
||||
- model (default: `gpt-4.1-mini`)
|
||||
- API key and timeout
|
||||
4. Open a document in the details panel, adjust destination and tags, then save.
|
||||
5. Keep `Learn from this routing decision` enabled to train future routing suggestions.
|
||||
|
||||
## Data Persistence On Host
|
||||
|
||||
All runtime data is stored on the host using bind mounts.
|
||||
|
||||
Default host location:
|
||||
- `./data/postgres`
|
||||
- `./data/redis`
|
||||
- `./data/storage`
|
||||
|
||||
To persist under another host directory, set `DCM_DATA_DIR`:
|
||||
|
||||
```bash
|
||||
DCM_DATA_DIR=/data docker compose up --build -d
|
||||
```
|
||||
|
||||
This will place runtime data under `/data` on the host.
|
||||
|
||||
## Common Commands
|
||||
|
||||
Start:
|
||||
|
||||
```bash
|
||||
docker compose up --build -d
|
||||
```
|
||||
|
||||
Stop:
|
||||
|
||||
```bash
|
||||
docker compose down
|
||||
```
|
||||
|
||||
View logs:
|
||||
|
||||
```bash
|
||||
docker compose logs -f
|
||||
```
|
||||
|
||||
Rebuild a clean stack while keeping persisted data:
|
||||
|
||||
```bash
|
||||
docker compose down
|
||||
docker compose up --build -d
|
||||
```
|
||||
|
||||
Reset all persisted runtime data:
|
||||
|
||||
```bash
|
||||
docker compose down
|
||||
rm -rf ./data
|
||||
```
|
||||
|
||||
## Handwriting Transcription Notes
|
||||
|
||||
- Handwriting and image transcription uses an OpenAI compatible vision endpoint.
|
||||
- Before transcription, images are normalized:
|
||||
- EXIF rotation is corrected
|
||||
- long side is resized to a maximum of 2000px
|
||||
- image is sent as a base64 data URL payload
|
||||
- Handwriting provider settings are persisted in host storage and survive container restarts.
|
||||
|
||||
## API Overview
|
||||
|
||||
GET endpoints:
|
||||
- `GET /api/v1/health`
|
||||
- `GET /api/v1/documents`
|
||||
- `GET /api/v1/documents/{document_id}`
|
||||
- `GET /api/v1/documents/{document_id}/preview`
|
||||
- `GET /api/v1/documents/{document_id}/download`
|
||||
- `GET /api/v1/documents/tags`
|
||||
- `GET /api/v1/search?query=...`
|
||||
- `GET /api/v1/settings`
|
||||
|
||||
Additional endpoints used by the UI:
|
||||
- `POST /api/v1/documents/upload`
|
||||
- `PATCH /api/v1/documents/{document_id}`
|
||||
- `POST /api/v1/documents/{document_id}/reprocess`
|
||||
- `PATCH /api/v1/settings/handwriting`
|
||||
Reference in New Issue
Block a user