Compare commits
2 Commits
4b34d6153c
...
41bbe87b4c
| Author | SHA1 | Date | |
|---|---|---|---|
|
41bbe87b4c
|
|||
|
6fba581865
|
15
CHANGELOG.md
15
CHANGELOG.md
@@ -3,18 +3,5 @@ All notable changes to this project will be documented in this file.
|
||||
|
||||
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
|
||||
|
||||
## [Unreleased]
|
||||
|
||||
### Added
|
||||
- Initialized `CHANGELOG.md` with Keep a Changelog structure for ongoing release-note tracking.
|
||||
|
||||
### Changed
|
||||
- Refreshed `README.md` with current stack details, runtime services, setup commands, configuration notes, and manual validation guidance.
|
||||
|
||||
### Deprecated
|
||||
|
||||
### Removed
|
||||
|
||||
### Fixed
|
||||
|
||||
### Security
|
||||
- Initial release
|
||||
|
||||
258
README.md
258
README.md
@@ -1,146 +1,90 @@
|
||||
# LedgerDock
|
||||
|
||||
LedgerDock is a self-hosted document management system (DMS) for ingesting, processing, organizing, and searching files.
|
||||
LedgerDock is a private document workspace you can run on your own computer or server.
|
||||
It helps teams collect files, process text from documents, and find information quickly with search.
|
||||
|
||||
## Core Capabilities
|
||||
## What LedgerDock Is For
|
||||
|
||||
- Drag and drop upload from anywhere in the UI
|
||||
- File and folder upload with path preservation
|
||||
- Asynchronous extraction and OCR for PDF, images, DOCX, XLSX, TXT, and ZIP
|
||||
- Metadata and full-text search
|
||||
- Routing suggestions based on previous decisions
|
||||
- Original file download and extracted markdown export
|
||||
- Upload files and folders from one place
|
||||
- Keep documents organized and searchable
|
||||
- Extract text from scans and images (OCR)
|
||||
- Download originals or extracted text
|
||||
|
||||
## Technology Stack
|
||||
## Before You Start
|
||||
|
||||
- Backend: FastAPI, SQLAlchemy, RQ worker (`backend/`)
|
||||
- Frontend: React, Vite, TypeScript (`frontend/`)
|
||||
- Infrastructure: PostgreSQL, Redis, Typesense (`docker-compose.yml`)
|
||||
You need:
|
||||
|
||||
## Runtime Services
|
||||
- Docker Desktop (Windows or macOS) or Docker Engine + Docker Compose (Linux)
|
||||
- A terminal app
|
||||
- The project folder on your machine
|
||||
- Internet access the first time you build containers
|
||||
|
||||
The default `docker compose` stack includes:
|
||||
## Install With Docker Compose
|
||||
|
||||
- `frontend` - React UI (`http://localhost:5173`)
|
||||
- `api` - FastAPI backend (`http://localhost:8000`, docs at `/docs`)
|
||||
- `worker` - background processing jobs
|
||||
- `db` - PostgreSQL (internal service network)
|
||||
- `redis` - queue backend (internal service network)
|
||||
- `typesense` - search index (internal service network)
|
||||
Follow these steps from the project folder (where `docker-compose.yml` is located).
|
||||
|
||||
## Requirements
|
||||
1. Create your local settings file from the template.
|
||||
|
||||
- Docker Engine
|
||||
- Docker Compose plugin
|
||||
- Internet access for first-time image build
|
||||
```bash
|
||||
cp .env.example .env
|
||||
```
|
||||
|
||||
## Quick Start
|
||||
|
||||
From repository root:
|
||||
2. Open `.env` in a text editor and set your own passwords and keys.
|
||||
3. Start LedgerDock.
|
||||
|
||||
```bash
|
||||
docker compose up --build -d
|
||||
```
|
||||
|
||||
Before first run, set required secrets and connection values in `.env` (or your shell):
|
||||
4. Wait until startup is complete, then open the app:
|
||||
- LedgerDock web app: `http://localhost:5173`
|
||||
- Health check: `http://localhost:8000/api/v1/health`
|
||||
5. Sign in with the admin username and password you set in `.env`.
|
||||
|
||||
- `POSTGRES_USER`
|
||||
- `POSTGRES_PASSWORD`
|
||||
- `POSTGRES_DB`
|
||||
- `DATABASE_URL`
|
||||
- `REDIS_PASSWORD`
|
||||
- `REDIS_URL`
|
||||
- `AUTH_BOOTSTRAP_ADMIN_USERNAME`
|
||||
- `AUTH_BOOTSTRAP_ADMIN_PASSWORD`
|
||||
- optional `AUTH_BOOTSTRAP_USER_USERNAME`
|
||||
- optional `AUTH_BOOTSTRAP_USER_PASSWORD`
|
||||
- `APP_SETTINGS_ENCRYPTION_KEY`
|
||||
- `TYPESENSE_API_KEY`
|
||||
## `.env` Settings Explained In Plain Language
|
||||
|
||||
Start from `.env.example` to avoid missing required variables.
|
||||
LedgerDock reads settings from `.env`. Some values are required and some are optional.
|
||||
|
||||
Open:
|
||||
### Required: Change These Before First Use
|
||||
|
||||
- Frontend: `http://localhost:5173`
|
||||
- API docs: `http://localhost:8000/docs`
|
||||
- Health: `http://localhost:8000/api/v1/health`
|
||||
- `POSTGRES_PASSWORD`: Password for the internal database.
|
||||
- `REDIS_PASSWORD`: Password for the internal queue service.
|
||||
- `AUTH_BOOTSTRAP_ADMIN_PASSWORD`: First admin login password.
|
||||
- `APP_SETTINGS_ENCRYPTION_KEY`: Secret used to protect saved app settings.
|
||||
- `TYPESENSE_API_KEY`: Secret key for the search engine.
|
||||
|
||||
Use bootstrap credentials (`AUTH_BOOTSTRAP_ADMIN_USERNAME` and `AUTH_BOOTSTRAP_ADMIN_PASSWORD`) to sign in from the frontend login screen.
|
||||
Use long, unique values for each one. Do not reuse personal passwords.
|
||||
|
||||
Stop the stack:
|
||||
### Required: Usually Keep Defaults Unless You Know You Need Changes
|
||||
|
||||
```bash
|
||||
docker compose down
|
||||
```
|
||||
- `POSTGRES_USER`: Database username.
|
||||
- `POSTGRES_DB`: Database name.
|
||||
- `DATABASE_URL`: Connection string to the database service.
|
||||
- `REDIS_URL`: Connection string to the Redis service.
|
||||
- `AUTH_BOOTSTRAP_ADMIN_USERNAME`: First admin username (default `admin`).
|
||||
|
||||
## Security Must-Know Before Real User Deployment
|
||||
If you change passwords, make sure matching URLs use the same new password.
|
||||
|
||||
The items below port the `MUST KNOW User-Dependent Risks` from `REPORT.md` into explicit operator actions.
|
||||
### Optional User Account (Can Be Left Empty)
|
||||
|
||||
### High: Development-first defaults can be promoted to production
|
||||
- `AUTH_BOOTSTRAP_USER_USERNAME`
|
||||
- `AUTH_BOOTSTRAP_USER_PASSWORD`
|
||||
|
||||
Avoid:
|
||||
- Set `APP_ENV=production`.
|
||||
- Set `PROVIDER_BASE_URL_ALLOW_HTTP=false`.
|
||||
- Set `PROVIDER_BASE_URL_ALLOW_PRIVATE_NETWORK=false`.
|
||||
- Set a strict non-empty `PROVIDER_BASE_URL_ALLOWLIST` for approved provider hosts only.
|
||||
- Set `PUBLIC_BASE_URL` to HTTPS.
|
||||
- Restrict `CORS_ORIGINS` to exact production frontend origins.
|
||||
- Use `REDIS_URL` with `rediss://`.
|
||||
- Set `REDIS_SECURITY_MODE=strict`.
|
||||
- Set `REDIS_TLS_MODE=required`.
|
||||
- Keep `HOST_BIND_IP=127.0.0.1` and expose services only through an HTTPS reverse proxy.
|
||||
These create an extra non-admin account on first startup.
|
||||
|
||||
Remedy:
|
||||
- Immediately correct the values above and redeploy `api` and `worker` (`docker compose up -d api worker`).
|
||||
- Rotate `AUTH_BOOTSTRAP_*` credentials, provider API keys, and Redis credentials if insecure values were used in a reachable environment.
|
||||
- Re-check `.env.example` and `docker-compose.yml` before each production promotion.
|
||||
### Network and Access Settings
|
||||
|
||||
### Medium: Login throttle IP identity depends on proxy trust model
|
||||
- `HOST_BIND_IP`: Where services listen. Keep `127.0.0.1` for local-only access.
|
||||
- `PUBLIC_BASE_URL`: Backend base URL. Local default is `http://localhost:8000`.
|
||||
- `CORS_ORIGINS`: Allowed frontend origins. Keep local defaults for single-machine use.
|
||||
- `VITE_API_BASE`: Frontend API URL override. Leave empty unless you know you need it.
|
||||
|
||||
Current behavior:
|
||||
- Login throttle identity currently uses `request.client.host` directly.
|
||||
### Environment Mode
|
||||
|
||||
Avoid:
|
||||
- Deploy so the backend receives true client IP addresses and does not collapse all traffic to one proxy source IP.
|
||||
- Validate lockout behavior with multiple client IPs before going live behind a proxy.
|
||||
- `APP_ENV=development`: Local mode (default).
|
||||
- `APP_ENV=production`: Use when running as a real shared deployment with HTTPS and tighter security settings.
|
||||
|
||||
Remedy:
|
||||
- If lockouts affect many users at once, temporarily increase `AUTH_LOGIN_FAILURE_LIMIT` and tune lockout timings to reduce impact while mitigation is in progress.
|
||||
- Update network and proxy topology so client IP identity is preserved for the backend, then re-run lockout validation tests.
|
||||
|
||||
### Medium: API documentation endpoints are exposed by default
|
||||
|
||||
Avoid:
|
||||
- Block public access to `/docs`, `/redoc`, and `/openapi.json` at the reverse proxy or edge firewall.
|
||||
- Keep docs endpoints reachable only from trusted internal/admin networks.
|
||||
|
||||
Remedy:
|
||||
- Add deny rules for those paths immediately and reload the proxy.
|
||||
- Verify those routes return `403` or `404` from untrusted networks.
|
||||
|
||||
### Medium: Auth session tokens are cookie-based
|
||||
|
||||
Avoid:
|
||||
- Keep dependencies patched to reduce known XSS vectors.
|
||||
- Keep frontend dependencies locked and scanned for known payload paths.
|
||||
- Treat any suspected script injection as a session risk and rotate bootstrap credentials immediately.
|
||||
|
||||
Remedy:
|
||||
- If script injection is suspected, revoke active sessions, rotate bootstrap credentials, and redeploy frontend fixes before restoring access.
|
||||
- Treat exposed sessions as compromised until revocation and credential rotation are complete.
|
||||
- Cookies are HttpOnly and cannot be read by JavaScript, but session scope still ends on server-side revocation and expiry controls.
|
||||
|
||||
### Low: Typesense transport defaults to HTTP on internal network
|
||||
|
||||
Avoid:
|
||||
- Keep Typesense on isolated internal networks only.
|
||||
- Do not expose Typesense service ports directly to untrusted networks.
|
||||
|
||||
Remedy:
|
||||
- For cross-host or untrusted network paths, terminate TLS in front of Typesense (or use equivalent secure service networking) and require encrypted transport for all clients.
|
||||
|
||||
## Common Operations
|
||||
## Daily Use Commands
|
||||
|
||||
Start or rebuild:
|
||||
|
||||
@@ -154,97 +98,41 @@ Stop:
|
||||
docker compose down
|
||||
```
|
||||
|
||||
Tail logs:
|
||||
View logs:
|
||||
|
||||
```bash
|
||||
docker compose logs -f
|
||||
```
|
||||
|
||||
Tail API and worker logs only:
|
||||
View backend logs only:
|
||||
|
||||
```bash
|
||||
docker compose logs -f api worker
|
||||
```
|
||||
|
||||
Reset all runtime data (destructive):
|
||||
## Where Your Data Is Stored
|
||||
|
||||
LedgerDock stores data in Docker volumes so it survives container restarts:
|
||||
|
||||
- `db-data` for PostgreSQL data
|
||||
- `redis-data` for Redis data
|
||||
- `dcm-storage` for uploaded files and app storage
|
||||
- `typesense-data` for the search index
|
||||
|
||||
To remove everything, including data:
|
||||
|
||||
```bash
|
||||
docker compose down -v
|
||||
```
|
||||
|
||||
## Frontend-Only Local Workflow
|
||||
Warning: this permanently deletes your LedgerDock data on this machine.
|
||||
|
||||
If backend services are already running, you can run frontend tooling locally:
|
||||
## First Checks After Install
|
||||
|
||||
```bash
|
||||
cd frontend && npm run dev
|
||||
cd frontend && npm run build
|
||||
cd frontend && npm run preview
|
||||
```
|
||||
- Open `http://localhost:5173` and confirm the login page appears.
|
||||
- Open `http://localhost:8000/api/v1/health` and confirm you get `{"status":"ok"}`.
|
||||
- Upload one sample file and confirm it appears in search.
|
||||
|
||||
`npm run preview` serves the built app on port `4173`.
|
||||
## Need Technical Documentation?
|
||||
|
||||
## Configuration
|
||||
|
||||
Main runtime variables are defined in `docker-compose.yml`:
|
||||
|
||||
- API and worker: `DATABASE_URL`, `REDIS_URL`, `REDIS_SECURITY_MODE`, `REDIS_TLS_MODE`, `STORAGE_ROOT`, `PUBLIC_BASE_URL`, `CORS_ORIGINS`, `AUTH_BOOTSTRAP_*`, `PROCESSING_LOG_STORE_*`, `CONTENT_EXPORT_*`, `TYPESENSE_*`, `APP_SETTINGS_ENCRYPTION_KEY`
|
||||
- Frontend: optional `VITE_API_BASE`
|
||||
|
||||
When `VITE_API_BASE` is unset, the frontend uses `http://<current-hostname>:8000/api/v1`.
|
||||
|
||||
Application settings saved from the UI persist at:
|
||||
|
||||
- `<STORAGE_ROOT>/settings.json` (inside the storage volume)
|
||||
|
||||
Provider API keys are persisted encrypted at rest (`api_key_encrypted`) and are no longer written as plaintext values.
|
||||
|
||||
Settings endpoints:
|
||||
|
||||
- `GET/PATCH /api/v1/settings`
|
||||
- `POST /api/v1/settings/reset`
|
||||
- `PATCH /api/v1/settings/handwriting`
|
||||
- `POST /api/v1/processing/logs/trim` (admin only)
|
||||
|
||||
Auth endpoints:
|
||||
|
||||
- `POST /api/v1/auth/login`
|
||||
- `GET /api/v1/auth/me`
|
||||
- `POST /api/v1/auth/logout`
|
||||
|
||||
Detailed DEV and LIVE environment guidance, including HTTPS reverse-proxy deployment values, is documented in `doc/operations-and-configuration.md` and `.env.example`.
|
||||
|
||||
## Data Persistence
|
||||
|
||||
Docker named volumes used by the stack:
|
||||
|
||||
- `db-data`
|
||||
- `redis-data`
|
||||
- `dcm-storage`
|
||||
- `typesense-data`
|
||||
|
||||
## Validation Checklist
|
||||
|
||||
After setup or config changes, verify:
|
||||
|
||||
- `GET /api/v1/health` returns `{"status":"ok"}`
|
||||
- Upload and processing complete successfully
|
||||
- Search returns expected results
|
||||
- Preview and download work for uploaded documents
|
||||
- `docker compose logs -f api worker` has no failures
|
||||
|
||||
## Repository Layout
|
||||
|
||||
- `backend/` - FastAPI API, services, models, worker
|
||||
- `frontend/` - React application
|
||||
- `doc/` - technical documentation for architecture, API, data model, and operations
|
||||
- `docker-compose.yml` - local runtime topology
|
||||
|
||||
## Documentation Index
|
||||
|
||||
- `doc/README.md` - technical documentation entrypoint
|
||||
- `doc/architecture-overview.md` - service and runtime architecture
|
||||
- `doc/api-contract.md` - endpoint and payload contract
|
||||
- `doc/data-model-reference.md` - persistence model reference
|
||||
- `doc/operations-and-configuration.md` - runtime operations and configuration
|
||||
- `doc/frontend-design-foundation.md` - frontend design rules
|
||||
Developer and operator docs are in `doc/`, starting at `doc/README.md`.
|
||||
|
||||
Reference in New Issue
Block a user