Initial commit

2026-02-21 09:44:18 -03:00
commit 5dfc2cbd85
65 changed files with 11989 additions and 0 deletions
@@ -0,0 +1,30 @@
+# Python
+__pycache__/
+*.py[cod]
+*.pyo
+.mypy_cache/
+.pytest_cache/
+
+# Node / JS
+node_modules/
+frontend/node_modules/
+frontend/dist/
+
+# Build output (optional)
+dist/
+build/
+
+# Environment
+.env
+*.env
+!.env.example
+
+# Data and generated artifacts (runtime only)
+data/postgres/
+data/redis/
+data/storage/
+
+# OS / IDE
+.DS_Store
+.vscode/
+.idea/
@@ -0,0 +1,44 @@
+# Repository Guidelines
+
+## Project Structure & Module Organization
+`backend/` contains FastAPI and worker code. Main logic is in `backend/app/`: routes in `api/`, business logic in `services/`, persistence in `db/` and `models/`, and jobs in `worker/`.  
+`frontend/` is a Vite + React + TypeScript app with UI code in `frontend/src/` (`components/`, `lib/api.ts`, `types.ts`, styles).  
+`doc/` stores project documentation, with `doc/README.md` as the entrypoint.  
+`docker-compose.yml` defines API, worker, frontend, Postgres, Redis, and Typesense. Runtime data is persisted in `./data/`.
+
+## Build, Test, and Development Commands
+Docker workflow (required):
+- `multipass shell dcm-dev` - enter the Linux VM used for Docker.
+- `cd ~/dcm` - move to the project directory inside the VM.
+- `docker compose up --build -d` - build and start all services.
+- `docker compose down` - stop and remove containers.
+- `docker compose logs -f` - stream logs across services.
+- `DCM_DATA_DIR=/data docker compose up --build -d` - override host data directory.
+- `exit` - leave the VM after Docker operations.
+
+Frontend-only workflow:
+- `cd frontend && npm run dev` - start Vite dev server.
+- `cd frontend && npm run build` - run TypeScript checks and build production assets.
+- `cd frontend && npm run preview` - serve built frontend locally.
+
+## Coding Style & Naming Conventions
+Follow existing module boundaries; keep files focused.  
+Python style in `backend/`: 4-space indentation, type hints, and docstrings for modules and functions; use `snake_case` for functions/modules and `PascalCase` for classes.  
+TypeScript style in `frontend/`: strict compiler settings (`strict`, `noFallthroughCasesInSwitch`, ES2022 target). Use `PascalCase` for React components (`DocumentCard.tsx`) and `camelCase` for variables/functions.
+
+## Testing Guidelines
+This repository currently has no committed automated test suite. For each change, run the stack and validate impacted API/UI flows manually. Verify:
+- API health at `GET /api/v1/health`
+- document upload and search behavior in the frontend
+- service logs for processing failures (`docker compose logs -f` inside the VM)
+
+When introducing automated tests, place them near the relevant module and document execution commands in `README.md`.
+Use `test_*.py` naming for backend tests and `*.test.tsx` for frontend tests.
+
+## Commit & Pull Request Guidelines
+Git history is not available in this workspace snapshot, so no local message pattern can be inferred. Use concise, imperative commit subjects scoped to one change.  
+PRs should include:
+- what changed and why
+- linked issue/task ID
+- manual verification steps and outcomes
+- screenshots or short recordings for UI changes
@@ -0,0 +1,118 @@
+# DMS
+
+DMS is a single-user document management system with:
+- drag-and-drop upload anywhere in the UI
+- file and folder upload
+- document processing and indexing (PDF, text, OpenAI handwriting/image transcription, DOCX, XLSX, ZIP extraction)
+- fallback handling for unsupported formats
+- original file preservation and download
+- metadata-based and full-text search
+- learning-based routing suggestions
+
+## Requirements
+
+- Docker Engine with Docker Compose plugin
+- Internet access for the first image build
+
+## Install And Run With Docker Compose
+
+1. Open a terminal in this repository root.
+2. Start the full stack:
+
+```bash
+docker compose up --build -d
+```
+
+3. Open the applications:
+- Frontend: `http://localhost:5173`
+- Backend API docs: `http://localhost:8000/docs`
+- Health check: `http://localhost:8000/api/v1/health`
+
+## Setup
+
+1. Open the frontend and upload files or folders.
+2. Set default destination path and tags before upload if needed.
+3. Configure handwriting transcription settings in the UI:
+- OpenAI compatible base URL
+- model (default: `gpt-4.1-mini`)
+- API key and timeout
+4. Open a document in the details panel, adjust destination and tags, then save.
+5. Keep `Learn from this routing decision` enabled to train future routing suggestions.
+
+## Data Persistence On Host
+
+All runtime data is stored on the host using bind mounts.
+
+Default host location:
+- `./data/postgres`
+- `./data/redis`
+- `./data/storage`
+
+To persist under another host directory, set `DCM_DATA_DIR`:
+
+```bash
+DCM_DATA_DIR=/data docker compose up --build -d
+```
+
+This will place runtime data under `/data` on the host.
+
+## Common Commands
+
+Start:
+
+```bash
+docker compose up --build -d
+```
+
+Stop:
+
+```bash
+docker compose down
+```
+
+View logs:
+
+```bash
+docker compose logs -f
+```
+
+Rebuild a clean stack while keeping persisted data:
+
+```bash
+docker compose down
+docker compose up --build -d
+```
+
+Reset all persisted runtime data:
+
+```bash
+docker compose down
+rm -rf ./data
+```
+
+## Handwriting Transcription Notes
+
+- Handwriting and image transcription uses an OpenAI compatible vision endpoint.
+- Before transcription, images are normalized:
+  - EXIF rotation is corrected
+  - long side is resized to a maximum of 2000px
+  - image is sent as a base64 data URL payload
+- Handwriting provider settings are persisted in host storage and survive container restarts.
+
+## API Overview
+
+GET endpoints:
+- `GET /api/v1/health`
+- `GET /api/v1/documents`
+- `GET /api/v1/documents/{document_id}`
+- `GET /api/v1/documents/{document_id}/preview`
+- `GET /api/v1/documents/{document_id}/download`
+- `GET /api/v1/documents/tags`
+- `GET /api/v1/search?query=...`
+- `GET /api/v1/settings`
+
+Additional endpoints used by the UI:
+- `POST /api/v1/documents/upload`
+- `PATCH /api/v1/documents/{document_id}`
+- `POST /api/v1/documents/{document_id}/reprocess`
+- `PATCH /api/v1/settings/handwriting`
@@ -0,0 +1,17 @@
+APP_ENV=development
+DATABASE_URL=postgresql+psycopg://dcm:dcm@db:5432/dcm
+REDIS_URL=redis://redis:6379/0
+STORAGE_ROOT=/data/storage
+DEFAULT_OPENAI_BASE_URL=https://api.openai.com/v1
+DEFAULT_OPENAI_MODEL=gpt-4.1-mini
+DEFAULT_OPENAI_TIMEOUT_SECONDS=45
+DEFAULT_OPENAI_HANDWRITING_ENABLED=true
+DEFAULT_OPENAI_API_KEY=
+DEFAULT_SUMMARY_MODEL=gpt-4.1-mini
+DEFAULT_ROUTING_MODEL=gpt-4.1-mini
+TYPESENSE_PROTOCOL=http
+TYPESENSE_HOST=typesense
+TYPESENSE_PORT=8108
+TYPESENSE_API_KEY=dcm-typesense-key
+TYPESENSE_COLLECTION_NAME=documents
+PUBLIC_BASE_URL=http://localhost:8000
@@ -0,0 +1,17 @@
+FROM python:3.12-slim
+
+ENV PYTHONDONTWRITEBYTECODE=1
+ENV PYTHONUNBUFFERED=1
+
+WORKDIR /app
+
+RUN apt-get update && apt-get install -y --no-install-recommends \
+    libmagic1 \
+    && rm -rf /var/lib/apt/lists/*
+
+COPY requirements.txt /app/requirements.txt
+RUN pip install --no-cache-dir -r /app/requirements.txt
+
+COPY app /app/app
+
+CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
@@ -0,0 +1 @@
+"""Backend application package for the DMS service."""
@@ -0,0 +1 @@
+"""API package containing route modules and router registration."""
@@ -0,0 +1,17 @@
+"""API router registration for all HTTP route modules."""
+
+from fastapi import APIRouter
+
+from app.api.routes_documents import router as documents_router
+from app.api.routes_health import router as health_router
+from app.api.routes_processing_logs import router as processing_logs_router
+from app.api.routes_search import router as search_router
+from app.api.routes_settings import router as settings_router
+
+
+api_router = APIRouter()
+api_router.include_router(health_router)
+api_router.include_router(documents_router, prefix="/documents", tags=["documents"])
+api_router.include_router(processing_logs_router, prefix="/processing/logs", tags=["processing-logs"])
+api_router.include_router(search_router, prefix="/search", tags=["search"])
+api_router.include_router(settings_router, prefix="/settings", tags=["settings"])
@@ -0,0 +1,725 @@
+"""Document CRUD, lifecycle, metadata, file access, and content export endpoints."""
+
+import io
+import re
+import unicodedata
+import zipfile
+from datetime import datetime, time
+from pathlib import Path
+from typing import Annotated, Literal
+from uuid import UUID
+
+from fastapi import APIRouter, Depends, File, Form, HTTPException, Query, UploadFile
+from fastapi.responses import FileResponse, Response, StreamingResponse
+from sqlalchemy import or_, func, select
+from sqlalchemy.orm import Session
+
+from app.services.app_settings import read_predefined_paths_settings, read_predefined_tags_settings
+from app.db.base import get_session
+from app.models.document import Document, DocumentStatus
+from app.schemas.documents import (
+    ContentExportRequest,
+    DocumentDetailResponse,
+    DocumentResponse,
+    DocumentsListResponse,
+    DocumentUpdateRequest,
+    UploadConflict,
+    UploadResponse,
+)
+from app.services.extractor import sniff_mime
+from app.services.handwriting_style import delete_many_handwriting_style_documents
+from app.services.processing_logs import log_processing_event, set_processing_log_autocommit
+from app.services.storage import absolute_path, compute_sha256, store_bytes
+from app.services.typesense_index import delete_many_documents_index, upsert_document_index
+from app.worker.queue import get_processing_queue
+
+
+router = APIRouter()
+
+
+def _parse_csv(value: str | None) -> list[str]:
+    """Parses comma-separated query values into a normalized non-empty list."""
+
+    if not value:
+        return []
+    return [part.strip() for part in value.split(",") if part.strip()]
+
+
+def _parse_date(value: str | None) -> datetime | None:
+    """Parses ISO date strings into UTC-naive midnight datetimes."""
+
+    if not value:
+        return None
+    try:
+        parsed = datetime.fromisoformat(value)
+        return parsed
+    except ValueError:
+        pass
+    try:
+        date_value = datetime.strptime(value, "%Y-%m-%d").date()
+        return datetime.combine(date_value, time.min)
+    except ValueError:
+        return None
+
+
+def _apply_discovery_filters(
+    statement,
+    *,
+    path_filter: str | None,
+    tag_filter: str | None,
+    type_filter: str | None,
+    processed_from: str | None,
+    processed_to: str | None,
+):
+    """Applies optional path/tag/type/date filters to list and search statements."""
+
+    if path_filter and path_filter.strip():
+        statement = statement.where(Document.logical_path.ilike(f"{path_filter.strip()}%"))
+
+    tags = _parse_csv(tag_filter)
+    if tags:
+        statement = statement.where(Document.tags.overlap(tags))
+
+    types = _parse_csv(type_filter)
+    if types:
+        type_clauses = []
+        for value in types:
+            lowered = value.lower()
+            type_clauses.append(Document.extension.ilike(lowered))
+            type_clauses.append(Document.mime_type.ilike(lowered))
+            type_clauses.append(Document.image_text_type.ilike(lowered))
+        statement = statement.where(or_(*type_clauses))
+
+    processed_from_dt = _parse_date(processed_from)
+    if processed_from_dt is not None:
+        statement = statement.where(Document.processed_at.is_not(None), Document.processed_at >= processed_from_dt)
+
+    processed_to_dt = _parse_date(processed_to)
+    if processed_to_dt is not None:
+        statement = statement.where(Document.processed_at.is_not(None), Document.processed_at <= processed_to_dt)
+
+    return statement
+
+
+def _summary_for_index(document: Document) -> str:
+    """Resolves best-available summary text for semantic index updates outside worker pipeline."""
+
+    candidate = document.metadata_json.get("summary_text")
+    if isinstance(candidate, str) and candidate.strip():
+        return candidate.strip()
+    extracted = document.extracted_text.strip()
+    if extracted:
+        return extracted[:12000]
+    return f"{document.original_filename}\n{document.mime_type}\n{document.logical_path}"
+
+
+def _normalize_tags(raw_tags: str | None) -> list[str]:
+    """Parses comma-separated tags into a cleaned unique list."""
+
+    if not raw_tags:
+        return []
+    tags = [tag.strip() for tag in raw_tags.split(",") if tag.strip()]
+    return list(dict.fromkeys(tags))[:50]
+
+
+def _sanitize_filename(filename: str) -> str:
+    """Normalizes user-supplied filenames while preserving readability and extensions."""
+
+    base = filename.strip().replace("\\", " ").replace("/", " ")
+    base = re.sub(r"\s+", " ", base)
+    return base[:512] or "document"
+
+
+def _slugify_segment(value: str) -> str:
+    """Creates a filesystem-safe slug for path segments and markdown file names."""
+
+    normalized = unicodedata.normalize("NFKD", value)
+    ascii_text = normalized.encode("ascii", "ignore").decode("ascii")
+    cleaned = re.sub(r"[^a-zA-Z0-9._ -]+", "", ascii_text).strip()
+    compact = re.sub(r"\s+", "-", cleaned)
+    compact = compact.strip(".-_")
+    return compact[:120] or "document"
+
+
+def _markdown_for_document(document: Document) -> str:
+    """Builds a markdown representation of extracted document content and metadata."""
+
+    lines = [
+        f"# {document.original_filename}",
+        "",
+        f"- Document ID: `{document.id}`",
+        f"- Logical Path: `{document.logical_path}`",
+        f"- Source Path: `{document.source_relative_path}`",
+        f"- Tags: {', '.join(document.tags) if document.tags else '(none)' }",
+        "",
+        "## Extracted Content",
+        "",
+    ]
+
+    if document.extracted_text.strip():
+        lines.append(document.extracted_text)
+    else:
+        lines.append("_No extracted text available for this document._")
+
+    return "\n".join(lines).strip() + "\n"
+
+
+def _markdown_filename(document: Document) -> str:
+    """Builds a deterministic markdown filename for a single document export."""
+
+    stem = Path(document.original_filename).stem or document.original_filename
+    slug = _slugify_segment(stem)
+    return f"{slug}-{str(document.id)[:8]}.md"
+
+
+def _zip_entry_name(document: Document, used_names: set[str]) -> str:
+    """Builds a unique zip entry path for a document markdown export."""
+
+    path_segments = [segment for segment in document.logical_path.split("/") if segment]
+    sanitized_segments = [_slugify_segment(segment) for segment in path_segments]
+    filename = _markdown_filename(document)
+
+    base_entry = "/".join([*sanitized_segments, filename]) if sanitized_segments else filename
+    entry = base_entry
+    suffix = 1
+    while entry in used_names:
+        stem = Path(filename).stem
+        ext = Path(filename).suffix
+        candidate = f"{stem}-{suffix}{ext}"
+        entry = "/".join([*sanitized_segments, candidate]) if sanitized_segments else candidate
+        suffix += 1
+    used_names.add(entry)
+    return entry
+
+
+def _resolve_previous_status(metadata_json: dict, fallback_status: DocumentStatus) -> DocumentStatus:
+    """Resolves the status to restore from trash using recorded metadata."""
+
+    raw_status = metadata_json.get("status_before_trash")
+    if isinstance(raw_status, str):
+        try:
+            parsed = DocumentStatus(raw_status)
+            if parsed != DocumentStatus.TRASHED:
+                return parsed
+        except ValueError:
+            pass
+    return fallback_status
+
+
+def _build_document_list_statement(
+    only_trashed: bool,
+    include_trashed: bool,
+    path_prefix: str | None,
+):
+    """Builds a base SQLAlchemy select statement with lifecycle and path filters."""
+
+    statement = select(Document)
+    if only_trashed:
+        statement = statement.where(Document.status == DocumentStatus.TRASHED)
+    elif not include_trashed:
+        statement = statement.where(Document.status != DocumentStatus.TRASHED)
+
+    if path_prefix:
+        trimmed_prefix = path_prefix.strip()
+        if trimmed_prefix:
+            statement = statement.where(Document.logical_path.ilike(f"{trimmed_prefix}%"))
+
+    return statement
+
+
+def _collect_document_tree(session: Session, root_document_id: UUID) -> list[tuple[int, Document]]:
+    """Collects a document and all descendants for recursive permanent deletion."""
+
+    queue: list[tuple[UUID, int]] = [(root_document_id, 0)]
+    visited: set[UUID] = set()
+    collected: list[tuple[int, Document]] = []
+
+    while queue:
+        current_id, depth = queue.pop(0)
+        if current_id in visited:
+            continue
+        visited.add(current_id)
+
+        document = session.execute(select(Document).where(Document.id == current_id)).scalar_one_or_none()
+        if document is None:
+            continue
+
+        collected.append((depth, document))
+        child_ids = session.execute(
+            select(Document.id).where(Document.parent_document_id == current_id)
+        ).scalars().all()
+        for child_id in child_ids:
+            queue.append((child_id, depth + 1))
+
+    collected.sort(key=lambda item: item[0], reverse=True)
+    return collected
+
+
+@router.get("", response_model=DocumentsListResponse)
+def list_documents(
+    offset: int = Query(default=0, ge=0),
+    limit: int = Query(default=50, ge=1, le=200),
+    include_trashed: bool = Query(default=False),
+    only_trashed: bool = Query(default=False),
+    path_prefix: str | None = Query(default=None),
+    path_filter: str | None = Query(default=None),
+    tag_filter: str | None = Query(default=None),
+    type_filter: str | None = Query(default=None),
+    processed_from: str | None = Query(default=None),
+    processed_to: str | None = Query(default=None),
+    session: Session = Depends(get_session),
+) -> DocumentsListResponse:
+    """Returns paginated documents ordered by newest upload timestamp."""
+
+    base_statement = _build_document_list_statement(
+        only_trashed=only_trashed,
+        include_trashed=include_trashed,
+        path_prefix=path_prefix,
+    )
+    base_statement = _apply_discovery_filters(
+        base_statement,
+        path_filter=path_filter,
+        tag_filter=tag_filter,
+        type_filter=type_filter,
+        processed_from=processed_from,
+        processed_to=processed_to,
+    )
+
+    statement = base_statement.order_by(Document.created_at.desc()).offset(offset).limit(limit)
+    items = session.execute(statement).scalars().all()
+
+    count_statement = select(func.count()).select_from(base_statement.subquery())
+    total = session.execute(count_statement).scalar_one()
+
+    return DocumentsListResponse(total=total, items=[DocumentResponse.model_validate(item) for item in items])
+
+
+@router.get("/tags")
+def list_tags(
+    include_trashed: bool = Query(default=False),
+    session: Session = Depends(get_session),
+) -> dict[str, list[str]]:
+    """Returns distinct tags currently assigned across all matching documents."""
+
+    statement = select(Document.tags)
+    if not include_trashed:
+        statement = statement.where(Document.status != DocumentStatus.TRASHED)
+
+    rows = session.execute(statement).scalars().all()
+    tags = {tag for row in rows for tag in row if tag}
+    tags.update(
+        str(item.get("value", "")).strip()
+        for item in read_predefined_tags_settings()
+        if str(item.get("value", "")).strip()
+    )
+    tags = sorted(tags)
+    return {"tags": tags}
+
+
+@router.get("/paths")
+def list_paths(
+    include_trashed: bool = Query(default=False),
+    session: Session = Depends(get_session),
+) -> dict[str, list[str]]:
+    """Returns distinct logical paths currently assigned across all matching documents."""
+
+    statement = select(Document.logical_path)
+    if not include_trashed:
+        statement = statement.where(Document.status != DocumentStatus.TRASHED)
+
+    rows = session.execute(statement).scalars().all()
+    paths = {row for row in rows if row}
+    paths.update(
+        str(item.get("value", "")).strip()
+        for item in read_predefined_paths_settings()
+        if str(item.get("value", "")).strip()
+    )
+    paths = sorted(paths)
+    return {"paths": paths}
+
+
+@router.get("/types")
+def list_types(
+    include_trashed: bool = Query(default=False),
+    session: Session = Depends(get_session),
+) -> dict[str, list[str]]:
+    """Returns distinct document type values from extension, MIME, and image text type."""
+
+    statement = select(Document.extension, Document.mime_type, Document.image_text_type)
+    if not include_trashed:
+        statement = statement.where(Document.status != DocumentStatus.TRASHED)
+    rows = session.execute(statement).all()
+    values: set[str] = set()
+    for extension, mime_type, image_text_type in rows:
+        for candidate in (extension, mime_type, image_text_type):
+            normalized = str(candidate).strip().lower() if isinstance(candidate, str) else ""
+            if normalized:
+                values.add(normalized)
+    return {"types": sorted(values)}
+
+
+@router.post("/content-md/export")
+def export_contents_markdown(
+    payload: ContentExportRequest,
+    session: Session = Depends(get_session),
+) -> StreamingResponse:
+    """Exports extracted contents for selected documents as individual markdown files in a ZIP archive."""
+
+    has_document_ids = len(payload.document_ids) > 0
+    has_path_prefix = bool(payload.path_prefix and payload.path_prefix.strip())
+    if not has_document_ids and not has_path_prefix:
+        raise HTTPException(status_code=400, detail="Provide document_ids or path_prefix for export")
+
+    statement = select(Document)
+    if has_document_ids:
+        statement = statement.where(Document.id.in_(payload.document_ids))
+    if has_path_prefix:
+        statement = statement.where(Document.logical_path.ilike(f"{payload.path_prefix.strip()}%"))
+    if payload.only_trashed:
+        statement = statement.where(Document.status == DocumentStatus.TRASHED)
+    elif not payload.include_trashed:
+        statement = statement.where(Document.status != DocumentStatus.TRASHED)
+
+    documents = session.execute(statement.order_by(Document.logical_path.asc(), Document.created_at.asc())).scalars().all()
+    if not documents:
+        raise HTTPException(status_code=404, detail="No matching documents found for export")
+
+    archive_buffer = io.BytesIO()
+    used_entries: set[str] = set()
+    with zipfile.ZipFile(archive_buffer, mode="w", compression=zipfile.ZIP_DEFLATED) as archive:
+        for document in documents:
+            entry_name = _zip_entry_name(document, used_entries)
+            archive.writestr(entry_name, _markdown_for_document(document))
+
+    archive_buffer.seek(0)
+    headers = {"Content-Disposition": 'attachment; filename="document-contents-md.zip"'}
+    return StreamingResponse(archive_buffer, media_type="application/zip", headers=headers)
+
+
+@router.get("/{document_id}", response_model=DocumentDetailResponse)
+def get_document(document_id: UUID, session: Session = Depends(get_session)) -> DocumentDetailResponse:
+    """Returns one document by unique identifier."""
+
+    document = session.execute(select(Document).where(Document.id == document_id)).scalar_one_or_none()
+    if document is None:
+        raise HTTPException(status_code=404, detail="Document not found")
+    return DocumentDetailResponse.model_validate(document)
+
+
+@router.get("/{document_id}/download")
+def download_document(document_id: UUID, session: Session = Depends(get_session)) -> FileResponse:
+    """Downloads original document bytes for the requested document identifier."""
+
+    document = session.execute(select(Document).where(Document.id == document_id)).scalar_one_or_none()
+    if document is None:
+        raise HTTPException(status_code=404, detail="Document not found")
+    file_path = absolute_path(document.stored_relative_path)
+    return FileResponse(path=file_path, filename=document.original_filename, media_type=document.mime_type)
+
+
+@router.get("/{document_id}/preview")
+def preview_document(document_id: UUID, session: Session = Depends(get_session)) -> FileResponse:
+    """Streams the original document inline when browser rendering is supported."""
+
+    document = session.execute(select(Document).where(Document.id == document_id)).scalar_one_or_none()
+    if document is None:
+        raise HTTPException(status_code=404, detail="Document not found")
+
+    original_path = absolute_path(document.stored_relative_path)
+    return FileResponse(path=original_path, media_type=document.mime_type)
+
+
+@router.get("/{document_id}/thumbnail")
+def thumbnail_document(document_id: UUID, session: Session = Depends(get_session)) -> FileResponse:
+    """Returns a generated thumbnail image for dashboard card previews."""
+
+    document = session.execute(select(Document).where(Document.id == document_id)).scalar_one_or_none()
+    if document is None:
+        raise HTTPException(status_code=404, detail="Document not found")
+
+    preview_relative_path = document.metadata_json.get("preview_relative_path")
+    if not preview_relative_path:
+        raise HTTPException(status_code=404, detail="Thumbnail not available")
+
+    preview_path = absolute_path(preview_relative_path)
+    if not preview_path.exists():
+        raise HTTPException(status_code=404, detail="Thumbnail file not found")
+    return FileResponse(path=preview_path)
+
+
+@router.get("/{document_id}/content-md")
+def download_document_content_markdown(document_id: UUID, session: Session = Depends(get_session)) -> Response:
+    """Downloads extracted content for one document as a markdown file."""
+
+    document = session.execute(select(Document).where(Document.id == document_id)).scalar_one_or_none()
+    if document is None:
+        raise HTTPException(status_code=404, detail="Document not found")
+
+    markdown_content = _markdown_for_document(document)
+    filename = _markdown_filename(document)
+    headers = {"Content-Disposition": f'attachment; filename="{filename}"'}
+    return Response(content=markdown_content, media_type="text/markdown; charset=utf-8", headers=headers)
+
+
+@router.post("/upload", response_model=UploadResponse)
+async def upload_documents(
+    files: Annotated[list[UploadFile], File(description="Files to upload")],
+    relative_paths: Annotated[list[str] | None, Form()] = None,
+    logical_path: Annotated[str, Form()] = "Inbox",
+    tags: Annotated[str | None, Form()] = None,
+    conflict_mode: Annotated[Literal["ask", "replace", "duplicate"], Form()] = "ask",
+    session: Session = Depends(get_session),
+) -> UploadResponse:
+    """Uploads files, records metadata, and enqueues asynchronous extraction tasks."""
+
+    set_processing_log_autocommit(session, True)
+    normalized_tags = _normalize_tags(tags)
+    queue = get_processing_queue()
+    uploaded: list[DocumentResponse] = []
+    conflicts: list[UploadConflict] = []
+
+    indexed_relative_paths = relative_paths or []
+    prepared_uploads: list[dict[str, object]] = []
+
+    for idx, file in enumerate(files):
+        filename = file.filename or f"uploaded_{idx}"
+        data = await file.read()
+        sha256 = compute_sha256(data)
+        source_relative_path = indexed_relative_paths[idx] if idx < len(indexed_relative_paths) else filename
+        extension = Path(filename).suffix.lower()
+        detected_mime = sniff_mime(data)
+        log_processing_event(
+            session=session,
+            stage="upload",
+            event="Upload request received",
+            level="info",
+            document_filename=filename,
+            payload_json={
+                "source_relative_path": source_relative_path,
+                "logical_path": logical_path,
+                "tags": normalized_tags,
+                "mime_type": detected_mime,
+                "size_bytes": len(data),
+                "conflict_mode": conflict_mode,
+            },
+        )
+        prepared_uploads.append(
+            {
+                "filename": filename,
+                "data": data,
+                "sha256": sha256,
+                "source_relative_path": source_relative_path,
+                "extension": extension,
+                "mime_type": detected_mime,
+            }
+        )
+
+        existing = session.execute(select(Document).where(Document.sha256 == sha256)).scalar_one_or_none()
+        if existing and conflict_mode == "ask":
+            log_processing_event(
+                session=session,
+                stage="upload",
+                event="Upload conflict detected",
+                level="warning",
+                document_id=existing.id,
+                document_filename=filename,
+                payload_json={
+                    "sha256": sha256,
+                    "existing_document_id": str(existing.id),
+                },
+            )
+            conflicts.append(
+                UploadConflict(
+                    original_filename=filename,
+                    sha256=sha256,
+                    existing_document_id=existing.id,
+                )
+            )
+
+    if conflicts and conflict_mode == "ask":
+        session.commit()
+        return UploadResponse(uploaded=[], conflicts=conflicts)
+
+    for prepared in prepared_uploads:
+        existing = session.execute(
+            select(Document).where(Document.sha256 == str(prepared["sha256"]))
+        ).scalar_one_or_none()
+        replaces_document_id = existing.id if existing and conflict_mode == "replace" else None
+
+        stored_relative_path = store_bytes(str(prepared["filename"]), bytes(prepared["data"]))
+
+        document = Document(
+            original_filename=str(prepared["filename"]),
+            source_relative_path=str(prepared["source_relative_path"]),
+            stored_relative_path=stored_relative_path,
+            mime_type=str(prepared["mime_type"]),
+            extension=str(prepared["extension"]),
+            sha256=str(prepared["sha256"]),
+            size_bytes=len(bytes(prepared["data"])),
+            logical_path=logical_path,
+            tags=list(normalized_tags),
+            replaces_document_id=replaces_document_id,
+            metadata_json={"upload": "web"},
+        )
+        session.add(document)
+        session.flush()
+        queue.enqueue("app.worker.tasks.process_document_task", str(document.id))
+
+        log_processing_event(
+            session=session,
+            stage="upload",
+            event="Document record created and queued",
+            level="info",
+            document=document,
+            payload_json={
+                "source_relative_path": document.source_relative_path,
+                "stored_relative_path": document.stored_relative_path,
+                "logical_path": document.logical_path,
+                "tags": list(document.tags),
+                "replaces_document_id": str(replaces_document_id) if replaces_document_id is not None else None,
+            },
+        )
+        uploaded.append(DocumentResponse.model_validate(document))
+
+    session.commit()
+    return UploadResponse(uploaded=uploaded, conflicts=conflicts)
+
+
+@router.patch("/{document_id}", response_model=DocumentResponse)
+def update_document(
+    document_id: UUID,
+    payload: DocumentUpdateRequest,
+    session: Session = Depends(get_session),
+) -> DocumentResponse:
+    """Updates document metadata and refreshes semantic index representation."""
+
+    document = session.execute(select(Document).where(Document.id == document_id)).scalar_one_or_none()
+    if document is None:
+        raise HTTPException(status_code=404, detail="Document not found")
+
+    if payload.original_filename is not None:
+        document.original_filename = _sanitize_filename(payload.original_filename)
+    if payload.logical_path is not None:
+        document.logical_path = payload.logical_path.strip() or "Inbox"
+    if payload.tags is not None:
+        document.tags = list(dict.fromkeys([tag.strip() for tag in payload.tags if tag.strip()]))[:50]
+
+    try:
+        upsert_document_index(document=document, summary_text=_summary_for_index(document))
+    except Exception:
+        pass
+
+    session.commit()
+    session.refresh(document)
+    return DocumentResponse.model_validate(document)
+
+
+@router.post("/{document_id}/trash", response_model=DocumentResponse)
+def trash_document(document_id: UUID, session: Session = Depends(get_session)) -> DocumentResponse:
+    """Marks a document as trashed without deleting files from storage."""
+
+    document = session.execute(select(Document).where(Document.id == document_id)).scalar_one_or_none()
+    if document is None:
+        raise HTTPException(status_code=404, detail="Document not found")
+
+    if document.status != DocumentStatus.TRASHED:
+        document.metadata_json = {
+            **document.metadata_json,
+            "status_before_trash": document.status.value,
+        }
+        document.status = DocumentStatus.TRASHED
+        try:
+            upsert_document_index(document=document, summary_text=_summary_for_index(document))
+        except Exception:
+            pass
+        session.commit()
+        session.refresh(document)
+
+    return DocumentResponse.model_validate(document)
+
+
+@router.post("/{document_id}/restore", response_model=DocumentResponse)
+def restore_document(document_id: UUID, session: Session = Depends(get_session)) -> DocumentResponse:
+    """Restores a trashed document to its previous lifecycle status."""
+
+    document = session.execute(select(Document).where(Document.id == document_id)).scalar_one_or_none()
+    if document is None:
+        raise HTTPException(status_code=404, detail="Document not found")
+
+    if document.status == DocumentStatus.TRASHED:
+        fallback = DocumentStatus.PROCESSED if document.processed_at else DocumentStatus.QUEUED
+        restored_status = _resolve_previous_status(document.metadata_json, fallback)
+        document.status = restored_status
+        metadata_json = dict(document.metadata_json)
+        metadata_json.pop("status_before_trash", None)
+        document.metadata_json = metadata_json
+        try:
+            upsert_document_index(document=document, summary_text=_summary_for_index(document))
+        except Exception:
+            pass
+        session.commit()
+        session.refresh(document)
+
+    return DocumentResponse.model_validate(document)
+
+
+@router.delete("/{document_id}")
+def delete_document(document_id: UUID, session: Session = Depends(get_session)) -> dict[str, int]:
+    """Permanently deletes a document and all descendant archive members including stored files."""
+
+    root = session.execute(select(Document).where(Document.id == document_id)).scalar_one_or_none()
+    if root is None:
+        raise HTTPException(status_code=404, detail="Document not found")
+    if root.status != DocumentStatus.TRASHED:
+        raise HTTPException(status_code=400, detail="Move document to trash before permanent deletion")
+
+    document_tree = _collect_document_tree(session=session, root_document_id=document_id)
+    document_ids = [document.id for _, document in document_tree]
+    try:
+        delete_many_documents_index([str(current_id) for current_id in document_ids])
+    except Exception:
+        pass
+    try:
+        delete_many_handwriting_style_documents([str(current_id) for current_id in document_ids])
+    except Exception:
+        pass
+
+    deleted_files = 0
+    for _, document in document_tree:
+        source_path = absolute_path(document.stored_relative_path)
+        if source_path.exists() and source_path.is_file():
+            source_path.unlink(missing_ok=True)
+            deleted_files += 1
+
+        preview_relative_path = document.metadata_json.get("preview_relative_path")
+        if isinstance(preview_relative_path, str):
+            preview_path = absolute_path(preview_relative_path)
+            if preview_path.exists() and preview_path.is_file():
+                preview_path.unlink(missing_ok=True)
+
+        session.delete(document)
+
+    session.commit()
+    return {"deleted_documents": len(document_tree), "deleted_files": deleted_files}
+
+
+@router.post("/{document_id}/reprocess", response_model=DocumentResponse)
+def reprocess_document(document_id: UUID, session: Session = Depends(get_session)) -> DocumentResponse:
+    """Re-enqueues a document for extraction and suggestion processing."""
+
+    document = session.execute(select(Document).where(Document.id == document_id)).scalar_one_or_none()
+    if document is None:
+        raise HTTPException(status_code=404, detail="Document not found")
+    if document.status == DocumentStatus.TRASHED:
+        raise HTTPException(status_code=400, detail="Restore document before reprocessing")
+
+    queue = get_processing_queue()
+    document.status = DocumentStatus.QUEUED
+    try:
+        upsert_document_index(document=document, summary_text=_summary_for_index(document))
+    except Exception:
+        pass
+    session.commit()
+    queue.enqueue("app.worker.tasks.process_document_task", str(document.id))
+    session.refresh(document)
+    return DocumentResponse.model_validate(document)
@@ -0,0 +1,13 @@
+"""Health and readiness endpoints for orchestration and uptime checks."""
+
+from fastapi import APIRouter
+
+
+router = APIRouter(prefix="/health", tags=["health"])
+
+
+@router.get("")
+def health() -> dict[str, str]:
+    """Returns service liveness status."""
+
+    return {"status": "ok"}
@@ -0,0 +1,66 @@
+"""Read-only API endpoints for processing pipeline event logs."""
+
+from uuid import UUID
+
+from fastapi import APIRouter, Depends, Query
+from sqlalchemy.orm import Session
+
+from app.db.base import get_session
+from app.schemas.processing_logs import ProcessingLogEntryResponse, ProcessingLogListResponse
+from app.services.processing_logs import (
+    cleanup_processing_logs,
+    clear_processing_logs,
+    count_processing_logs,
+    list_processing_logs,
+)
+
+
+router = APIRouter()
+
+
+@router.get("", response_model=ProcessingLogListResponse)
+def get_processing_logs(
+    offset: int = Query(default=0, ge=0),
+    limit: int = Query(default=120, ge=1, le=400),
+    document_id: UUID | None = Query(default=None),
+    session: Session = Depends(get_session),
+) -> ProcessingLogListResponse:
+    """Returns paginated processing logs ordered from newest to oldest."""
+
+    items = list_processing_logs(
+        session=session,
+        limit=limit,
+        offset=offset,
+        document_id=document_id,
+    )
+    total = count_processing_logs(session=session, document_id=document_id)
+    return ProcessingLogListResponse(
+        total=total,
+        items=[ProcessingLogEntryResponse.model_validate(item) for item in items],
+    )
+
+
+@router.post("/trim")
+def trim_processing_logs(
+    keep_document_sessions: int = Query(default=2, ge=0, le=20),
+    keep_unbound_entries: int = Query(default=80, ge=0, le=400),
+    session: Session = Depends(get_session),
+) -> dict[str, int]:
+    """Deletes old processing logs while keeping recent document sessions and unbound events."""
+
+    result = cleanup_processing_logs(
+        session=session,
+        keep_document_sessions=keep_document_sessions,
+        keep_unbound_entries=keep_unbound_entries,
+    )
+    session.commit()
+    return result
+
+
+@router.post("/clear")
+def clear_all_processing_logs(session: Session = Depends(get_session)) -> dict[str, int]:
+    """Deletes all processing logs to reset the diagnostics timeline."""
+
+    result = clear_processing_logs(session=session)
+    session.commit()
+    return result
@@ -0,0 +1,84 @@
+"""Search endpoints for full-text and metadata document discovery."""
+
+from fastapi import APIRouter, Depends, Query
+from sqlalchemy import Text, cast, func, select
+from sqlalchemy.orm import Session
+
+from app.api.routes_documents import _apply_discovery_filters
+from app.db.base import get_session
+from app.models.document import Document, DocumentStatus
+from app.schemas.documents import DocumentResponse, SearchResponse
+
+
+router = APIRouter()
+
+
+@router.get("", response_model=SearchResponse)
+def search_documents(
+    query: str = Query(min_length=2),
+    offset: int = Query(default=0, ge=0),
+    limit: int = Query(default=50, ge=1, le=200),
+    include_trashed: bool = Query(default=False),
+    only_trashed: bool = Query(default=False),
+    path_filter: str | None = Query(default=None),
+    tag_filter: str | None = Query(default=None),
+    type_filter: str | None = Query(default=None),
+    processed_from: str | None = Query(default=None),
+    processed_to: str | None = Query(default=None),
+    session: Session = Depends(get_session),
+) -> SearchResponse:
+    """Searches documents using PostgreSQL full-text ranking plus metadata matching."""
+
+    vector = func.to_tsvector(
+        "simple",
+        func.coalesce(Document.original_filename, "")
+        + " "
+        + func.coalesce(Document.logical_path, "")
+        + " "
+        + func.coalesce(Document.extracted_text, "")
+        + " "
+        + func.coalesce(cast(Document.tags, Text), ""),
+    )
+    ts_query = func.plainto_tsquery("simple", query)
+    rank = func.ts_rank_cd(vector, ts_query)
+
+    search_filter = (
+        vector.op("@@")(ts_query)
+        | Document.original_filename.ilike(f"%{query}%")
+        | Document.logical_path.ilike(f"%{query}%")
+        | cast(Document.tags, Text).ilike(f"%{query}%")
+    )
+
+    statement = select(Document).where(search_filter)
+    if only_trashed:
+        statement = statement.where(Document.status == DocumentStatus.TRASHED)
+    elif not include_trashed:
+        statement = statement.where(Document.status != DocumentStatus.TRASHED)
+    statement = _apply_discovery_filters(
+        statement,
+        path_filter=path_filter,
+        tag_filter=tag_filter,
+        type_filter=type_filter,
+        processed_from=processed_from,
+        processed_to=processed_to,
+    )
+    statement = statement.order_by(rank.desc(), Document.created_at.desc()).offset(offset).limit(limit)
+
+    items = session.execute(statement).scalars().all()
+
+    count_statement = select(func.count(Document.id)).where(search_filter)
+    if only_trashed:
+        count_statement = count_statement.where(Document.status == DocumentStatus.TRASHED)
+    elif not include_trashed:
+        count_statement = count_statement.where(Document.status != DocumentStatus.TRASHED)
+    count_statement = _apply_discovery_filters(
+        count_statement,
+        path_filter=path_filter,
+        tag_filter=tag_filter,
+        type_filter=type_filter,
+        processed_from=processed_from,
+        processed_to=processed_to,
+    )
+    total = session.execute(count_statement).scalar_one()
+
+    return SearchResponse(total=total, items=[DocumentResponse.model_validate(item) for item in items])
@@ -0,0 +1,232 @@
+"""API routes for managing persistent single-user application settings."""
+
+from fastapi import APIRouter
+
+from app.schemas.settings import (
+    AppSettingsUpdateRequest,
+    AppSettingsResponse,
+    DisplaySettingsResponse,
+    HandwritingSettingsResponse,
+    HandwritingStyleSettingsResponse,
+    HandwritingSettingsUpdateRequest,
+    OcrTaskSettingsResponse,
+    ProviderSettingsResponse,
+    RoutingTaskSettingsResponse,
+    SummaryTaskSettingsResponse,
+    TaskSettingsResponse,
+    UploadDefaultsResponse,
+)
+from app.services.app_settings import (
+    TASK_OCR_HANDWRITING,
+    TASK_ROUTING_CLASSIFICATION,
+    TASK_SUMMARY_GENERATION,
+    read_app_settings,
+    reset_app_settings,
+    update_app_settings,
+    update_handwriting_settings,
+)
+
+
+router = APIRouter()
+
+
+def _build_response(payload: dict) -> AppSettingsResponse:
+    """Converts internal settings dictionaries into API response models."""
+
+    upload_defaults_payload = payload.get("upload_defaults", {})
+    display_payload = payload.get("display", {})
+    providers_payload = payload.get("providers", [])
+    tasks_payload = payload.get("tasks", {})
+    handwriting_style_payload = payload.get("handwriting_style_clustering", {})
+    ocr_payload = tasks_payload.get(TASK_OCR_HANDWRITING, {})
+    summary_payload = tasks_payload.get(TASK_SUMMARY_GENERATION, {})
+    routing_payload = tasks_payload.get(TASK_ROUTING_CLASSIFICATION, {})
+
+    return AppSettingsResponse(
+        upload_defaults=UploadDefaultsResponse(
+            logical_path=str(upload_defaults_payload.get("logical_path", "Inbox")),
+            tags=[
+                str(tag).strip()
+                for tag in upload_defaults_payload.get("tags", [])
+                if isinstance(tag, str) and tag.strip()
+            ],
+        ),
+        display=DisplaySettingsResponse(
+            cards_per_page=int(display_payload.get("cards_per_page", 12)),
+            log_typing_animation_enabled=bool(display_payload.get("log_typing_animation_enabled", True)),
+        ),
+        handwriting_style_clustering=HandwritingStyleSettingsResponse(
+            enabled=bool(handwriting_style_payload.get("enabled", True)),
+            embed_model=str(handwriting_style_payload.get("embed_model", "ts/clip-vit-b-p32")),
+            neighbor_limit=int(handwriting_style_payload.get("neighbor_limit", 8)),
+            match_min_similarity=float(handwriting_style_payload.get("match_min_similarity", 0.86)),
+            bootstrap_match_min_similarity=float(
+                handwriting_style_payload.get("bootstrap_match_min_similarity", 0.89)
+            ),
+            bootstrap_sample_size=int(handwriting_style_payload.get("bootstrap_sample_size", 3)),
+            image_max_side=int(handwriting_style_payload.get("image_max_side", 1024)),
+        ),
+        predefined_paths=[
+            {
+                "value": str(item.get("value", "")).strip(),
+                "global_shared": bool(item.get("global_shared", False)),
+            }
+            for item in payload.get("predefined_paths", [])
+            if isinstance(item, dict) and str(item.get("value", "")).strip()
+        ],
+        predefined_tags=[
+            {
+                "value": str(item.get("value", "")).strip(),
+                "global_shared": bool(item.get("global_shared", False)),
+            }
+            for item in payload.get("predefined_tags", [])
+            if isinstance(item, dict) and str(item.get("value", "")).strip()
+        ],
+        providers=[
+            ProviderSettingsResponse(
+                id=str(provider.get("id", "")),
+                label=str(provider.get("label", "")),
+                provider_type=str(provider.get("provider_type", "openai_compatible")),
+                base_url=str(provider.get("base_url", "https://api.openai.com/v1")),
+                timeout_seconds=int(provider.get("timeout_seconds", 45)),
+                api_key_set=bool(provider.get("api_key_set", False)),
+                api_key_masked=str(provider.get("api_key_masked", "")),
+            )
+            for provider in providers_payload
+        ],
+        tasks=TaskSettingsResponse(
+            ocr_handwriting=OcrTaskSettingsResponse(
+                enabled=bool(ocr_payload.get("enabled", True)),
+                provider_id=str(ocr_payload.get("provider_id", "openai-default")),
+                model=str(ocr_payload.get("model", "gpt-4.1-mini")),
+                prompt=str(ocr_payload.get("prompt", "")),
+            ),
+            summary_generation=SummaryTaskSettingsResponse(
+                enabled=bool(summary_payload.get("enabled", True)),
+                provider_id=str(summary_payload.get("provider_id", "openai-default")),
+                model=str(summary_payload.get("model", "gpt-4.1-mini")),
+                prompt=str(summary_payload.get("prompt", "")),
+                max_input_tokens=int(summary_payload.get("max_input_tokens", 8000)),
+            ),
+            routing_classification=RoutingTaskSettingsResponse(
+                enabled=bool(routing_payload.get("enabled", True)),
+                provider_id=str(routing_payload.get("provider_id", "openai-default")),
+                model=str(routing_payload.get("model", "gpt-4.1-mini")),
+                prompt=str(routing_payload.get("prompt", "")),
+                neighbor_count=int(routing_payload.get("neighbor_count", 8)),
+                neighbor_min_similarity=float(routing_payload.get("neighbor_min_similarity", 0.84)),
+                auto_apply_confidence_threshold=float(routing_payload.get("auto_apply_confidence_threshold", 0.78)),
+                auto_apply_neighbor_similarity_threshold=float(
+                    routing_payload.get("auto_apply_neighbor_similarity_threshold", 0.55)
+                ),
+                neighbor_path_override_enabled=bool(routing_payload.get("neighbor_path_override_enabled", True)),
+                neighbor_path_override_min_similarity=float(
+                    routing_payload.get("neighbor_path_override_min_similarity", 0.86)
+                ),
+                neighbor_path_override_min_gap=float(routing_payload.get("neighbor_path_override_min_gap", 0.04)),
+                neighbor_path_override_max_confidence=float(
+                    routing_payload.get("neighbor_path_override_max_confidence", 0.9)
+                ),
+            ),
+        ),
+    )
+
+
+@router.get("", response_model=AppSettingsResponse)
+def get_app_settings() -> AppSettingsResponse:
+    """Returns persisted provider and per-task settings configuration."""
+
+    return _build_response(read_app_settings())
+
+
+@router.patch("", response_model=AppSettingsResponse)
+def set_app_settings(payload: AppSettingsUpdateRequest) -> AppSettingsResponse:
+    """Updates providers and task settings and returns resulting persisted configuration."""
+
+    providers_payload = None
+    if payload.providers is not None:
+        providers_payload = [provider.model_dump() for provider in payload.providers]
+
+    tasks_payload = None
+    if payload.tasks is not None:
+        tasks_payload = payload.tasks.model_dump(exclude_none=True)
+
+    upload_defaults_payload = None
+    if payload.upload_defaults is not None:
+        upload_defaults_payload = payload.upload_defaults.model_dump(exclude_none=True)
+
+    display_payload = None
+    if payload.display is not None:
+        display_payload = payload.display.model_dump(exclude_none=True)
+
+    handwriting_style_payload = None
+    if payload.handwriting_style_clustering is not None:
+        handwriting_style_payload = payload.handwriting_style_clustering.model_dump(exclude_none=True)
+    predefined_paths_payload = None
+    if payload.predefined_paths is not None:
+        predefined_paths_payload = [item.model_dump(exclude_none=True) for item in payload.predefined_paths]
+    predefined_tags_payload = None
+    if payload.predefined_tags is not None:
+        predefined_tags_payload = [item.model_dump(exclude_none=True) for item in payload.predefined_tags]
+
+    updated = update_app_settings(
+        providers=providers_payload,
+        tasks=tasks_payload,
+        upload_defaults=upload_defaults_payload,
+        display=display_payload,
+        handwriting_style=handwriting_style_payload,
+        predefined_paths=predefined_paths_payload,
+        predefined_tags=predefined_tags_payload,
+    )
+    return _build_response(updated)
+
+
+@router.post("/reset", response_model=AppSettingsResponse)
+def reset_settings_to_defaults() -> AppSettingsResponse:
+    """Resets all persisted settings to default providers and task bindings."""
+
+    return _build_response(reset_app_settings())
+
+
+@router.patch("/handwriting", response_model=AppSettingsResponse)
+def set_handwriting_settings(payload: HandwritingSettingsUpdateRequest) -> AppSettingsResponse:
+    """Updates handwriting transcription settings and returns the resulting configuration."""
+
+    updated = update_handwriting_settings(
+        enabled=payload.enabled,
+        openai_base_url=payload.openai_base_url,
+        openai_model=payload.openai_model,
+        openai_timeout_seconds=payload.openai_timeout_seconds,
+        openai_api_key=payload.openai_api_key,
+        clear_openai_api_key=payload.clear_openai_api_key,
+    )
+    return _build_response(updated)
+
+
+@router.get("/handwriting", response_model=HandwritingSettingsResponse)
+def get_handwriting_settings() -> HandwritingSettingsResponse:
+    """Returns legacy handwriting response shape for compatibility with older clients."""
+
+    payload = _build_response(read_app_settings())
+    fallback_provider = ProviderSettingsResponse(
+        id="openai-default",
+        label="OpenAI Default",
+        provider_type="openai_compatible",
+        base_url="https://api.openai.com/v1",
+        timeout_seconds=45,
+        api_key_set=False,
+        api_key_masked="",
+    )
+    ocr = payload.tasks.ocr_handwriting
+    provider = next((item for item in payload.providers if item.id == ocr.provider_id), None)
+    if provider is None:
+        provider = payload.providers[0] if payload.providers else fallback_provider
+    return HandwritingSettingsResponse(
+        provider=provider.provider_type,
+        enabled=ocr.enabled,
+        openai_base_url=provider.base_url,
+        openai_model=ocr.model,
+        openai_timeout_seconds=provider.timeout_seconds,
+        openai_api_key_set=provider.api_key_set,
+        openai_api_key_masked=provider.api_key_masked,
+    )
@@ -0,0 +1 @@
+"""Core settings and shared configuration package."""
@@ -0,0 +1,46 @@
+"""Application settings and environment configuration."""
+
+from functools import lru_cache
+from pathlib import Path
+
+from pydantic import Field
+from pydantic_settings import BaseSettings, SettingsConfigDict
+
+
+class Settings(BaseSettings):
+    """Defines runtime configuration values loaded from environment variables."""
+
+    model_config = SettingsConfigDict(env_file=".env", env_file_encoding="utf-8", extra="ignore")
+
+    app_name: str = "dcm-dms"
+    app_env: str = "development"
+    database_url: str = "postgresql+psycopg://dcm:dcm@db:5432/dcm"
+    redis_url: str = "redis://redis:6379/0"
+    storage_root: Path = Path("/data/storage")
+    upload_chunk_size: int = 4 * 1024 * 1024
+    max_zip_members: int = 250
+    max_zip_depth: int = 2
+    max_text_length: int = 500_000
+    default_openai_base_url: str = "https://api.openai.com/v1"
+    default_openai_model: str = "gpt-4.1-mini"
+    default_openai_timeout_seconds: int = 45
+    default_openai_handwriting_enabled: bool = True
+    default_openai_api_key: str = ""
+    default_summary_model: str = "gpt-4.1-mini"
+    default_routing_model: str = "gpt-4.1-mini"
+    typesense_protocol: str = "http"
+    typesense_host: str = "typesense"
+    typesense_port: int = 8108
+    typesense_api_key: str = "dcm-typesense-key"
+    typesense_collection_name: str = "documents"
+    typesense_timeout_seconds: int = 120
+    typesense_num_retries: int = 0
+    public_base_url: str = "http://localhost:8000"
+    cors_origins: list[str] = Field(default_factory=lambda: ["http://localhost:5173", "http://localhost:3000"])
+
+
+@lru_cache(maxsize=1)
+def get_settings() -> Settings:
+    """Returns a cached settings object for dependency injection and service access."""
+
+    return Settings()
@@ -0,0 +1 @@
+"""Database package exposing engine and session utilities."""
@@ -0,0 +1,53 @@
+"""Database engine and session utilities for persistence operations."""
+
+from collections.abc import Generator
+
+from sqlalchemy import create_engine, text
+from sqlalchemy.orm import Session, declarative_base, sessionmaker
+
+from app.core.config import get_settings
+
+
+Base = declarative_base()
+
+
+settings = get_settings()
+engine = create_engine(settings.database_url, pool_pre_ping=True)
+SessionLocal = sessionmaker(bind=engine, autoflush=False, autocommit=False, expire_on_commit=False)
+
+
+def get_session() -> Generator[Session, None, None]:
+    """Provides a transactional database session for FastAPI request handling."""
+
+    session = SessionLocal()
+    try:
+        yield session
+    finally:
+        session.close()
+
+
+def init_db() -> None:
+    """Initializes all ORM tables and search-related database extensions/indexes."""
+
+    from app import models  # noqa: F401
+
+    Base.metadata.create_all(bind=engine)
+    with engine.begin() as connection:
+        connection.execute(text("CREATE EXTENSION IF NOT EXISTS pg_trgm"))
+        connection.execute(
+            text(
+                """
+                CREATE INDEX IF NOT EXISTS idx_documents_text_search
+                ON documents
+                USING GIN (
+                    to_tsvector(
+                        'simple',
+                        coalesce(original_filename, '') || ' ' ||
+                        coalesce(logical_path, '') || ' ' ||
+                        coalesce(extracted_text, '')
+                    )
+                )
+                """
+            )
+        )
+        connection.execute(text("CREATE INDEX IF NOT EXISTS idx_documents_sha256 ON documents (sha256)"))
@@ -0,0 +1,50 @@
+"""FastAPI entrypoint for the DMS backend service."""
+
+from fastapi import FastAPI
+from fastapi.middleware.cors import CORSMiddleware
+
+from app.api.router import api_router
+from app.core.config import get_settings
+from app.db.base import init_db
+from app.services.app_settings import ensure_app_settings
+from app.services.handwriting_style import ensure_handwriting_style_collection
+from app.services.storage import ensure_storage
+from app.services.typesense_index import ensure_typesense_collection
+
+
+settings = get_settings()
+
+
+def create_app() -> FastAPI:
+    """Builds and configures the FastAPI application instance."""
+
+    app = FastAPI(title="DCM DMS API", version="0.1.0")
+    app.add_middleware(
+        CORSMiddleware,
+        allow_origins=settings.cors_origins,
+        allow_credentials=True,
+        allow_methods=["*"],
+        allow_headers=["*"],
+    )
+    app.include_router(api_router, prefix="/api/v1")
+
+    @app.on_event("startup")
+    def startup_event() -> None:
+        """Initializes storage directories and database schema on service startup."""
+
+        ensure_storage()
+        ensure_app_settings()
+        init_db()
+        try:
+            ensure_typesense_collection()
+        except Exception:
+            pass
+        try:
+            ensure_handwriting_style_collection()
+        except Exception:
+            pass
+
+    return app
+
+
+app = create_app()
@@ -0,0 +1,6 @@
+"""Model exports for ORM metadata discovery."""
+
+from app.models.document import Document, DocumentStatus
+from app.models.processing_log import ProcessingLogEntry
+
+__all__ = ["Document", "DocumentStatus", "ProcessingLogEntry"]
@@ -0,0 +1,65 @@
+"""Data model representing a stored and processed document."""
+
+import uuid
+from datetime import UTC, datetime
+from enum import Enum
+
+from sqlalchemy import Boolean, DateTime, Enum as SqlEnum, ForeignKey, Integer, String, Text
+from sqlalchemy.dialects.postgresql import ARRAY, JSONB, UUID
+from sqlalchemy.orm import Mapped, mapped_column, relationship
+
+from app.db.base import Base
+
+
+class DocumentStatus(str, Enum):
+    """Enumerates processing states for uploaded documents."""
+
+    QUEUED = "queued"
+    PROCESSED = "processed"
+    UNSUPPORTED = "unsupported"
+    ERROR = "error"
+    TRASHED = "trashed"
+
+
+class Document(Base):
+    """Stores file identity, storage paths, extracted content, and classification metadata."""
+
+    __tablename__ = "documents"
+
+    id: Mapped[uuid.UUID] = mapped_column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
+    original_filename: Mapped[str] = mapped_column(String(512), nullable=False)
+    source_relative_path: Mapped[str] = mapped_column(String(1024), nullable=False, default="")
+    stored_relative_path: Mapped[str] = mapped_column(String(1024), nullable=False)
+    mime_type: Mapped[str] = mapped_column(String(255), nullable=False, default="application/octet-stream")
+    extension: Mapped[str] = mapped_column(String(32), nullable=False, default="")
+    sha256: Mapped[str] = mapped_column(String(128), nullable=False)
+    size_bytes: Mapped[int] = mapped_column(Integer, nullable=False)
+    logical_path: Mapped[str] = mapped_column(String(1024), nullable=False, default="Inbox")
+    suggested_path: Mapped[str | None] = mapped_column(String(1024), nullable=True)
+    tags: Mapped[list[str]] = mapped_column(ARRAY(String), nullable=False, default=list)
+    suggested_tags: Mapped[list[str]] = mapped_column(ARRAY(String), nullable=False, default=list)
+    metadata_json: Mapped[dict] = mapped_column(JSONB, nullable=False, default=dict)
+    extracted_text: Mapped[str] = mapped_column(Text, nullable=False, default="")
+    image_text_type: Mapped[str | None] = mapped_column(String(64), nullable=True)
+    handwriting_style_id: Mapped[str | None] = mapped_column(String(64), nullable=True, index=True)
+    status: Mapped[DocumentStatus] = mapped_column(SqlEnum(DocumentStatus), nullable=False, default=DocumentStatus.QUEUED)
+    preview_available: Mapped[bool] = mapped_column(Boolean, nullable=False, default=False)
+    archived_member_path: Mapped[str | None] = mapped_column(String(1024), nullable=True)
+    is_archive_member: Mapped[bool] = mapped_column(Boolean, nullable=False, default=False)
+    parent_document_id: Mapped[uuid.UUID | None] = mapped_column(UUID(as_uuid=True), ForeignKey("documents.id"), nullable=True)
+    replaces_document_id: Mapped[uuid.UUID | None] = mapped_column(UUID(as_uuid=True), ForeignKey("documents.id"), nullable=True)
+    created_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), nullable=False, default=lambda: datetime.now(UTC))
+    processed_at: Mapped[datetime | None] = mapped_column(DateTime(timezone=True), nullable=True)
+    updated_at: Mapped[datetime] = mapped_column(
+        DateTime(timezone=True),
+        nullable=False,
+        default=lambda: datetime.now(UTC),
+        onupdate=lambda: datetime.now(UTC),
+    )
+
+    parent_document: Mapped["Document | None"] = relationship(
+        "Document",
+        remote_side="Document.id",
+        foreign_keys=[parent_document_id],
+        post_update=True,
+    )
@@ -0,0 +1,33 @@
+"""Data model representing one persisted processing pipeline log entry."""
+
+import uuid
+from datetime import UTC, datetime
+
+from sqlalchemy import BigInteger, DateTime, ForeignKey, String, Text
+from sqlalchemy.dialects.postgresql import JSONB, UUID
+from sqlalchemy.orm import Mapped, mapped_column
+
+from app.db.base import Base
+
+
+class ProcessingLogEntry(Base):
+    """Stores a timestamped processing event with optional model prompt and response text."""
+
+    __tablename__ = "processing_logs"
+
+    id: Mapped[int] = mapped_column(BigInteger, primary_key=True, autoincrement=True)
+    created_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), nullable=False, default=lambda: datetime.now(UTC))
+    level: Mapped[str] = mapped_column(String(16), nullable=False, default="info")
+    stage: Mapped[str] = mapped_column(String(64), nullable=False)
+    event: Mapped[str] = mapped_column(String(256), nullable=False)
+    document_id: Mapped[uuid.UUID | None] = mapped_column(
+        UUID(as_uuid=True),
+        ForeignKey("documents.id", ondelete="SET NULL"),
+        nullable=True,
+    )
+    document_filename: Mapped[str | None] = mapped_column(String(512), nullable=True)
+    provider_id: Mapped[str | None] = mapped_column(String(128), nullable=True)
+    model_name: Mapped[str | None] = mapped_column(String(256), nullable=True)
+    prompt_text: Mapped[str | None] = mapped_column(Text, nullable=True)
+    response_text: Mapped[str | None] = mapped_column(Text, nullable=True)
+    payload_json: Mapped[dict] = mapped_column(JSONB, nullable=False, default=dict)
@@ -0,0 +1 @@
+"""Pydantic schema package for API request and response models."""
@@ -0,0 +1,92 @@
+"""Pydantic schema definitions for document API payloads."""
+
+from datetime import datetime
+from uuid import UUID
+
+from pydantic import BaseModel, Field
+
+from app.models.document import DocumentStatus
+
+
+class DocumentResponse(BaseModel):
+    """Represents a document record returned by API endpoints."""
+
+    id: UUID
+    original_filename: str
+    source_relative_path: str
+    mime_type: str
+    extension: str
+    size_bytes: int
+    sha256: str
+    logical_path: str
+    suggested_path: str | None
+    image_text_type: str | None
+    handwriting_style_id: str | None
+    tags: list[str] = Field(default_factory=list)
+    suggested_tags: list[str] = Field(default_factory=list)
+    status: DocumentStatus
+    preview_available: bool
+    is_archive_member: bool
+    archived_member_path: str | None
+    parent_document_id: UUID | None
+    replaces_document_id: UUID | None
+    created_at: datetime
+    processed_at: datetime | None
+
+    class Config:
+        """Enables ORM object parsing for SQLAlchemy model instances."""
+
+        from_attributes = True
+
+
+class DocumentDetailResponse(DocumentResponse):
+    """Represents a full document payload including extracted text content."""
+
+    extracted_text: str
+    metadata_json: dict
+
+
+class DocumentsListResponse(BaseModel):
+    """Represents a paginated document list response payload."""
+
+    total: int
+    items: list[DocumentResponse]
+
+
+class UploadConflict(BaseModel):
+    """Describes an upload conflict where a matching checksum already exists."""
+
+    original_filename: str
+    sha256: str
+    existing_document_id: UUID
+
+
+class UploadResponse(BaseModel):
+    """Represents the result of a batch upload request."""
+
+    uploaded: list[DocumentResponse] = Field(default_factory=list)
+    conflicts: list[UploadConflict] = Field(default_factory=list)
+
+
+class DocumentUpdateRequest(BaseModel):
+    """Captures document metadata changes."""
+
+    original_filename: str | None = None
+    logical_path: str | None = None
+    tags: list[str] | None = None
+
+
+class SearchResponse(BaseModel):
+    """Represents the result of a search query."""
+
+    total: int
+    items: list[DocumentResponse]
+
+
+class ContentExportRequest(BaseModel):
+    """Describes filters used to export extracted document contents as Markdown files."""
+
+    document_ids: list[UUID] = Field(default_factory=list)
+    path_prefix: str | None = None
+    include_trashed: bool = False
+    only_trashed: bool = False
@@ -0,0 +1,35 @@
+"""Pydantic schemas for processing pipeline log API payloads."""
+
+from datetime import datetime
+from uuid import UUID
+
+from pydantic import BaseModel, Field
+
+
+class ProcessingLogEntryResponse(BaseModel):
+    """Represents one persisted processing log event returned by API endpoints."""
+
+    id: int
+    created_at: datetime
+    level: str
+    stage: str
+    event: str
+    document_id: UUID | None
+    document_filename: str | None
+    provider_id: str | None
+    model_name: str | None
+    prompt_text: str | None
+    response_text: str | None
+    payload_json: dict
+
+    class Config:
+        """Enables ORM object parsing for SQLAlchemy model instances."""
+
+        from_attributes = True
+
+
+class ProcessingLogListResponse(BaseModel):
+    """Represents a paginated collection of processing log records."""
+
+    total: int
+    items: list[ProcessingLogEntryResponse] = Field(default_factory=list)
@@ -0,0 +1,242 @@
+"""Pydantic schemas for application-level runtime settings."""
+
+from pydantic import BaseModel, Field
+
+
+class ProviderSettingsResponse(BaseModel):
+    """Represents a persisted model provider with non-secret connection metadata."""
+
+    id: str
+    label: str
+    provider_type: str = "openai_compatible"
+    base_url: str
+    timeout_seconds: int
+    api_key_set: bool
+    api_key_masked: str = ""
+
+
+class ProviderSettingsUpdateRequest(BaseModel):
+    """Represents a model provider create-or-update request."""
+
+    id: str
+    label: str
+    provider_type: str = "openai_compatible"
+    base_url: str
+    timeout_seconds: int = Field(default=45, ge=5, le=180)
+    api_key: str | None = None
+    clear_api_key: bool = False
+
+
+class OcrTaskSettingsResponse(BaseModel):
+    """Represents OCR task runtime settings and prompt configuration."""
+
+    enabled: bool
+    provider_id: str
+    model: str
+    prompt: str
+
+
+class OcrTaskSettingsUpdateRequest(BaseModel):
+    """Represents OCR task settings updates."""
+
+    enabled: bool | None = None
+    provider_id: str | None = None
+    model: str | None = None
+    prompt: str | None = None
+
+
+class SummaryTaskSettingsResponse(BaseModel):
+    """Represents summarization task runtime settings."""
+
+    enabled: bool
+    provider_id: str
+    model: str
+    prompt: str
+    max_input_tokens: int
+
+
+class SummaryTaskSettingsUpdateRequest(BaseModel):
+    """Represents summarization task settings updates."""
+
+    enabled: bool | None = None
+    provider_id: str | None = None
+    model: str | None = None
+    prompt: str | None = None
+    max_input_tokens: int | None = Field(default=None, ge=512, le=64000)
+
+
+class RoutingTaskSettingsResponse(BaseModel):
+    """Represents routing task runtime settings for path and tag classification."""
+
+    enabled: bool
+    provider_id: str
+    model: str
+    prompt: str
+    neighbor_count: int
+    neighbor_min_similarity: float
+    auto_apply_confidence_threshold: float
+    auto_apply_neighbor_similarity_threshold: float
+    neighbor_path_override_enabled: bool
+    neighbor_path_override_min_similarity: float
+    neighbor_path_override_min_gap: float
+    neighbor_path_override_max_confidence: float
+
+
+class RoutingTaskSettingsUpdateRequest(BaseModel):
+    """Represents routing task settings updates."""
+
+    enabled: bool | None = None
+    provider_id: str | None = None
+    model: str | None = None
+    prompt: str | None = None
+    neighbor_count: int | None = Field(default=None, ge=1, le=40)
+    neighbor_min_similarity: float | None = Field(default=None, ge=0.0, le=1.0)
+    auto_apply_confidence_threshold: float | None = Field(default=None, ge=0.0, le=1.0)
+    auto_apply_neighbor_similarity_threshold: float | None = Field(default=None, ge=0.0, le=1.0)
+    neighbor_path_override_enabled: bool | None = None
+    neighbor_path_override_min_similarity: float | None = Field(default=None, ge=0.0, le=1.0)
+    neighbor_path_override_min_gap: float | None = Field(default=None, ge=0.0, le=1.0)
+    neighbor_path_override_max_confidence: float | None = Field(default=None, ge=0.0, le=1.0)
+
+
+class UploadDefaultsResponse(BaseModel):
+    """Represents default upload destination and default tags."""
+
+    logical_path: str
+    tags: list[str] = Field(default_factory=list)
+
+
+class UploadDefaultsUpdateRequest(BaseModel):
+    """Represents updates for default upload destination and default tags."""
+
+    logical_path: str | None = None
+    tags: list[str] | None = None
+
+
+class DisplaySettingsResponse(BaseModel):
+    """Represents document-list display preferences."""
+
+    cards_per_page: int = Field(default=12, ge=1, le=200)
+    log_typing_animation_enabled: bool = True
+
+
+class DisplaySettingsUpdateRequest(BaseModel):
+    """Represents updates for document-list display preferences."""
+
+    cards_per_page: int | None = Field(default=None, ge=1, le=200)
+    log_typing_animation_enabled: bool | None = None
+
+
+class PredefinedPathEntryResponse(BaseModel):
+    """Represents one predefined logical path with global discoverability scope."""
+
+    value: str
+    global_shared: bool
+
+
+class PredefinedPathEntryUpdateRequest(BaseModel):
+    """Represents one predefined logical path create-or-update request."""
+
+    value: str
+    global_shared: bool = False
+
+
+class PredefinedTagEntryResponse(BaseModel):
+    """Represents one predefined tag with global discoverability scope."""
+
+    value: str
+    global_shared: bool
+
+
+class PredefinedTagEntryUpdateRequest(BaseModel):
+    """Represents one predefined tag create-or-update request."""
+
+    value: str
+    global_shared: bool = False
+
+
+class HandwritingStyleSettingsResponse(BaseModel):
+    """Represents handwriting-style clustering settings used by Typesense image embeddings."""
+
+    enabled: bool
+    embed_model: str
+    neighbor_limit: int
+    match_min_similarity: float
+    bootstrap_match_min_similarity: float
+    bootstrap_sample_size: int
+    image_max_side: int
+
+
+class HandwritingStyleSettingsUpdateRequest(BaseModel):
+    """Represents updates for handwriting-style clustering and match thresholds."""
+
+    enabled: bool | None = None
+    embed_model: str | None = None
+    neighbor_limit: int | None = Field(default=None, ge=1, le=32)
+    match_min_similarity: float | None = Field(default=None, ge=0.0, le=1.0)
+    bootstrap_match_min_similarity: float | None = Field(default=None, ge=0.0, le=1.0)
+    bootstrap_sample_size: int | None = Field(default=None, ge=1, le=30)
+    image_max_side: int | None = Field(default=None, ge=256, le=4096)
+
+
+class TaskSettingsResponse(BaseModel):
+    """Represents all task-level model bindings and prompt settings."""
+
+    ocr_handwriting: OcrTaskSettingsResponse
+    summary_generation: SummaryTaskSettingsResponse
+    routing_classification: RoutingTaskSettingsResponse
+
+
+class TaskSettingsUpdateRequest(BaseModel):
+    """Represents partial updates for task-level settings."""
+
+    ocr_handwriting: OcrTaskSettingsUpdateRequest | None = None
+    summary_generation: SummaryTaskSettingsUpdateRequest | None = None
+    routing_classification: RoutingTaskSettingsUpdateRequest | None = None
+
+
+class AppSettingsResponse(BaseModel):
+    """Represents all application settings exposed by the API."""
+
+    upload_defaults: UploadDefaultsResponse
+    display: DisplaySettingsResponse
+    handwriting_style_clustering: HandwritingStyleSettingsResponse
+    predefined_paths: list[PredefinedPathEntryResponse] = Field(default_factory=list)
+    predefined_tags: list[PredefinedTagEntryResponse] = Field(default_factory=list)
+    providers: list[ProviderSettingsResponse]
+    tasks: TaskSettingsResponse
+
+
+class AppSettingsUpdateRequest(BaseModel):
+    """Represents full settings update input for providers and task bindings."""
+
+    upload_defaults: UploadDefaultsUpdateRequest | None = None
+    display: DisplaySettingsUpdateRequest | None = None
+    handwriting_style_clustering: HandwritingStyleSettingsUpdateRequest | None = None
+    predefined_paths: list[PredefinedPathEntryUpdateRequest] | None = None
+    predefined_tags: list[PredefinedTagEntryUpdateRequest] | None = None
+    providers: list[ProviderSettingsUpdateRequest] | None = None
+    tasks: TaskSettingsUpdateRequest | None = None
+
+
+class HandwritingSettingsResponse(BaseModel):
+    """Represents legacy handwriting response shape kept for backward compatibility."""
+
+    provider: str = "openai_compatible"
+    enabled: bool
+    openai_base_url: str
+    openai_model: str
+    openai_timeout_seconds: int
+    openai_api_key_set: bool
+    openai_api_key_masked: str = ""
+
+
+class HandwritingSettingsUpdateRequest(BaseModel):
+    """Represents legacy handwriting update shape kept for backward compatibility."""
+
+    enabled: bool | None = None
+    openai_base_url: str | None = None
+    openai_model: str | None = None
+    openai_timeout_seconds: int | None = Field(default=None, ge=5, le=180)
+    openai_api_key: str | None = None
+    clear_openai_api_key: bool = False
@@ -0,0 +1 @@
+"""Domain services package for storage, extraction, and classification logic."""
@@ -0,0 +1,885 @@
+"""Persistent single-user application settings service backed by host-mounted storage."""
+
+import json
+import re
+from pathlib import Path
+from typing import Any
+
+from app.core.config import get_settings
+
+
+settings = get_settings()
+
+
+TASK_OCR_HANDWRITING = "ocr_handwriting"
+TASK_SUMMARY_GENERATION = "summary_generation"
+TASK_ROUTING_CLASSIFICATION = "routing_classification"
+HANDWRITING_STYLE_SETTINGS_KEY = "handwriting_style_clustering"
+PREDEFINED_PATHS_SETTINGS_KEY = "predefined_paths"
+PREDEFINED_TAGS_SETTINGS_KEY = "predefined_tags"
+DEFAULT_HANDWRITING_STYLE_EMBED_MODEL = "ts/clip-vit-b-p32"
+
+
+DEFAULT_OCR_PROMPT = (
+    "You are an expert at reading messy handwritten notes, including hard-to-read writing.\n"
+    "Task: transcribe the handwriting as exactly as possible.\n\n"
+    "Rules:\n"
+    "- Output ONLY the transcription in German, no commentary.\n"
+    "- Preserve original line breaks where they clearly exist.\n"
+    "- Do NOT translate or correct grammar or spelling.\n"
+    "- If a word or character is unclear, wrap your best guess in [[? ... ?]].\n"
+    "- If something is unreadable, write [[?unleserlich?]] in its place."
+)
+
+DEFAULT_SUMMARY_PROMPT = (
+    "You summarize documents for indexing and routing.\n"
+    "Return concise markdown with key entities, purpose, and document category hints.\n"
+    "Do not invent facts and do not include any explanation outside the summary."
+)
+
+DEFAULT_ROUTING_PROMPT = (
+    "You classify one document into an existing logical path and tags.\n"
+    "Prefer existing paths and tags when possible.\n"
+    "If the evidence is weak, keep chosen_path as null and use suggestions instead.\n"
+    "Return JSON only with this exact shape:\n"
+    "{\n"
+    "  \"chosen_path\": string | null,\n"
+    "  \"chosen_tags\": string[],\n"
+    "  \"suggested_new_paths\": string[],\n"
+    "  \"suggested_new_tags\": string[],\n"
+    "  \"confidence\": number\n"
+    "}\n"
+    "Confidence must be between 0 and 1."
+)
+
+
+def _default_settings() -> dict[str, Any]:
+    """Builds default settings including providers and model task bindings."""
+
+    return {
+        "upload_defaults": {
+            "logical_path": "Inbox",
+            "tags": [],
+        },
+        "display": {
+            "cards_per_page": 12,
+            "log_typing_animation_enabled": True,
+        },
+        PREDEFINED_PATHS_SETTINGS_KEY: [],
+        PREDEFINED_TAGS_SETTINGS_KEY: [],
+        HANDWRITING_STYLE_SETTINGS_KEY: {
+            "enabled": True,
+            "embed_model": DEFAULT_HANDWRITING_STYLE_EMBED_MODEL,
+            "neighbor_limit": 8,
+            "match_min_similarity": 0.86,
+            "bootstrap_match_min_similarity": 0.89,
+            "bootstrap_sample_size": 3,
+            "image_max_side": 1024,
+        },
+        "providers": [
+            {
+                "id": "openai-default",
+                "label": "OpenAI Default",
+                "provider_type": "openai_compatible",
+                "base_url": settings.default_openai_base_url,
+                "timeout_seconds": settings.default_openai_timeout_seconds,
+                "api_key": settings.default_openai_api_key,
+            }
+        ],
+        "tasks": {
+            TASK_OCR_HANDWRITING: {
+                "enabled": settings.default_openai_handwriting_enabled,
+                "provider_id": "openai-default",
+                "model": settings.default_openai_model,
+                "prompt": DEFAULT_OCR_PROMPT,
+            },
+            TASK_SUMMARY_GENERATION: {
+                "enabled": True,
+                "provider_id": "openai-default",
+                "model": settings.default_summary_model,
+                "prompt": DEFAULT_SUMMARY_PROMPT,
+                "max_input_tokens": 8000,
+            },
+            TASK_ROUTING_CLASSIFICATION: {
+                "enabled": True,
+                "provider_id": "openai-default",
+                "model": settings.default_routing_model,
+                "prompt": DEFAULT_ROUTING_PROMPT,
+                "neighbor_count": 8,
+                "neighbor_min_similarity": 0.84,
+                "auto_apply_confidence_threshold": 0.78,
+                "auto_apply_neighbor_similarity_threshold": 0.55,
+                "neighbor_path_override_enabled": True,
+                "neighbor_path_override_min_similarity": 0.86,
+                "neighbor_path_override_min_gap": 0.04,
+                "neighbor_path_override_max_confidence": 0.9,
+            },
+        },
+    }
+
+
+def _settings_path() -> Path:
+    """Returns the absolute path of the persisted settings file."""
+
+    return settings.storage_root / "settings.json"
+
+
+def _clamp_timeout(value: int) -> int:
+    """Clamps timeout values to a safe and practical range."""
+
+    return max(5, min(180, value))
+
+
+def _clamp_input_tokens(value: int) -> int:
+    """Clamps per-request summary input token budget values to practical bounds."""
+
+    return max(512, min(64000, value))
+
+
+def _clamp_neighbor_count(value: int) -> int:
+    """Clamps nearest-neighbor lookup count for routing classification."""
+
+    return max(1, min(40, value))
+
+
+def _clamp_cards_per_page(value: int) -> int:
+    """Clamps dashboard cards-per-page display setting to practical bounds."""
+
+    return max(1, min(200, value))
+
+
+def _clamp_predefined_entries_limit(value: int) -> int:
+    """Clamps maximum count for predefined tag/path catalog entries."""
+
+    return max(1, min(2000, value))
+
+
+def _clamp_handwriting_style_neighbor_limit(value: int) -> int:
+    """Clamps handwriting-style nearest-neighbor count used for style matching."""
+
+    return max(1, min(32, value))
+
+
+def _clamp_handwriting_style_sample_size(value: int) -> int:
+    """Clamps handwriting-style bootstrap sample size used for stricter matching."""
+
+    return max(1, min(30, value))
+
+
+def _clamp_handwriting_style_image_max_side(value: int) -> int:
+    """Clamps handwriting-style image normalization max-side pixel size."""
+
+    return max(256, min(4096, value))
+
+
+def _clamp_probability(value: float, fallback: float) -> float:
+    """Clamps probability-like numbers to the range [0, 1]."""
+
+    try:
+        parsed = float(value)
+    except (TypeError, ValueError):
+        return fallback
+    return max(0.0, min(1.0, parsed))
+
+
+def _safe_int(value: Any, fallback: int) -> int:
+    """Safely converts arbitrary values to integers with fallback handling."""
+
+    try:
+        return int(value)
+    except (TypeError, ValueError):
+        return fallback
+
+
+def _normalize_provider_id(value: str | None, fallback: str) -> str:
+    """Normalizes provider identifiers into stable lowercase slug values."""
+
+    candidate = (value or "").strip().lower()
+    candidate = re.sub(r"[^a-z0-9_-]+", "-", candidate).strip("-")
+    return candidate or fallback
+
+
+def _mask_api_key(value: str) -> str:
+    """Masks a secret API key while retaining enough characters for identification."""
+
+    if not value:
+        return ""
+    if len(value) <= 6:
+        return "*" * len(value)
+    return f"{value[:4]}...{value[-2:]}"
+
+
+def _normalize_provider(
+    payload: dict[str, Any],
+    fallback_id: str,
+    fallback_values: dict[str, Any],
+) -> dict[str, Any]:
+    """Normalizes one provider payload to a stable shape with bounds and defaults."""
+
+    defaults = _default_settings()["providers"][0]
+    provider_id = _normalize_provider_id(str(payload.get("id", fallback_id)), fallback_id)
+    provider_type = str(payload.get("provider_type", fallback_values.get("provider_type", defaults["provider_type"]))).strip()
+    if provider_type != "openai_compatible":
+        provider_type = "openai_compatible"
+
+    api_key_value = payload.get("api_key", fallback_values.get("api_key", defaults["api_key"]))
+    api_key = str(api_key_value).strip() if api_key_value is not None else ""
+
+    return {
+        "id": provider_id,
+        "label": str(payload.get("label", fallback_values.get("label", provider_id))).strip() or provider_id,
+        "provider_type": provider_type,
+        "base_url": str(payload.get("base_url", fallback_values.get("base_url", defaults["base_url"]))).strip()
+        or defaults["base_url"],
+        "timeout_seconds": _clamp_timeout(
+            _safe_int(
+                payload.get("timeout_seconds", fallback_values.get("timeout_seconds", defaults["timeout_seconds"])),
+                defaults["timeout_seconds"],
+            )
+        ),
+        "api_key": api_key,
+    }
+
+
+def _normalize_ocr_task(payload: dict[str, Any], provider_ids: list[str]) -> dict[str, Any]:
+    """Normalizes OCR task settings while enforcing valid provider references."""
+
+    defaults = _default_settings()["tasks"][TASK_OCR_HANDWRITING]
+    provider_id = str(payload.get("provider_id", defaults["provider_id"])).strip()
+    if provider_id not in provider_ids:
+        provider_id = provider_ids[0]
+
+    return {
+        "enabled": bool(payload.get("enabled", defaults["enabled"])),
+        "provider_id": provider_id,
+        "model": str(payload.get("model", defaults["model"])).strip() or defaults["model"],
+        "prompt": str(payload.get("prompt", defaults["prompt"])).strip() or defaults["prompt"],
+    }
+
+
+def _normalize_summary_task(payload: dict[str, Any], provider_ids: list[str]) -> dict[str, Any]:
+    """Normalizes summary task settings while enforcing valid provider references."""
+
+    defaults = _default_settings()["tasks"][TASK_SUMMARY_GENERATION]
+    provider_id = str(payload.get("provider_id", defaults["provider_id"])).strip()
+    if provider_id not in provider_ids:
+        provider_id = provider_ids[0]
+
+    raw_max_tokens = payload.get("max_input_tokens")
+    if raw_max_tokens is None:
+        legacy_chars = _safe_int(payload.get("max_source_chars", 0), 0)
+        if legacy_chars > 0:
+            raw_max_tokens = max(512, legacy_chars // 4)
+        else:
+            raw_max_tokens = defaults["max_input_tokens"]
+
+    return {
+        "enabled": bool(payload.get("enabled", defaults["enabled"])),
+        "provider_id": provider_id,
+        "model": str(payload.get("model", defaults["model"])).strip() or defaults["model"],
+        "prompt": str(payload.get("prompt", defaults["prompt"])).strip() or defaults["prompt"],
+        "max_input_tokens": _clamp_input_tokens(
+            _safe_int(raw_max_tokens, defaults["max_input_tokens"])
+        ),
+    }
+
+
+def _normalize_routing_task(payload: dict[str, Any], provider_ids: list[str]) -> dict[str, Any]:
+    """Normalizes routing task settings while enforcing valid provider references."""
+
+    defaults = _default_settings()["tasks"][TASK_ROUTING_CLASSIFICATION]
+    provider_id = str(payload.get("provider_id", defaults["provider_id"])).strip()
+    if provider_id not in provider_ids:
+        provider_id = provider_ids[0]
+
+    return {
+        "enabled": bool(payload.get("enabled", defaults["enabled"])),
+        "provider_id": provider_id,
+        "model": str(payload.get("model", defaults["model"])).strip() or defaults["model"],
+        "prompt": str(payload.get("prompt", defaults["prompt"])).strip() or defaults["prompt"],
+        "neighbor_count": _clamp_neighbor_count(
+            _safe_int(payload.get("neighbor_count", defaults["neighbor_count"]), defaults["neighbor_count"])
+        ),
+        "neighbor_min_similarity": _clamp_probability(
+            payload.get("neighbor_min_similarity", defaults["neighbor_min_similarity"]),
+            defaults["neighbor_min_similarity"],
+        ),
+        "auto_apply_confidence_threshold": _clamp_probability(
+            payload.get("auto_apply_confidence_threshold", defaults["auto_apply_confidence_threshold"]),
+            defaults["auto_apply_confidence_threshold"],
+        ),
+        "auto_apply_neighbor_similarity_threshold": _clamp_probability(
+            payload.get(
+                "auto_apply_neighbor_similarity_threshold",
+                defaults["auto_apply_neighbor_similarity_threshold"],
+            ),
+            defaults["auto_apply_neighbor_similarity_threshold"],
+        ),
+        "neighbor_path_override_enabled": bool(
+            payload.get("neighbor_path_override_enabled", defaults["neighbor_path_override_enabled"])
+        ),
+        "neighbor_path_override_min_similarity": _clamp_probability(
+            payload.get(
+                "neighbor_path_override_min_similarity",
+                defaults["neighbor_path_override_min_similarity"],
+            ),
+            defaults["neighbor_path_override_min_similarity"],
+        ),
+        "neighbor_path_override_min_gap": _clamp_probability(
+            payload.get("neighbor_path_override_min_gap", defaults["neighbor_path_override_min_gap"]),
+            defaults["neighbor_path_override_min_gap"],
+        ),
+        "neighbor_path_override_max_confidence": _clamp_probability(
+            payload.get(
+                "neighbor_path_override_max_confidence",
+                defaults["neighbor_path_override_max_confidence"],
+            ),
+            defaults["neighbor_path_override_max_confidence"],
+        ),
+    }
+
+
+def _normalize_tasks(payload: dict[str, Any], provider_ids: list[str]) -> dict[str, Any]:
+    """Normalizes task settings map for OCR, summarization, and routing tasks."""
+
+    if not isinstance(payload, dict):
+        payload = {}
+    return {
+        TASK_OCR_HANDWRITING: _normalize_ocr_task(payload.get(TASK_OCR_HANDWRITING, {}), provider_ids),
+        TASK_SUMMARY_GENERATION: _normalize_summary_task(payload.get(TASK_SUMMARY_GENERATION, {}), provider_ids),
+        TASK_ROUTING_CLASSIFICATION: _normalize_routing_task(payload.get(TASK_ROUTING_CLASSIFICATION, {}), provider_ids),
+    }
+
+
+def _normalize_upload_defaults(payload: dict[str, Any], defaults: dict[str, Any]) -> dict[str, Any]:
+    """Normalizes upload default destination path and tags."""
+
+    if not isinstance(payload, dict):
+        payload = {}
+
+    default_path = str(defaults.get("logical_path", "Inbox")).strip() or "Inbox"
+    raw_path = str(payload.get("logical_path", default_path)).strip()
+    logical_path = raw_path or default_path
+
+    raw_tags = payload.get("tags", defaults.get("tags", []))
+    tags: list[str] = []
+    seen_lowered: set[str] = set()
+    if isinstance(raw_tags, list):
+        for raw_tag in raw_tags:
+            normalized = str(raw_tag).strip()
+            if not normalized:
+                continue
+            lowered = normalized.lower()
+            if lowered in seen_lowered:
+                continue
+            seen_lowered.add(lowered)
+            tags.append(normalized)
+            if len(tags) >= 50:
+                break
+
+    return {
+        "logical_path": logical_path,
+        "tags": tags,
+    }
+
+
+def _normalize_display_settings(payload: dict[str, Any], defaults: dict[str, Any]) -> dict[str, Any]:
+    """Normalizes display settings used by the document dashboard UI."""
+
+    if not isinstance(payload, dict):
+        payload = {}
+
+    default_cards_per_page = _safe_int(defaults.get("cards_per_page", 12), 12)
+    cards_per_page = _clamp_cards_per_page(
+        _safe_int(payload.get("cards_per_page", default_cards_per_page), default_cards_per_page)
+    )
+    return {
+        "cards_per_page": cards_per_page,
+        "log_typing_animation_enabled": bool(
+            payload.get("log_typing_animation_enabled", defaults.get("log_typing_animation_enabled", True))
+        ),
+    }
+
+
+def _normalize_predefined_paths(
+    payload: Any,
+    existing_items: list[dict[str, Any]] | None = None,
+) -> list[dict[str, Any]]:
+    """Normalizes predefined path entries and enforces irreversible global-sharing flag."""
+
+    existing_map: dict[str, dict[str, Any]] = {}
+    if isinstance(existing_items, list):
+        for item in existing_items:
+            if not isinstance(item, dict):
+                continue
+            value = str(item.get("value", "")).strip().strip("/")
+            if not value:
+                continue
+            existing_map[value.lower()] = {
+                "value": value,
+                "global_shared": bool(item.get("global_shared", False)),
+            }
+
+    if not isinstance(payload, list):
+        return list(existing_map.values())
+
+    normalized: list[dict[str, Any]] = []
+    seen: set[str] = set()
+    limit = _clamp_predefined_entries_limit(len(payload))
+    for item in payload:
+        if not isinstance(item, dict):
+            continue
+        value = str(item.get("value", "")).strip().strip("/")
+        if not value:
+            continue
+        lowered = value.lower()
+        if lowered in seen:
+            continue
+        seen.add(lowered)
+        existing = existing_map.get(lowered)
+        requested_global = bool(item.get("global_shared", False))
+        global_shared = bool(existing.get("global_shared", False) if existing else False) or requested_global
+        normalized.append(
+            {
+                "value": value,
+                "global_shared": global_shared,
+            }
+        )
+        if len(normalized) >= limit:
+            break
+    return normalized
+
+
+def _normalize_predefined_tags(
+    payload: Any,
+    existing_items: list[dict[str, Any]] | None = None,
+) -> list[dict[str, Any]]:
+    """Normalizes predefined tag entries and enforces irreversible global-sharing flag."""
+
+    existing_map: dict[str, dict[str, Any]] = {}
+    if isinstance(existing_items, list):
+        for item in existing_items:
+            if not isinstance(item, dict):
+                continue
+            value = str(item.get("value", "")).strip()
+            if not value:
+                continue
+            existing_map[value.lower()] = {
+                "value": value,
+                "global_shared": bool(item.get("global_shared", False)),
+            }
+
+    if not isinstance(payload, list):
+        return list(existing_map.values())
+
+    normalized: list[dict[str, Any]] = []
+    seen: set[str] = set()
+    limit = _clamp_predefined_entries_limit(len(payload))
+    for item in payload:
+        if not isinstance(item, dict):
+            continue
+        value = str(item.get("value", "")).strip()
+        if not value:
+            continue
+        lowered = value.lower()
+        if lowered in seen:
+            continue
+        seen.add(lowered)
+        existing = existing_map.get(lowered)
+        requested_global = bool(item.get("global_shared", False))
+        global_shared = bool(existing.get("global_shared", False) if existing else False) or requested_global
+        normalized.append(
+            {
+                "value": value,
+                "global_shared": global_shared,
+            }
+        )
+        if len(normalized) >= limit:
+            break
+    return normalized
+
+
+def _normalize_handwriting_style_settings(payload: dict[str, Any], defaults: dict[str, Any]) -> dict[str, Any]:
+    """Normalizes handwriting-style clustering settings exposed in the settings UI."""
+
+    if not isinstance(payload, dict):
+        payload = {}
+
+    default_enabled = bool(defaults.get("enabled", True))
+    default_embed_model = str(defaults.get("embed_model", DEFAULT_HANDWRITING_STYLE_EMBED_MODEL)).strip()
+    default_neighbor_limit = _safe_int(defaults.get("neighbor_limit", 8), 8)
+    default_match_min = _clamp_probability(defaults.get("match_min_similarity", 0.86), 0.86)
+    default_bootstrap_match_min = _clamp_probability(defaults.get("bootstrap_match_min_similarity", 0.89), 0.89)
+    default_bootstrap_sample_size = _safe_int(defaults.get("bootstrap_sample_size", 3), 3)
+    default_image_max_side = _safe_int(defaults.get("image_max_side", 1024), 1024)
+
+    return {
+        "enabled": bool(payload.get("enabled", default_enabled)),
+        "embed_model": str(payload.get("embed_model", default_embed_model)).strip() or default_embed_model,
+        "neighbor_limit": _clamp_handwriting_style_neighbor_limit(
+            _safe_int(payload.get("neighbor_limit", default_neighbor_limit), default_neighbor_limit)
+        ),
+        "match_min_similarity": _clamp_probability(
+            payload.get("match_min_similarity", default_match_min),
+            default_match_min,
+        ),
+        "bootstrap_match_min_similarity": _clamp_probability(
+            payload.get("bootstrap_match_min_similarity", default_bootstrap_match_min),
+            default_bootstrap_match_min,
+        ),
+        "bootstrap_sample_size": _clamp_handwriting_style_sample_size(
+            _safe_int(payload.get("bootstrap_sample_size", default_bootstrap_sample_size), default_bootstrap_sample_size)
+        ),
+        "image_max_side": _clamp_handwriting_style_image_max_side(
+            _safe_int(payload.get("image_max_side", default_image_max_side), default_image_max_side)
+        ),
+    }
+
+
+def _sanitize_settings(payload: dict[str, Any]) -> dict[str, Any]:
+    """Sanitizes all persisted settings into a stable normalized structure."""
+
+    if not isinstance(payload, dict):
+        payload = {}
+
+    defaults = _default_settings()
+
+    providers_payload = payload.get("providers")
+    normalized_providers: list[dict[str, Any]] = []
+    seen_provider_ids: set[str] = set()
+
+    if isinstance(providers_payload, list):
+        for index, provider_payload in enumerate(providers_payload):
+            if not isinstance(provider_payload, dict):
+                continue
+            fallback = defaults["providers"][0]
+            candidate = _normalize_provider(provider_payload, fallback_id=f"provider-{index + 1}", fallback_values=fallback)
+            if candidate["id"] in seen_provider_ids:
+                continue
+            seen_provider_ids.add(candidate["id"])
+            normalized_providers.append(candidate)
+
+    if not normalized_providers:
+        normalized_providers = [dict(defaults["providers"][0])]
+
+    provider_ids = [provider["id"] for provider in normalized_providers]
+    tasks_payload = payload.get("tasks", {})
+    normalized_tasks = _normalize_tasks(tasks_payload, provider_ids)
+    upload_defaults = _normalize_upload_defaults(payload.get("upload_defaults", {}), defaults["upload_defaults"])
+    display_settings = _normalize_display_settings(payload.get("display", {}), defaults["display"])
+    predefined_paths = _normalize_predefined_paths(
+        payload.get(PREDEFINED_PATHS_SETTINGS_KEY, []),
+        existing_items=payload.get(PREDEFINED_PATHS_SETTINGS_KEY, []),
+    )
+    predefined_tags = _normalize_predefined_tags(
+        payload.get(PREDEFINED_TAGS_SETTINGS_KEY, []),
+        existing_items=payload.get(PREDEFINED_TAGS_SETTINGS_KEY, []),
+    )
+    handwriting_style_settings = _normalize_handwriting_style_settings(
+        payload.get(HANDWRITING_STYLE_SETTINGS_KEY, {}),
+        defaults[HANDWRITING_STYLE_SETTINGS_KEY],
+    )
+
+    return {
+        "upload_defaults": upload_defaults,
+        "display": display_settings,
+        PREDEFINED_PATHS_SETTINGS_KEY: predefined_paths,
+        PREDEFINED_TAGS_SETTINGS_KEY: predefined_tags,
+        HANDWRITING_STYLE_SETTINGS_KEY: handwriting_style_settings,
+        "providers": normalized_providers,
+        "tasks": normalized_tasks,
+    }
+
+
+def ensure_app_settings() -> None:
+    """Creates a settings file with defaults when no persisted settings are present."""
+
+    path = _settings_path()
+    path.parent.mkdir(parents=True, exist_ok=True)
+    if path.exists():
+        return
+
+    defaults = _sanitize_settings(_default_settings())
+    path.write_text(json.dumps(defaults, indent=2), encoding="utf-8")
+
+
+def _read_raw_settings() -> dict[str, Any]:
+    """Reads persisted settings from disk and returns normalized values."""
+
+    ensure_app_settings()
+    path = _settings_path()
+    try:
+        payload = json.loads(path.read_text(encoding="utf-8"))
+    except (OSError, json.JSONDecodeError):
+        payload = {}
+    return _sanitize_settings(payload)
+
+
+def _write_settings(payload: dict[str, Any]) -> None:
+    """Persists sanitized settings payload to host-mounted storage."""
+
+    path = _settings_path()
+    path.parent.mkdir(parents=True, exist_ok=True)
+    path.write_text(json.dumps(payload, indent=2), encoding="utf-8")
+
+
+def read_app_settings() -> dict[str, Any]:
+    """Reads settings and returns a sanitized view safe for API responses."""
+
+    payload = _read_raw_settings()
+    providers_response: list[dict[str, Any]] = []
+    for provider in payload["providers"]:
+        api_key = str(provider.get("api_key", ""))
+        providers_response.append(
+            {
+                "id": provider["id"],
+                "label": provider["label"],
+                "provider_type": provider["provider_type"],
+                "base_url": provider["base_url"],
+                "timeout_seconds": int(provider["timeout_seconds"]),
+                "api_key_set": bool(api_key),
+                "api_key_masked": _mask_api_key(api_key),
+            }
+        )
+
+    return {
+        "upload_defaults": payload.get("upload_defaults", {"logical_path": "Inbox", "tags": []}),
+        "display": payload.get("display", {"cards_per_page": 12, "log_typing_animation_enabled": True}),
+        PREDEFINED_PATHS_SETTINGS_KEY: payload.get(PREDEFINED_PATHS_SETTINGS_KEY, []),
+        PREDEFINED_TAGS_SETTINGS_KEY: payload.get(PREDEFINED_TAGS_SETTINGS_KEY, []),
+        HANDWRITING_STYLE_SETTINGS_KEY: payload.get(HANDWRITING_STYLE_SETTINGS_KEY, {}),
+        "providers": providers_response,
+        "tasks": payload["tasks"],
+    }
+
+
+def reset_app_settings() -> dict[str, Any]:
+    """Resets persisted application settings to sanitized repository defaults."""
+
+    defaults = _sanitize_settings(_default_settings())
+    _write_settings(defaults)
+    return read_app_settings()
+
+
+def read_task_runtime_settings(task_name: str) -> dict[str, Any]:
+    """Returns runtime task settings and resolved provider including secret values."""
+
+    payload = _read_raw_settings()
+    tasks = payload["tasks"]
+    if task_name not in tasks:
+        raise KeyError(f"Unknown task settings key: {task_name}")
+
+    task = dict(tasks[task_name])
+    provider_map = {provider["id"]: provider for provider in payload["providers"]}
+    provider = provider_map.get(task.get("provider_id"))
+    if provider is None:
+        provider = payload["providers"][0]
+        task["provider_id"] = provider["id"]
+
+    return {
+        "task": task,
+        "provider": dict(provider),
+    }
+
+
+def update_app_settings(
+    providers: list[dict[str, Any]] | None = None,
+    tasks: dict[str, dict[str, Any]] | None = None,
+    upload_defaults: dict[str, Any] | None = None,
+    display: dict[str, Any] | None = None,
+    handwriting_style: dict[str, Any] | None = None,
+    predefined_paths: list[dict[str, Any]] | None = None,
+    predefined_tags: list[dict[str, Any]] | None = None,
+) -> dict[str, Any]:
+    """Updates app settings, persists them, and returns API-safe values."""
+
+    current_payload = _read_raw_settings()
+    next_payload: dict[str, Any] = {
+        "upload_defaults": dict(current_payload.get("upload_defaults", {"logical_path": "Inbox", "tags": []})),
+        "display": dict(current_payload.get("display", {"cards_per_page": 12, "log_typing_animation_enabled": True})),
+        PREDEFINED_PATHS_SETTINGS_KEY: list(current_payload.get(PREDEFINED_PATHS_SETTINGS_KEY, [])),
+        PREDEFINED_TAGS_SETTINGS_KEY: list(current_payload.get(PREDEFINED_TAGS_SETTINGS_KEY, [])),
+        HANDWRITING_STYLE_SETTINGS_KEY: dict(
+            current_payload.get(HANDWRITING_STYLE_SETTINGS_KEY, _default_settings()[HANDWRITING_STYLE_SETTINGS_KEY])
+        ),
+        "providers": list(current_payload["providers"]),
+        "tasks": dict(current_payload["tasks"]),
+    }
+
+    if providers is not None:
+        existing_provider_map = {provider["id"]: provider for provider in current_payload["providers"]}
+        next_providers: list[dict[str, Any]] = []
+        for index, provider_payload in enumerate(providers):
+            if not isinstance(provider_payload, dict):
+                continue
+
+            provider_id = _normalize_provider_id(
+                str(provider_payload.get("id", "")),
+                fallback=f"provider-{index + 1}",
+            )
+            existing_provider = existing_provider_map.get(provider_id, {})
+            merged_payload = dict(provider_payload)
+            merged_payload["id"] = provider_id
+
+            if bool(provider_payload.get("clear_api_key", False)):
+                merged_payload["api_key"] = ""
+            elif "api_key" in provider_payload and provider_payload.get("api_key") is not None:
+                merged_payload["api_key"] = str(provider_payload.get("api_key")).strip()
+            else:
+                merged_payload["api_key"] = str(existing_provider.get("api_key", ""))
+
+            normalized_provider = _normalize_provider(
+                merged_payload,
+                fallback_id=provider_id,
+                fallback_values=existing_provider,
+            )
+            next_providers.append(normalized_provider)
+
+        if next_providers:
+            next_payload["providers"] = next_providers
+
+    if tasks is not None:
+        merged_tasks = dict(current_payload["tasks"])
+        for task_name, task_update in tasks.items():
+            if task_name not in merged_tasks or not isinstance(task_update, dict):
+                continue
+            existing_task = dict(merged_tasks[task_name])
+            for key, value in task_update.items():
+                if value is None:
+                    continue
+                existing_task[key] = value
+            merged_tasks[task_name] = existing_task
+        next_payload["tasks"] = merged_tasks
+
+    if upload_defaults is not None and isinstance(upload_defaults, dict):
+        next_upload_defaults = dict(next_payload.get("upload_defaults", {}))
+        for key in ("logical_path", "tags"):
+            if key in upload_defaults:
+                next_upload_defaults[key] = upload_defaults[key]
+        next_payload["upload_defaults"] = next_upload_defaults
+
+    if display is not None and isinstance(display, dict):
+        next_display = dict(next_payload.get("display", {}))
+        if "cards_per_page" in display:
+            next_display["cards_per_page"] = display["cards_per_page"]
+        if "log_typing_animation_enabled" in display:
+            next_display["log_typing_animation_enabled"] = bool(display["log_typing_animation_enabled"])
+        next_payload["display"] = next_display
+
+    if handwriting_style is not None and isinstance(handwriting_style, dict):
+        next_handwriting_style = dict(next_payload.get(HANDWRITING_STYLE_SETTINGS_KEY, {}))
+        for key in (
+            "enabled",
+            "embed_model",
+            "neighbor_limit",
+            "match_min_similarity",
+            "bootstrap_match_min_similarity",
+            "bootstrap_sample_size",
+            "image_max_side",
+        ):
+            if key in handwriting_style:
+                next_handwriting_style[key] = handwriting_style[key]
+        next_payload[HANDWRITING_STYLE_SETTINGS_KEY] = next_handwriting_style
+
+    if predefined_paths is not None:
+        next_payload[PREDEFINED_PATHS_SETTINGS_KEY] = _normalize_predefined_paths(
+            predefined_paths,
+            existing_items=next_payload.get(PREDEFINED_PATHS_SETTINGS_KEY, []),
+        )
+
+    if predefined_tags is not None:
+        next_payload[PREDEFINED_TAGS_SETTINGS_KEY] = _normalize_predefined_tags(
+            predefined_tags,
+            existing_items=next_payload.get(PREDEFINED_TAGS_SETTINGS_KEY, []),
+        )
+
+    sanitized = _sanitize_settings(next_payload)
+    _write_settings(sanitized)
+    return read_app_settings()
+
+
+def read_handwriting_provider_settings() -> dict[str, Any]:
+    """Returns OCR settings in legacy shape for the handwriting transcription service."""
+
+    runtime = read_task_runtime_settings(TASK_OCR_HANDWRITING)
+    provider = runtime["provider"]
+    task = runtime["task"]
+
+    return {
+        "provider": provider["provider_type"],
+        "enabled": bool(task.get("enabled", True)),
+        "openai_base_url": str(provider.get("base_url", settings.default_openai_base_url)),
+        "openai_model": str(task.get("model", settings.default_openai_model)),
+        "openai_timeout_seconds": int(provider.get("timeout_seconds", settings.default_openai_timeout_seconds)),
+        "openai_api_key": str(provider.get("api_key", "")),
+        "prompt": str(task.get("prompt", DEFAULT_OCR_PROMPT)),
+        "provider_id": str(provider.get("id", "openai-default")),
+    }
+
+
+def read_handwriting_style_settings() -> dict[str, Any]:
+    """Returns handwriting-style clustering settings for Typesense style assignment logic."""
+
+    payload = _read_raw_settings()
+    defaults = _default_settings()[HANDWRITING_STYLE_SETTINGS_KEY]
+    return _normalize_handwriting_style_settings(
+        payload.get(HANDWRITING_STYLE_SETTINGS_KEY, {}),
+        defaults,
+    )
+
+
+def read_predefined_paths_settings() -> list[dict[str, Any]]:
+    """Returns normalized predefined logical path catalog entries."""
+
+    payload = _read_raw_settings()
+    return _normalize_predefined_paths(
+        payload.get(PREDEFINED_PATHS_SETTINGS_KEY, []),
+        existing_items=payload.get(PREDEFINED_PATHS_SETTINGS_KEY, []),
+    )
+
+
+def read_predefined_tags_settings() -> list[dict[str, Any]]:
+    """Returns normalized predefined tag catalog entries."""
+
+    payload = _read_raw_settings()
+    return _normalize_predefined_tags(
+        payload.get(PREDEFINED_TAGS_SETTINGS_KEY, []),
+        existing_items=payload.get(PREDEFINED_TAGS_SETTINGS_KEY, []),
+    )
+
+
+def update_handwriting_settings(
+    enabled: bool | None = None,
+    openai_base_url: str | None = None,
+    openai_model: str | None = None,
+    openai_timeout_seconds: int | None = None,
+    openai_api_key: str | None = None,
+    clear_openai_api_key: bool = False,
+) -> dict[str, Any]:
+    """Updates OCR task and bound provider values using the legacy handwriting API contract."""
+
+    runtime = read_task_runtime_settings(TASK_OCR_HANDWRITING)
+    provider = runtime["provider"]
+
+    provider_update: dict[str, Any] = {
+        "id": provider["id"],
+        "label": provider["label"],
+        "provider_type": provider["provider_type"],
+        "base_url": openai_base_url if openai_base_url is not None else provider["base_url"],
+        "timeout_seconds": openai_timeout_seconds if openai_timeout_seconds is not None else provider["timeout_seconds"],
+    }
+    if clear_openai_api_key:
+        provider_update["clear_api_key"] = True
+    elif openai_api_key is not None:
+        provider_update["api_key"] = openai_api_key
+
+    tasks_update: dict[str, dict[str, Any]] = {TASK_OCR_HANDWRITING: {}}
+    if enabled is not None:
+        tasks_update[TASK_OCR_HANDWRITING]["enabled"] = enabled
+    if openai_model is not None:
+        tasks_update[TASK_OCR_HANDWRITING]["model"] = openai_model
+
+    return update_app_settings(
+        providers=[provider_update],
+        tasks=tasks_update,
+    )
@@ -0,0 +1,315 @@
+"""Document extraction service for text indexing, previews, and archive fan-out."""
+
+import io
+import re
+import zipfile
+from dataclasses import dataclass, field
+from pathlib import Path
+
+import magic
+from docx import Document as DocxDocument
+from openpyxl import load_workbook
+from PIL import Image, ImageOps
+from pypdf import PdfReader
+import pymupdf
+
+from app.core.config import get_settings
+from app.services.handwriting import (
+    IMAGE_TEXT_TYPE_NO_TEXT,
+    IMAGE_TEXT_TYPE_UNKNOWN,
+    HandwritingTranscriptionError,
+    HandwritingTranscriptionNotConfiguredError,
+    HandwritingTranscriptionTimeoutError,
+    classify_image_text_bytes,
+    transcribe_handwriting_bytes,
+)
+
+
+settings = get_settings()
+
+
+IMAGE_EXTENSIONS = {
+    ".jpg",
+    ".jpeg",
+    ".png",
+    ".tif",
+    ".tiff",
+    ".bmp",
+    ".gif",
+    ".webp",
+    ".heic",
+}
+
+SUPPORTED_TEXT_EXTENSIONS = {
+    ".txt",
+    ".md",
+    ".csv",
+    ".json",
+    ".xml",
+    ".svg",
+    ".pdf",
+    ".docx",
+    ".xlsx",
+    *IMAGE_EXTENSIONS,
+}
+
+
+@dataclass
+class ExtractionResult:
+    """Represents output generated during extraction for a single file."""
+
+    text: str
+    preview_bytes: bytes | None
+    preview_suffix: str | None
+    status: str
+    metadata_json: dict[str, object] = field(default_factory=dict)
+
+
+@dataclass
+class ArchiveMember:
+    """Represents an extracted file entry from an archive."""
+
+    name: str
+    data: bytes
+
+
+def sniff_mime(data: bytes) -> str:
+    """Detects MIME type using libmagic for robust format handling."""
+
+    return magic.from_buffer(data, mime=True) or "application/octet-stream"
+
+
+def is_supported_for_extraction(extension: str, mime_type: str) -> bool:
+    """Determines if a file should be text-processed for indexing and classification."""
+
+    return extension in SUPPORTED_TEXT_EXTENSIONS or mime_type.startswith("text/")
+
+
+def _normalize_text(text: str) -> str:
+    """Normalizes extracted text by removing repeated form separators and controls."""
+
+    cleaned = text.replace("\r", "\n").replace("\x00", "")
+    lines: list[str] = []
+    for line in cleaned.split("\n"):
+        stripped = line.strip()
+        if stripped and re.fullmatch(r"[.\-_*=~\s]{4,}", stripped):
+            continue
+        lines.append(line)
+
+    normalized = "\n".join(lines)
+    normalized = re.sub(r"\n{3,}", "\n\n", normalized)
+    return normalized.strip()
+
+
+def _extract_pdf_text(data: bytes) -> str:
+    """Extracts text from PDF bytes using pypdf page parsing."""
+
+    reader = PdfReader(io.BytesIO(data))
+    pages: list[str] = []
+    for page in reader.pages:
+        pages.append(page.extract_text() or "")
+    return _normalize_text("\n".join(pages))
+
+
+def _extract_pdf_preview(data: bytes) -> tuple[bytes | None, str | None]:
+    """Creates a JPEG thumbnail preview from the first PDF page."""
+
+    try:
+        document = pymupdf.open(stream=data, filetype="pdf")
+    except Exception:
+        return None, None
+
+    try:
+        if document.page_count < 1:
+            return None, None
+        page = document.load_page(0)
+        pixmap = page.get_pixmap(matrix=pymupdf.Matrix(1.5, 1.5), alpha=False)
+        return pixmap.tobytes("jpeg"), ".jpg"
+    except Exception:
+        return None, None
+    finally:
+        document.close()
+
+
+def _extract_docx_text(data: bytes) -> str:
+    """Extracts paragraph text from DOCX content."""
+
+    document = DocxDocument(io.BytesIO(data))
+    return _normalize_text("\n".join(paragraph.text for paragraph in document.paragraphs if paragraph.text))
+
+
+def _extract_xlsx_text(data: bytes) -> str:
+    """Extracts cell text from XLSX workbook sheets for indexing."""
+
+    workbook = load_workbook(io.BytesIO(data), data_only=True, read_only=True)
+    chunks: list[str] = []
+    for sheet in workbook.worksheets:
+        chunks.append(sheet.title)
+        row_count = 0
+        for row in sheet.iter_rows(min_row=1, max_row=200):
+            row_values = [str(cell.value) for cell in row if cell.value is not None]
+            if row_values:
+                chunks.append(" ".join(row_values))
+            row_count += 1
+            if row_count >= 200:
+                break
+    return _normalize_text("\n".join(chunks))
+
+
+def _build_image_preview(data: bytes) -> tuple[bytes | None, str | None]:
+    """Builds a JPEG preview thumbnail for image files."""
+
+    try:
+        with Image.open(io.BytesIO(data)) as image:
+            preview = ImageOps.exif_transpose(image).convert("RGB")
+            preview.thumbnail((600, 600))
+            output = io.BytesIO()
+            preview.save(output, format="JPEG", optimize=True, quality=82)
+            return output.getvalue(), ".jpg"
+    except Exception:
+        return None, None
+
+
+def _extract_handwriting_text(data: bytes, mime_type: str) -> ExtractionResult:
+    """Extracts text from image bytes and records handwriting-vs-printed classification metadata."""
+
+    preview_bytes, preview_suffix = _build_image_preview(data)
+    metadata_json: dict[str, object] = {}
+
+    try:
+        text_type = classify_image_text_bytes(data, mime_type=mime_type)
+        metadata_json = {
+            "image_text_type": text_type.label,
+            "image_text_type_confidence": text_type.confidence,
+            "image_text_type_provider": text_type.provider,
+            "image_text_type_model": text_type.model,
+        }
+    except HandwritingTranscriptionNotConfiguredError as error:
+        return ExtractionResult(
+            text="",
+            preview_bytes=preview_bytes,
+            preview_suffix=preview_suffix,
+            status="unsupported",
+            metadata_json={"transcription_error": str(error), "image_text_type": IMAGE_TEXT_TYPE_UNKNOWN},
+        )
+    except HandwritingTranscriptionTimeoutError as error:
+        metadata_json = {
+            "image_text_type": IMAGE_TEXT_TYPE_UNKNOWN,
+            "image_text_type_error": str(error),
+        }
+    except HandwritingTranscriptionError as error:
+        metadata_json = {
+            "image_text_type": IMAGE_TEXT_TYPE_UNKNOWN,
+            "image_text_type_error": str(error),
+        }
+
+    if metadata_json.get("image_text_type") == IMAGE_TEXT_TYPE_NO_TEXT:
+        metadata_json["transcription_skipped"] = "no_text_detected"
+        return ExtractionResult(
+            text="",
+            preview_bytes=preview_bytes,
+            preview_suffix=preview_suffix,
+            status="processed",
+            metadata_json=metadata_json,
+        )
+
+    try:
+        transcription = transcribe_handwriting_bytes(data, mime_type=mime_type)
+        transcription_metadata: dict[str, object] = {
+            "transcription_provider": transcription.provider,
+            "transcription_model": transcription.model,
+            "transcription_uncertainties": transcription.uncertainties,
+        }
+        return ExtractionResult(
+            text=_normalize_text(transcription.text),
+            preview_bytes=preview_bytes,
+            preview_suffix=preview_suffix,
+            status="processed",
+            metadata_json={**metadata_json, **transcription_metadata},
+        )
+    except HandwritingTranscriptionNotConfiguredError as error:
+        return ExtractionResult(
+            text="",
+            preview_bytes=preview_bytes,
+            preview_suffix=preview_suffix,
+            status="unsupported",
+            metadata_json={**metadata_json, "transcription_error": str(error)},
+        )
+    except HandwritingTranscriptionTimeoutError as error:
+        return ExtractionResult(
+            text="",
+            preview_bytes=preview_bytes,
+            preview_suffix=preview_suffix,
+            status="error",
+            metadata_json={**metadata_json, "transcription_error": str(error)},
+        )
+    except HandwritingTranscriptionError as error:
+        return ExtractionResult(
+            text="",
+            preview_bytes=preview_bytes,
+            preview_suffix=preview_suffix,
+            status="error",
+            metadata_json={**metadata_json, "transcription_error": str(error)},
+        )
+
+
+def extract_text_content(filename: str, data: bytes, mime_type: str) -> ExtractionResult:
+    """Extracts text and optional preview bytes for supported file types."""
+
+    extension = Path(filename).suffix.lower()
+    text = ""
+    preview_bytes: bytes | None = None
+    preview_suffix: str | None = None
+
+    try:
+        if extension == ".pdf":
+            text = _extract_pdf_text(data)
+            preview_bytes, preview_suffix = _extract_pdf_preview(data)
+        elif extension in {".txt", ".md", ".csv", ".json", ".xml", ".svg"} or mime_type.startswith("text/"):
+            text = _normalize_text(data.decode("utf-8", errors="ignore"))
+        elif extension == ".docx":
+            text = _extract_docx_text(data)
+        elif extension == ".xlsx":
+            text = _extract_xlsx_text(data)
+        elif extension in IMAGE_EXTENSIONS:
+            return _extract_handwriting_text(data=data, mime_type=mime_type)
+        else:
+            return ExtractionResult(
+                text="",
+                preview_bytes=None,
+                preview_suffix=None,
+                status="unsupported",
+                metadata_json={"reason": "unsupported_format"},
+            )
+    except Exception as error:
+        return ExtractionResult(
+            text="",
+            preview_bytes=None,
+            preview_suffix=None,
+            status="error",
+            metadata_json={"reason": "extraction_exception", "error": str(error)},
+        )
+
+    return ExtractionResult(
+        text=text[: settings.max_text_length],
+        preview_bytes=preview_bytes,
+        preview_suffix=preview_suffix,
+        status="processed",
+        metadata_json={},
+    )
+
+
+def extract_archive_members(data: bytes, depth: int = 0) -> list[ArchiveMember]:
+    """Extracts processable members from zip archives with configurable depth limits."""
+
+    members: list[ArchiveMember] = []
+    if depth > settings.max_zip_depth:
+        return members
+
+    with zipfile.ZipFile(io.BytesIO(data)) as archive:
+        infos = [info for info in archive.infolist() if not info.is_dir()][: settings.max_zip_members]
+        for info in infos:
+            member_data = archive.read(info.filename)
+            members.append(ArchiveMember(name=info.filename, data=member_data))
+
+    return members
@@ -0,0 +1,477 @@
+"""Handwriting transcription service using OpenAI-compatible vision models."""
+
+import base64
+import io
+import json
+import re
+from dataclasses import dataclass
+from typing import Any
+
+from openai import APIConnectionError, APIError, APITimeoutError, OpenAI
+from PIL import Image, ImageOps
+
+from app.services.app_settings import DEFAULT_OCR_PROMPT, read_handwriting_provider_settings
+
+MAX_IMAGE_SIDE = 2000
+IMAGE_TEXT_TYPE_HANDWRITING = "handwriting"
+IMAGE_TEXT_TYPE_PRINTED = "printed_text"
+IMAGE_TEXT_TYPE_NO_TEXT = "no_text"
+IMAGE_TEXT_TYPE_UNKNOWN = "unknown"
+
+IMAGE_TEXT_CLASSIFICATION_PROMPT = (
+    "Classify the text content in this image.\n"
+    "Choose exactly one label from: handwriting, printed_text, no_text.\n"
+    "Definitions:\n"
+    "- handwriting: text exists and most readable text is handwritten.\n"
+    "- printed_text: text exists and most readable text is machine printed or typed.\n"
+    "- no_text: no readable text is present.\n"
+    "Return strict JSON only with shape:\n"
+    "{\n"
+    '  "label": "handwriting|printed_text|no_text",\n'
+    '  "confidence": number\n'
+    "}\n"
+    "Confidence must be between 0 and 1."
+)
+
+
+class HandwritingTranscriptionError(Exception):
+    """Raised when handwriting transcription fails for a non-timeout reason."""
+
+
+class HandwritingTranscriptionTimeoutError(HandwritingTranscriptionError):
+    """Raised when handwriting transcription exceeds the configured timeout."""
+
+
+class HandwritingTranscriptionNotConfiguredError(HandwritingTranscriptionError):
+    """Raised when handwriting transcription is disabled or missing credentials."""
+
+
+@dataclass
+class HandwritingTranscription:
+    """Represents transcription output and uncertainty markers."""
+
+    text: str
+    uncertainties: list[str]
+    provider: str
+    model: str
+
+
+@dataclass
+class ImageTextClassification:
+    """Represents model classification of image text modality for one image."""
+
+    label: str
+    confidence: float
+    provider: str
+    model: str
+
+
+def _extract_uncertainties(text: str) -> list[str]:
+    """Extracts uncertainty markers from transcription output."""
+
+    matches = re.findall(r"\[\[\?(.*?)\?\]\]", text)
+    return [match.strip() for match in matches if match.strip()]
+
+
+def _coerce_json_object(payload: str) -> dict[str, Any]:
+    """Parses and extracts a JSON object from raw model output text."""
+
+    text = payload.strip()
+    if not text:
+        return {}
+
+    try:
+        parsed = json.loads(text)
+        if isinstance(parsed, dict):
+            return parsed
+    except json.JSONDecodeError:
+        pass
+
+    fenced = re.search(r"```(?:json)?\s*(\{.*?\})\s*```", text, flags=re.DOTALL | re.IGNORECASE)
+    if fenced:
+        try:
+            parsed = json.loads(fenced.group(1))
+            if isinstance(parsed, dict):
+                return parsed
+        except json.JSONDecodeError:
+            pass
+
+    first_brace = text.find("{")
+    last_brace = text.rfind("}")
+    if first_brace >= 0 and last_brace > first_brace:
+        candidate = text[first_brace : last_brace + 1]
+        try:
+            parsed = json.loads(candidate)
+            if isinstance(parsed, dict):
+                return parsed
+        except json.JSONDecodeError:
+            return {}
+    return {}
+
+
+def _clamp_probability(value: Any, fallback: float = 0.0) -> float:
+    """Clamps confidence-like values to the inclusive [0, 1] range."""
+
+    try:
+        parsed = float(value)
+    except (TypeError, ValueError):
+        return fallback
+    return max(0.0, min(1.0, parsed))
+
+
+def _normalize_image_text_type(label: str) -> str:
+    """Normalizes classifier labels into one supported canonical image text type."""
+
+    normalized = label.strip().lower().replace("-", "_").replace(" ", "_")
+    if normalized in {IMAGE_TEXT_TYPE_HANDWRITING, "handwritten", "handwritten_text"}:
+        return IMAGE_TEXT_TYPE_HANDWRITING
+    if normalized in {IMAGE_TEXT_TYPE_PRINTED, "printed", "typed", "machine_text"}:
+        return IMAGE_TEXT_TYPE_PRINTED
+    if normalized in {IMAGE_TEXT_TYPE_NO_TEXT, "no-text", "none", "no readable text"}:
+        return IMAGE_TEXT_TYPE_NO_TEXT
+    return IMAGE_TEXT_TYPE_UNKNOWN
+
+
+def _normalize_image_bytes(image_data: bytes) -> tuple[bytes, str]:
+    """Applies EXIF rotation and scales large images down for efficient transcription."""
+
+    with Image.open(io.BytesIO(image_data)) as image:
+        rotated = ImageOps.exif_transpose(image)
+        prepared = rotated.convert("RGB")
+        long_side = max(prepared.width, prepared.height)
+        if long_side > MAX_IMAGE_SIDE:
+            scale = MAX_IMAGE_SIDE / long_side
+            resized_width = max(1, int(prepared.width * scale))
+            resized_height = max(1, int(prepared.height * scale))
+            prepared = prepared.resize((resized_width, resized_height), Image.Resampling.LANCZOS)
+
+        output = io.BytesIO()
+        prepared.save(output, format="JPEG", quality=90, optimize=True)
+        return output.getvalue(), "image/jpeg"
+
+
+def _create_client(provider_settings: dict[str, Any]) -> OpenAI:
+    """Creates an OpenAI client configured for compatible endpoints and timeouts."""
+
+    api_key = str(provider_settings.get("openai_api_key", "")).strip() or "no-key-required"
+    return OpenAI(
+        api_key=api_key,
+        base_url=str(provider_settings["openai_base_url"]),
+        timeout=int(provider_settings["openai_timeout_seconds"]),
+    )
+
+
+def _extract_text_from_response(response: Any) -> str:
+    """Extracts plain text from responses API output objects."""
+
+    output_text = getattr(response, "output_text", None)
+    if isinstance(output_text, str) and output_text.strip():
+        return output_text.strip()
+
+    output_items = getattr(response, "output", None)
+    if not isinstance(output_items, list):
+        return ""
+
+    texts: list[str] = []
+    for item in output_items:
+        item_data = item.model_dump() if hasattr(item, "model_dump") else item
+        if not isinstance(item_data, dict):
+            continue
+        item_type = item_data.get("type")
+        if item_type == "output_text":
+            text = str(item_data.get("text", "")).strip()
+            if text:
+                texts.append(text)
+        if item_type == "message":
+            for content in item_data.get("content", []) or []:
+                if not isinstance(content, dict):
+                    continue
+                if content.get("type") in {"output_text", "text"}:
+                    text = str(content.get("text", "")).strip()
+                    if text:
+                        texts.append(text)
+
+    return "\n".join(texts).strip()
+
+
+def _transcribe_with_responses(client: OpenAI, model: str, prompt: str, image_data_url: str) -> str:
+    """Transcribes handwriting using the responses API."""
+
+    response = client.responses.create(
+        model=model,
+        input=[
+            {
+                "role": "user",
+                "content": [
+                    {
+                        "type": "input_text",
+                        "text": prompt,
+                    },
+                    {
+                        "type": "input_image",
+                        "image_url": image_data_url,
+                        "detail": "high",
+                    },
+                ],
+            }
+        ],
+    )
+    return _extract_text_from_response(response)
+
+
+def _transcribe_with_chat(client: OpenAI, model: str, prompt: str, image_data_url: str) -> str:
+    """Transcribes handwriting using chat completions for endpoint compatibility."""
+
+    response = client.chat.completions.create(
+        model=model,
+        messages=[
+            {
+                "role": "user",
+                "content": [
+                    {
+                        "type": "text",
+                        "text": prompt,
+                    },
+                    {
+                        "type": "image_url",
+                        "image_url": {
+                            "url": image_data_url,
+                            "detail": "high",
+                        },
+                    },
+                ],
+            }
+        ],
+    )
+
+    message_content = response.choices[0].message.content
+    if isinstance(message_content, str):
+        return message_content.strip()
+    if isinstance(message_content, list):
+        text_parts: list[str] = []
+        for part in message_content:
+            if isinstance(part, dict):
+                text = str(part.get("text", "")).strip()
+                if text:
+                    text_parts.append(text)
+        return "\n".join(text_parts).strip()
+    return ""
+
+
+def _classify_with_responses(client: OpenAI, model: str, prompt: str, image_data_url: str) -> str:
+    """Classifies image text modality using the responses API."""
+
+    response = client.responses.create(
+        model=model,
+        input=[
+            {
+                "role": "user",
+                "content": [
+                    {
+                        "type": "input_text",
+                        "text": prompt,
+                    },
+                    {
+                        "type": "input_image",
+                        "image_url": image_data_url,
+                        "detail": "high",
+                    },
+                ],
+            }
+        ],
+    )
+    return _extract_text_from_response(response)
+
+
+def _classify_with_chat(client: OpenAI, model: str, prompt: str, image_data_url: str) -> str:
+    """Classifies image text modality using chat completions for compatibility."""
+
+    response = client.chat.completions.create(
+        model=model,
+        messages=[
+            {
+                "role": "user",
+                "content": [
+                    {
+                        "type": "text",
+                        "text": prompt,
+                    },
+                    {
+                        "type": "image_url",
+                        "image_url": {
+                            "url": image_data_url,
+                            "detail": "high",
+                        },
+                    },
+                ],
+            }
+        ],
+    )
+
+    message_content = response.choices[0].message.content
+    if isinstance(message_content, str):
+        return message_content.strip()
+    if isinstance(message_content, list):
+        text_parts: list[str] = []
+        for part in message_content:
+            if isinstance(part, dict):
+                text = str(part.get("text", "")).strip()
+                if text:
+                    text_parts.append(text)
+        return "\n".join(text_parts).strip()
+    return ""
+
+
+def _classify_image_text_data_url(image_data_url: str) -> ImageTextClassification:
+    """Classifies an image as handwriting, printed text, or no text."""
+
+    provider_settings = read_handwriting_provider_settings()
+    provider_type = str(provider_settings.get("provider", "openai_compatible")).strip()
+    if provider_type != "openai_compatible":
+        raise HandwritingTranscriptionError(f"unsupported_provider_type:{provider_type}")
+
+    if not bool(provider_settings.get("enabled", True)):
+        raise HandwritingTranscriptionNotConfiguredError("handwriting_transcription_disabled")
+
+    model = str(provider_settings.get("openai_model", "gpt-4.1-mini")).strip() or "gpt-4.1-mini"
+    client = _create_client(provider_settings)
+
+    try:
+        output_text = _classify_with_responses(
+            client=client,
+            model=model,
+            prompt=IMAGE_TEXT_CLASSIFICATION_PROMPT,
+            image_data_url=image_data_url,
+        )
+        if not output_text:
+            output_text = _classify_with_chat(
+                client=client,
+                model=model,
+                prompt=IMAGE_TEXT_CLASSIFICATION_PROMPT,
+                image_data_url=image_data_url,
+            )
+    except APITimeoutError as error:
+        raise HandwritingTranscriptionTimeoutError("openai_request_timeout") from error
+    except (APIConnectionError, APIError):
+        try:
+            output_text = _classify_with_chat(
+                client=client,
+                model=model,
+                prompt=IMAGE_TEXT_CLASSIFICATION_PROMPT,
+                image_data_url=image_data_url,
+            )
+        except APITimeoutError as timeout_error:
+            raise HandwritingTranscriptionTimeoutError("openai_request_timeout") from timeout_error
+        except Exception as fallback_error:
+            raise HandwritingTranscriptionError(str(fallback_error)) from fallback_error
+    except Exception as error:
+        raise HandwritingTranscriptionError(str(error)) from error
+
+    parsed = _coerce_json_object(output_text)
+    if not parsed:
+        raise HandwritingTranscriptionError("image_text_classification_parse_failed")
+
+    label = _normalize_image_text_type(str(parsed.get("label", "")))
+    confidence = _clamp_probability(parsed.get("confidence", 0.0), fallback=0.0)
+    return ImageTextClassification(
+        label=label,
+        confidence=confidence,
+        provider="openai",
+        model=model,
+    )
+
+
+def _transcribe_image_data_url(image_data_url: str) -> HandwritingTranscription:
+    """Transcribes a handwriting image data URL with configured OpenAI provider settings."""
+
+    provider_settings = read_handwriting_provider_settings()
+    provider_type = str(provider_settings.get("provider", "openai_compatible")).strip()
+    if provider_type != "openai_compatible":
+        raise HandwritingTranscriptionError(f"unsupported_provider_type:{provider_type}")
+
+    if not bool(provider_settings.get("enabled", True)):
+        raise HandwritingTranscriptionNotConfiguredError("handwriting_transcription_disabled")
+
+    model = str(provider_settings.get("openai_model", "gpt-4.1-mini")).strip() or "gpt-4.1-mini"
+    prompt = str(provider_settings.get("prompt", DEFAULT_OCR_PROMPT)).strip() or DEFAULT_OCR_PROMPT
+    client = _create_client(provider_settings)
+
+    try:
+        text = _transcribe_with_responses(client=client, model=model, prompt=prompt, image_data_url=image_data_url)
+        if not text:
+            text = _transcribe_with_chat(client=client, model=model, prompt=prompt, image_data_url=image_data_url)
+    except APITimeoutError as error:
+        raise HandwritingTranscriptionTimeoutError("openai_request_timeout") from error
+    except (APIConnectionError, APIError) as error:
+        try:
+            text = _transcribe_with_chat(client=client, model=model, prompt=prompt, image_data_url=image_data_url)
+        except APITimeoutError as timeout_error:
+            raise HandwritingTranscriptionTimeoutError("openai_request_timeout") from timeout_error
+        except Exception as fallback_error:
+            raise HandwritingTranscriptionError(str(fallback_error)) from fallback_error
+    except Exception as error:
+        raise HandwritingTranscriptionError(str(error)) from error
+
+    final_text = text.strip()
+    return HandwritingTranscription(
+        text=final_text,
+        uncertainties=_extract_uncertainties(final_text),
+        provider="openai",
+        model=model,
+    )
+
+
+def transcribe_handwriting_base64(image_base64: str, mime_type: str = "image/jpeg") -> HandwritingTranscription:
+    """Transcribes handwriting from a base64 payload without data URL prefix."""
+
+    normalized_mime = mime_type.strip().lower() if mime_type.strip() else "image/jpeg"
+    image_data_url = f"data:{normalized_mime};base64,{image_base64}"
+    return _transcribe_image_data_url(image_data_url)
+
+
+def transcribe_handwriting_url(image_url: str) -> HandwritingTranscription:
+    """Transcribes handwriting from a direct image URL."""
+
+    return _transcribe_image_data_url(image_url)
+
+
+def transcribe_handwriting_bytes(image_data: bytes, mime_type: str = "image/jpeg") -> HandwritingTranscription:
+    """Transcribes handwriting from raw image bytes after normalization."""
+
+    normalized_bytes, normalized_mime = _normalize_image_bytes(image_data)
+    encoded = base64.b64encode(normalized_bytes).decode("ascii")
+    return transcribe_handwriting_base64(encoded, mime_type=normalized_mime)
+
+
+def classify_image_text_base64(image_base64: str, mime_type: str = "image/jpeg") -> ImageTextClassification:
+    """Classifies image text type from a base64 payload without data URL prefix."""
+
+    normalized_mime = mime_type.strip().lower() if mime_type.strip() else "image/jpeg"
+    image_data_url = f"data:{normalized_mime};base64,{image_base64}"
+    return _classify_image_text_data_url(image_data_url)
+
+
+def classify_image_text_url(image_url: str) -> ImageTextClassification:
+    """Classifies image text type from a direct image URL."""
+
+    return _classify_image_text_data_url(image_url)
+
+
+def classify_image_text_bytes(image_data: bytes, mime_type: str = "image/jpeg") -> ImageTextClassification:
+    """Classifies image text type from raw image bytes after normalization."""
+
+    normalized_bytes, normalized_mime = _normalize_image_bytes(image_data)
+    encoded = base64.b64encode(normalized_bytes).decode("ascii")
+    return classify_image_text_base64(encoded, mime_type=normalized_mime)
+
+
+def transcribe_handwriting(image: bytes | str, mime_type: str = "image/jpeg") -> HandwritingTranscription:
+    """Transcribes handwriting from bytes, base64 text, or URL input."""
+
+    if isinstance(image, bytes):
+        return transcribe_handwriting_bytes(image, mime_type=mime_type)
+
+    stripped = image.strip()
+    if stripped.startswith("http://") or stripped.startswith("https://"):
+        return transcribe_handwriting_url(stripped)
+    return transcribe_handwriting_base64(stripped, mime_type=mime_type)
@@ -0,0 +1,435 @@
+"""Handwriting-style clustering and style-scoped path composition for image documents."""
+
+import base64
+import io
+import re
+from dataclasses import dataclass
+from typing import Any
+
+from PIL import Image, ImageOps
+from sqlalchemy import func, select
+from sqlalchemy.orm import Session
+
+from app.core.config import get_settings
+from app.models.document import Document, DocumentStatus
+from app.services.app_settings import (
+    DEFAULT_HANDWRITING_STYLE_EMBED_MODEL,
+    read_handwriting_style_settings,
+)
+from app.services.typesense_index import get_typesense_client
+
+
+settings = get_settings()
+
+IMAGE_TEXT_TYPE_HANDWRITING = "handwriting"
+HANDWRITING_STYLE_COLLECTION_SUFFIX = "_handwriting_styles"
+HANDWRITING_STYLE_EMBED_MODEL = DEFAULT_HANDWRITING_STYLE_EMBED_MODEL
+HANDWRITING_STYLE_MATCH_MIN_SIMILARITY = 0.86
+HANDWRITING_STYLE_BOOTSTRAP_MIN_SIMILARITY = 0.89
+HANDWRITING_STYLE_BOOTSTRAP_SAMPLE_SIZE = 3
+HANDWRITING_STYLE_NEIGHBOR_LIMIT = 8
+HANDWRITING_STYLE_IMAGE_MAX_SIDE = 1024
+HANDWRITING_STYLE_ID_PREFIX = "hw_style_"
+HANDWRITING_STYLE_ID_PATTERN = re.compile(r"^hw_style_(\d+)$")
+
+
+@dataclass
+class HandwritingStyleNeighbor:
+    """Represents one nearest handwriting-style neighbor returned from Typesense."""
+
+    document_id: str
+    style_cluster_id: str
+    vector_distance: float
+    similarity: float
+
+
+@dataclass
+class HandwritingStyleAssignment:
+    """Represents the chosen handwriting-style cluster assignment for one document."""
+
+    style_cluster_id: str
+    matched_existing: bool
+    similarity: float
+    vector_distance: float
+    compared_neighbors: int
+    match_min_similarity: float
+    bootstrap_match_min_similarity: float
+
+
+def _style_collection_name() -> str:
+    """Builds the dedicated Typesense collection name used for handwriting-style vectors."""
+
+    return f"{settings.typesense_collection_name}{HANDWRITING_STYLE_COLLECTION_SUFFIX}"
+
+
+def _style_collection() -> Any:
+    """Returns the Typesense collection handle for handwriting-style indexing."""
+
+    client = get_typesense_client()
+    return client.collections[_style_collection_name()]
+
+
+def _distance_to_similarity(vector_distance: float) -> float:
+    """Converts Typesense vector distance into conservative similarity in [0, 1]."""
+
+    return max(0.0, min(1.0, 1.0 - (vector_distance / 2.0)))
+
+
+def _encode_style_image_base64(image_data: bytes, image_max_side: int) -> str:
+    """Normalizes and downsizes image bytes and returns a base64-encoded JPEG payload."""
+
+    with Image.open(io.BytesIO(image_data)) as image:
+        prepared = ImageOps.exif_transpose(image).convert("RGB")
+        longest_side = max(prepared.width, prepared.height)
+        if longest_side > image_max_side:
+            scale = image_max_side / longest_side
+            resized_width = max(1, int(prepared.width * scale))
+            resized_height = max(1, int(prepared.height * scale))
+            prepared = prepared.resize((resized_width, resized_height), Image.Resampling.LANCZOS)
+
+        output = io.BytesIO()
+        prepared.save(output, format="JPEG", quality=86, optimize=True)
+        return base64.b64encode(output.getvalue()).decode("ascii")
+
+
+def ensure_handwriting_style_collection() -> None:
+    """Creates the handwriting-style Typesense collection when it is not present."""
+
+    runtime_settings = read_handwriting_style_settings()
+    embed_model = str(runtime_settings.get("embed_model", HANDWRITING_STYLE_EMBED_MODEL)).strip() or HANDWRITING_STYLE_EMBED_MODEL
+    collection = _style_collection()
+    should_recreate_collection = False
+    try:
+        existing_schema = collection.retrieve()
+        if isinstance(existing_schema, dict):
+            existing_fields = existing_schema.get("fields", [])
+            if isinstance(existing_fields, list):
+                for field in existing_fields:
+                    if not isinstance(field, dict):
+                        continue
+                    if str(field.get("name", "")).strip() != "embedding":
+                        continue
+                    embed_config = field.get("embed", {})
+                    model_config = embed_config.get("model_config", {}) if isinstance(embed_config, dict) else {}
+                    existing_model = str(model_config.get("model_name", "")).strip()
+                    if existing_model and existing_model != embed_model:
+                        should_recreate_collection = True
+                        break
+        if not should_recreate_collection:
+            return
+    except Exception as error:
+        message = str(error).lower()
+        if "404" not in message and "not found" not in message:
+            raise
+
+    client = get_typesense_client()
+    if should_recreate_collection:
+        client.collections[_style_collection_name()].delete()
+
+    schema = {
+        "name": _style_collection_name(),
+        "fields": [
+            {
+                "name": "style_cluster_id",
+                "type": "string",
+                "facet": True,
+            },
+            {
+                "name": "image_text_type",
+                "type": "string",
+                "facet": True,
+            },
+            {
+                "name": "created_at",
+                "type": "int64",
+            },
+            {
+                "name": "image",
+                "type": "image",
+                "store": False,
+            },
+            {
+                "name": "embedding",
+                "type": "float[]",
+                "embed": {
+                    "from": ["image"],
+                    "model_config": {
+                        "model_name": embed_model,
+                    },
+                },
+            },
+        ],
+        "default_sorting_field": "created_at",
+    }
+    client.collections.create(schema)
+
+
+def _search_style_neighbors(
+    image_base64: str,
+    limit: int,
+    exclude_document_id: str | None = None,
+) -> list[HandwritingStyleNeighbor]:
+    """Returns nearest handwriting-style neighbors for one encoded image payload."""
+
+    ensure_handwriting_style_collection()
+    client = get_typesense_client()
+
+    filter_clauses = [f"image_text_type:={IMAGE_TEXT_TYPE_HANDWRITING}"]
+    if exclude_document_id:
+        filter_clauses.append(f"id:!={exclude_document_id}")
+
+    search_payload = {
+        "q": "*",
+        "query_by": "embedding",
+        "vector_query": f"embedding:([], image:{image_base64}, k:{max(1, limit)})",
+        "exclude_fields": "embedding,image",
+        "per_page": max(1, limit),
+        "filter_by": " && ".join(filter_clauses),
+    }
+    response = client.multi_search.perform(
+        {
+            "searches": [
+                {
+                    "collection": _style_collection_name(),
+                    **search_payload,
+                }
+            ]
+        },
+        {},
+    )
+
+    results = response.get("results", []) if isinstance(response, dict) else []
+    first_result = results[0] if isinstance(results, list) and len(results) > 0 else {}
+    hits = first_result.get("hits", []) if isinstance(first_result, dict) else []
+
+    neighbors: list[HandwritingStyleNeighbor] = []
+    for hit in hits:
+        if not isinstance(hit, dict):
+            continue
+        document = hit.get("document")
+        if not isinstance(document, dict):
+            continue
+
+        document_id = str(document.get("id", "")).strip()
+        style_cluster_id = str(document.get("style_cluster_id", "")).strip()
+        if not document_id or not style_cluster_id:
+            continue
+
+        try:
+            vector_distance = float(hit.get("vector_distance", 2.0))
+        except (TypeError, ValueError):
+            vector_distance = 2.0
+
+        neighbors.append(
+            HandwritingStyleNeighbor(
+                document_id=document_id,
+                style_cluster_id=style_cluster_id,
+                vector_distance=vector_distance,
+                similarity=_distance_to_similarity(vector_distance),
+            )
+        )
+
+        if len(neighbors) >= limit:
+            break
+
+    return neighbors
+
+
+def _next_style_cluster_id(session: Session) -> str:
+    """Allocates the next stable handwriting-style folder identifier."""
+
+    existing_ids = session.execute(
+        select(Document.handwriting_style_id).where(Document.handwriting_style_id.is_not(None))
+    ).scalars().all()
+    max_value = 0
+    for existing_id in existing_ids:
+        candidate = str(existing_id).strip()
+        match = HANDWRITING_STYLE_ID_PATTERN.fullmatch(candidate)
+        if not match:
+            continue
+        numeric_part = int(match.group(1))
+        max_value = max(max_value, numeric_part)
+    return f"{HANDWRITING_STYLE_ID_PREFIX}{max_value + 1}"
+
+
+def _style_cluster_sample_size(session: Session, style_cluster_id: str) -> int:
+    """Returns the number of indexed documents currently assigned to one style cluster."""
+
+    return int(
+        session.execute(
+            select(func.count())
+            .select_from(Document)
+            .where(Document.handwriting_style_id == style_cluster_id)
+            .where(Document.image_text_type == IMAGE_TEXT_TYPE_HANDWRITING)
+        ).scalar_one()
+    )
+
+
+def assign_handwriting_style(
+    session: Session,
+    document: Document,
+    image_data: bytes,
+) -> HandwritingStyleAssignment:
+    """Assigns a document to an existing handwriting-style cluster or creates a new one."""
+
+    runtime_settings = read_handwriting_style_settings()
+    image_max_side = int(runtime_settings.get("image_max_side", HANDWRITING_STYLE_IMAGE_MAX_SIDE))
+    neighbor_limit = int(runtime_settings.get("neighbor_limit", HANDWRITING_STYLE_NEIGHBOR_LIMIT))
+    match_min_similarity = float(runtime_settings.get("match_min_similarity", HANDWRITING_STYLE_MATCH_MIN_SIMILARITY))
+    bootstrap_match_min_similarity = float(
+        runtime_settings.get("bootstrap_match_min_similarity", HANDWRITING_STYLE_BOOTSTRAP_MIN_SIMILARITY)
+    )
+    bootstrap_sample_size = int(runtime_settings.get("bootstrap_sample_size", HANDWRITING_STYLE_BOOTSTRAP_SAMPLE_SIZE))
+
+    image_base64 = _encode_style_image_base64(image_data, image_max_side=image_max_side)
+    neighbors = _search_style_neighbors(
+        image_base64=image_base64,
+        limit=neighbor_limit,
+        exclude_document_id=str(document.id),
+    )
+
+    best_neighbor = neighbors[0] if neighbors else None
+    similarity = best_neighbor.similarity if best_neighbor else 0.0
+    vector_distance = best_neighbor.vector_distance if best_neighbor else 2.0
+    cluster_sample_size = 0
+    if best_neighbor:
+        cluster_sample_size = _style_cluster_sample_size(
+            session=session,
+            style_cluster_id=best_neighbor.style_cluster_id,
+        )
+    required_similarity = (
+        bootstrap_match_min_similarity
+        if cluster_sample_size < bootstrap_sample_size
+        else match_min_similarity
+    )
+    should_match_existing = (
+        best_neighbor is not None and similarity >= required_similarity
+    )
+
+    if should_match_existing and best_neighbor:
+        style_cluster_id = best_neighbor.style_cluster_id
+        matched_existing = True
+    else:
+        existing_style_cluster_id = (document.handwriting_style_id or "").strip()
+        if HANDWRITING_STYLE_ID_PATTERN.fullmatch(existing_style_cluster_id):
+            style_cluster_id = existing_style_cluster_id
+        else:
+            style_cluster_id = _next_style_cluster_id(session=session)
+        matched_existing = False
+
+    ensure_handwriting_style_collection()
+    collection = _style_collection()
+    payload = {
+        "id": str(document.id),
+        "style_cluster_id": style_cluster_id,
+        "image_text_type": IMAGE_TEXT_TYPE_HANDWRITING,
+        "created_at": int(document.created_at.timestamp()),
+        "image": image_base64,
+    }
+    collection.documents.upsert(payload)
+
+    return HandwritingStyleAssignment(
+        style_cluster_id=style_cluster_id,
+        matched_existing=matched_existing,
+        similarity=similarity,
+        vector_distance=vector_distance,
+        compared_neighbors=len(neighbors),
+        match_min_similarity=match_min_similarity,
+        bootstrap_match_min_similarity=bootstrap_match_min_similarity,
+    )
+
+
+def delete_handwriting_style_document(document_id: str) -> None:
+    """Deletes one document id from the handwriting-style Typesense collection."""
+
+    collection = _style_collection()
+    try:
+        collection.documents[document_id].delete()
+    except Exception as error:
+        message = str(error).lower()
+        if "404" in message or "not found" in message:
+            return
+        raise
+
+
+def delete_many_handwriting_style_documents(document_ids: list[str]) -> None:
+    """Deletes many document ids from the handwriting-style Typesense collection."""
+
+    for document_id in document_ids:
+        delete_handwriting_style_document(document_id)
+
+
+def apply_handwriting_style_path(style_cluster_id: str | None, path_value: str | None) -> str | None:
+    """Composes style-prefixed logical paths while preventing duplicate prefix nesting."""
+
+    if path_value is None:
+        return None
+
+    normalized_path = path_value.strip().strip("/")
+    if not normalized_path:
+        return None
+
+    normalized_style = (style_cluster_id or "").strip().strip("/")
+    if not normalized_style:
+        return normalized_path
+
+    segments = [segment for segment in normalized_path.split("/") if segment]
+    while segments and HANDWRITING_STYLE_ID_PATTERN.fullmatch(segments[0]):
+        segments.pop(0)
+    if segments and segments[0].strip().lower() == normalized_style.lower():
+        segments.pop(0)
+
+    if len(segments) == 0:
+        return normalized_style
+
+    sanitized_path = "/".join(segments)
+    return f"{normalized_style}/{sanitized_path}"
+
+
+def resolve_handwriting_style_path_prefix(
+    session: Session,
+    style_cluster_id: str | None,
+    *,
+    exclude_document_id: str | None = None,
+) -> str | None:
+    """Resolves a stable path prefix for one style cluster, preferring known non-style root segments."""
+
+    normalized_style = (style_cluster_id or "").strip()
+    if not normalized_style:
+        return None
+
+    statement = select(Document.logical_path).where(
+        Document.handwriting_style_id == normalized_style,
+        Document.image_text_type == IMAGE_TEXT_TYPE_HANDWRITING,
+        Document.status != DocumentStatus.TRASHED,
+    )
+    if exclude_document_id:
+        statement = statement.where(Document.id != exclude_document_id)
+    rows = session.execute(statement).scalars().all()
+
+    segment_counts: dict[str, int] = {}
+    segment_labels: dict[str, str] = {}
+    for raw_path in rows:
+        if not isinstance(raw_path, str):
+            continue
+        segments = [segment.strip() for segment in raw_path.split("/") if segment.strip()]
+        if not segments:
+            continue
+        first_segment = segments[0]
+        lowered = first_segment.lower()
+        if lowered == "inbox":
+            continue
+        if HANDWRITING_STYLE_ID_PATTERN.fullmatch(first_segment):
+            continue
+        segment_counts[lowered] = segment_counts.get(lowered, 0) + 1
+        if lowered not in segment_labels:
+            segment_labels[lowered] = first_segment
+
+    if not segment_counts:
+        return normalized_style
+
+    winner = sorted(
+        segment_counts.items(),
+        key=lambda item: (-item[1], item[0]),
+    )[0][0]
+    return segment_labels.get(winner, normalized_style)
@@ -0,0 +1,227 @@
+"""Model runtime utilities for provider-bound LLM task execution."""
+
+from dataclasses import dataclass
+from typing import Any
+from urllib.parse import urlparse, urlunparse
+
+from openai import APIConnectionError, APIError, APITimeoutError, OpenAI
+
+from app.services.app_settings import read_task_runtime_settings
+
+
+class ModelTaskError(Exception):
+    """Raised when a model task request fails."""
+
+
+class ModelTaskTimeoutError(ModelTaskError):
+    """Raised when a model task request times out."""
+
+
+class ModelTaskDisabledError(ModelTaskError):
+    """Raised when a model task is disabled in settings."""
+
+
+@dataclass
+class ModelTaskRuntime:
+    """Resolved runtime configuration for one task and provider."""
+
+    task_name: str
+    provider_id: str
+    provider_type: str
+    base_url: str
+    timeout_seconds: int
+    api_key: str
+    model: str
+    prompt: str
+
+
+def _normalize_base_url(raw_value: str) -> str:
+    """Normalizes provider base URL and appends /v1 for OpenAI-compatible servers."""
+
+    trimmed = raw_value.strip().rstrip("/")
+    if not trimmed:
+        return "https://api.openai.com/v1"
+
+    parsed = urlparse(trimmed)
+    path = parsed.path or ""
+    if not path.endswith("/v1"):
+        path = f"{path}/v1" if path else "/v1"
+
+    return urlunparse(parsed._replace(path=path))
+
+
+def _should_fallback_to_chat(error: Exception) -> bool:
+    """Determines whether a responses API failure should fallback to chat completions."""
+
+    status_code = getattr(error, "status_code", None)
+    if isinstance(status_code, int) and status_code in {400, 404, 405, 415, 422, 501}:
+        return True
+
+    message = str(error).lower()
+    fallback_markers = (
+        "404",
+        "not found",
+        "unknown endpoint",
+        "unsupported",
+        "invalid url",
+        "responses",
+    )
+    return any(marker in message for marker in fallback_markers)
+
+
+def _extract_text_from_response(response: Any) -> str:
+    """Extracts plain text from Responses API outputs."""
+
+    output_text = getattr(response, "output_text", None)
+    if isinstance(output_text, str) and output_text.strip():
+        return output_text.strip()
+
+    output_items = getattr(response, "output", None)
+    if not isinstance(output_items, list):
+        return ""
+
+    chunks: list[str] = []
+    for item in output_items:
+        item_data = item.model_dump() if hasattr(item, "model_dump") else item
+        if not isinstance(item_data, dict):
+            continue
+
+        item_type = item_data.get("type")
+        if item_type == "output_text":
+            text = str(item_data.get("text", "")).strip()
+            if text:
+                chunks.append(text)
+
+        if item_type == "message":
+            for content in item_data.get("content", []) or []:
+                if not isinstance(content, dict):
+                    continue
+                if content.get("type") in {"output_text", "text"}:
+                    text = str(content.get("text", "")).strip()
+                    if text:
+                        chunks.append(text)
+
+    return "\n".join(chunks).strip()
+
+
+def _extract_text_from_chat_response(response: Any) -> str:
+    """Extracts text from Chat Completions API outputs."""
+
+    message_content = response.choices[0].message.content
+    if isinstance(message_content, str):
+        return message_content.strip()
+    if not isinstance(message_content, list):
+        return ""
+
+    chunks: list[str] = []
+    for content in message_content:
+        if not isinstance(content, dict):
+            continue
+        text = str(content.get("text", "")).strip()
+        if text:
+            chunks.append(text)
+    return "\n".join(chunks).strip()
+
+
+def resolve_task_runtime(task_name: str) -> ModelTaskRuntime:
+    """Resolves one task runtime including provider endpoint, model, and prompt."""
+
+    runtime_payload = read_task_runtime_settings(task_name)
+    task_payload = runtime_payload["task"]
+    provider_payload = runtime_payload["provider"]
+
+    if not bool(task_payload.get("enabled", True)):
+        raise ModelTaskDisabledError(f"task_disabled:{task_name}")
+
+    provider_type = str(provider_payload.get("provider_type", "openai_compatible")).strip()
+    if provider_type != "openai_compatible":
+        raise ModelTaskError(f"unsupported_provider_type:{provider_type}")
+
+    return ModelTaskRuntime(
+        task_name=task_name,
+        provider_id=str(provider_payload.get("id", "")),
+        provider_type=provider_type,
+        base_url=_normalize_base_url(str(provider_payload.get("base_url", "https://api.openai.com/v1"))),
+        timeout_seconds=int(provider_payload.get("timeout_seconds", 45)),
+        api_key=str(provider_payload.get("api_key", "")).strip() or "no-key-required",
+        model=str(task_payload.get("model", "")).strip(),
+        prompt=str(task_payload.get("prompt", "")).strip(),
+    )
+
+
+def _create_client(runtime: ModelTaskRuntime) -> OpenAI:
+    """Builds an OpenAI SDK client for OpenAI-compatible provider endpoints."""
+
+    return OpenAI(
+        api_key=runtime.api_key,
+        base_url=runtime.base_url,
+        timeout=runtime.timeout_seconds,
+    )
+
+
+def complete_text_task(task_name: str, user_text: str, prompt_override: str | None = None) -> str:
+    """Runs a text-only task against the configured provider and returns plain output text."""
+
+    runtime = resolve_task_runtime(task_name)
+    client = _create_client(runtime)
+    prompt = (prompt_override or runtime.prompt).strip() or runtime.prompt
+
+    try:
+        response = client.responses.create(
+            model=runtime.model,
+            input=[
+                {
+                    "role": "system",
+                    "content": [
+                        {
+                            "type": "input_text",
+                            "text": prompt,
+                        }
+                    ],
+                },
+                {
+                    "role": "user",
+                    "content": [
+                        {
+                            "type": "input_text",
+                            "text": user_text,
+                        }
+                    ],
+                },
+            ],
+        )
+        text = _extract_text_from_response(response)
+        if text:
+            return text
+    except APITimeoutError as error:
+        raise ModelTaskTimeoutError(f"task_timeout:{task_name}") from error
+    except APIConnectionError as error:
+        raise ModelTaskError(f"task_error:{task_name}:{error}") from error
+    except APIError as error:
+        if not _should_fallback_to_chat(error):
+            raise ModelTaskError(f"task_error:{task_name}:{error}") from error
+    except Exception as error:
+        if not _should_fallback_to_chat(error):
+            raise ModelTaskError(f"task_error:{task_name}:{error}") from error
+
+    try:
+        fallback = client.chat.completions.create(
+            model=runtime.model,
+            messages=[
+                {
+                    "role": "system",
+                    "content": prompt,
+                },
+                {
+                    "role": "user",
+                    "content": user_text,
+                },
+            ],
+        )
+        return _extract_text_from_chat_response(fallback)
+    except APITimeoutError as error:
+        raise ModelTaskTimeoutError(f"task_timeout:{task_name}") from error
+    except (APIConnectionError, APIError) as error:
+        raise ModelTaskError(f"task_error:{task_name}:{error}") from error
+    except Exception as error:
+        raise ModelTaskError(f"task_error:{task_name}:{error}") from error
@@ -0,0 +1,192 @@
+"""Persistence helpers for writing and querying processing pipeline log events."""
+
+from typing import Any
+from uuid import UUID
+
+from sqlalchemy import delete, func, select
+from sqlalchemy.orm import Session
+
+from app.models.document import Document
+from app.models.processing_log import ProcessingLogEntry
+
+
+MAX_STAGE_LENGTH = 64
+MAX_EVENT_LENGTH = 256
+MAX_LEVEL_LENGTH = 16
+MAX_PROVIDER_LENGTH = 128
+MAX_MODEL_LENGTH = 256
+MAX_DOCUMENT_FILENAME_LENGTH = 512
+MAX_PROMPT_LENGTH = 200000
+MAX_RESPONSE_LENGTH = 200000
+DEFAULT_KEEP_DOCUMENT_SESSIONS = 2
+DEFAULT_KEEP_UNBOUND_ENTRIES = 80
+PROCESSING_LOG_AUTOCOMMIT_SESSION_KEY = "processing_log_autocommit"
+
+
+def _trim(value: str | None, max_length: int) -> str | None:
+    """Normalizes and truncates text values for safe log persistence."""
+
+    if value is None:
+        return None
+    normalized = value.strip()
+    if not normalized:
+        return None
+    if len(normalized) <= max_length:
+        return normalized
+    return normalized[: max_length - 3] + "..."
+
+
+def _safe_payload(payload_json: dict[str, Any] | None) -> dict[str, Any]:
+    """Ensures payload values are persisted as dictionaries."""
+
+    return payload_json if isinstance(payload_json, dict) else {}
+
+
+def set_processing_log_autocommit(session: Session, enabled: bool) -> None:
+    """Toggles per-session immediate commit behavior for processing log events."""
+
+    session.info[PROCESSING_LOG_AUTOCOMMIT_SESSION_KEY] = bool(enabled)
+
+
+def is_processing_log_autocommit_enabled(session: Session) -> bool:
+    """Returns whether processing logs are committed immediately for the current session."""
+
+    return bool(session.info.get(PROCESSING_LOG_AUTOCOMMIT_SESSION_KEY, False))
+
+
+def log_processing_event(
+    session: Session,
+    stage: str,
+    event: str,
+    *,
+    level: str = "info",
+    document: Document | None = None,
+    document_id: UUID | None = None,
+    document_filename: str | None = None,
+    provider_id: str | None = None,
+    model_name: str | None = None,
+    prompt_text: str | None = None,
+    response_text: str | None = None,
+    payload_json: dict[str, Any] | None = None,
+) -> None:
+    """Persists one processing log entry linked to an optional document context."""
+
+    resolved_document_id = document.id if document is not None else document_id
+    resolved_document_filename = document.original_filename if document is not None else document_filename
+
+    entry = ProcessingLogEntry(
+        level=_trim(level, MAX_LEVEL_LENGTH) or "info",
+        stage=_trim(stage, MAX_STAGE_LENGTH) or "pipeline",
+        event=_trim(event, MAX_EVENT_LENGTH) or "event",
+        document_id=resolved_document_id,
+        document_filename=_trim(resolved_document_filename, MAX_DOCUMENT_FILENAME_LENGTH),
+        provider_id=_trim(provider_id, MAX_PROVIDER_LENGTH),
+        model_name=_trim(model_name, MAX_MODEL_LENGTH),
+        prompt_text=_trim(prompt_text, MAX_PROMPT_LENGTH),
+        response_text=_trim(response_text, MAX_RESPONSE_LENGTH),
+        payload_json=_safe_payload(payload_json),
+    )
+    session.add(entry)
+    if is_processing_log_autocommit_enabled(session):
+        session.commit()
+
+
+def count_processing_logs(session: Session, document_id: UUID | None = None) -> int:
+    """Counts persisted processing logs, optionally restricted to one document."""
+
+    statement = select(func.count()).select_from(ProcessingLogEntry)
+    if document_id is not None:
+        statement = statement.where(ProcessingLogEntry.document_id == document_id)
+    return int(session.execute(statement).scalar_one())
+
+
+def list_processing_logs(
+    session: Session,
+    *,
+    limit: int,
+    offset: int,
+    document_id: UUID | None = None,
+) -> list[ProcessingLogEntry]:
+    """Lists processing logs ordered by newest-first with optional document filter."""
+
+    statement = select(ProcessingLogEntry)
+    if document_id is not None:
+        statement = statement.where(ProcessingLogEntry.document_id == document_id)
+    statement = statement.order_by(ProcessingLogEntry.created_at.desc(), ProcessingLogEntry.id.desc()).offset(offset).limit(limit)
+    return session.execute(statement).scalars().all()
+
+
+def cleanup_processing_logs(
+    session: Session,
+    *,
+    keep_document_sessions: int = DEFAULT_KEEP_DOCUMENT_SESSIONS,
+    keep_unbound_entries: int = DEFAULT_KEEP_UNBOUND_ENTRIES,
+) -> dict[str, int]:
+    """Deletes old log entries while keeping recent document sessions and unbound events."""
+
+    normalized_keep_sessions = max(0, keep_document_sessions)
+    normalized_keep_unbound = max(0, keep_unbound_entries)
+    deleted_document_entries = 0
+    deleted_unbound_entries = 0
+
+    recent_document_rows = session.execute(
+        select(
+            ProcessingLogEntry.document_id,
+            func.max(ProcessingLogEntry.created_at).label("last_seen"),
+        )
+        .where(ProcessingLogEntry.document_id.is_not(None))
+        .group_by(ProcessingLogEntry.document_id)
+        .order_by(func.max(ProcessingLogEntry.created_at).desc())
+        .limit(normalized_keep_sessions)
+    ).all()
+    keep_document_ids = [row[0] for row in recent_document_rows if row[0] is not None]
+
+    if keep_document_ids:
+        deleted_document_entries = int(
+            session.execute(
+                delete(ProcessingLogEntry).where(
+                    ProcessingLogEntry.document_id.is_not(None),
+                    ProcessingLogEntry.document_id.notin_(keep_document_ids),
+                )
+            ).rowcount
+            or 0
+        )
+    else:
+        deleted_document_entries = int(
+            session.execute(delete(ProcessingLogEntry).where(ProcessingLogEntry.document_id.is_not(None))).rowcount or 0
+        )
+
+    keep_unbound_rows = session.execute(
+        select(ProcessingLogEntry.id)
+        .where(ProcessingLogEntry.document_id.is_(None))
+        .order_by(ProcessingLogEntry.created_at.desc(), ProcessingLogEntry.id.desc())
+        .limit(normalized_keep_unbound)
+    ).all()
+    keep_unbound_ids = [row[0] for row in keep_unbound_rows]
+
+    if keep_unbound_ids:
+        deleted_unbound_entries = int(
+            session.execute(
+                delete(ProcessingLogEntry).where(
+                    ProcessingLogEntry.document_id.is_(None),
+                    ProcessingLogEntry.id.notin_(keep_unbound_ids),
+                )
+            ).rowcount
+            or 0
+        )
+    else:
+        deleted_unbound_entries = int(
+            session.execute(delete(ProcessingLogEntry).where(ProcessingLogEntry.document_id.is_(None))).rowcount or 0
+        )
+
+    return {
+        "deleted_document_entries": deleted_document_entries,
+        "deleted_unbound_entries": deleted_unbound_entries,
+    }
+
+
+def clear_processing_logs(session: Session) -> dict[str, int]:
+    """Deletes all persisted processing log entries and returns deletion count."""
+
+    deleted_entries = int(session.execute(delete(ProcessingLogEntry)).rowcount or 0)
+    return {"deleted_entries": deleted_entries}
@@ -0,0 +1,59 @@
+"""File storage utilities for persistence, retrieval, and checksum calculation."""
+
+import hashlib
+import uuid
+from datetime import UTC, datetime
+from pathlib import Path
+
+from app.core.config import get_settings
+
+
+settings = get_settings()
+
+
+def ensure_storage() -> None:
+    """Ensures required storage directories exist at service startup."""
+
+    for relative in ["originals", "derived/previews", "tmp"]:
+        (settings.storage_root / relative).mkdir(parents=True, exist_ok=True)
+
+
+def compute_sha256(data: bytes) -> str:
+    """Computes a SHA-256 hex digest for raw file bytes."""
+
+    return hashlib.sha256(data).hexdigest()
+
+
+def store_bytes(filename: str, data: bytes) -> str:
+    """Stores file content under a unique path and returns its storage-relative location."""
+
+    stamp = datetime.now(UTC).strftime("%Y/%m/%d")
+    safe_ext = Path(filename).suffix.lower()
+    target_dir = settings.storage_root / "originals" / stamp
+    target_dir.mkdir(parents=True, exist_ok=True)
+    target_name = f"{uuid.uuid4()}{safe_ext}"
+    target_path = target_dir / target_name
+    target_path.write_bytes(data)
+    return str(target_path.relative_to(settings.storage_root))
+
+
+def read_bytes(relative_path: str) -> bytes:
+    """Reads and returns bytes from a storage-relative path."""
+
+    return (settings.storage_root / relative_path).read_bytes()
+
+
+def absolute_path(relative_path: str) -> Path:
+    """Returns the absolute filesystem path for a storage-relative location."""
+
+    return settings.storage_root / relative_path
+
+
+def write_preview(document_id: str, data: bytes, suffix: str = ".jpg") -> str:
+    """Writes preview bytes and returns the preview path relative to storage root."""
+
+    target_dir = settings.storage_root / "derived" / "previews"
+    target_dir.mkdir(parents=True, exist_ok=True)
+    target_path = target_dir / f"{document_id}{suffix}"
+    target_path.write_bytes(data)
+    return str(target_path.relative_to(settings.storage_root))
@@ -0,0 +1,257 @@
+"""Typesense indexing and semantic-neighbor retrieval for document routing."""
+
+from dataclasses import dataclass
+from typing import Any
+
+import typesense
+
+from app.core.config import get_settings
+from app.models.document import Document, DocumentStatus
+
+
+settings = get_settings()
+MAX_TYPESENSE_QUERY_CHARS = 600
+
+
+@dataclass
+class SimilarDocument:
+    """Represents one nearest-neighbor document returned by Typesense semantic search."""
+
+    document_id: str
+    document_name: str
+    summary_text: str
+    logical_path: str
+    tags: list[str]
+    vector_distance: float
+
+
+def _build_client() -> typesense.Client:
+    """Builds a Typesense API client using configured host and credentials."""
+
+    return typesense.Client(
+        {
+            "nodes": [
+                {
+                    "host": settings.typesense_host,
+                    "port": str(settings.typesense_port),
+                    "protocol": settings.typesense_protocol,
+                }
+            ],
+            "api_key": settings.typesense_api_key,
+            "connection_timeout_seconds": settings.typesense_timeout_seconds,
+            "num_retries": settings.typesense_num_retries,
+        }
+    )
+
+
+_client: typesense.Client | None = None
+
+
+def get_typesense_client() -> typesense.Client:
+    """Returns a cached Typesense client for repeated indexing and search operations."""
+
+    global _client
+    if _client is None:
+        _client = _build_client()
+    return _client
+
+
+def _collection() -> Any:
+    """Returns the configured Typesense collection handle."""
+
+    client = get_typesense_client()
+    return client.collections[settings.typesense_collection_name]
+
+
+def ensure_typesense_collection() -> None:
+    """Creates the document semantic collection when it does not already exist."""
+
+    collection = _collection()
+    try:
+        collection.retrieve()
+        return
+    except Exception as error:
+        message = str(error).lower()
+        if "404" not in message and "not found" not in message:
+            raise
+
+    schema = {
+        "name": settings.typesense_collection_name,
+        "fields": [
+            {
+                "name": "document_name",
+                "type": "string",
+            },
+            {
+                "name": "summary_text",
+                "type": "string",
+            },
+            {
+                "name": "logical_path",
+                "type": "string",
+                "facet": True,
+            },
+            {
+                "name": "tags",
+                "type": "string[]",
+                "facet": True,
+            },
+            {
+                "name": "status",
+                "type": "string",
+                "facet": True,
+            },
+            {
+                "name": "mime_type",
+                "type": "string",
+                "optional": True,
+                "facet": True,
+            },
+            {
+                "name": "extension",
+                "type": "string",
+                "optional": True,
+                "facet": True,
+            },
+            {
+                "name": "created_at",
+                "type": "int64",
+            },
+            {
+                "name": "has_labels",
+                "type": "bool",
+                "facet": True,
+            },
+            {
+                "name": "embedding",
+                "type": "float[]",
+                "embed": {
+                    "from": [
+                        "document_name",
+                        "summary_text",
+                    ],
+                    "model_config": {
+                        "model_name": "ts/e5-small-v2",
+                        "indexing_prefix": "passage:",
+                        "query_prefix": "query:",
+                    },
+                },
+            },
+        ],
+        "default_sorting_field": "created_at",
+    }
+    client = get_typesense_client()
+    client.collections.create(schema)
+
+
+def _has_labels(document: Document) -> bool:
+    """Determines whether a document has usable human-assigned routing metadata."""
+
+    if document.logical_path.strip() and document.logical_path.strip().lower() != "inbox":
+        return True
+    return len([tag for tag in document.tags if tag.strip()]) > 0
+
+
+def upsert_document_index(document: Document, summary_text: str) -> None:
+    """Upserts one document into Typesense for semantic retrieval and routing examples."""
+
+    ensure_typesense_collection()
+    collection = _collection()
+    payload = {
+        "id": str(document.id),
+        "document_name": document.original_filename,
+        "summary_text": summary_text[:50000],
+        "logical_path": document.logical_path,
+        "tags": [tag for tag in document.tags if tag.strip()][:50],
+        "status": document.status.value,
+        "mime_type": document.mime_type,
+        "extension": document.extension,
+        "created_at": int(document.created_at.timestamp()),
+        "has_labels": _has_labels(document) and document.status != DocumentStatus.TRASHED,
+    }
+    collection.documents.upsert(payload)
+
+
+def delete_document_index(document_id: str) -> None:
+    """Deletes one document from Typesense by identifier."""
+
+    collection = _collection()
+    try:
+        collection.documents[document_id].delete()
+    except Exception as error:
+        message = str(error).lower()
+        if "404" in message or "not found" in message:
+            return
+        raise
+
+
+def delete_many_documents_index(document_ids: list[str]) -> None:
+    """Deletes many documents from Typesense by identifiers."""
+
+    for document_id in document_ids:
+        delete_document_index(document_id)
+
+
+def query_similar_documents(summary_text: str, limit: int, exclude_document_id: str | None = None) -> list[SimilarDocument]:
+    """Returns semantic nearest neighbors among labeled non-trashed indexed documents."""
+
+    ensure_typesense_collection()
+    collection = _collection()
+    normalized_query = " ".join(summary_text.strip().split())
+    query_text = normalized_query[:MAX_TYPESENSE_QUERY_CHARS] if normalized_query else "document"
+    search_payload = {
+        "q": query_text,
+        "query_by": "embedding",
+        "vector_query": f"embedding:([], k:{max(1, limit)})",
+        "exclude_fields": "embedding",
+        "per_page": max(1, limit),
+        "filter_by": "has_labels:=true && status:!=trashed",
+    }
+
+    try:
+        response = collection.documents.search(search_payload)
+    except Exception as error:
+        message = str(error).lower()
+        if "query string exceeds max allowed length" not in message:
+            raise
+        fallback_payload = dict(search_payload)
+        fallback_payload["q"] = "document"
+        response = collection.documents.search(fallback_payload)
+    hits = response.get("hits", []) if isinstance(response, dict) else []
+
+    neighbors: list[SimilarDocument] = []
+    for hit in hits:
+        if not isinstance(hit, dict):
+            continue
+        document = hit.get("document", {})
+        if not isinstance(document, dict):
+            continue
+
+        document_id = str(document.get("id", "")).strip()
+        if not document_id:
+            continue
+        if exclude_document_id and document_id == exclude_document_id:
+            continue
+
+        raw_tags = document.get("tags", [])
+        tags = [str(tag).strip() for tag in raw_tags if str(tag).strip()] if isinstance(raw_tags, list) else []
+        try:
+            distance = float(hit.get("vector_distance", 2.0))
+        except (TypeError, ValueError):
+            distance = 2.0
+
+        neighbors.append(
+            SimilarDocument(
+                document_id=document_id,
+                document_name=str(document.get("document_name", "")).strip(),
+                summary_text=str(document.get("summary_text", "")).strip(),
+                logical_path=str(document.get("logical_path", "")).strip(),
+                tags=tags,
+                vector_distance=distance,
+            )
+        )
+
+        if len(neighbors) >= limit:
+            break
+
+    return neighbors
@@ -0,0 +1 @@
+"""Background worker package for queueing and document processing tasks."""
@@ -0,0 +1,21 @@
+"""Queue connection helpers used by API and worker processes."""
+
+from redis import Redis
+from rq import Queue
+
+from app.core.config import get_settings
+
+
+settings = get_settings()
+
+
+def get_redis() -> Redis:
+    """Creates a Redis connection from configured URL."""
+
+    return Redis.from_url(settings.redis_url)
+
+
+def get_processing_queue() -> Queue:
+    """Returns the named queue for document processing jobs."""
+
+    return Queue("dcm", connection=get_redis())
@@ -0,0 +1,544 @@
+"""Background worker tasks for extraction, indexing, and archive fan-out."""
+
+import uuid
+from datetime import UTC, datetime
+from pathlib import Path
+
+from sqlalchemy import select
+
+from app.db.base import SessionLocal
+from app.models.document import Document, DocumentStatus
+from app.services.app_settings import read_handwriting_provider_settings, read_handwriting_style_settings
+from app.services.extractor import (
+    IMAGE_EXTENSIONS,
+    extract_archive_members,
+    extract_text_content,
+    is_supported_for_extraction,
+    sniff_mime,
+)
+from app.services.handwriting import IMAGE_TEXT_TYPE_HANDWRITING
+from app.services.handwriting_style import (
+    assign_handwriting_style,
+    delete_handwriting_style_document,
+)
+from app.services.processing_logs import cleanup_processing_logs, log_processing_event, set_processing_log_autocommit
+from app.services.routing_pipeline import (
+    apply_routing_decision,
+    classify_document_routing,
+    summarize_document,
+    upsert_semantic_index,
+)
+from app.services.storage import absolute_path, compute_sha256, store_bytes, write_preview
+from app.worker.queue import get_processing_queue
+
+
+def _create_archive_member_document(
+    parent: Document,
+    member_name: str,
+    member_data: bytes,
+    mime_type: str,
+) -> Document:
+    """Creates a child document entity for a file extracted from an uploaded archive."""
+
+    extension = Path(member_name).suffix.lower()
+    stored_relative_path = store_bytes(member_name, member_data)
+    return Document(
+        original_filename=Path(member_name).name,
+        source_relative_path=f"{parent.source_relative_path}/{member_name}".strip("/"),
+        stored_relative_path=stored_relative_path,
+        mime_type=mime_type,
+        extension=extension,
+        sha256=compute_sha256(member_data),
+        size_bytes=len(member_data),
+        logical_path=parent.logical_path,
+        tags=list(parent.tags),
+        metadata_json={"origin": "archive", "parent": str(parent.id)},
+        is_archive_member=True,
+        archived_member_path=member_name,
+        parent_document_id=parent.id,
+    )
+
+
+def process_document_task(document_id: str) -> None:
+    """Processes one queued document and updates extraction and suggestion fields."""
+
+    with SessionLocal() as session:
+        set_processing_log_autocommit(session, True)
+        queue = get_processing_queue()
+        document = session.execute(
+            select(Document).where(Document.id == uuid.UUID(document_id))
+        ).scalar_one_or_none()
+        if document is None:
+            return
+        log_processing_event(
+            session=session,
+            stage="worker",
+            event="Document processing started",
+            level="info",
+            document=document,
+            payload_json={"status": document.status.value},
+        )
+        if document.status == DocumentStatus.TRASHED:
+            log_processing_event(
+                session=session,
+                stage="worker",
+                event="Document skipped because it is trashed",
+                level="warning",
+                document=document,
+            )
+            session.commit()
+            return
+
+        source_path = absolute_path(document.stored_relative_path)
+        data = source_path.read_bytes()
+
+        if document.extension == ".zip":
+            child_ids: list[str] = []
+            log_processing_event(
+                session=session,
+                stage="archive",
+                event="Archive extraction started",
+                level="info",
+                document=document,
+                payload_json={"size_bytes": len(data)},
+            )
+            try:
+                members = extract_archive_members(data)
+                for member in members:
+                    mime_type = sniff_mime(member.data)
+                    child = _create_archive_member_document(
+                        parent=document,
+                        member_name=member.name,
+                        member_data=member.data,
+                        mime_type=mime_type,
+                    )
+                    session.add(child)
+                    session.flush()
+                    child_ids.append(str(child.id))
+                    log_processing_event(
+                        session=session,
+                        stage="archive",
+                        event="Archive member extracted and queued",
+                        level="info",
+                        document=child,
+                        payload_json={
+                            "parent_document_id": str(document.id),
+                            "member_name": member.name,
+                            "member_size_bytes": len(member.data),
+                            "mime_type": mime_type,
+                        },
+                    )
+                document.status = DocumentStatus.PROCESSED
+                document.extracted_text = f"archive with {len(members)} files"
+                log_processing_event(
+                    session=session,
+                    stage="archive",
+                    event="Archive extraction completed",
+                    level="info",
+                    document=document,
+                    payload_json={"member_count": len(members)},
+                )
+            except Exception as exc:
+                document.status = DocumentStatus.ERROR
+                document.metadata_json = {**document.metadata_json, "error": str(exc)}
+                log_processing_event(
+                    session=session,
+                    stage="archive",
+                    event="Archive extraction failed",
+                    level="error",
+                    document=document,
+                    response_text=str(exc),
+                )
+
+            if document.status == DocumentStatus.PROCESSED:
+                try:
+                    summary_text = summarize_document(session=session, document=document)
+                    metadata_json = dict(document.metadata_json)
+                    metadata_json["summary_text"] = summary_text[:20000]
+                    document.metadata_json = metadata_json
+                    routing_decision = classify_document_routing(session=session, document=document, summary_text=summary_text)
+                    apply_routing_decision(document=document, decision=routing_decision, session=session)
+                    routing_metadata = document.metadata_json.get("routing", {})
+                    log_processing_event(
+                        session=session,
+                        stage="routing",
+                        event="Routing decision applied",
+                        level="info",
+                        document=document,
+                        payload_json=routing_metadata if isinstance(routing_metadata, dict) else {},
+                    )
+                    log_processing_event(
+                        session=session,
+                        stage="indexing",
+                        event="Typesense upsert started",
+                        level="info",
+                        document=document,
+                    )
+                    upsert_semantic_index(document=document, summary_text=summary_text)
+                    log_processing_event(
+                        session=session,
+                        stage="indexing",
+                        event="Typesense upsert completed",
+                        level="info",
+                        document=document,
+                    )
+                except Exception as exc:
+                    document.metadata_json = {
+                        **document.metadata_json,
+                        "routing_error": str(exc),
+                    }
+                    log_processing_event(
+                        session=session,
+                        stage="routing",
+                        event="Routing or indexing failed for archive document",
+                        level="error",
+                        document=document,
+                        response_text=str(exc),
+                    )
+            document.processed_at = datetime.now(UTC)
+            log_processing_event(
+                session=session,
+                stage="worker",
+                event="Document processing completed",
+                level="info",
+                document=document,
+                payload_json={"status": document.status.value},
+            )
+            cleanup_processing_logs(session=session, keep_document_sessions=2, keep_unbound_entries=80)
+            session.commit()
+            for child_id in child_ids:
+                queue.enqueue("app.worker.tasks.process_document_task", child_id)
+            for child_id in child_ids:
+                log_processing_event(
+                    session=session,
+                    stage="archive",
+                    event="Archive child job enqueued",
+                    level="info",
+                    document_id=uuid.UUID(child_id),
+                    payload_json={"parent_document_id": str(document.id)},
+                )
+            session.commit()
+            return
+
+        if not is_supported_for_extraction(document.extension, document.mime_type):
+            document.status = DocumentStatus.UNSUPPORTED
+            document.processed_at = datetime.now(UTC)
+            log_processing_event(
+                session=session,
+                stage="extraction",
+                event="Document type unsupported for extraction",
+                level="warning",
+                document=document,
+                payload_json={"extension": document.extension, "mime_type": document.mime_type},
+            )
+            log_processing_event(
+                session=session,
+                stage="worker",
+                event="Document processing completed",
+                level="info",
+                document=document,
+                payload_json={"status": document.status.value},
+            )
+            cleanup_processing_logs(session=session, keep_document_sessions=2, keep_unbound_entries=80)
+            session.commit()
+            return
+
+        if document.extension in IMAGE_EXTENSIONS:
+            ocr_settings = read_handwriting_provider_settings()
+            log_processing_event(
+                session=session,
+                stage="ocr",
+                event="OCR request started",
+                level="info",
+                document=document,
+                provider_id=str(ocr_settings.get("provider_id", "")),
+                model_name=str(ocr_settings.get("openai_model", "")),
+                prompt_text=str(ocr_settings.get("prompt", "")),
+                payload_json={"mime_type": document.mime_type},
+            )
+        else:
+            log_processing_event(
+                session=session,
+                stage="extraction",
+                event="Text extraction started",
+                level="info",
+                document=document,
+                payload_json={"extension": document.extension, "mime_type": document.mime_type},
+            )
+
+        extraction = extract_text_content(document.original_filename, data, document.mime_type)
+        if extraction.preview_bytes and extraction.preview_suffix:
+            preview_relative_path = write_preview(str(document.id), extraction.preview_bytes, extraction.preview_suffix)
+            document.metadata_json = {**document.metadata_json, "preview_relative_path": preview_relative_path}
+            document.preview_available = True
+            log_processing_event(
+                session=session,
+                stage="extraction",
+                event="Preview generated",
+                level="info",
+                document=document,
+                payload_json={"preview_relative_path": preview_relative_path},
+            )
+        if extraction.metadata_json:
+            document.metadata_json = {**document.metadata_json, **extraction.metadata_json}
+        if document.extension in IMAGE_EXTENSIONS:
+            image_text_type = extraction.metadata_json.get("image_text_type")
+            if isinstance(image_text_type, str) and image_text_type.strip():
+                document.image_text_type = image_text_type.strip()
+            else:
+                document.image_text_type = None
+        else:
+            document.image_text_type = None
+            document.handwriting_style_id = None
+
+        if extraction.status == "error":
+            document.status = DocumentStatus.ERROR
+            document.metadata_json = {**document.metadata_json, "error": "extraction_failed"}
+            if document.extension in IMAGE_EXTENSIONS:
+                document.handwriting_style_id = None
+                metadata_json = dict(document.metadata_json)
+                metadata_json.pop("handwriting_style", None)
+                document.metadata_json = metadata_json
+                try:
+                    delete_handwriting_style_document(str(document.id))
+                except Exception:
+                    pass
+            document.processed_at = datetime.now(UTC)
+            log_processing_event(
+                session=session,
+                stage="extraction",
+                event="Extraction failed",
+                level="error",
+                document=document,
+                response_text=str(extraction.metadata_json.get("error", "extraction_failed")),
+                payload_json=extraction.metadata_json,
+            )
+            if "transcription_error" in extraction.metadata_json:
+                log_processing_event(
+                    session=session,
+                    stage="ocr",
+                    event="OCR request failed",
+                    level="error",
+                    document=document,
+                    response_text=str(extraction.metadata_json.get("transcription_error", "")),
+                )
+            log_processing_event(
+                session=session,
+                stage="worker",
+                event="Document processing completed",
+                level="info",
+                document=document,
+                payload_json={"status": document.status.value},
+            )
+            cleanup_processing_logs(session=session, keep_document_sessions=2, keep_unbound_entries=80)
+            session.commit()
+            return
+
+        if extraction.status == "unsupported":
+            document.status = DocumentStatus.UNSUPPORTED
+            if document.extension in IMAGE_EXTENSIONS:
+                document.handwriting_style_id = None
+                metadata_json = dict(document.metadata_json)
+                metadata_json.pop("handwriting_style", None)
+                document.metadata_json = metadata_json
+                try:
+                    delete_handwriting_style_document(str(document.id))
+                except Exception:
+                    pass
+            document.processed_at = datetime.now(UTC)
+            log_processing_event(
+                session=session,
+                stage="extraction",
+                event="Extraction returned unsupported",
+                level="warning",
+                document=document,
+                payload_json=extraction.metadata_json,
+            )
+            log_processing_event(
+                session=session,
+                stage="worker",
+                event="Document processing completed",
+                level="info",
+                document=document,
+                payload_json={"status": document.status.value},
+            )
+            cleanup_processing_logs(session=session, keep_document_sessions=2, keep_unbound_entries=80)
+            session.commit()
+            return
+
+        if document.extension in IMAGE_EXTENSIONS:
+            image_text_type = document.image_text_type or ""
+            if image_text_type == IMAGE_TEXT_TYPE_HANDWRITING:
+                style_settings = read_handwriting_style_settings()
+                if not bool(style_settings.get("enabled", True)):
+                    document.handwriting_style_id = None
+                    metadata_json = dict(document.metadata_json)
+                    metadata_json.pop("handwriting_style", None)
+                    metadata_json["handwriting_style_disabled"] = True
+                    document.metadata_json = metadata_json
+                    log_processing_event(
+                        session=session,
+                        stage="style",
+                        event="Handwriting style clustering disabled",
+                        level="warning",
+                        document=document,
+                        payload_json={
+                            "enabled": False,
+                            "embed_model": style_settings.get("embed_model"),
+                        },
+                    )
+                else:
+                    try:
+                        assignment = assign_handwriting_style(
+                            session=session,
+                            document=document,
+                            image_data=data,
+                        )
+                        document.handwriting_style_id = assignment.style_cluster_id
+                        metadata_json = dict(document.metadata_json)
+                        metadata_json["handwriting_style"] = {
+                            "style_cluster_id": assignment.style_cluster_id,
+                            "matched_existing": assignment.matched_existing,
+                            "similarity": assignment.similarity,
+                            "vector_distance": assignment.vector_distance,
+                            "compared_neighbors": assignment.compared_neighbors,
+                            "match_min_similarity": assignment.match_min_similarity,
+                            "bootstrap_match_min_similarity": assignment.bootstrap_match_min_similarity,
+                        }
+                        metadata_json.pop("handwriting_style_disabled", None)
+                        document.metadata_json = metadata_json
+                        log_processing_event(
+                            session=session,
+                            stage="style",
+                            event="Handwriting style assigned",
+                            level="info",
+                            document=document,
+                            payload_json=metadata_json["handwriting_style"],
+                        )
+                    except Exception as style_error:
+                        document.handwriting_style_id = None
+                        metadata_json = dict(document.metadata_json)
+                        metadata_json["handwriting_style_error"] = str(style_error)
+                        metadata_json.pop("handwriting_style", None)
+                        metadata_json.pop("handwriting_style_disabled", None)
+                        document.metadata_json = metadata_json
+                        log_processing_event(
+                            session=session,
+                            stage="style",
+                            event="Handwriting style assignment failed",
+                            level="error",
+                            document=document,
+                            response_text=str(style_error),
+                        )
+            else:
+                document.handwriting_style_id = None
+                metadata_json = dict(document.metadata_json)
+                metadata_json.pop("handwriting_style", None)
+                metadata_json.pop("handwriting_style_disabled", None)
+                document.metadata_json = metadata_json
+                try:
+                    delete_handwriting_style_document(str(document.id))
+                except Exception:
+                    pass
+
+        if document.extension in IMAGE_EXTENSIONS:
+            log_processing_event(
+                session=session,
+                stage="ocr",
+                event="OCR response received",
+                level="info",
+                document=document,
+                provider_id=str(
+                    extraction.metadata_json.get(
+                        "transcription_provider",
+                        extraction.metadata_json.get("image_text_type_provider", ""),
+                    )
+                ),
+                model_name=str(
+                    extraction.metadata_json.get(
+                        "transcription_model",
+                        extraction.metadata_json.get("image_text_type_model", ""),
+                    )
+                ),
+                response_text=extraction.text,
+                payload_json={
+                    "image_text_type": document.image_text_type,
+                    "image_text_type_confidence": extraction.metadata_json.get("image_text_type_confidence"),
+                    "transcription_skipped": extraction.metadata_json.get("transcription_skipped"),
+                    "uncertainty_count": len(
+                        extraction.metadata_json.get("transcription_uncertainties", [])
+                        if isinstance(extraction.metadata_json.get("transcription_uncertainties", []), list)
+                        else []
+                    )
+                },
+            )
+        else:
+            log_processing_event(
+                session=session,
+                stage="extraction",
+                event="Text extraction completed",
+                level="info",
+                document=document,
+                response_text=extraction.text,
+                payload_json={"text_length": len(extraction.text)},
+            )
+
+        document.extracted_text = extraction.text
+
+        try:
+            summary_text = summarize_document(session=session, document=document)
+            routing_decision = classify_document_routing(session=session, document=document, summary_text=summary_text)
+            apply_routing_decision(document=document, decision=routing_decision, session=session)
+            routing_metadata = document.metadata_json.get("routing", {})
+            log_processing_event(
+                session=session,
+                stage="routing",
+                event="Routing decision applied",
+                level="info",
+                document=document,
+                payload_json=routing_metadata if isinstance(routing_metadata, dict) else {},
+            )
+            log_processing_event(
+                session=session,
+                stage="indexing",
+                event="Typesense upsert started",
+                level="info",
+                document=document,
+            )
+            upsert_semantic_index(document=document, summary_text=summary_text)
+            log_processing_event(
+                session=session,
+                stage="indexing",
+                event="Typesense upsert completed",
+                level="info",
+                document=document,
+            )
+            metadata_json = dict(document.metadata_json)
+            metadata_json["summary_text"] = summary_text[:20000]
+            document.metadata_json = metadata_json
+        except Exception as exc:
+            document.metadata_json = {
+                **document.metadata_json,
+                "routing_error": str(exc),
+            }
+            log_processing_event(
+                session=session,
+                stage="routing",
+                event="Routing or indexing failed",
+                level="error",
+                document=document,
+                response_text=str(exc),
+            )
+
+        document.status = DocumentStatus.PROCESSED
+        document.processed_at = datetime.now(UTC)
+        log_processing_event(
+            session=session,
+            stage="worker",
+            event="Document processing completed",
+            level="info",
+            document=document,
+            payload_json={"status": document.status.value},
+        )
+        cleanup_processing_logs(session=session, keep_document_sessions=2, keep_unbound_entries=80)
+        session.commit()
@@ -0,0 +1,18 @@
+fastapi==0.116.1
+uvicorn[standard]==0.35.0
+sqlalchemy==2.0.39
+psycopg[binary]==3.2.9
+pydantic-settings==2.10.1
+python-multipart==0.0.20
+redis==6.4.0
+rq==2.3.2
+python-magic==0.4.27
+pillow==11.3.0
+pypdf==5.9.0
+pymupdf==1.26.4
+python-docx==1.2.0
+openpyxl==3.1.5
+orjson==3.11.3
+openai==1.107.2
+typesense==1.1.1
+tiktoken==0.11.0
@@ -0,0 +1,15 @@
+# Documentation
+
+This is the documentation entrypoint for DMS.
+
+## Available Documents
+
+- Project setup and operations: `../README.md`
+- Frontend visual system and compact UI rules: `frontend-design-foundation.md`
+- Handwriting style implementation plan: `../PLAN.md`
+
+## Planned Additions
+
+- Architecture overview
+- Data model reference
+- API contract details
@@ -0,0 +1,49 @@
+# Frontend Design Foundation
+
+## Direction
+
+The DCM frontend now follows a compact command-deck direction:
+- dark layered surfaces with strong separation between sections
+- tight spacing and small radii to maximize information density
+- consistent control primitives across buttons, inputs, selects, and panels
+- high-legibility typography tuned for metadata-heavy workflows
+
+## Token Source
+
+Use `frontend/src/design-foundation.css` as the single token source for:
+- typography (`--font-display`, `--font-body`, `--font-mono`)
+- color system (`--color-*`)
+- spacing (`--space-*`)
+- radii and shadows (`--radius-*`, `--shadow-*`)
+- interaction timing (`--transition-*`)
+
+Do not hardcode new palette or spacing values in component styles when a token already exists.
+
+## Layout Principles
+
+- The top bar is sticky and should remain compact under all breakpoints.
+- Documents and viewer operate as a two-pane layout on desktop and collapse to one pane on narrower screens.
+- Toolbar rows should keep critical actions visible without forcing large vertical gaps.
+- Settings sections should preserve dense form grouping while remaining keyboard friendly.
+
+## Control Standards
+
+- Global input, select, textarea, and button styles are defined once in `frontend/src/styles.css`.
+- Variant button classes (`secondary-action`, `active-view-button`, `warning-action`, `danger-action`) are the only approved button color routes.
+- Tag chips, routing pills, card chips, and icon buttons must stay within the compact radius and spacing scale.
+- Focus states use `:focus-visible` and tokenized focus color to preserve keyboard discoverability.
+
+## Motion Rules
+
+- Use `rise-in` for section entry and `pulse-border` for card selection emphasis.
+- Keep transitions brief and functional.
+- Avoid decorative animation loops outside explicit status indicators like terminal caret blink.
+
+## Extension Checklist
+
+When adding or redesigning a UI area:
+1. Start from existing tokens in `frontend/src/design-foundation.css`.
+2. Add missing tokens in `frontend/src/design-foundation.css`, not per-component styles.
+3. Implement component styles in `frontend/src/styles.css` using existing layout and variant conventions.
+4. Validate responsive behavior at `1240px`, `1040px`, `760px`, and `560px` breakpoints.
+5. Verify keyboard focus visibility and text contrast before merging.
@@ -0,0 +1,111 @@
+services:
+  db:
+    image: postgres:16-alpine
+    environment:
+      POSTGRES_USER: dcm
+      POSTGRES_PASSWORD: dcm
+      POSTGRES_DB: dcm
+    ports:
+      - "5432:5432"
+    volumes:
+      - db-data:/var/lib/postgresql/data
+    healthcheck:
+      test: ["CMD-SHELL", "pg_isready -U dcm -d dcm"]
+      interval: 10s
+      timeout: 5s
+      retries: 10
+
+  redis:
+    image: redis:7-alpine
+    ports:
+      - "6379:6379"
+    volumes:
+      - redis-data:/data
+
+  typesense:
+    image: typesense/typesense:29.0
+    command:
+      - "--data-dir=/data"
+      - "--api-key=dcm-typesense-key"
+      - "--enable-cors"
+    ports:
+      - "8108:8108"
+    volumes:
+      - typesense-data:/data
+
+  api:
+    build:
+      context: ./backend
+    environment:
+      APP_ENV: development
+      DATABASE_URL: postgresql+psycopg://dcm:dcm@db:5432/dcm
+      REDIS_URL: redis://redis:6379/0
+      STORAGE_ROOT: /data/storage
+      OCR_LANGUAGES: eng,deu
+      PUBLIC_BASE_URL: http://192.168.2.5:8000
+      CORS_ORIGINS: '["http://localhost:5173","http://localhost:3000","http://192.168.2.5:5173"]'
+      TYPESENSE_PROTOCOL: http
+      TYPESENSE_HOST: typesense
+      TYPESENSE_PORT: 8108
+      TYPESENSE_API_KEY: dcm-typesense-key
+      TYPESENSE_COLLECTION_NAME: documents
+    ports:
+      - "8000:8000"
+    volumes:
+      - ./backend/app:/app/app
+      - dcm-storage:/data
+    depends_on:
+      db:
+        condition: service_healthy
+      redis:
+        condition: service_started
+      typesense:
+        condition: service_started
+
+  worker:
+    build:
+      context: ./backend
+    command: ["rq", "worker", "dcm", "--url", "redis://redis:6379/0"]
+    environment:
+      APP_ENV: development
+      DATABASE_URL: postgresql+psycopg://dcm:dcm@db:5432/dcm
+      REDIS_URL: redis://redis:6379/0
+      STORAGE_ROOT: /data/storage
+      OCR_LANGUAGES: eng,deu
+      PUBLIC_BASE_URL: http://localhost:8000
+      TYPESENSE_PROTOCOL: http
+      TYPESENSE_HOST: typesense
+      TYPESENSE_PORT: 8108
+      TYPESENSE_API_KEY: dcm-typesense-key
+      TYPESENSE_COLLECTION_NAME: documents
+    volumes:
+      - ./backend/app:/app/app
+      - dcm-storage:/data
+    depends_on:
+      db:
+        condition: service_healthy
+      redis:
+        condition: service_started
+      typesense:
+        condition: service_started
+
+  frontend:
+    build:
+      context: ./frontend
+    environment:
+      VITE_API_BASE: http://192.168.2.5:8000/api/v1
+    ports:
+      - "5173:5173"
+    volumes:
+      - ./frontend/src:/app/src
+      - ./frontend/index.html:/app/index.html
+      - ./frontend/vite.config.ts:/app/vite.config.ts
+    depends_on:
+      api:
+        condition: service_started
+
+volumes:
+  db-data:
+  redis-data:
+  dcm-storage:
+  typesense-data:
@@ -0,0 +1,16 @@
+FROM node:22-alpine
+
+WORKDIR /app
+
+COPY package.json /app/package.json
+RUN npm install
+
+COPY tsconfig.json /app/tsconfig.json
+COPY tsconfig.node.json /app/tsconfig.node.json
+COPY vite.config.ts /app/vite.config.ts
+COPY index.html /app/index.html
+COPY src /app/src
+
+EXPOSE 5173
+
+CMD ["npm", "run", "dev", "--", "--host", "0.0.0.0", "--port", "5173"]
@@ -0,0 +1,12 @@
+<!doctype html>
+<html lang="en">
+  <head>
+    <meta charset="UTF-8" />
+    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
+    <title>DCM DMS</title>
+  </head>
+  <body>
+    <div id="root"></div>
+    <script type="module" src="/src/main.tsx"></script>
+  </body>
+</html>
@@ -0,0 +1,22 @@
+{
+  "name": "dcm-dms-frontend",
+  "version": "0.1.0",
+  "private": true,
+  "type": "module",
+  "scripts": {
+    "dev": "vite",
+    "build": "tsc -b && vite build",
+    "preview": "vite preview --host 0.0.0.0 --port 4173"
+  },
+  "dependencies": {
+    "lucide-react": "latest",
+    "react": "19.1.1",
+    "react-dom": "19.1.1"
+  },
+  "devDependencies": {
+    "@types/react": "19.1.11",
+    "@types/react-dom": "19.1.7",
+    "typescript": "5.9.2",
+    "vite": "7.1.5"
+  }
+}
@@ -0,0 +1,795 @@
+/**
+ * Main application layout and orchestration for document and settings workspaces.
+ */
+import { useCallback, useEffect, useMemo, useRef, useState } from 'react';
+
+import ActionModal from './components/ActionModal';
+import DocumentGrid from './components/DocumentGrid';
+import DocumentViewer from './components/DocumentViewer';
+import PathInput from './components/PathInput';
+import ProcessingLogPanel from './components/ProcessingLogPanel';
+import SearchFiltersBar from './components/SearchFiltersBar';
+import SettingsScreen from './components/SettingsScreen';
+import UploadSurface from './components/UploadSurface';
+import {
+  clearProcessingLogs,
+  deleteDocument,
+  exportContentsMarkdown,
+  getAppSettings,
+  listDocuments,
+  listPaths,
+  listProcessingLogs,
+  listTags,
+  listTypes,
+  resetAppSettings,
+  searchDocuments,
+  trashDocument,
+  updateAppSettings,
+  uploadDocuments,
+} from './lib/api';
+import type { AppSettings, AppSettingsUpdate, DmsDocument, ProcessingLogEntry } from './types';
+
+type AppScreen = 'documents' | 'settings';
+type DocumentView = 'active' | 'trash';
+
+interface DialogOption {
+  key: string;
+  label: string;
+  tone?: 'neutral' | 'primary' | 'warning' | 'danger';
+}
+
+interface DialogState {
+  title: string;
+  message: string;
+  options: DialogOption[];
+}
+
+/**
+ * Defines the root DMS frontend component.
+ */
+export default function App(): JSX.Element {
+  const DEFAULT_PAGE_SIZE = 12;
+  const [screen, setScreen] = useState<AppScreen>('documents');
+  const [documentView, setDocumentView] = useState<DocumentView>('active');
+  const [documents, setDocuments] = useState<DmsDocument[]>([]);
+  const [totalDocuments, setTotalDocuments] = useState<number>(0);
+  const [currentPage, setCurrentPage] = useState<number>(1);
+  const [isLoading, setIsLoading] = useState<boolean>(false);
+  const [isUploading, setIsUploading] = useState<boolean>(false);
+  const [searchText, setSearchText] = useState<string>('');
+  const [activeSearchQuery, setActiveSearchQuery] = useState<string>('');
+  const [selectedDocumentId, setSelectedDocumentId] = useState<string | null>(null);
+  const [selectedDocumentIds, setSelectedDocumentIds] = useState<string[]>([]);
+  const [exportPathInput, setExportPathInput] = useState<string>('');
+  const [tagFilter, setTagFilter] = useState<string>('');
+  const [typeFilter, setTypeFilter] = useState<string>('');
+  const [pathFilter, setPathFilter] = useState<string>('');
+  const [processedFrom, setProcessedFrom] = useState<string>('');
+  const [processedTo, setProcessedTo] = useState<string>('');
+  const [knownTags, setKnownTags] = useState<string[]>([]);
+  const [knownPaths, setKnownPaths] = useState<string[]>([]);
+  const [knownTypes, setKnownTypes] = useState<string[]>([]);
+  const [appSettings, setAppSettings] = useState<AppSettings | null>(null);
+  const [settingsSaveAction, setSettingsSaveAction] = useState<(() => Promise<void>) | null>(null);
+  const [processingLogs, setProcessingLogs] = useState<ProcessingLogEntry[]>([]);
+  const [isLoadingLogs, setIsLoadingLogs] = useState<boolean>(false);
+  const [isClearingLogs, setIsClearingLogs] = useState<boolean>(false);
+  const [processingLogError, setProcessingLogError] = useState<string | null>(null);
+  const [isSavingSettings, setIsSavingSettings] = useState<boolean>(false);
+  const [isRunningBulkAction, setIsRunningBulkAction] = useState<boolean>(false);
+  const [error, setError] = useState<string | null>(null);
+  const [dialogState, setDialogState] = useState<DialogState | null>(null);
+  const dialogResolverRef = useRef<((value: string) => void) | null>(null);
+
+  const pageSize = useMemo(() => {
+    const configured = appSettings?.display?.cards_per_page;
+    if (!configured || Number.isNaN(configured)) {
+      return DEFAULT_PAGE_SIZE;
+    }
+    return Math.max(1, Math.min(200, configured));
+  }, [appSettings]);
+
+  const presentDialog = useCallback((title: string, message: string, options: DialogOption[]): Promise<string> => {
+    setDialogState({ title, message, options });
+    return new Promise<string>((resolve) => {
+      dialogResolverRef.current = resolve;
+    });
+  }, []);
+
+  const requestConfirmation = useCallback(
+    async (title: string, message: string, confirmLabel = 'Confirm'): Promise<boolean> => {
+      const choice = await presentDialog(title, message, [
+        { key: 'cancel', label: 'Cancel', tone: 'neutral' },
+        { key: 'confirm', label: confirmLabel, tone: 'danger' },
+      ]);
+      return choice === 'confirm';
+    },
+    [presentDialog],
+  );
+
+  const closeDialog = useCallback((key: string): void => {
+    const resolver = dialogResolverRef.current;
+    dialogResolverRef.current = null;
+    setDialogState(null);
+    if (resolver) {
+      resolver(key);
+    }
+  }, []);
+
+  const downloadBlob = useCallback((blob: Blob, filename: string): void => {
+    const objectUrl = URL.createObjectURL(blob);
+    const anchor = document.createElement('a');
+    anchor.href = objectUrl;
+    anchor.download = filename;
+    anchor.click();
+    URL.revokeObjectURL(objectUrl);
+  }, []);
+
+  const loadCatalogs = useCallback(async (): Promise<void> => {
+    const [tags, paths, types] = await Promise.all([listTags(true), listPaths(true), listTypes(true)]);
+    setKnownTags(tags);
+    setKnownPaths(paths);
+    setKnownTypes(types);
+  }, []);
+
+  const loadDocuments = useCallback(async (options?: { silent?: boolean }): Promise<void> => {
+    const silent = options?.silent ?? false;
+    if (!silent) {
+      setIsLoading(true);
+      setError(null);
+    }
+    try {
+      const offset = (currentPage - 1) * pageSize;
+      const search = activeSearchQuery.trim();
+      const filters = {
+        pathFilter,
+        tagFilter,
+        typeFilter,
+        processedFrom,
+        processedTo,
+      };
+      const payload =
+        search.length > 0
+          ? await searchDocuments(search, {
+              limit: pageSize,
+              offset,
+              onlyTrashed: documentView === 'trash',
+              ...filters,
+            })
+          : await listDocuments({
+              limit: pageSize,
+              offset,
+              onlyTrashed: documentView === 'trash',
+              ...filters,
+            });
+
+      setDocuments(payload.items);
+      setTotalDocuments(payload.total);
+      if (payload.items.length === 0) {
+        setSelectedDocumentId(null);
+      } else if (!payload.items.some((item) => item.id === selectedDocumentId)) {
+        setSelectedDocumentId(payload.items[0].id);
+      }
+      setSelectedDocumentIds((current) => current.filter((documentId) => payload.items.some((item) => item.id === documentId)));
+    } catch (caughtError) {
+      setError(caughtError instanceof Error ? caughtError.message : 'Failed to load documents');
+    } finally {
+      if (!silent) {
+        setIsLoading(false);
+      }
+    }
+  }, [
+    activeSearchQuery,
+    currentPage,
+    documentView,
+    pageSize,
+    pathFilter,
+    processedFrom,
+    processedTo,
+    selectedDocumentId,
+    tagFilter,
+    typeFilter,
+  ]);
+
+  const loadSettings = useCallback(async (): Promise<void> => {
+    setError(null);
+    try {
+      const payload = await getAppSettings();
+      setAppSettings(payload);
+    } catch (caughtError) {
+      setError(caughtError instanceof Error ? caughtError.message : 'Failed to load settings');
+    }
+  }, []);
+
+  const loadProcessingTimeline = useCallback(async (options?: { silent?: boolean }): Promise<void> => {
+    const silent = options?.silent ?? false;
+    if (!silent) {
+      setIsLoadingLogs(true);
+    }
+    try {
+      const payload = await listProcessingLogs({ limit: 180 });
+      setProcessingLogs(payload.items);
+      setProcessingLogError(null);
+    } catch (caughtError) {
+      setProcessingLogError(caughtError instanceof Error ? caughtError.message : 'Failed to load processing logs');
+    } finally {
+      if (!silent) {
+        setIsLoadingLogs(false);
+      }
+    }
+  }, []);
+
+  useEffect(() => {
+    const bootstrap = async (): Promise<void> => {
+      try {
+        await Promise.all([loadDocuments(), loadCatalogs(), loadSettings(), loadProcessingTimeline()]);
+      } catch (caughtError) {
+        setError(caughtError instanceof Error ? caughtError.message : 'Failed to initialize application');
+      }
+    };
+    void bootstrap();
+  }, [loadCatalogs, loadDocuments, loadProcessingTimeline, loadSettings]);
+
+  useEffect(() => {
+    setSelectedDocumentIds([]);
+    setCurrentPage(1);
+  }, [documentView, pageSize]);
+
+  useEffect(() => {
+    if (screen !== 'documents') {
+      return;
+    }
+    void loadDocuments();
+  }, [loadDocuments, screen]);
+
+  useEffect(() => {
+    if (screen !== 'documents') {
+      return;
+    }
+    const pollInterval = window.setInterval(() => {
+      void loadDocuments({ silent: true });
+    }, 3000);
+    return () => window.clearInterval(pollInterval);
+  }, [loadDocuments, screen]);
+
+  useEffect(() => {
+    if (screen !== 'documents') {
+      return;
+    }
+    void loadProcessingTimeline();
+    const pollInterval = window.setInterval(() => {
+      void loadProcessingTimeline({ silent: true });
+    }, 1500);
+    return () => window.clearInterval(pollInterval);
+  }, [loadProcessingTimeline, screen]);
+
+  const selectedDocument = useMemo(
+    () => documents.find((document) => document.id === selectedDocumentId) ?? null,
+    [documents, selectedDocumentId],
+  );
+  const totalPages = useMemo(() => Math.max(1, Math.ceil(totalDocuments / pageSize)), [pageSize, totalDocuments]);
+  const allVisibleSelected = useMemo(() => documents.length > 0 && documents.every((document) => selectedDocumentIds.includes(document.id)), [documents, selectedDocumentIds]);
+  const isProcessingActive = useMemo(() => documents.some((document) => document.status === 'queued'), [documents]);
+  const typingAnimationEnabled = appSettings?.display?.log_typing_animation_enabled ?? true;
+  const hasActiveSearch = Boolean(
+    activeSearchQuery.trim() || tagFilter || typeFilter || pathFilter || processedFrom || processedTo,
+  );
+
+  const handleUpload = useCallback(async (files: File[]): Promise<void> => {
+    if (files.length === 0) {
+      return;
+    }
+    setIsUploading(true);
+    setError(null);
+    try {
+      const uploadDefaults = appSettings?.upload_defaults ?? { logical_path: 'Inbox', tags: [] };
+      const tagsCsv = uploadDefaults.tags.join(',');
+      const firstAttempt = await uploadDocuments(files, {
+        logicalPath: uploadDefaults.logical_path,
+        tags: tagsCsv,
+        conflictMode: 'ask',
+      });
+
+      if (firstAttempt.conflicts.length > 0) {
+        const choice = await presentDialog(
+          'Upload Conflicts Detected',
+          `${firstAttempt.conflicts.length} file(s) already exist. Replace existing records or keep duplicates?`,
+          [
+            { key: 'duplicate', label: 'Keep Duplicates', tone: 'neutral' },
+            { key: 'replace', label: 'Replace Existing', tone: 'warning' },
+          ],
+        );
+        await uploadDocuments(files, {
+          logicalPath: uploadDefaults.logical_path,
+          tags: tagsCsv,
+          conflictMode: choice === 'replace' ? 'replace' : 'duplicate',
+        });
+      }
+
+      await Promise.all([loadDocuments(), loadCatalogs(), loadProcessingTimeline()]);
+    } catch (caughtError) {
+      setError(caughtError instanceof Error ? caughtError.message : 'Upload failed');
+    } finally {
+      setIsUploading(false);
+    }
+  }, [appSettings, loadCatalogs, loadDocuments, loadProcessingTimeline, presentDialog]);
+
+  const handleSearch = useCallback(async (): Promise<void> => {
+    setSelectedDocumentIds([]);
+    setCurrentPage(1);
+    setActiveSearchQuery(searchText.trim());
+  }, [searchText]);
+
+  const handleResetSearch = useCallback((): void => {
+    setSearchText('');
+    setActiveSearchQuery('');
+    setTagFilter('');
+    setTypeFilter('');
+    setPathFilter('');
+    setProcessedFrom('');
+    setProcessedTo('');
+    setCurrentPage(1);
+    setSelectedDocumentIds([]);
+  }, []);
+
+  const handleDocumentUpdated = useCallback((updated: DmsDocument): void => {
+    setDocuments((current) => {
+      const shouldAppear = documentView === 'trash' ? updated.status === 'trashed' : updated.status !== 'trashed';
+      if (!shouldAppear) {
+        return current.filter((document) => document.id !== updated.id);
+      }
+      const exists = current.some((document) => document.id === updated.id);
+      if (!exists) {
+        return [updated, ...current];
+      }
+      return current.map((document) => (document.id === updated.id ? updated : document));
+    });
+    if (documentView === 'trash' && updated.status !== 'trashed') {
+      setSelectedDocumentIds((current) => current.filter((id) => id !== updated.id));
+      if (selectedDocumentId === updated.id) {
+        setSelectedDocumentId(null);
+      }
+    }
+    if (documentView === 'active' && updated.status === 'trashed') {
+      setSelectedDocumentIds((current) => current.filter((id) => id !== updated.id));
+      if (selectedDocumentId === updated.id) {
+        setSelectedDocumentId(null);
+      }
+    }
+    void loadCatalogs();
+  }, [documentView, loadCatalogs, selectedDocumentId]);
+
+  const handleDocumentDeleted = useCallback((documentId: string): void => {
+    setDocuments((current) => current.filter((document) => document.id !== documentId));
+    setSelectedDocumentIds((current) => current.filter((id) => id !== documentId));
+    if (selectedDocumentId === documentId) {
+      setSelectedDocumentId(null);
+    }
+    void loadCatalogs();
+  }, [loadCatalogs, selectedDocumentId]);
+
+  const handleToggleChecked = useCallback((documentId: string, checked: boolean): void => {
+    setSelectedDocumentIds((current) => {
+      if (checked && !current.includes(documentId)) {
+        return [...current, documentId];
+      }
+      if (!checked) {
+        return current.filter((item) => item !== documentId);
+      }
+      return current;
+    });
+  }, []);
+
+  const handleToggleSelectAllVisible = useCallback((): void => {
+    if (documents.length === 0) {
+      return;
+    }
+    if (allVisibleSelected) {
+      setSelectedDocumentIds([]);
+      return;
+    }
+    setSelectedDocumentIds(documents.map((document) => document.id));
+  }, [allVisibleSelected, documents]);
+
+  const handleTrashSelected = useCallback(async (): Promise<void> => {
+    if (selectedDocumentIds.length === 0) {
+      return;
+    }
+    setIsRunningBulkAction(true);
+    setError(null);
+    try {
+      await Promise.all(selectedDocumentIds.map((documentId) => trashDocument(documentId)));
+      setSelectedDocumentIds([]);
+      await Promise.all([loadDocuments(), loadCatalogs()]);
+    } catch (caughtError) {
+      setError(caughtError instanceof Error ? caughtError.message : 'Failed to trash selected documents');
+    } finally {
+      setIsRunningBulkAction(false);
+    }
+  }, [loadCatalogs, loadDocuments, selectedDocumentIds]);
+
+  const handleTrashDocumentCard = useCallback(async (documentId: string): Promise<void> => {
+    if (documentView === 'trash') {
+      return;
+    }
+    setError(null);
+    try {
+      await trashDocument(documentId);
+      setSelectedDocumentIds((current) => current.filter((id) => id !== documentId));
+      if (selectedDocumentId === documentId) {
+        setSelectedDocumentId(null);
+      }
+      await Promise.all([loadDocuments(), loadCatalogs()]);
+    } catch (caughtError) {
+      const message = caughtError instanceof Error ? caughtError.message : 'Failed to trash document';
+      setError(message);
+      throw caughtError instanceof Error ? caughtError : new Error(message);
+    }
+  }, [documentView, loadCatalogs, loadDocuments, selectedDocumentId]);
+
+  const handleDeleteSelected = useCallback(async (): Promise<void> => {
+    if (selectedDocumentIds.length === 0) {
+      return;
+    }
+    const confirmed = await requestConfirmation(
+      'Delete Selected Documents Permanently',
+      'This removes selected documents and stored files permanently.',
+      'Delete Permanently',
+    );
+    if (!confirmed) {
+      return;
+    }
+    setIsRunningBulkAction(true);
+    setError(null);
+    try {
+      await Promise.all(selectedDocumentIds.map((documentId) => deleteDocument(documentId)));
+      setSelectedDocumentIds([]);
+      await Promise.all([loadDocuments(), loadCatalogs()]);
+    } catch (caughtError) {
+      setError(caughtError instanceof Error ? caughtError.message : 'Failed to delete selected documents');
+    } finally {
+      setIsRunningBulkAction(false);
+    }
+  }, [loadCatalogs, loadDocuments, requestConfirmation, selectedDocumentIds]);
+
+  const handleExportSelected = useCallback(async (): Promise<void> => {
+    if (selectedDocumentIds.length === 0) {
+      return;
+    }
+    setIsRunningBulkAction(true);
+    setError(null);
+    try {
+      const result = await exportContentsMarkdown({
+        document_ids: selectedDocumentIds,
+        only_trashed: documentView === 'trash',
+        include_trashed: false,
+      });
+      downloadBlob(result.blob, result.filename);
+    } catch (caughtError) {
+      setError(caughtError instanceof Error ? caughtError.message : 'Failed to export selected markdown files');
+    } finally {
+      setIsRunningBulkAction(false);
+    }
+  }, [documentView, downloadBlob, selectedDocumentIds]);
+
+  const handleExportPath = useCallback(async (): Promise<void> => {
+    const trimmedPrefix = exportPathInput.trim();
+    if (!trimmedPrefix) {
+      setError('Enter a path prefix before exporting by path');
+      return;
+    }
+    setIsRunningBulkAction(true);
+    setError(null);
+    try {
+      const result = await exportContentsMarkdown({
+        path_prefix: trimmedPrefix,
+        only_trashed: documentView === 'trash',
+        include_trashed: false,
+      });
+      downloadBlob(result.blob, result.filename);
+    } catch (caughtError) {
+      setError(caughtError instanceof Error ? caughtError.message : 'Failed to export path markdown files');
+    } finally {
+      setIsRunningBulkAction(false);
+    }
+  }, [documentView, downloadBlob, exportPathInput]);
+
+  const handleSaveSettings = useCallback(async (payload: AppSettingsUpdate): Promise<void> => {
+    setIsSavingSettings(true);
+    setError(null);
+    try {
+      const updated = await updateAppSettings(payload);
+      setAppSettings(updated);
+      await loadCatalogs();
+    } catch (caughtError) {
+      setError(caughtError instanceof Error ? caughtError.message : 'Failed to save settings');
+      throw caughtError;
+    } finally {
+      setIsSavingSettings(false);
+    }
+  }, [loadCatalogs]);
+
+  const handleSaveSettingsFromHeader = useCallback(async (): Promise<void> => {
+    if (!settingsSaveAction) {
+      setError('Settings are still loading');
+      return;
+    }
+    await settingsSaveAction();
+  }, [settingsSaveAction]);
+
+  const handleRegisterSettingsSaveAction = useCallback((action: (() => Promise<void>) | null): void => {
+    setSettingsSaveAction(() => action);
+  }, []);
+
+  const handleResetSettings = useCallback(async (): Promise<void> => {
+    const confirmed = await requestConfirmation(
+      'Reset Settings',
+      'This resets all settings to defaults and overwrites current values.',
+      'Reset Settings',
+    );
+    if (!confirmed) {
+      return;
+    }
+    setIsSavingSettings(true);
+    setError(null);
+    try {
+      const updated = await resetAppSettings();
+      setAppSettings(updated);
+      await loadCatalogs();
+    } catch (caughtError) {
+      setError(caughtError instanceof Error ? caughtError.message : 'Failed to reset settings');
+    } finally {
+      setIsSavingSettings(false);
+    }
+  }, [loadCatalogs, requestConfirmation]);
+
+  const handleClearProcessingLogs = useCallback(async (): Promise<void> => {
+    const confirmed = await requestConfirmation(
+      'Clear Processing Log',
+      'This clears the full diagnostics timeline.',
+      'Clear Logs',
+    );
+    if (!confirmed) {
+      return;
+    }
+    setIsClearingLogs(true);
+    try {
+      await clearProcessingLogs();
+      await loadProcessingTimeline();
+      setProcessingLogError(null);
+    } catch (caughtError) {
+      setProcessingLogError(caughtError instanceof Error ? caughtError.message : 'Failed to clear processing logs');
+    } finally {
+      setIsClearingLogs(false);
+    }
+  }, [loadProcessingTimeline, requestConfirmation]);
+
+  const handleFilterPathFromCard = useCallback((pathValue: string): void => {
+    setActiveSearchQuery('');
+    setSearchText('');
+    setTagFilter('');
+    setTypeFilter('');
+    setProcessedFrom('');
+    setProcessedTo('');
+    setPathFilter(pathValue);
+    setCurrentPage(1);
+  }, []);
+
+  const handleFilterTagFromCard = useCallback((tagValue: string): void => {
+    setActiveSearchQuery('');
+    setSearchText('');
+    setPathFilter('');
+    setTypeFilter('');
+    setProcessedFrom('');
+    setProcessedTo('');
+    setTagFilter(tagValue);
+    setCurrentPage(1);
+  }, []);
+
+  return (
+    <main className="app-shell">
+      <header className="topbar">
+        <div>
+          <h1>LedgerDock</h1>
+          <p>Document command deck for OCR, routing intelligence, and controlled metadata ops.</p>
+        </div>
+        <div className="topbar-controls">
+          <div className="topbar-nav-group">
+            <button
+              type="button"
+              className={screen === 'documents' && documentView === 'active' ? 'active-view-button' : 'secondary-action'}
+              onClick={() => {
+                setScreen('documents');
+                setDocumentView('active');
+              }}
+            >
+              Documents
+            </button>
+            <button
+              type="button"
+              className={screen === 'documents' && documentView === 'trash' ? 'active-view-button' : 'secondary-action'}
+              onClick={() => {
+                setScreen('documents');
+                setDocumentView('trash');
+              }}
+            >
+              Trash
+            </button>
+            <button
+              type="button"
+              className={screen === 'settings' ? 'active-view-button' : 'secondary-action'}
+              onClick={() => setScreen('settings')}
+            >
+              Settings
+            </button>
+          </div>
+
+          {screen === 'documents' && (
+            <div className="topbar-document-group">
+              <UploadSurface onUploadRequested={handleUpload} isUploading={isUploading} variant="inline" />
+            </div>
+          )}
+
+          {screen === 'settings' && (
+            <div className="topbar-settings-group">
+              <button type="button" className="secondary-action" onClick={() => void handleResetSettings()} disabled={isSavingSettings}>
+                Reset To Defaults
+              </button>
+              <button type="button" onClick={() => void handleSaveSettingsFromHeader()} disabled={isSavingSettings || !settingsSaveAction}>
+                {isSavingSettings ? 'Saving Settings...' : 'Save Settings'}
+              </button>
+            </div>
+          )}
+        </div>
+      </header>
+
+      {error && <p className="error-banner">{error}</p>}
+
+      {screen === 'settings' && (
+        <SettingsScreen
+          settings={appSettings}
+          isSaving={isSavingSettings}
+          knownTags={knownTags}
+          knownPaths={knownPaths}
+          onSave={handleSaveSettings}
+          onRegisterSaveAction={handleRegisterSettingsSaveAction}
+        />
+      )}
+
+      {screen === 'documents' && (
+        <>
+          <section className="layout-grid">
+            <div>
+              <div className="panel-header document-panel-header">
+                <div className="document-panel-title-row">
+                  <h2>{documentView === 'trash' ? 'Trashed Documents' : 'Documents'}</h2>
+                  <p>{isLoading ? 'Loading...' : `${totalDocuments} document(s)`}</p>
+                </div>
+                <SearchFiltersBar
+                  searchText={searchText}
+                  onSearchTextChange={setSearchText}
+                  onSearchSubmit={() => void handleSearch()}
+                  onReset={handleResetSearch}
+                  hasActiveSearch={hasActiveSearch}
+                  knownTags={knownTags}
+                  knownPaths={knownPaths}
+                  knownTypes={knownTypes}
+                  tagFilter={tagFilter}
+                  onTagFilterChange={(value) => {
+                    setTagFilter(value);
+                    setCurrentPage(1);
+                  }}
+                  typeFilter={typeFilter}
+                  onTypeFilterChange={(value) => {
+                    setTypeFilter(value);
+                    setCurrentPage(1);
+                  }}
+                  pathFilter={pathFilter}
+                  onPathFilterChange={(value) => {
+                    setPathFilter(value);
+                    setCurrentPage(1);
+                  }}
+                  processedFrom={processedFrom}
+                  onProcessedFromChange={(value) => {
+                    setProcessedFrom(value);
+                    setCurrentPage(1);
+                  }}
+                  processedTo={processedTo}
+                  onProcessedToChange={(value) => {
+                    setProcessedTo(value);
+                    setCurrentPage(1);
+                  }}
+                  isLoading={isLoading}
+                />
+                <div className="document-toolbar-row">
+                  <div className="document-toolbar-pagination compact-pagination">
+                    <button type="button" className="secondary-action" onClick={() => setCurrentPage(1)} disabled={isLoading || currentPage <= 1}>
+                      First
+                    </button>
+                    <button type="button" className="secondary-action" onClick={() => setCurrentPage((current) => Math.max(1, current - 1))} disabled={isLoading || currentPage <= 1}>
+                      Prev
+                    </button>
+                    <span className="small">Page {currentPage} / {totalPages}</span>
+                    <button type="button" className="secondary-action" onClick={() => setCurrentPage((current) => Math.min(totalPages, current + 1))} disabled={isLoading || currentPage >= totalPages}>
+                      Next
+                    </button>
+                    <button type="button" className="secondary-action" onClick={() => setCurrentPage(totalPages)} disabled={isLoading || currentPage >= totalPages}>
+                      Last
+                    </button>
+                  </div>
+                </div>
+                <div className="document-toolbar-row">
+                  <div className="document-toolbar-selection">
+                    <span className="small">Select:</span>
+                    <button type="button" className="secondary-action" onClick={handleToggleSelectAllVisible} disabled={documents.length === 0}>
+                      {allVisibleSelected ? 'Unselect Page' : 'Select Page'}
+                    </button>
+                    <span className="small">Selected {selectedDocumentIds.length}</span>
+                    {documentView !== 'trash' && (
+                      <button type="button" className="warning-action" onClick={() => void handleTrashSelected()} disabled={isRunningBulkAction || selectedDocumentIds.length === 0}>
+                        Move To Trash
+                      </button>
+                    )}
+                    {documentView === 'trash' && (
+                      <button type="button" className="danger-action" onClick={() => void handleDeleteSelected()} disabled={isRunningBulkAction || selectedDocumentIds.length === 0}>
+                        Delete Permanently
+                      </button>
+                    )}
+                    <button type="button" className="secondary-action" onClick={() => void handleExportSelected()} disabled={isRunningBulkAction || selectedDocumentIds.length === 0}>
+                      Export Selected MD
+                    </button>
+                  </div>
+                  <div className="document-toolbar-export-path">
+                    <PathInput value={exportPathInput} onChange={setExportPathInput} placeholder="Export by path prefix" suggestions={knownPaths} />
+                    <button type="button" className="secondary-action" onClick={() => void handleExportPath()} disabled={isRunningBulkAction}>
+                      Export Path MD
+                    </button>
+                  </div>
+                </div>
+              </div>
+              <DocumentGrid
+                documents={documents}
+                selectedDocumentId={selectedDocumentId}
+                selectedDocumentIds={selectedDocumentIds}
+                isTrashView={documentView === 'trash'}
+                onSelect={(document) => setSelectedDocumentId(document.id)}
+                onToggleChecked={handleToggleChecked}
+                onTrashDocument={handleTrashDocumentCard}
+                onFilterPath={handleFilterPathFromCard}
+                onFilterTag={handleFilterTagFromCard}
+              />
+            </div>
+            <DocumentViewer
+              document={selectedDocument}
+              isTrashView={documentView === 'trash'}
+              existingTags={knownTags}
+              existingPaths={knownPaths}
+              onDocumentUpdated={handleDocumentUpdated}
+              onDocumentDeleted={handleDocumentDeleted}
+              requestConfirmation={requestConfirmation}
+            />
+          </section>
+          {processingLogError && <p className="error-banner">{processingLogError}</p>}
+          <ProcessingLogPanel
+            entries={processingLogs}
+            isLoading={isLoadingLogs}
+            isClearing={isClearingLogs}
+            selectedDocumentId={selectedDocumentId}
+            isProcessingActive={isProcessingActive}
+            typingAnimationEnabled={typingAnimationEnabled}
+            onClear={() => void handleClearProcessingLogs()}
+          />
+        </>
+      )}
+
+      <ActionModal
+        isOpen={dialogState !== null}
+        title={dialogState?.title ?? ''}
+        message={dialogState?.message ?? ''}
+        options={dialogState?.options ?? []}
+        onSelect={closeDialog}
+        onDismiss={() => closeDialog('cancel')}
+      />
+    </main>
+  );
+}
@@ -0,0 +1,79 @@
+/**
+ * Reusable modal for confirmations and multi-action prompts.
+ */
+interface ActionModalOption {
+  key: string;
+  label: string;
+  tone?: 'neutral' | 'primary' | 'warning' | 'danger';
+}
+
+interface ActionModalProps {
+  isOpen: boolean;
+  title: string;
+  message: string;
+  options: ActionModalOption[];
+  onSelect: (key: string) => void;
+  onDismiss: () => void;
+}
+
+/**
+ * Renders a centered modal dialog with configurable action buttons.
+ */
+export default function ActionModal({
+  isOpen,
+  title,
+  message,
+  options,
+  onSelect,
+  onDismiss,
+}: ActionModalProps): JSX.Element | null {
+  if (!isOpen) {
+    return null;
+  }
+
+  return (
+    <div
+      className="modal-backdrop"
+      role="button"
+      tabIndex={0}
+      aria-label="Close dialog"
+      onClick={onDismiss}
+      onKeyDown={(event) => {
+        if (event.key === 'Escape') {
+          onDismiss();
+        }
+      }}
+    >
+      <section
+        className="action-modal"
+        role="dialog"
+        aria-modal="true"
+        aria-labelledby="action-modal-title"
+        onClick={(event) => event.stopPropagation()}
+      >
+        <h3 id="action-modal-title">{title}</h3>
+        <p>{message}</p>
+        <div className="action-modal-buttons">
+          {options.map((option) => (
+            <button
+              key={option.key}
+              type="button"
+              className={
+                option.tone === 'danger'
+                  ? 'danger-action'
+                  : option.tone === 'warning'
+                    ? 'warning-action'
+                    : option.tone === 'neutral'
+                      ? 'secondary-action'
+                      : ''
+              }
+              onClick={() => onSelect(option.key)}
+            >
+              {option.label}
+            </button>
+          ))}
+        </div>
+      </section>
+    </div>
+  );
+}
@@ -0,0 +1,220 @@
+/**
+ * Card view for displaying document summary, preview, and metadata.
+ */
+import { useState } from 'react';
+import { Download, FileText, Trash2 } from 'lucide-react';
+
+import type { DmsDocument } from '../types';
+import { contentMarkdownUrl, downloadUrl, thumbnailUrl } from '../lib/api';
+
+/**
+ * Defines properties accepted by the document card component.
+ */
+interface DocumentCardProps {
+  document: DmsDocument;
+  isSelected: boolean;
+  isChecked: boolean;
+  isTrashView: boolean;
+  onSelect: (document: DmsDocument) => void;
+  onToggleChecked: (documentId: string, checked: boolean) => void;
+  onTrashDocument: (documentId: string) => Promise<void>;
+  onFilterPath: (path: string) => void;
+  onFilterTag: (tag: string) => void;
+}
+
+/**
+ * Defines visual processing status variants rendered in the card header indicator.
+ */
+type StatusTone = 'success' | 'progress' | 'failed';
+
+/**
+ * Resolves status tone and tooltip text from backend document status values.
+ */
+function statusPresentation(status: DmsDocument['status']): { tone: StatusTone; tooltip: string } {
+  if (status === 'processed') {
+    return { tone: 'success', tooltip: 'Processing status: success' };
+  }
+  if (status === 'queued') {
+    return { tone: 'progress', tooltip: 'Processing status: in progress' };
+  }
+  if (status === 'error') {
+    return { tone: 'failed', tooltip: 'Processing status: failed' };
+  }
+  if (status === 'unsupported') {
+    return { tone: 'failed', tooltip: 'Processing status: failed (unsupported type)' };
+  }
+  return { tone: 'success', tooltip: 'Processing status: success (moved to trash)' };
+}
+
+/**
+ * Limits logical-path length while preserving start and end context with middle ellipsis.
+ */
+function compactLogicalPath(path: string, maxChars = 180): string {
+  const normalized = path.trim();
+  if (!normalized) {
+    return '';
+  }
+  if (normalized.length <= maxChars) {
+    return normalized;
+  }
+  const keepChars = Math.max(12, maxChars - 3);
+  const headChars = Math.ceil(keepChars * 0.6);
+  const tailChars = keepChars - headChars;
+  return `${normalized.slice(0, headChars)}...${normalized.slice(-tailChars)}`;
+}
+
+/**
+ * Renders one document card with optional image preview and searchable metadata.
+ */
+export default function DocumentCard({
+  document,
+  isSelected,
+  isChecked,
+  isTrashView,
+  onSelect,
+  onToggleChecked,
+  onTrashDocument,
+  onFilterPath,
+  onFilterTag,
+}: DocumentCardProps): JSX.Element {
+  const [isTrashing, setIsTrashing] = useState<boolean>(false);
+  const createdDate = new Date(document.created_at).toLocaleString();
+  const status = statusPresentation(document.status);
+  const compactPath = compactLogicalPath(document.logical_path, 180);
+  const trashDisabled = isTrashView || document.status === 'trashed' || isTrashing;
+  const trashTitle = trashDisabled ? 'Already in trash' : 'Move to trash';
+
+  return (
+    <article
+      className={`document-card ${isSelected ? 'selected' : ''}`}
+      role="button"
+      tabIndex={0}
+      onClick={() => onSelect(document)}
+      onKeyDown={(event) => {
+        if (event.currentTarget !== event.target) {
+          return;
+        }
+        if (event.key === 'Enter' || event.key === ' ') {
+          event.preventDefault();
+          onSelect(document);
+        }
+      }}
+    >
+      <header className="document-card-header">
+        <div
+          className={`card-status-indicator ${status.tone}`}
+          title={status.tooltip}
+          aria-label={status.tooltip}
+        />
+        <label className="card-checkbox card-checkbox-compact" onClick={(event) => event.stopPropagation()}>
+          <input
+            type="checkbox"
+            checked={isChecked}
+            onChange={(event) => onToggleChecked(document.id, event.target.checked)}
+            onClick={(event) => event.stopPropagation()}
+            aria-label={`Select ${document.original_filename}`}
+            title="Select document"
+          />
+        </label>
+      </header>
+      <div className="document-preview">
+        {document.preview_available ? (
+          <img src={thumbnailUrl(document.id)} alt={document.original_filename} loading="lazy" />
+        ) : (
+          <div className="document-preview-fallback">{document.extension || 'file'}</div>
+        )}
+      </div>
+      <div className="document-content document-card-body">
+        <h3 title={`${document.logical_path}/${document.original_filename}`}>
+          <span className="document-title-path">{compactPath}/</span>
+          <span className="document-title-name">{document.original_filename}</span>
+        </h3>
+        <p className="document-date">{createdDate}</p>
+      </div>
+      <footer className="document-card-footer">
+        <div className="card-footer-discovery">
+          <button
+            type="button"
+            className="card-chip path-chip"
+            onClick={(event) => {
+              event.preventDefault();
+              event.stopPropagation();
+              onFilterPath(document.logical_path);
+            }}
+            title={`Filter by path: ${document.logical_path}`}
+          >
+            {document.logical_path}
+          </button>
+          <div className="card-chip-row">
+            {document.tags.slice(0, 4).map((tag) => (
+              <button
+                key={`${document.id}-${tag}`}
+                type="button"
+                className="card-chip"
+                onClick={(event) => {
+                  event.preventDefault();
+                  event.stopPropagation();
+                  onFilterTag(tag);
+                }}
+                title={`Filter by tag: ${tag}`}
+              >
+                #{tag}
+              </button>
+            ))}
+          </div>
+        </div>
+        <div className="card-action-row">
+          <button
+            type="button"
+            className="card-icon-button"
+            aria-label="Download original"
+            title="Download original"
+            onClick={(event) => {
+              event.preventDefault();
+              event.stopPropagation();
+              window.open(downloadUrl(document.id), '_blank', 'noopener,noreferrer');
+            }}
+          >
+            <Download aria-hidden="true" />
+          </button>
+          <button
+            type="button"
+            className="card-icon-button"
+            aria-label="Export recognized text as markdown"
+            title="Export recognized text as markdown"
+            onClick={(event) => {
+              event.preventDefault();
+              event.stopPropagation();
+              window.open(contentMarkdownUrl(document.id), '_blank', 'noopener,noreferrer');
+            }}
+          >
+            <FileText aria-hidden="true" />
+          </button>
+          <button
+            type="button"
+            className="card-icon-button danger"
+            aria-label={trashTitle}
+            title={trashTitle}
+            disabled={trashDisabled}
+            onClick={async (event) => {
+              event.preventDefault();
+              event.stopPropagation();
+              if (trashDisabled) {
+                return;
+              }
+              setIsTrashing(true);
+              try {
+                await onTrashDocument(document.id);
+              } catch {
+              } finally {
+                setIsTrashing(false);
+              }
+            }}
+          >
+            <Trash2 aria-hidden="true" />
+          </button>
+        </div>
+      </footer>
+    </article>
+  );
+}
@@ -0,0 +1,54 @@
+/**
+ * Grid renderer for document collections.
+ */
+import type { DmsDocument } from '../types';
+import DocumentCard from './DocumentCard';
+
+/**
+ * Defines props for document grid rendering.
+ */
+interface DocumentGridProps {
+  documents: DmsDocument[];
+  selectedDocumentId: string | null;
+  selectedDocumentIds: string[];
+  isTrashView: boolean;
+  onSelect: (document: DmsDocument) => void;
+  onToggleChecked: (documentId: string, checked: boolean) => void;
+  onTrashDocument: (documentId: string) => Promise<void>;
+  onFilterPath: (path: string) => void;
+  onFilterTag: (tag: string) => void;
+}
+
+/**
+ * Renders cards in a responsive grid with selection state.
+ */
+export default function DocumentGrid({
+  documents,
+  selectedDocumentId,
+  selectedDocumentIds,
+  isTrashView,
+  onSelect,
+  onToggleChecked,
+  onTrashDocument,
+  onFilterPath,
+  onFilterTag,
+}: DocumentGridProps): JSX.Element {
+  return (
+    <section className="document-grid">
+      {documents.map((document) => (
+        <DocumentCard
+          key={document.id}
+          document={document}
+          onSelect={onSelect}
+          isSelected={selectedDocumentId === document.id}
+          isChecked={selectedDocumentIds.includes(document.id)}
+          onToggleChecked={onToggleChecked}
+          isTrashView={isTrashView}
+          onTrashDocument={onTrashDocument}
+          onFilterPath={onFilterPath}
+          onFilterTag={onFilterTag}
+        />
+      ))}
+    </section>
+  );
+}
@@ -0,0 +1,585 @@
+/**
+ * Embedded document viewer panel for preview, metadata updates, and lifecycle actions.
+ */
+import { useEffect, useMemo, useState } from 'react';
+
+import {
+  contentMarkdownUrl,
+  deleteDocument,
+  getDocumentDetails,
+  previewUrl,
+  reprocessDocument,
+  restoreDocument,
+  trashDocument,
+  updateDocumentMetadata,
+} from '../lib/api';
+import type { DmsDocument, DmsDocumentDetail } from '../types';
+import PathInput from './PathInput';
+import TagInput from './TagInput';
+
+/**
+ * Defines props for the selected document viewer panel.
+ */
+interface DocumentViewerProps {
+  document: DmsDocument | null;
+  isTrashView: boolean;
+  existingTags: string[];
+  existingPaths: string[];
+  onDocumentUpdated: (document: DmsDocument) => void;
+  onDocumentDeleted: (documentId: string) => void;
+  requestConfirmation: (title: string, message: string, confirmLabel?: string) => Promise<boolean>;
+}
+
+/**
+ * Renders selected document preview with editable metadata and lifecycle controls.
+ */
+export default function DocumentViewer({
+  document,
+  isTrashView,
+  existingTags,
+  existingPaths,
+  onDocumentUpdated,
+  onDocumentDeleted,
+  requestConfirmation,
+}: DocumentViewerProps): JSX.Element {
+  const [documentDetail, setDocumentDetail] = useState<DmsDocumentDetail | null>(null);
+  const [isLoadingDetails, setIsLoadingDetails] = useState<boolean>(false);
+  const [originalFilename, setOriginalFilename] = useState<string>('');
+  const [logicalPath, setLogicalPath] = useState<string>('');
+  const [tags, setTags] = useState<string[]>([]);
+  const [isSaving, setIsSaving] = useState<boolean>(false);
+  const [isReprocessing, setIsReprocessing] = useState<boolean>(false);
+  const [isTrashing, setIsTrashing] = useState<boolean>(false);
+  const [isRestoring, setIsRestoring] = useState<boolean>(false);
+  const [isDeleting, setIsDeleting] = useState<boolean>(false);
+  const [isMetadataDirty, setIsMetadataDirty] = useState<boolean>(false);
+  const [error, setError] = useState<string | null>(null);
+
+  /**
+   * Syncs editable metadata fields whenever selection changes.
+   */
+  useEffect(() => {
+    if (!document) {
+      setDocumentDetail(null);
+      setIsMetadataDirty(false);
+      return;
+    }
+    setOriginalFilename(document.original_filename);
+    setLogicalPath(document.logical_path);
+    setTags(document.tags);
+    setIsMetadataDirty(false);
+    setError(null);
+  }, [document?.id]);
+
+  /**
+   * Refreshes editable metadata from list updates only while form is clean.
+   */
+  useEffect(() => {
+    if (!document || isMetadataDirty) {
+      return;
+    }
+    setOriginalFilename(document.original_filename);
+    setLogicalPath(document.logical_path);
+    setTags(document.tags);
+  }, [
+    document?.id,
+    document?.original_filename,
+    document?.logical_path,
+    document?.tags,
+    isMetadataDirty,
+  ]);
+
+  /**
+   * Loads full selected-document details for extracted text and metadata display.
+   */
+  useEffect(() => {
+    if (!document) {
+      return;
+    }
+    let cancelled = false;
+    const loadDocumentDetails = async (): Promise<void> => {
+      setIsLoadingDetails(true);
+      try {
+        const payload = await getDocumentDetails(document.id);
+        if (!cancelled) {
+          setDocumentDetail(payload);
+        }
+      } catch (caughtError) {
+        if (!cancelled) {
+          setError(caughtError instanceof Error ? caughtError.message : 'Failed to load document details');
+        }
+      } finally {
+        if (!cancelled) {
+          setIsLoadingDetails(false);
+        }
+      }
+    };
+
+    void loadDocumentDetails();
+    return () => {
+      cancelled = true;
+    };
+  }, [document?.id]);
+
+  /**
+   * Resolves whether selected document should render as an image element in preview.
+   */
+  const isImageDocument = useMemo(() => {
+    if (!document) {
+      return false;
+    }
+    return document.mime_type.startsWith('image/');
+  }, [document]);
+
+  /**
+   * Extracts provider/transcription errors from document metadata for user visibility.
+   */
+  const transcriptionError = useMemo(() => {
+    const value = documentDetail?.metadata_json?.transcription_error;
+    return typeof value === 'string' ? value : '';
+  }, [documentDetail]);
+
+  /**
+   * Extracts routing errors from metadata to surface classification issues.
+   */
+  const routingError = useMemo(() => {
+    const value = documentDetail?.metadata_json?.routing_error;
+    return typeof value === 'string' ? value : '';
+  }, [documentDetail]);
+
+  /**
+   * Builds a compact routing status summary for user visibility.
+   */
+  const routingSummary = useMemo(() => {
+    const value = documentDetail?.metadata_json?.routing;
+    if (!value || typeof value !== 'object') {
+      return '';
+    }
+
+    const routing = value as Record<string, unknown>;
+    const confidence = typeof routing.confidence === 'number' ? routing.confidence : null;
+    const similarity = typeof routing.neighbor_similarity === 'number' ? routing.neighbor_similarity : null;
+    const confidenceThreshold =
+      typeof routing.auto_apply_confidence_threshold === 'number'
+        ? routing.auto_apply_confidence_threshold
+        : null;
+    const autoApplied = typeof routing.auto_applied === 'boolean' ? routing.auto_applied : null;
+    const autoAppliedPath = typeof routing.auto_applied_path === 'boolean' ? routing.auto_applied_path : null;
+    const autoAppliedTags = typeof routing.auto_applied_tags === 'boolean' ? routing.auto_applied_tags : null;
+    const blockedReasonsRaw = routing.auto_apply_blocked_reasons;
+    const blockedReasons =
+      Array.isArray(blockedReasonsRaw) && blockedReasonsRaw.length > 0
+        ? blockedReasonsRaw
+            .map((reason) => String(reason))
+            .map((reason) => {
+              if (reason === 'missing_chosen_path') {
+                return 'no chosen path';
+              }
+              if (reason === 'confidence_below_threshold') {
+                return 'confidence below threshold';
+              }
+              if (reason === 'neighbor_similarity_below_threshold') {
+                return 'neighbor similarity below threshold';
+              }
+              return reason;
+            })
+        : [];
+
+    const parts: string[] = [];
+    if (autoApplied !== null) {
+      parts.push(`Auto Applied: ${autoApplied ? 'yes' : 'no'}`);
+    }
+    if (autoApplied) {
+      const appliedTargets: string[] = [];
+      if (autoAppliedPath) {
+        appliedTargets.push('path');
+      }
+      if (autoAppliedTags) {
+        appliedTargets.push('tags');
+      }
+      if (appliedTargets.length > 0) {
+        parts.push(`Applied: ${appliedTargets.join(' + ')}`);
+      }
+    }
+    if (confidence !== null) {
+      if (confidenceThreshold !== null) {
+        parts.push(`Confidence: ${confidence.toFixed(2)} / ${confidenceThreshold.toFixed(2)}`);
+      } else {
+        parts.push(`Confidence: ${confidence.toFixed(2)}`);
+      }
+    }
+    if (similarity !== null) {
+      parts.push(`Neighbor Similarity (info): ${similarity.toFixed(2)}`);
+    }
+    if (autoApplied === false && blockedReasons.length > 0) {
+      parts.push(`Blocked: ${blockedReasons.join(', ')}`);
+    }
+    return parts.join(' | ');
+  }, [documentDetail]);
+
+  /**
+   * Resolves whether routing already auto-applied path and tags.
+   */
+  const routingAutoApplyState = useMemo(() => {
+    const value = documentDetail?.metadata_json?.routing;
+    if (!value || typeof value !== 'object') {
+      return {
+        autoAppliedPath: false,
+        autoAppliedTags: false,
+      };
+    }
+
+    const routing = value as Record<string, unknown>;
+    return {
+      autoAppliedPath: routing.auto_applied_path === true,
+      autoAppliedTags: routing.auto_applied_tags === true,
+    };
+  }, [documentDetail]);
+
+  /**
+   * Resolves whether any routing suggestion still needs manual application.
+   */
+  const hasPathSuggestion = Boolean(document?.suggested_path) && !routingAutoApplyState.autoAppliedPath;
+  const hasTagSuggestions = (document?.suggested_tags.length ?? 0) > 0 && !routingAutoApplyState.autoAppliedTags;
+  const canApplyAllSuggestions = hasPathSuggestion || hasTagSuggestions;
+
+  /**
+   * Applies suggested path value to editable metadata field.
+   */
+  const applySuggestedPath = (): void => {
+    if (!hasPathSuggestion || !document?.suggested_path) {
+      return;
+    }
+    setLogicalPath(document.suggested_path);
+    setIsMetadataDirty(true);
+  };
+
+  /**
+   * Applies one suggested tag to editable metadata field.
+   */
+  const applySuggestedTag = (tag: string): void => {
+    if (!hasTagSuggestions || tags.includes(tag)) {
+      return;
+    }
+    setTags([...tags, tag]);
+    setIsMetadataDirty(true);
+  };
+
+  /**
+   * Applies all suggested routing values into editable metadata fields.
+   */
+  const applyAllSuggestions = (): void => {
+    if (hasPathSuggestion && document?.suggested_path) {
+      setLogicalPath(document.suggested_path);
+    }
+    if (hasTagSuggestions && document?.suggested_tags.length) {
+      const nextTags = [...tags];
+      for (const tag of document.suggested_tags) {
+        if (!nextTags.includes(tag)) {
+          nextTags.push(tag);
+        }
+      }
+      setTags(nextTags);
+    }
+    setIsMetadataDirty(true);
+  };
+
+  /**
+   * Persists metadata changes to backend.
+   */
+  const handleSave = async (): Promise<void> => {
+    if (!document) {
+      return;
+    }
+    setIsSaving(true);
+    setError(null);
+    try {
+      const updated = await updateDocumentMetadata(document.id, {
+        original_filename: originalFilename,
+        logical_path: logicalPath,
+        tags,
+      });
+      setOriginalFilename(updated.original_filename);
+      setLogicalPath(updated.logical_path);
+      setTags(updated.tags);
+      setIsMetadataDirty(false);
+      onDocumentUpdated(updated);
+      const payload = await getDocumentDetails(document.id);
+      setDocumentDetail(payload);
+    } catch (caughtError) {
+      setError(caughtError instanceof Error ? caughtError.message : 'Failed to save metadata');
+    } finally {
+      setIsSaving(false);
+    }
+  };
+
+  /**
+   * Re-runs extraction and routing logic for the currently selected document.
+   */
+  const handleReprocess = async (): Promise<void> => {
+    if (!document) {
+      return;
+    }
+    setIsReprocessing(true);
+    setError(null);
+    try {
+      const updated = await reprocessDocument(document.id);
+      onDocumentUpdated(updated);
+      const payload = await getDocumentDetails(document.id);
+      setDocumentDetail(payload);
+    } catch (caughtError) {
+      setError(caughtError instanceof Error ? caughtError.message : 'Failed to reprocess document');
+    } finally {
+      setIsReprocessing(false);
+    }
+  };
+
+  /**
+   * Moves the selected document to trash state.
+   */
+  const handleTrash = async (): Promise<void> => {
+    if (!document) {
+      return;
+    }
+    setIsTrashing(true);
+    setError(null);
+    try {
+      const updated = await trashDocument(document.id);
+      onDocumentUpdated(updated);
+    } catch (caughtError) {
+      setError(caughtError instanceof Error ? caughtError.message : 'Failed to trash document');
+    } finally {
+      setIsTrashing(false);
+    }
+  };
+
+  /**
+   * Restores the selected document from trash.
+   */
+  const handleRestore = async (): Promise<void> => {
+    if (!document) {
+      return;
+    }
+    setIsRestoring(true);
+    setError(null);
+    try {
+      const updated = await restoreDocument(document.id);
+      onDocumentUpdated(updated);
+    } catch (caughtError) {
+      setError(caughtError instanceof Error ? caughtError.message : 'Failed to restore document');
+    } finally {
+      setIsRestoring(false);
+    }
+  };
+
+  /**
+   * Permanently deletes the selected document and associated files.
+   */
+  const handleDelete = async (): Promise<void> => {
+    if (!document) {
+      return;
+    }
+    const confirmed = await requestConfirmation(
+      'Delete Document Permanently',
+      'This removes the document record and stored file from the system.',
+      'Delete Permanently',
+    );
+    if (!confirmed) {
+      return;
+    }
+
+    setIsDeleting(true);
+    setError(null);
+    try {
+      await deleteDocument(document.id);
+      onDocumentDeleted(document.id);
+    } catch (caughtError) {
+      setError(caughtError instanceof Error ? caughtError.message : 'Failed to delete document');
+    } finally {
+      setIsDeleting(false);
+    }
+  };
+
+  if (!document) {
+    return (
+      <aside className="document-viewer empty">
+        <h2>Document Details</h2>
+        <p>Select a document to preview and manage metadata.</p>
+      </aside>
+    );
+  }
+
+  const isTrashed = document.status === 'trashed' || isTrashView;
+  const metadataDisabled = isTrashed || isSaving || isTrashing || isRestoring || isDeleting;
+
+  return (
+    <aside className="document-viewer">
+      <h2>{document.original_filename}</h2>
+      <p className="small">Status: {document.status}</p>
+      <div className="viewer-preview">
+        {isImageDocument ? (
+          <img src={previewUrl(document.id)} alt={document.original_filename} />
+        ) : (
+          <iframe src={previewUrl(document.id)} title={document.original_filename} />
+        )}
+      </div>
+      <label>
+        File Name
+        <input
+          value={originalFilename}
+          onChange={(event) => {
+            setOriginalFilename(event.target.value);
+            setIsMetadataDirty(true);
+          }}
+          disabled={metadataDisabled}
+        />
+      </label>
+      <label>
+        Destination Path
+        <PathInput
+          value={logicalPath}
+          onChange={(value) => {
+            setLogicalPath(value);
+            setIsMetadataDirty(true);
+          }}
+          suggestions={existingPaths}
+          disabled={metadataDisabled}
+        />
+      </label>
+      <label>
+        Tags
+        <TagInput
+          value={tags}
+          onChange={(value) => {
+            setTags(value);
+            setIsMetadataDirty(true);
+          }}
+          suggestions={existingTags}
+          disabled={metadataDisabled}
+        />
+      </label>
+      {(document.suggested_path || document.suggested_tags.length > 0 || routingSummary || routingError) && (
+        <section className="routing-suggestions-panel">
+          <div className="routing-suggestions-header">
+            <h3>Routing Suggestions</h3>
+            {canApplyAllSuggestions && (
+              <button
+                type="button"
+                className="secondary-action"
+                onClick={applyAllSuggestions}
+                disabled={metadataDisabled}
+              >
+                Apply All
+              </button>
+            )}
+          </div>
+          {routingError && <p className="small error">{routingError}</p>}
+          {routingSummary && <p className="small">{routingSummary}</p>}
+          {hasPathSuggestion && document.suggested_path && (
+            <div className="routing-suggestion-group">
+              <p className="small">Suggested Path</p>
+              <button
+                type="button"
+                className="routing-pill"
+                onClick={applySuggestedPath}
+                disabled={metadataDisabled}
+              >
+                {document.suggested_path}
+              </button>
+            </div>
+          )}
+          {hasTagSuggestions && document.suggested_tags.length > 0 && (
+            <div className="routing-suggestion-group">
+              <p className="small">Suggested Tags</p>
+              <div className="routing-pill-row">
+                {document.suggested_tags.map((tag) => (
+                  <button
+                    key={tag}
+                    type="button"
+                    className="routing-pill"
+                    onClick={() => applySuggestedTag(tag)}
+                    disabled={metadataDisabled}
+                  >
+                    {tag}
+                  </button>
+                ))}
+              </div>
+            </div>
+          )}
+        </section>
+      )}
+      <section className="extracted-text-panel">
+        <h3>Extracted Text</h3>
+        {transcriptionError && <p className="small error">{transcriptionError}</p>}
+        {isLoadingDetails ? (
+          <p className="small">Loading extracted text...</p>
+        ) : documentDetail?.extracted_text.trim() ? (
+          <pre>{documentDetail.extracted_text}</pre>
+        ) : (
+          <p className="small">No extracted text available for this document yet.</p>
+        )}
+      </section>
+      {error && <p className="error">{error}</p>}
+      <div className="viewer-actions">
+        {!isTrashed && (
+          <button type="button" onClick={handleSave} disabled={metadataDisabled}>
+            {isSaving ? 'Saving...' : 'Save Metadata'}
+          </button>
+        )}
+        {!isTrashed && (
+          <button
+            type="button"
+            className="secondary-action"
+            onClick={handleReprocess}
+            disabled={metadataDisabled || isReprocessing}
+            title="Re-runs OCR/extraction, summary generation, routing suggestion, and indexing for this document."
+          >
+            {isReprocessing ? 'Reprocessing...' : 'Reprocess Document'}
+          </button>
+        )}
+        {!isTrashed && (
+          <button
+            type="button"
+            className="warning-action"
+            onClick={handleTrash}
+            disabled={metadataDisabled || isTrashing}
+          >
+            {isTrashing ? 'Trashing...' : 'Move To Trash'}
+          </button>
+        )}
+        {isTrashed && (
+          <button
+            type="button"
+            className="secondary-action"
+            onClick={handleRestore}
+            disabled={isRestoring || isDeleting}
+          >
+            {isRestoring ? 'Restoring...' : 'Restore Document'}
+          </button>
+        )}
+        <button
+          type="button"
+          className="secondary-action"
+          onClick={() => window.open(contentMarkdownUrl(document.id), '_blank', 'noopener,noreferrer')}
+          disabled={isDeleting}
+          title="Downloads recognized/extracted content as markdown for this document."
+        >
+          Download Recognized MD
+        </button>
+        {isTrashed && (
+          <button
+            type="button"
+            className="danger-action"
+            onClick={handleDelete}
+            disabled={isDeleting || isRestoring}
+          >
+            {isDeleting ? 'Deleting...' : 'Delete Permanently'}
+          </button>
+        )}
+      </div>
+      <p className="viewer-inline-help">
+        Reprocess runs OCR/extraction, updates summary, refreshes routing suggestions, and re-indexes search.
+      </p>
+    </aside>
+  );
+}
@@ -0,0 +1,73 @@
+/**
+ * Path editor with suggestion dropdown for scalable logical-path selection.
+ */
+import { useMemo, useState } from 'react';
+
+/**
+ * Defines properties for the reusable path input component.
+ */
+interface PathInputProps {
+  value: string;
+  suggestions: string[];
+  placeholder?: string;
+  disabled?: boolean;
+  onChange: (nextValue: string) => void;
+}
+
+/**
+ * Renders a text input with filtered clickable path suggestions.
+ */
+export default function PathInput({
+  value,
+  suggestions,
+  placeholder = 'Destination path',
+  disabled = false,
+  onChange,
+}: PathInputProps): JSX.Element {
+  const [isFocused, setIsFocused] = useState<boolean>(false);
+
+  /**
+   * Calculates filtered suggestions based on current input value.
+   */
+  const filteredSuggestions = useMemo(() => {
+    const normalized = value.trim().toLowerCase();
+    if (!normalized) {
+      return suggestions.slice(0, 20);
+    }
+    return suggestions.filter((candidate) => candidate.toLowerCase().includes(normalized)).slice(0, 20);
+  }, [suggestions, value]);
+
+  return (
+    <div className={`path-input ${disabled ? 'disabled' : ''}`}>
+      <input
+        value={value}
+        onChange={(event) => onChange(event.target.value)}
+        onFocus={() => setIsFocused(true)}
+        onBlur={() => {
+          window.setTimeout(() => setIsFocused(false), 120);
+        }}
+        placeholder={placeholder}
+        disabled={disabled}
+      />
+      {isFocused && filteredSuggestions.length > 0 && (
+        <div className="path-suggestions" role="listbox" aria-label="Path suggestions">
+          {filteredSuggestions.map((suggestion) => (
+            <button
+              key={suggestion}
+              type="button"
+              className="path-suggestion-item"
+              onMouseDown={(event) => {
+                event.preventDefault();
+                onChange(suggestion);
+                setIsFocused(false);
+              }}
+              disabled={disabled}
+            >
+              {suggestion}
+            </button>
+          ))}
+        </div>
+      )}
+    </div>
+  );
+}
@@ -0,0 +1,203 @@
+/**
+ * Processing log timeline panel for upload, OCR, summarization, routing, and indexing events.
+ */
+import { useEffect, useMemo, useRef, useState } from 'react';
+
+import type { ProcessingLogEntry } from '../types';
+
+interface ProcessingLogPanelProps {
+  entries: ProcessingLogEntry[];
+  isLoading: boolean;
+  isClearing: boolean;
+  selectedDocumentId: string | null;
+  isProcessingActive: boolean;
+  typingAnimationEnabled: boolean;
+  onClear: () => void;
+}
+
+/**
+ * Renders processing events in a terminal-style stream with optional typed headers.
+ */
+export default function ProcessingLogPanel({
+  entries,
+  isLoading,
+  isClearing,
+  selectedDocumentId,
+  isProcessingActive,
+  typingAnimationEnabled,
+  onClear,
+}: ProcessingLogPanelProps): JSX.Element {
+  const timeline = useMemo(() => [...entries].reverse(), [entries]);
+  const [typedEntryIds, setTypedEntryIds] = useState<Set<number>>(() => new Set());
+  const [typingEntryId, setTypingEntryId] = useState<number | null>(null);
+  const [typingHeader, setTypingHeader] = useState<string>('');
+  const [expandedIds, setExpandedIds] = useState<Set<number>>(() => new Set());
+  const timerRef = useRef<number | null>(null);
+
+  const formatTimestamp = (value: string): string => {
+    const parsed = new Date(value);
+    if (Number.isNaN(parsed.getTime())) {
+      return value;
+    }
+    return parsed.toLocaleString();
+  };
+
+  const payloadText = (payload: Record<string, unknown>): string => {
+    try {
+      return JSON.stringify(payload, null, 2);
+    } catch (error) {
+      return String(error);
+    }
+  };
+
+  const renderHeader = (entry: ProcessingLogEntry): string => {
+    const headerParts = [formatTimestamp(entry.created_at), entry.level.toUpperCase(), entry.stage];
+    if (entry.document_filename) {
+      headerParts.push(entry.document_filename);
+    }
+    if (selectedDocumentId !== null && selectedDocumentId === entry.document_id) {
+      headerParts.push('selected-document');
+    }
+    return `[${headerParts.join(' | ')}] ${entry.event}`;
+  };
+
+  useEffect(() => {
+    const knownIds = new Set(typedEntryIds);
+    if (typingEntryId !== null) {
+      knownIds.add(typingEntryId);
+    }
+    const nextUntyped = timeline.find((entry) => !knownIds.has(entry.id));
+    if (!nextUntyped) {
+      return;
+    }
+    if (!typingAnimationEnabled) {
+      setTypedEntryIds((current) => {
+        const next = new Set(current);
+        next.add(nextUntyped.id);
+        return next;
+      });
+      return;
+    }
+    if (typingEntryId !== null) {
+      return;
+    }
+
+    const fullHeader = renderHeader(nextUntyped);
+    setTypingEntryId(nextUntyped.id);
+    setTypingHeader('');
+    let cursor = 0;
+    timerRef.current = window.setInterval(() => {
+      cursor += 1;
+      setTypingHeader(fullHeader.slice(0, cursor));
+      if (cursor >= fullHeader.length) {
+        if (timerRef.current !== null) {
+          window.clearInterval(timerRef.current);
+          timerRef.current = null;
+        }
+        setTypedEntryIds((current) => {
+          const next = new Set(current);
+          next.add(nextUntyped.id);
+          return next;
+        });
+        setTypingEntryId(null);
+      }
+    }, 10);
+  }, [timeline, typedEntryIds, typingAnimationEnabled, typingEntryId]);
+
+  useEffect(() => {
+    return () => {
+      if (timerRef.current !== null) {
+        window.clearInterval(timerRef.current);
+      }
+    };
+  }, []);
+
+  return (
+    <section className="processing-log-panel">
+      <div className="panel-header">
+        <h2>Processing Log</h2>
+        <div className="processing-log-header-actions">
+          <p>{isLoading ? 'Refreshing...' : `${entries.length} recent event(s)`}</p>
+          <button type="button" className="secondary-action" onClick={onClear} disabled={isLoading || isClearing}>
+            {isClearing ? 'Clearing...' : 'Clear All Logs'}
+          </button>
+        </div>
+      </div>
+      <div className="processing-log-terminal-wrap">
+        <div className="processing-log-terminal">
+          {timeline.length === 0 && <p className="terminal-empty">No processing events yet.</p>}
+          {timeline.map((entry, index) => {
+            const groupKey = `${entry.document_id ?? 'unbound'}:${entry.stage}`;
+            const previousGroupKey = index > 0 ? `${timeline[index - 1].document_id ?? 'unbound'}:${timeline[index - 1].stage}` : null;
+            const showSeparator = index > 0 && groupKey !== previousGroupKey;
+            const isTyping = entry.id === typingEntryId;
+            const isTyped = typedEntryIds.has(entry.id) || (!typingAnimationEnabled && !isTyping);
+            const isExpanded = expandedIds.has(entry.id);
+            const providerModel = [entry.provider_id, entry.model_name].filter(Boolean).join(' / ');
+            const hasDetails =
+              providerModel.length > 0 ||
+              Object.keys(entry.payload_json).length > 0 ||
+              Boolean(entry.prompt_text) ||
+              Boolean(entry.response_text);
+            return (
+              <div key={entry.id}>
+                {showSeparator && <div className="terminal-separator">------</div>}
+                <div className="terminal-row-header">
+                  <span>{isTyping ? typingHeader : renderHeader(entry)}</span>
+                  {hasDetails && isTyped && (
+                    <button
+                      type="button"
+                      className="terminal-unfold-button"
+                      onClick={() =>
+                        setExpandedIds((current) => {
+                          const next = new Set(current);
+                          if (next.has(entry.id)) {
+                            next.delete(entry.id);
+                          } else {
+                            next.add(entry.id);
+                          }
+                          return next;
+                        })
+                      }
+                    >
+                      {isExpanded ? 'Fold' : 'Unfold'}
+                    </button>
+                  )}
+                </div>
+                {isExpanded && isTyped && (
+                  <div className="terminal-row-details">
+                    {providerModel && <div>provider/model: {providerModel}</div>}
+                    {Object.keys(entry.payload_json).length > 0 && (
+                      <>
+                        <div>payload:</div>
+                        <pre>{payloadText(entry.payload_json)}</pre>
+                      </>
+                    )}
+                    {entry.prompt_text && (
+                      <>
+                        <div>prompt:</div>
+                        <pre>{entry.prompt_text}</pre>
+                      </>
+                    )}
+                    {entry.response_text && (
+                      <>
+                        <div>response:</div>
+                        <pre>{entry.response_text}</pre>
+                      </>
+                    )}
+                  </div>
+                )}
+              </div>
+            );
+          })}
+          {isProcessingActive && typingEntryId === null && (
+            <div className="terminal-idle-prompt">
+              <span className="terminal-caret">&gt;</span>
+              <span className="terminal-caret-blink">_</span>
+            </div>
+          )}
+        </div>
+      </div>
+    </section>
+  );
+}
@@ -0,0 +1,107 @@
+/**
+ * Compact search and filter controls for document discovery.
+ */
+interface SearchFiltersBarProps {
+  searchText: string;
+  onSearchTextChange: (value: string) => void;
+  onSearchSubmit: () => void;
+  onReset: () => void;
+  hasActiveSearch: boolean;
+  knownTags: string[];
+  knownPaths: string[];
+  knownTypes: string[];
+  tagFilter: string;
+  onTagFilterChange: (value: string) => void;
+  typeFilter: string;
+  onTypeFilterChange: (value: string) => void;
+  pathFilter: string;
+  onPathFilterChange: (value: string) => void;
+  processedFrom: string;
+  onProcessedFromChange: (value: string) => void;
+  processedTo: string;
+  onProcessedToChange: (value: string) => void;
+  isLoading: boolean;
+}
+
+/**
+ * Renders dense search, filter, and quick reset controls.
+ */
+export default function SearchFiltersBar({
+  searchText,
+  onSearchTextChange,
+  onSearchSubmit,
+  onReset,
+  hasActiveSearch,
+  knownTags,
+  knownPaths,
+  knownTypes,
+  tagFilter,
+  onTagFilterChange,
+  typeFilter,
+  onTypeFilterChange,
+  pathFilter,
+  onPathFilterChange,
+  processedFrom,
+  onProcessedFromChange,
+  processedTo,
+  onProcessedToChange,
+  isLoading,
+}: SearchFiltersBarProps): JSX.Element {
+  return (
+    <div className="search-filters-bar">
+      <input
+        value={searchText}
+        onChange={(event) => onSearchTextChange(event.target.value)}
+        placeholder="Search across name, text, path, tags"
+        onKeyDown={(event) => {
+          if (event.key === 'Enter') {
+            event.preventDefault();
+            onSearchSubmit();
+          }
+        }}
+      />
+      <select value={tagFilter} onChange={(event) => onTagFilterChange(event.target.value)}>
+        <option value="">All Tags</option>
+        {knownTags.map((tag) => (
+          <option key={tag} value={tag}>
+            {tag}
+          </option>
+        ))}
+      </select>
+      <select value={typeFilter} onChange={(event) => onTypeFilterChange(event.target.value)}>
+        <option value="">All Types</option>
+        {knownTypes.map((typeValue) => (
+          <option key={typeValue} value={typeValue}>
+            {typeValue}
+          </option>
+        ))}
+      </select>
+      <select value={pathFilter} onChange={(event) => onPathFilterChange(event.target.value)}>
+        <option value="">All Paths</option>
+        {knownPaths.map((path) => (
+          <option key={path} value={path}>
+            {path}
+          </option>
+        ))}
+      </select>
+      <input
+        type="date"
+        value={processedFrom}
+        onChange={(event) => onProcessedFromChange(event.target.value)}
+        title="Processed from"
+      />
+      <input
+        type="date"
+        value={processedTo}
+        onChange={(event) => onProcessedToChange(event.target.value)}
+        title="Processed to"
+      />
+      <button type="button" onClick={onSearchSubmit} disabled={isLoading}>
+        Search
+      </button>
+      <button type="button" className="secondary-action" onClick={onReset} disabled={!hasActiveSearch || isLoading}>
+        Reset
+      </button>
+    </div>
+  );
+}
@@ -0,0 +1,721 @@
+/**
+ * Dedicated settings screen for providers, task model bindings, and catalog controls.
+ */
+import { useCallback, useEffect, useMemo, useState } from 'react';
+
+import PathInput from './PathInput';
+import TagInput from './TagInput';
+import type {
+  AppSettings,
+  AppSettingsUpdate,
+  DisplaySettings,
+  HandwritingStyleClusteringSettings,
+  OcrTaskSettings,
+  PredefinedPathEntry,
+  PredefinedTagEntry,
+  ProviderSettings,
+  RoutingTaskSettings,
+  SummaryTaskSettings,
+  UploadDefaultsSettings,
+} from '../types';
+
+interface EditableProvider extends ProviderSettings {
+  row_id: string;
+  api_key: string;
+  clear_api_key: boolean;
+}
+
+interface SettingsScreenProps {
+  settings: AppSettings | null;
+  isSaving: boolean;
+  knownTags: string[];
+  knownPaths: string[];
+  onSave: (payload: AppSettingsUpdate) => Promise<void>;
+  onRegisterSaveAction?: (action: (() => Promise<void>) | null) => void;
+}
+
+function clampCardsPerPage(value: number): number {
+  return Math.max(1, Math.min(200, value));
+}
+
+function parseCardsPerPageInput(input: string, fallback: number): number {
+  const parsed = Number.parseInt(input, 10);
+  if (Number.isNaN(parsed)) {
+    return clampCardsPerPage(fallback);
+  }
+  return clampCardsPerPage(parsed);
+}
+
+/**
+ * Renders compact human-oriented settings controls.
+ */
+export default function SettingsScreen({
+  settings,
+  isSaving,
+  knownTags,
+  knownPaths,
+  onSave,
+  onRegisterSaveAction,
+}: SettingsScreenProps): JSX.Element {
+  const [providers, setProviders] = useState<EditableProvider[]>([]);
+  const [ocrTask, setOcrTask] = useState<OcrTaskSettings | null>(null);
+  const [summaryTask, setSummaryTask] = useState<SummaryTaskSettings | null>(null);
+  const [routingTask, setRoutingTask] = useState<RoutingTaskSettings | null>(null);
+  const [handwritingStyle, setHandwritingStyle] = useState<HandwritingStyleClusteringSettings | null>(null);
+  const [predefinedPaths, setPredefinedPaths] = useState<PredefinedPathEntry[]>([]);
+  const [predefinedTags, setPredefinedTags] = useState<PredefinedTagEntry[]>([]);
+  const [newPredefinedPath, setNewPredefinedPath] = useState<string>('');
+  const [newPredefinedTag, setNewPredefinedTag] = useState<string>('');
+  const [uploadDefaults, setUploadDefaults] = useState<UploadDefaultsSettings | null>(null);
+  const [displaySettings, setDisplaySettings] = useState<DisplaySettings | null>(null);
+  const [cardsPerPageInput, setCardsPerPageInput] = useState<string>('12');
+  const [error, setError] = useState<string | null>(null);
+
+  useEffect(() => {
+    if (!settings) {
+      return;
+    }
+    setProviders(
+      settings.providers.map((provider) => ({
+        ...provider,
+        row_id: `${provider.id}-${Math.random().toString(36).slice(2, 9)}`,
+        api_key: '',
+        clear_api_key: false,
+      })),
+    );
+    setOcrTask(settings.tasks.ocr_handwriting);
+    setSummaryTask(settings.tasks.summary_generation);
+    setRoutingTask(settings.tasks.routing_classification);
+    setHandwritingStyle(settings.handwriting_style_clustering);
+    setPredefinedPaths(settings.predefined_paths);
+    setPredefinedTags(settings.predefined_tags);
+    setUploadDefaults(settings.upload_defaults);
+    setDisplaySettings(settings.display);
+    setCardsPerPageInput(String(settings.display.cards_per_page));
+    setError(null);
+  }, [settings]);
+
+  const fallbackProviderId = useMemo(() => providers[0]?.id ?? '', [providers]);
+
+  const addProvider = (): void => {
+    const sequence = providers.length + 1;
+    setProviders((current) => [
+      ...current,
+      {
+        row_id: `provider-row-${Date.now()}-${sequence}`,
+        id: `provider-${sequence}`,
+        label: `Provider ${sequence}`,
+        provider_type: 'openai_compatible',
+        base_url: 'http://localhost:11434/v1',
+        timeout_seconds: 45,
+        api_key_set: false,
+        api_key_masked: '',
+        api_key: '',
+        clear_api_key: false,
+      },
+    ]);
+  };
+
+  const removeProvider = (rowId: string): void => {
+    const target = providers.find((provider) => provider.row_id === rowId);
+    if (!target || providers.length <= 1) {
+      return;
+    }
+    const remaining = providers.filter((provider) => provider.row_id !== rowId);
+    const fallback = remaining[0]?.id ?? '';
+    setProviders(remaining);
+    if (ocrTask?.provider_id === target.id && fallback) {
+      setOcrTask({ ...ocrTask, provider_id: fallback });
+    }
+    if (summaryTask?.provider_id === target.id && fallback) {
+      setSummaryTask({ ...summaryTask, provider_id: fallback });
+    }
+    if (routingTask?.provider_id === target.id && fallback) {
+      setRoutingTask({ ...routingTask, provider_id: fallback });
+    }
+  };
+
+  const addPredefinedPath = (): void => {
+    const value = newPredefinedPath.trim().replace(/^\/+|\/+$/g, '');
+    if (!value) {
+      return;
+    }
+    if (predefinedPaths.some((entry) => entry.value.toLowerCase() === value.toLowerCase())) {
+      setNewPredefinedPath('');
+      return;
+    }
+    setPredefinedPaths([...predefinedPaths, { value, global_shared: false }]);
+    setNewPredefinedPath('');
+  };
+
+  const addPredefinedTag = (): void => {
+    const value = newPredefinedTag.trim();
+    if (!value) {
+      return;
+    }
+    if (predefinedTags.some((entry) => entry.value.toLowerCase() === value.toLowerCase())) {
+      setNewPredefinedTag('');
+      return;
+    }
+    setPredefinedTags([...predefinedTags, { value, global_shared: false }]);
+    setNewPredefinedTag('');
+  };
+
+  const handleSave = useCallback(async (): Promise<void> => {
+    if (!ocrTask || !summaryTask || !routingTask || !handwritingStyle || !uploadDefaults || !displaySettings) {
+      setError('Settings are not fully loaded yet');
+      return;
+    }
+    if (providers.length === 0) {
+      setError('At least one provider is required');
+      return;
+    }
+
+    setError(null);
+    try {
+      const resolvedCardsPerPage = parseCardsPerPageInput(cardsPerPageInput, displaySettings.cards_per_page);
+      setDisplaySettings({ ...displaySettings, cards_per_page: resolvedCardsPerPage });
+      setCardsPerPageInput(String(resolvedCardsPerPage));
+
+      await onSave({
+        upload_defaults: {
+          logical_path: uploadDefaults.logical_path.trim(),
+          tags: uploadDefaults.tags,
+        },
+        display: {
+          cards_per_page: resolvedCardsPerPage,
+          log_typing_animation_enabled: displaySettings.log_typing_animation_enabled,
+        },
+        predefined_paths: predefinedPaths,
+        predefined_tags: predefinedTags,
+        handwriting_style_clustering: {
+          enabled: handwritingStyle.enabled,
+          embed_model: handwritingStyle.embed_model.trim(),
+          neighbor_limit: handwritingStyle.neighbor_limit,
+          match_min_similarity: handwritingStyle.match_min_similarity,
+          bootstrap_match_min_similarity: handwritingStyle.bootstrap_match_min_similarity,
+          bootstrap_sample_size: handwritingStyle.bootstrap_sample_size,
+          image_max_side: handwritingStyle.image_max_side,
+        },
+        providers: providers.map((provider) => ({
+          id: provider.id.trim(),
+          label: provider.label.trim(),
+          provider_type: provider.provider_type,
+          base_url: provider.base_url.trim(),
+          timeout_seconds: provider.timeout_seconds,
+          api_key: provider.api_key.trim() || undefined,
+          clear_api_key: provider.clear_api_key,
+        })),
+        tasks: {
+          ocr_handwriting: {
+            enabled: ocrTask.enabled,
+            provider_id: ocrTask.provider_id,
+            model: ocrTask.model.trim(),
+            prompt: ocrTask.prompt,
+          },
+          summary_generation: {
+            enabled: summaryTask.enabled,
+            provider_id: summaryTask.provider_id,
+            model: summaryTask.model.trim(),
+            prompt: summaryTask.prompt,
+            max_input_tokens: summaryTask.max_input_tokens,
+          },
+          routing_classification: {
+            enabled: routingTask.enabled,
+            provider_id: routingTask.provider_id,
+            model: routingTask.model.trim(),
+            prompt: routingTask.prompt,
+            neighbor_count: routingTask.neighbor_count,
+            neighbor_min_similarity: routingTask.neighbor_min_similarity,
+            auto_apply_confidence_threshold: routingTask.auto_apply_confidence_threshold,
+            auto_apply_neighbor_similarity_threshold: routingTask.auto_apply_neighbor_similarity_threshold,
+            neighbor_path_override_enabled: routingTask.neighbor_path_override_enabled,
+            neighbor_path_override_min_similarity: routingTask.neighbor_path_override_min_similarity,
+            neighbor_path_override_min_gap: routingTask.neighbor_path_override_min_gap,
+            neighbor_path_override_max_confidence: routingTask.neighbor_path_override_max_confidence,
+          },
+        },
+      });
+    } catch (caughtError) {
+      setError(caughtError instanceof Error ? caughtError.message : 'Failed to save settings');
+    }
+  }, [
+    cardsPerPageInput,
+    displaySettings,
+    handwritingStyle,
+    ocrTask,
+    onSave,
+    predefinedPaths,
+    predefinedTags,
+    providers,
+    routingTask,
+    summaryTask,
+    uploadDefaults,
+  ]);
+
+  useEffect(() => {
+    if (!onRegisterSaveAction) {
+      return;
+    }
+    if (!settings || !ocrTask || !summaryTask || !routingTask || !handwritingStyle || !uploadDefaults || !displaySettings) {
+      onRegisterSaveAction(null);
+      return;
+    }
+    onRegisterSaveAction(() => handleSave());
+    return () => onRegisterSaveAction(null);
+  }, [displaySettings, handleSave, handwritingStyle, ocrTask, onRegisterSaveAction, routingTask, settings, summaryTask, uploadDefaults]);
+
+  if (!settings || !ocrTask || !summaryTask || !routingTask || !handwritingStyle || !uploadDefaults || !displaySettings) {
+    return (
+      <section className="settings-layout">
+        <div className="settings-card">
+          <h2>Settings</h2>
+          <p>Loading settings...</p>
+        </div>
+      </section>
+    );
+  }
+
+  return (
+    <section className="settings-layout">
+      {error && <p className="error-banner">{error}</p>}
+
+      <div className="settings-card settings-section">
+        <div className="settings-section-header">
+          <h3>Workspace</h3>
+          <p className="small">Defaults and display behavior for document operations.</p>
+        </div>
+        <div className="settings-field-grid">
+          <label className="settings-field settings-field-wide">
+            Default Path
+            <PathInput
+              value={uploadDefaults.logical_path}
+              onChange={(nextPath) => setUploadDefaults({ ...uploadDefaults, logical_path: nextPath })}
+              suggestions={knownPaths}
+            />
+          </label>
+          <label className="settings-field settings-field-wide">
+            Default Tags
+            <TagInput
+              value={uploadDefaults.tags}
+              onChange={(nextTags) => setUploadDefaults({ ...uploadDefaults, tags: nextTags })}
+              suggestions={knownTags}
+            />
+          </label>
+          <label className="settings-field">
+            Cards Per Page
+            <input
+              type="number"
+              min={1}
+              max={200}
+              value={cardsPerPageInput}
+              onChange={(event) => setCardsPerPageInput(event.target.value)}
+            />
+          </label>
+          <label className="inline-checkbox settings-checkbox-field">
+            <input
+              type="checkbox"
+              checked={displaySettings.log_typing_animation_enabled}
+              onChange={(event) =>
+                setDisplaySettings({ ...displaySettings, log_typing_animation_enabled: event.target.checked })
+              }
+            />
+            Processing log typing animation enabled
+          </label>
+        </div>
+      </div>
+
+      <div className="settings-card settings-section">
+        <div className="settings-section-header">
+          <h3>Catalog Presets</h3>
+          <p className="small">Pre-register allowed paths and tags. Global-shared is irreversible.</p>
+        </div>
+        <div className="settings-catalog-grid">
+          <section className="settings-catalog-card">
+            <h4>Predefined Paths</h4>
+            <div className="settings-catalog-add-row">
+              <input
+                placeholder="Add path"
+                value={newPredefinedPath}
+                onChange={(event) => setNewPredefinedPath(event.target.value)}
+              />
+              <button type="button" className="secondary-action" onClick={addPredefinedPath}>
+                Add
+              </button>
+            </div>
+            <div className="settings-catalog-list">
+              {predefinedPaths.map((entry) => (
+                <div key={entry.value} className="settings-catalog-row">
+                  <span>{entry.value}</span>
+                  <label className="inline-checkbox">
+                    <input
+                      type="checkbox"
+                      checked={entry.global_shared}
+                      disabled={entry.global_shared}
+                      onChange={(event) =>
+                        setPredefinedPaths((current) =>
+                          current.map((item) =>
+                            item.value === entry.value
+                              ? { ...item, global_shared: item.global_shared || event.target.checked }
+                              : item,
+                          ),
+                        )
+                      }
+                    />
+                    Global
+                  </label>
+                  <button
+                    type="button"
+                    className="secondary-action"
+                    onClick={() => setPredefinedPaths((current) => current.filter((item) => item.value !== entry.value))}
+                  >
+                    Remove
+                  </button>
+                </div>
+              ))}
+            </div>
+          </section>
+          <section className="settings-catalog-card">
+            <h4>Predefined Tags</h4>
+            <div className="settings-catalog-add-row">
+              <input
+                placeholder="Add tag"
+                value={newPredefinedTag}
+                onChange={(event) => setNewPredefinedTag(event.target.value)}
+              />
+              <button type="button" className="secondary-action" onClick={addPredefinedTag}>
+                Add
+              </button>
+            </div>
+            <div className="settings-catalog-list">
+              {predefinedTags.map((entry) => (
+                <div key={entry.value} className="settings-catalog-row">
+                  <span>{entry.value}</span>
+                  <label className="inline-checkbox">
+                    <input
+                      type="checkbox"
+                      checked={entry.global_shared}
+                      disabled={entry.global_shared}
+                      onChange={(event) =>
+                        setPredefinedTags((current) =>
+                          current.map((item) =>
+                            item.value === entry.value
+                              ? { ...item, global_shared: item.global_shared || event.target.checked }
+                              : item,
+                          ),
+                        )
+                      }
+                    />
+                    Global
+                  </label>
+                  <button
+                    type="button"
+                    className="secondary-action"
+                    onClick={() => setPredefinedTags((current) => current.filter((item) => item.value !== entry.value))}
+                  >
+                    Remove
+                  </button>
+                </div>
+              ))}
+            </div>
+          </section>
+        </div>
+      </div>
+
+      <div className="settings-card settings-section">
+        <div className="settings-section-header">
+          <h3>Providers</h3>
+          <p className="small">Configure OpenAI-compatible model endpoints.</p>
+        </div>
+        <div className="provider-list">
+          {providers.map((provider, index) => (
+            <div key={provider.row_id} className="provider-grid">
+              <div className="provider-header">
+                <h4>{provider.label || `Provider ${index + 1}`}</h4>
+                <button
+                  type="button"
+                  className="danger-action"
+                  onClick={() => removeProvider(provider.row_id)}
+                  disabled={providers.length <= 1 || isSaving}
+                >
+                  Remove
+                </button>
+              </div>
+              <div className="settings-field-grid">
+                <label className="settings-field">
+                  Provider ID
+                  <input
+                    value={provider.id}
+                    onChange={(event) =>
+                      setProviders((current) =>
+                        current.map((item) => (item.row_id === provider.row_id ? { ...item, id: event.target.value } : item)),
+                      )
+                    }
+                  />
+                </label>
+                <label className="settings-field">
+                  Label
+                  <input
+                    value={provider.label}
+                    onChange={(event) =>
+                      setProviders((current) =>
+                        current.map((item) =>
+                          item.row_id === provider.row_id ? { ...item, label: event.target.value } : item,
+                        ),
+                      )
+                    }
+                  />
+                </label>
+                <label className="settings-field">
+                  Timeout Seconds
+                  <input
+                    type="number"
+                    value={provider.timeout_seconds}
+                    onChange={(event) => {
+                      const nextTimeout = Number.parseInt(event.target.value, 10);
+                      if (Number.isNaN(nextTimeout)) {
+                        return;
+                      }
+                      setProviders((current) =>
+                        current.map((item) =>
+                          item.row_id === provider.row_id ? { ...item, timeout_seconds: nextTimeout } : item,
+                        ),
+                      );
+                    }}
+                  />
+                </label>
+                <label className="settings-field settings-field-wide">
+                  Base URL
+                  <input
+                    value={provider.base_url}
+                    onChange={(event) =>
+                      setProviders((current) =>
+                        current.map((item) =>
+                          item.row_id === provider.row_id ? { ...item, base_url: event.target.value } : item,
+                        ),
+                      )
+                    }
+                  />
+                </label>
+                <label className="settings-field settings-field-wide">
+                  API Key
+                  <input
+                    type="password"
+                    placeholder={provider.api_key_set ? `Stored: ${provider.api_key_masked}` : 'Optional API key'}
+                    value={provider.api_key}
+                    onChange={(event) =>
+                      setProviders((current) =>
+                        current.map((item) =>
+                          item.row_id === provider.row_id ? { ...item, api_key: event.target.value } : item,
+                        ),
+                      )
+                    }
+                  />
+                </label>
+                <label className="inline-checkbox settings-checkbox-field">
+                  <input
+                    type="checkbox"
+                    checked={provider.clear_api_key}
+                    onChange={(event) =>
+                      setProviders((current) =>
+                        current.map((item) =>
+                          item.row_id === provider.row_id ? { ...item, clear_api_key: event.target.checked } : item,
+                        ),
+                      )
+                    }
+                  />
+                  Clear Stored API Key
+                </label>
+              </div>
+            </div>
+          ))}
+        </div>
+        <div className="settings-section-actions">
+          <button type="button" className="secondary-action" onClick={addProvider} disabled={isSaving}>
+            Add Provider
+          </button>
+        </div>
+      </div>
+
+      <div className="settings-card settings-section">
+        <div className="settings-section-header">
+          <h3>Task Runtime</h3>
+          <p className="small">Bind providers and tune OCR, summary, routing, and handwriting style behavior.</p>
+        </div>
+
+        <div className="task-settings-block">
+          <div className="task-block-header">
+            <h4>OCR Handwriting</h4>
+            <label className="inline-checkbox settings-toggle">
+              <input type="checkbox" checked={ocrTask.enabled} onChange={(event) => setOcrTask({ ...ocrTask, enabled: event.target.checked })} />
+              Enabled
+            </label>
+          </div>
+          <div className="settings-field-grid">
+            <label className="settings-field">
+              Provider
+              <select value={ocrTask.provider_id} onChange={(event) => setOcrTask({ ...ocrTask, provider_id: event.target.value || fallbackProviderId })}>
+                {providers.map((provider) => (
+                  <option key={provider.row_id} value={provider.id}>
+                    {provider.label} ({provider.id})
+                  </option>
+                ))}
+              </select>
+            </label>
+            <label className="settings-field">
+              Model
+              <input value={ocrTask.model} onChange={(event) => setOcrTask({ ...ocrTask, model: event.target.value })} />
+            </label>
+            <label className="settings-field settings-field-wide">
+              OCR Prompt
+              <textarea value={ocrTask.prompt} onChange={(event) => setOcrTask({ ...ocrTask, prompt: event.target.value })} />
+            </label>
+          </div>
+        </div>
+
+        <div className="task-settings-block">
+          <div className="task-block-header">
+            <h4>Summary Generation</h4>
+            <label className="inline-checkbox settings-toggle">
+              <input type="checkbox" checked={summaryTask.enabled} onChange={(event) => setSummaryTask({ ...summaryTask, enabled: event.target.checked })} />
+              Enabled
+            </label>
+          </div>
+          <div className="settings-field-grid">
+            <label className="settings-field">
+              Provider
+              <select value={summaryTask.provider_id} onChange={(event) => setSummaryTask({ ...summaryTask, provider_id: event.target.value || fallbackProviderId })}>
+                {providers.map((provider) => (
+                  <option key={provider.row_id} value={provider.id}>
+                    {provider.label} ({provider.id})
+                  </option>
+                ))}
+              </select>
+            </label>
+            <label className="settings-field">
+              Model
+              <input value={summaryTask.model} onChange={(event) => setSummaryTask({ ...summaryTask, model: event.target.value })} />
+            </label>
+            <label className="settings-field">
+              Max Input Tokens
+              <input
+                type="number"
+                min={512}
+                max={64000}
+                value={summaryTask.max_input_tokens}
+                onChange={(event) => {
+                  const nextValue = Number.parseInt(event.target.value, 10);
+                  if (!Number.isNaN(nextValue)) {
+                    setSummaryTask({ ...summaryTask, max_input_tokens: nextValue });
+                  }
+                }}
+              />
+            </label>
+            <label className="settings-field settings-field-wide">
+              Summary Prompt
+              <textarea value={summaryTask.prompt} onChange={(event) => setSummaryTask({ ...summaryTask, prompt: event.target.value })} />
+            </label>
+          </div>
+        </div>
+
+        <div className="task-settings-block">
+          <div className="task-block-header">
+            <h4>Routing Classification</h4>
+            <label className="inline-checkbox settings-toggle">
+              <input type="checkbox" checked={routingTask.enabled} onChange={(event) => setRoutingTask({ ...routingTask, enabled: event.target.checked })} />
+              Enabled
+            </label>
+          </div>
+          <div className="settings-field-grid">
+            <label className="settings-field">
+              Provider
+              <select value={routingTask.provider_id} onChange={(event) => setRoutingTask({ ...routingTask, provider_id: event.target.value || fallbackProviderId })}>
+                {providers.map((provider) => (
+                  <option key={provider.row_id} value={provider.id}>
+                    {provider.label} ({provider.id})
+                  </option>
+                ))}
+              </select>
+            </label>
+            <label className="settings-field">
+              Model
+              <input value={routingTask.model} onChange={(event) => setRoutingTask({ ...routingTask, model: event.target.value })} />
+            </label>
+            <label className="settings-field">
+              Neighbor Count
+              <input type="number" value={routingTask.neighbor_count} onChange={(event) => setRoutingTask({ ...routingTask, neighbor_count: Number.parseInt(event.target.value, 10) || routingTask.neighbor_count })} />
+            </label>
+            <label className="settings-field">
+              Min Neighbor Similarity
+              <input type="number" step="0.01" min="0" max="1" value={routingTask.neighbor_min_similarity} onChange={(event) => setRoutingTask({ ...routingTask, neighbor_min_similarity: Number.parseFloat(event.target.value) || routingTask.neighbor_min_similarity })} />
+            </label>
+            <label className="settings-field">
+              Auto Apply Confidence
+              <input type="number" step="0.01" min="0" max="1" value={routingTask.auto_apply_confidence_threshold} onChange={(event) => setRoutingTask({ ...routingTask, auto_apply_confidence_threshold: Number.parseFloat(event.target.value) || routingTask.auto_apply_confidence_threshold })} />
+            </label>
+            <label className="settings-field">
+              Auto Apply Neighbor Similarity
+              <input type="number" step="0.01" min="0" max="1" value={routingTask.auto_apply_neighbor_similarity_threshold} onChange={(event) => setRoutingTask({ ...routingTask, auto_apply_neighbor_similarity_threshold: Number.parseFloat(event.target.value) || routingTask.auto_apply_neighbor_similarity_threshold })} />
+            </label>
+            <label className="inline-checkbox settings-checkbox-field">
+              <input type="checkbox" checked={routingTask.neighbor_path_override_enabled} onChange={(event) => setRoutingTask({ ...routingTask, neighbor_path_override_enabled: event.target.checked })} />
+              Dominant neighbor path override enabled
+            </label>
+            <label className="settings-field">
+              Override Min Similarity
+              <input type="number" step="0.01" min="0" max="1" value={routingTask.neighbor_path_override_min_similarity} onChange={(event) => setRoutingTask({ ...routingTask, neighbor_path_override_min_similarity: Number.parseFloat(event.target.value) || routingTask.neighbor_path_override_min_similarity })} />
+            </label>
+            <label className="settings-field">
+              Override Min Gap
+              <input type="number" step="0.01" min="0" max="1" value={routingTask.neighbor_path_override_min_gap} onChange={(event) => setRoutingTask({ ...routingTask, neighbor_path_override_min_gap: Number.parseFloat(event.target.value) || routingTask.neighbor_path_override_min_gap })} />
+            </label>
+            <label className="settings-field">
+              Override Max LLM Confidence
+              <input type="number" step="0.01" min="0" max="1" value={routingTask.neighbor_path_override_max_confidence} onChange={(event) => setRoutingTask({ ...routingTask, neighbor_path_override_max_confidence: Number.parseFloat(event.target.value) || routingTask.neighbor_path_override_max_confidence })} />
+            </label>
+            <label className="settings-field settings-field-wide">
+              Routing Prompt
+              <textarea value={routingTask.prompt} onChange={(event) => setRoutingTask({ ...routingTask, prompt: event.target.value })} />
+            </label>
+          </div>
+        </div>
+
+        <div className="task-settings-block">
+          <div className="task-block-header">
+            <h4>Handwriting Style Clustering</h4>
+            <label className="inline-checkbox settings-toggle">
+              <input type="checkbox" checked={handwritingStyle.enabled} onChange={(event) => setHandwritingStyle({ ...handwritingStyle, enabled: event.target.checked })} />
+              Enabled
+            </label>
+          </div>
+          <div className="settings-field-grid">
+            <label className="settings-field settings-field-wide">
+              Typesense Embedding Model Slug
+              <input value={handwritingStyle.embed_model} onChange={(event) => setHandwritingStyle({ ...handwritingStyle, embed_model: event.target.value })} />
+            </label>
+            <label className="settings-field">
+              Neighbor Limit
+              <input type="number" min={1} max={32} value={handwritingStyle.neighbor_limit} onChange={(event) => setHandwritingStyle({ ...handwritingStyle, neighbor_limit: Number.parseInt(event.target.value, 10) || handwritingStyle.neighbor_limit })} />
+            </label>
+            <label className="settings-field">
+              Match Min Similarity
+              <input type="number" step="0.01" min="0" max="1" value={handwritingStyle.match_min_similarity} onChange={(event) => setHandwritingStyle({ ...handwritingStyle, match_min_similarity: Number.parseFloat(event.target.value) || handwritingStyle.match_min_similarity })} />
+            </label>
+            <label className="settings-field">
+              Bootstrap Match Min Similarity
+              <input type="number" step="0.01" min="0" max="1" value={handwritingStyle.bootstrap_match_min_similarity} onChange={(event) => setHandwritingStyle({ ...handwritingStyle, bootstrap_match_min_similarity: Number.parseFloat(event.target.value) || handwritingStyle.bootstrap_match_min_similarity })} />
+            </label>
+            <label className="settings-field">
+              Bootstrap Sample Size
+              <input type="number" min={1} max={30} value={handwritingStyle.bootstrap_sample_size} onChange={(event) => setHandwritingStyle({ ...handwritingStyle, bootstrap_sample_size: Number.parseInt(event.target.value, 10) || handwritingStyle.bootstrap_sample_size })} />
+            </label>
+            <label className="settings-field">
+              Max Image Side (px)
+              <input type="number" min={256} max={4096} value={handwritingStyle.image_max_side} onChange={(event) => setHandwritingStyle({ ...handwritingStyle, image_max_side: Number.parseInt(event.target.value, 10) || handwritingStyle.image_max_side })} />
+            </label>
+          </div>
+        </div>
+      </div>
+    </section>
+  );
+}
@@ -0,0 +1,123 @@
+/**
+ * Tag editor with suggestion dropdown and keyboard-friendly chip interactions.
+ */
+import { useMemo, useState } from 'react';
+import type { KeyboardEvent } from 'react';
+
+/**
+ * Defines properties for the reusable tag input component.
+ */
+interface TagInputProps {
+  value: string[];
+  suggestions: string[];
+  placeholder?: string;
+  disabled?: boolean;
+  onChange: (tags: string[]) => void;
+}
+
+/**
+ * Renders a chip-based tag editor with inline suggestions.
+ */
+export default function TagInput({
+  value,
+  suggestions,
+  placeholder = 'Add tag',
+  disabled = false,
+  onChange,
+}: TagInputProps): JSX.Element {
+  const [draft, setDraft] = useState<string>('');
+
+  /**
+   * Calculates filtered suggestions based on current draft and selected tags.
+   */
+  const filteredSuggestions = useMemo(() => {
+    const normalized = draft.trim().toLowerCase();
+    return suggestions
+      .filter((candidate) => !value.includes(candidate))
+      .filter((candidate) => (normalized ? candidate.toLowerCase().includes(normalized) : false))
+      .slice(0, 8);
+  }, [draft, suggestions, value]);
+
+  /**
+   * Adds a tag to the selected value list when valid.
+   */
+  const addTag = (tag: string): void => {
+    const normalized = tag.trim();
+    if (!normalized) {
+      return;
+    }
+    if (value.includes(normalized)) {
+      setDraft('');
+      return;
+    }
+    onChange([...value, normalized]);
+    setDraft('');
+  };
+
+  /**
+   * Removes one tag by value.
+   */
+  const removeTag = (tag: string): void => {
+    onChange(value.filter((candidate) => candidate !== tag));
+  };
+
+  /**
+   * Handles keyboard interactions for quick tag editing.
+   */
+  const handleKeyDown = (event: KeyboardEvent<HTMLInputElement>): void => {
+    if (event.key === 'Enter' || event.key === ',') {
+      event.preventDefault();
+      addTag(draft);
+      return;
+    }
+    if (event.key === 'Backspace' && draft.length === 0 && value.length > 0) {
+      event.preventDefault();
+      onChange(value.slice(0, -1));
+    }
+  };
+
+  return (
+    <div className={`tag-input ${disabled ? 'disabled' : ''}`}>
+      <div className="tag-chip-row">
+        {value.map((tag) => (
+          <button
+            key={tag}
+            type="button"
+            className="tag-chip"
+            onClick={() => removeTag(tag)}
+            disabled={disabled}
+            title="Remove tag"
+          >
+            {tag}
+          </button>
+        ))}
+      </div>
+      <input
+        value={draft}
+        onChange={(event) => setDraft(event.target.value)}
+        onKeyDown={handleKeyDown}
+        onBlur={() => addTag(draft)}
+        placeholder={placeholder}
+        disabled={disabled}
+      />
+      {filteredSuggestions.length > 0 && (
+        <div className="tag-suggestions" role="listbox" aria-label="Tag suggestions">
+          {filteredSuggestions.map((suggestion) => (
+            <button
+              key={suggestion}
+              type="button"
+              className="tag-suggestion-item"
+              onMouseDown={(event) => {
+                event.preventDefault();
+                addTag(suggestion);
+              }}
+              disabled={disabled}
+            >
+              {suggestion}
+            </button>
+          ))}
+        </div>
+      )}
+    </div>
+  );
+}
@@ -0,0 +1,127 @@
+/**
+ * Upload surface that supports global drag-and-drop and file/folder picking.
+ */
+import { useEffect, useMemo, useRef, useState } from 'react';
+import type { ChangeEvent } from 'react';
+
+/**
+ * Defines callback signature for queued file uploads.
+ */
+interface UploadSurfaceProps {
+  onUploadRequested: (files: File[]) => Promise<void>;
+  isUploading: boolean;
+  variant?: 'panel' | 'inline';
+}
+
+/**
+ * Renders upload actions and drag overlay for dropping documents anywhere.
+ */
+export default function UploadSurface({
+  onUploadRequested,
+  isUploading,
+  variant = 'panel',
+}: UploadSurfaceProps): JSX.Element {
+  const [isDragging, setIsDragging] = useState<boolean>(false);
+  const fileInputRef = useRef<HTMLInputElement | null>(null);
+  const folderInputRef = useRef<HTMLInputElement | null>(null);
+
+  /**
+   * Installs folder-selection attributes unsupported by default React typings.
+   */
+  useEffect(() => {
+    if (folderInputRef.current) {
+      folderInputRef.current.setAttribute('webkitdirectory', '');
+      folderInputRef.current.setAttribute('directory', '');
+      folderInputRef.current.setAttribute('multiple', '');
+    }
+  }, []);
+
+  /**
+   * Registers global drag listeners so users can drop files anywhere in the app.
+   */
+  useEffect(() => {
+    const onDragOver = (event: DragEvent): void => {
+      event.preventDefault();
+      setIsDragging(true);
+    };
+    const onDragLeave = (event: DragEvent): void => {
+      event.preventDefault();
+      if (!event.relatedTarget) {
+        setIsDragging(false);
+      }
+    };
+    const onDrop = async (event: DragEvent): Promise<void> => {
+      event.preventDefault();
+      setIsDragging(false);
+      const droppedFiles = Array.from(event.dataTransfer?.files ?? []);
+      if (droppedFiles.length > 0) {
+        await onUploadRequested(droppedFiles);
+      }
+    };
+
+    window.addEventListener('dragover', onDragOver);
+    window.addEventListener('dragleave', onDragLeave);
+    window.addEventListener('drop', onDrop);
+    return () => {
+      window.removeEventListener('dragover', onDragOver);
+      window.removeEventListener('dragleave', onDragLeave);
+      window.removeEventListener('drop', onDrop);
+    };
+  }, [onUploadRequested]);
+
+  /**
+   * Provides helper text based on current upload activity.
+   */
+  const statusLabel = useMemo(() => {
+    if (isUploading) {
+      return 'Uploading and scheduling processing...';
+    }
+    return 'Drop files anywhere or use file/folder upload.';
+  }, [isUploading]);
+
+  /**
+   * Handles manual file and folder input selection events.
+   */
+  const handlePickedFiles = async (event: ChangeEvent<HTMLInputElement>): Promise<void> => {
+    const pickedFiles = Array.from(event.target.files ?? []);
+    if (pickedFiles.length > 0) {
+      await onUploadRequested(pickedFiles);
+    }
+    event.target.value = '';
+  };
+
+  if (variant === 'inline') {
+    return (
+      <>
+        {isDragging && <div className="drop-overlay">Drop to upload</div>}
+        <div className="upload-actions upload-actions-inline">
+          <button type="button" onClick={() => fileInputRef.current?.click()} disabled={isUploading}>
+            Upload Files
+          </button>
+          <button type="button" onClick={() => folderInputRef.current?.click()} disabled={isUploading}>
+            Upload Folder
+          </button>
+        </div>
+        <input ref={fileInputRef} type="file" multiple hidden onChange={handlePickedFiles} />
+        <input ref={folderInputRef} type="file" hidden onChange={handlePickedFiles} />
+      </>
+    );
+  }
+
+  return (
+    <section className="upload-surface">
+      {isDragging && <div className="drop-overlay">Drop to upload</div>}
+      <div className="upload-actions">
+        <button type="button" onClick={() => fileInputRef.current?.click()} disabled={isUploading}>
+          Upload Files
+        </button>
+        <button type="button" onClick={() => folderInputRef.current?.click()} disabled={isUploading}>
+          Upload Folder
+        </button>
+      </div>
+      <p>{statusLabel}</p>
+      <input ref={fileInputRef} type="file" multiple hidden onChange={handlePickedFiles} />
+      <input ref={folderInputRef} type="file" hidden onChange={handlePickedFiles} />
+    </section>
+  );
+}
@@ -0,0 +1,119 @@
+/**
+ * Foundational compact tokens and primitives for the LedgerDock frontend.
+ */
+@import url('https://fonts.googleapis.com/css2?family=Archivo:wght@500;600;700&family=IBM+Plex+Mono:wght@400;500&family=IBM+Plex+Sans:wght@400;500;600&display=swap');
+
+:root {
+  --font-display: 'Archivo', sans-serif;
+  --font-body: 'IBM Plex Sans', sans-serif;
+  --font-mono: 'IBM Plex Mono', monospace;
+
+  --color-bg-0: #0b111b;
+  --color-bg-1: #101827;
+  --color-panel: #141e2f;
+  --color-panel-strong: #1b273a;
+  --color-panel-elevated: #1f2d44;
+  --color-border: #2f3f5a;
+  --color-border-strong: #46597a;
+  --color-text: #e4ebf7;
+  --color-text-muted: #9aa8c1;
+  --color-accent: #3f8dff;
+  --color-accent-strong: #2e70cf;
+  --color-success: #3bb07f;
+  --color-warning: #d89a42;
+  --color-danger: #d56a6a;
+  --color-focus: #79adff;
+
+  --radius-xs: 4px;
+  --radius-sm: 6px;
+  --radius-md: 8px;
+  --radius-lg: 10px;
+
+  --shadow-soft: 0 10px 24px rgba(0, 0, 0, 0.24);
+  --shadow-strong: 0 16px 34px rgba(0, 0, 0, 0.34);
+
+  --space-1: 0.25rem;
+  --space-2: 0.5rem;
+  --space-3: 0.75rem;
+  --space-4: 1rem;
+  --space-5: 1.5rem;
+
+  --transition-fast: 140ms ease;
+  --transition-base: 200ms ease;
+}
+
+* {
+  box-sizing: border-box;
+}
+
+html,
+body,
+#root {
+  min-height: 100%;
+}
+
+body {
+  margin: 0;
+  color: var(--color-text);
+  font-family: var(--font-body);
+  line-height: 1.45;
+  background:
+    radial-gradient(circle at 15% -5%, rgba(63, 141, 255, 0.24), transparent 38%),
+    radial-gradient(circle at 90% -15%, rgba(130, 166, 229, 0.15), transparent 35%),
+    linear-gradient(180deg, var(--color-bg-0) 0%, var(--color-bg-1) 100%);
+}
+
+body::before {
+  content: '';
+  position: fixed;
+  inset: 0;
+  pointer-events: none;
+  z-index: -1;
+  opacity: 0.35;
+  background-image:
+    linear-gradient(rgba(139, 162, 196, 0.08) 1px, transparent 1px),
+    linear-gradient(90deg, rgba(139, 162, 196, 0.08) 1px, transparent 1px);
+  background-size: 34px 34px;
+}
+
+button,
+input,
+select,
+textarea {
+  font: inherit;
+}
+
+input[type='checkbox'] {
+  accent-color: var(--color-accent);
+}
+
+:focus-visible {
+  outline: 2px solid var(--color-focus);
+  outline-offset: 1px;
+}
+
+@keyframes rise-in {
+  from {
+    opacity: 0;
+    transform: translateY(8px);
+  }
+  to {
+    opacity: 1;
+    transform: translateY(0);
+  }
+}
+
+@keyframes pulse-border {
+  from {
+    box-shadow: 0 0 0 0 rgba(121, 173, 255, 0.36);
+  }
+  to {
+    box-shadow: 0 0 0 8px rgba(121, 173, 255, 0);
+  }
+}
+
+@keyframes terminal-blink {
+  50% {
+    opacity: 0;
+  }
+}
@@ -0,0 +1,411 @@
+/**
+ * API client for backend DMS endpoints.
+ */
+import type {
+  AppSettings,
+  AppSettingsUpdate,
+  DocumentListResponse,
+  DmsDocument,
+  DmsDocumentDetail,
+  ProcessingLogListResponse,
+  SearchResponse,
+  TypeListResponse,
+  UploadResponse,
+} from '../types';
+
+/**
+ * Resolves backend base URL from environment with localhost fallback.
+ */
+const API_BASE = import.meta.env.VITE_API_BASE ?? 'http://localhost:8000/api/v1';
+
+/**
+ * Encodes query parameters while skipping undefined and null values.
+ */
+function buildQuery(params: Record<string, string | number | boolean | undefined | null>): string {
+  const searchParams = new URLSearchParams();
+  Object.entries(params).forEach(([key, value]) => {
+    if (value === undefined || value === null || value === '') {
+      return;
+    }
+    searchParams.set(key, String(value));
+  });
+  const encoded = searchParams.toString();
+  return encoded ? `?${encoded}` : '';
+}
+
+/**
+ * Extracts a filename from content-disposition headers with fallback support.
+ */
+function responseFilename(response: Response, fallback: string): string {
+  const disposition = response.headers.get('content-disposition') ?? '';
+  const match = disposition.match(/filename="?([^";]+)"?/i);
+  if (!match || !match[1]) {
+    return fallback;
+  }
+  return match[1];
+}
+
+/**
+ * Loads documents from the backend list endpoint.
+ */
+export async function listDocuments(options?: {
+  limit?: number;
+  offset?: number;
+  includeTrashed?: boolean;
+  onlyTrashed?: boolean;
+  pathPrefix?: string;
+  pathFilter?: string;
+  tagFilter?: string;
+  typeFilter?: string;
+  processedFrom?: string;
+  processedTo?: string;
+}): Promise<DocumentListResponse> {
+  const query = buildQuery({
+    limit: options?.limit ?? 100,
+    offset: options?.offset ?? 0,
+    include_trashed: options?.includeTrashed,
+    only_trashed: options?.onlyTrashed,
+    path_prefix: options?.pathPrefix,
+    path_filter: options?.pathFilter,
+    tag_filter: options?.tagFilter,
+    type_filter: options?.typeFilter,
+    processed_from: options?.processedFrom,
+    processed_to: options?.processedTo,
+  });
+  const response = await fetch(`${API_BASE}/documents${query}`);
+  if (!response.ok) {
+    throw new Error('Failed to load documents');
+  }
+  return response.json() as Promise<DocumentListResponse>;
+}
+
+/**
+ * Executes free-text search against backend search endpoint.
+ */
+export async function searchDocuments(
+  queryText: string,
+  options?: {
+    limit?: number;
+    offset?: number;
+    includeTrashed?: boolean;
+    onlyTrashed?: boolean;
+    pathFilter?: string;
+    tagFilter?: string;
+    typeFilter?: string;
+    processedFrom?: string;
+    processedTo?: string;
+  },
+): Promise<SearchResponse> {
+  const query = buildQuery({
+    query: queryText,
+    limit: options?.limit ?? 100,
+    offset: options?.offset ?? 0,
+    include_trashed: options?.includeTrashed,
+    only_trashed: options?.onlyTrashed,
+    path_filter: options?.pathFilter,
+    tag_filter: options?.tagFilter,
+    type_filter: options?.typeFilter,
+    processed_from: options?.processedFrom,
+    processed_to: options?.processedTo,
+  });
+  const response = await fetch(`${API_BASE}/search${query}`);
+  if (!response.ok) {
+    throw new Error('Search failed');
+  }
+  return response.json() as Promise<SearchResponse>;
+}
+
+/**
+ * Loads processing logs for recent upload, OCR, summarization, routing, and indexing steps.
+ */
+export async function listProcessingLogs(options?: {
+  limit?: number;
+  offset?: number;
+  documentId?: string;
+}): Promise<ProcessingLogListResponse> {
+  const query = buildQuery({
+    limit: options?.limit ?? 120,
+    offset: options?.offset ?? 0,
+    document_id: options?.documentId,
+  });
+  const response = await fetch(`${API_BASE}/processing/logs${query}`);
+  if (!response.ok) {
+    throw new Error('Failed to load processing logs');
+  }
+  return response.json() as Promise<ProcessingLogListResponse>;
+}
+
+/**
+ * Trims persisted processing logs while keeping recent document sessions.
+ */
+export async function trimProcessingLogs(options?: {
+  keepDocumentSessions?: number;
+  keepUnboundEntries?: number;
+}): Promise<{ deleted_document_entries: number; deleted_unbound_entries: number }> {
+  const query = buildQuery({
+    keep_document_sessions: options?.keepDocumentSessions ?? 2,
+    keep_unbound_entries: options?.keepUnboundEntries ?? 80,
+  });
+  const response = await fetch(`${API_BASE}/processing/logs/trim${query}`, {
+    method: 'POST',
+  });
+  if (!response.ok) {
+    throw new Error('Failed to trim processing logs');
+  }
+  return response.json() as Promise<{ deleted_document_entries: number; deleted_unbound_entries: number }>;
+}
+
+/**
+ * Clears all persisted processing logs.
+ */
+export async function clearProcessingLogs(): Promise<{ deleted_entries: number }> {
+  const response = await fetch(`${API_BASE}/processing/logs/clear`, {
+    method: 'POST',
+  });
+  if (!response.ok) {
+    throw new Error('Failed to clear processing logs');
+  }
+  return response.json() as Promise<{ deleted_entries: number }>;
+}
+
+/**
+ * Returns existing tags for suggestion UIs.
+ */
+export async function listTags(includeTrashed = false): Promise<string[]> {
+  const query = buildQuery({ include_trashed: includeTrashed });
+  const response = await fetch(`${API_BASE}/documents/tags${query}`);
+  if (!response.ok) {
+    throw new Error('Failed to load tags');
+  }
+  const payload = (await response.json()) as { tags: string[] };
+  return payload.tags;
+}
+
+/**
+ * Returns existing logical paths for suggestion UIs.
+ */
+export async function listPaths(includeTrashed = false): Promise<string[]> {
+  const query = buildQuery({ include_trashed: includeTrashed });
+  const response = await fetch(`${API_BASE}/documents/paths${query}`);
+  if (!response.ok) {
+    throw new Error('Failed to load paths');
+  }
+  const payload = (await response.json()) as { paths: string[] };
+  return payload.paths;
+}
+
+/**
+ * Returns distinct type values from extension, MIME, and image text categories.
+ */
+export async function listTypes(includeTrashed = false): Promise<string[]> {
+  const query = buildQuery({ include_trashed: includeTrashed });
+  const response = await fetch(`${API_BASE}/documents/types${query}`);
+  if (!response.ok) {
+    throw new Error('Failed to load document types');
+  }
+  const payload = (await response.json()) as TypeListResponse;
+  return payload.types;
+}
+
+/**
+ * Uploads files with optional logical path and tags.
+ */
+export async function uploadDocuments(
+  files: File[],
+  options: {
+    logicalPath: string;
+    tags: string;
+    conflictMode: 'ask' | 'replace' | 'duplicate';
+  },
+): Promise<UploadResponse> {
+  const formData = new FormData();
+  files.forEach((file) => {
+    formData.append('files', file, file.name);
+    const relativePath = (file as File & { webkitRelativePath?: string }).webkitRelativePath ?? file.name;
+    formData.append('relative_paths', relativePath);
+  });
+  formData.append('logical_path', options.logicalPath);
+  formData.append('tags', options.tags);
+  formData.append('conflict_mode', options.conflictMode);
+
+  const response = await fetch(`${API_BASE}/documents/upload`, {
+    method: 'POST',
+    body: formData,
+  });
+  if (!response.ok) {
+    throw new Error('Upload failed');
+  }
+  return response.json() as Promise<UploadResponse>;
+}
+
+/**
+ * Updates document metadata and optionally trains routing suggestions.
+ */
+export async function updateDocumentMetadata(
+  documentId: string,
+  payload: { original_filename?: string; logical_path?: string; tags?: string[] },
+): Promise<DmsDocument> {
+  const response = await fetch(`${API_BASE}/documents/${documentId}`, {
+    method: 'PATCH',
+    headers: {
+      'Content-Type': 'application/json',
+    },
+    body: JSON.stringify(payload),
+  });
+  if (!response.ok) {
+    throw new Error('Failed to update document metadata');
+  }
+  return response.json() as Promise<DmsDocument>;
+}
+
+/**
+ * Moves a document to trash state without removing stored files.
+ */
+export async function trashDocument(documentId: string): Promise<DmsDocument> {
+  const response = await fetch(`${API_BASE}/documents/${documentId}/trash`, { method: 'POST' });
+  if (!response.ok) {
+    throw new Error('Failed to trash document');
+  }
+  return response.json() as Promise<DmsDocument>;
+}
+
+/**
+ * Restores a document from trash to active state.
+ */
+export async function restoreDocument(documentId: string): Promise<DmsDocument> {
+  const response = await fetch(`${API_BASE}/documents/${documentId}/restore`, { method: 'POST' });
+  if (!response.ok) {
+    throw new Error('Failed to restore document');
+  }
+  return response.json() as Promise<DmsDocument>;
+}
+
+/**
+ * Permanently deletes a document record and associated stored files.
+ */
+export async function deleteDocument(documentId: string): Promise<{ deleted_documents: number; deleted_files: number }> {
+  const response = await fetch(`${API_BASE}/documents/${documentId}`, { method: 'DELETE' });
+  if (!response.ok) {
+    throw new Error('Failed to delete document');
+  }
+  return response.json() as Promise<{ deleted_documents: number; deleted_files: number }>;
+}
+
+/**
+ * Loads full details for one document, including extracted text content.
+ */
+export async function getDocumentDetails(documentId: string): Promise<DmsDocumentDetail> {
+  const response = await fetch(`${API_BASE}/documents/${documentId}`);
+  if (!response.ok) {
+    throw new Error('Failed to load document details');
+  }
+  return response.json() as Promise<DmsDocumentDetail>;
+}
+
+/**
+ * Re-enqueues one document for extraction and classification processing.
+ */
+export async function reprocessDocument(documentId: string): Promise<DmsDocument> {
+  const response = await fetch(`${API_BASE}/documents/${documentId}/reprocess`, {
+    method: 'POST',
+  });
+  if (!response.ok) {
+    throw new Error('Failed to reprocess document');
+  }
+  return response.json() as Promise<DmsDocument>;
+}
+
+/**
+ * Builds preview URL for a specific document.
+ */
+export function previewUrl(documentId: string): string {
+  return `${API_BASE}/documents/${documentId}/preview`;
+}
+
+/**
+ * Builds thumbnail URL for dashboard card rendering.
+ */
+export function thumbnailUrl(documentId: string): string {
+  return `${API_BASE}/documents/${documentId}/thumbnail`;
+}
+
+/**
+ * Builds download URL for a specific document.
+ */
+export function downloadUrl(documentId: string): string {
+  return `${API_BASE}/documents/${documentId}/download`;
+}
+
+/**
+ * Builds direct markdown-content download URL for one document.
+ */
+export function contentMarkdownUrl(documentId: string): string {
+  return `${API_BASE}/documents/${documentId}/content-md`;
+}
+
+/**
+ * Exports extracted content markdown files for selected documents or path filters.
+ */
+export async function exportContentsMarkdown(payload: {
+  document_ids?: string[];
+  path_prefix?: string;
+  include_trashed?: boolean;
+  only_trashed?: boolean;
+}): Promise<{ blob: Blob; filename: string }> {
+  const response = await fetch(`${API_BASE}/documents/content-md/export`, {
+    method: 'POST',
+    headers: {
+      'Content-Type': 'application/json',
+    },
+    body: JSON.stringify(payload),
+  });
+  if (!response.ok) {
+    throw new Error('Failed to export markdown contents');
+  }
+  const blob = await response.blob();
+  return {
+    blob,
+    filename: responseFilename(response, 'document-contents-md.zip'),
+  };
+}
+
+/**
+ * Retrieves persisted application settings from backend.
+ */
+export async function getAppSettings(): Promise<AppSettings> {
+  const response = await fetch(`${API_BASE}/settings`);
+  if (!response.ok) {
+    throw new Error('Failed to load application settings');
+  }
+  return response.json() as Promise<AppSettings>;
+}
+
+/**
+ * Updates provider and task settings for OpenAI-compatible model execution.
+ */
+export async function updateAppSettings(payload: AppSettingsUpdate): Promise<AppSettings> {
+  const response = await fetch(`${API_BASE}/settings`, {
+    method: 'PATCH',
+    headers: {
+      'Content-Type': 'application/json',
+    },
+    body: JSON.stringify(payload),
+  });
+  if (!response.ok) {
+    throw new Error('Failed to update settings');
+  }
+  return response.json() as Promise<AppSettings>;
+}
+
+/**
+ * Resets persisted provider and task settings to backend defaults.
+ */
+export async function resetAppSettings(): Promise<AppSettings> {
+  const response = await fetch(`${API_BASE}/settings/reset`, {
+    method: 'POST',
+  });
+  if (!response.ok) {
+    throw new Error('Failed to reset settings');
+  }
+  return response.json() as Promise<AppSettings>;
+}
@@ -0,0 +1,18 @@
+/**
+ * Frontend application bootstrap for React rendering.
+ */
+import { StrictMode } from 'react';
+import { createRoot } from 'react-dom/client';
+
+import App from './App';
+import './design-foundation.css';
+import './styles.css';
+
+/**
+ * Mounts the root React application into the document.
+ */
+createRoot(document.getElementById('root')!).render(
+  <StrictMode>
+    <App />
+  </StrictMode>,
+);
@@ -0,0 +1,292 @@
+/**
+ * Shared TypeScript API contracts used by frontend components.
+ */
+
+/**
+ * Enumerates backend document lifecycle states.
+ */
+export type DocumentStatus = 'queued' | 'processed' | 'unsupported' | 'error' | 'trashed';
+
+/**
+ * Represents one document row returned by backend APIs.
+ */
+export interface DmsDocument {
+  id: string;
+  original_filename: string;
+  source_relative_path: string;
+  mime_type: string;
+  extension: string;
+  size_bytes: number;
+  sha256: string;
+  logical_path: string;
+  suggested_path: string | null;
+  image_text_type: string | null;
+  handwriting_style_id: string | null;
+  tags: string[];
+  suggested_tags: string[];
+  status: DocumentStatus;
+  preview_available: boolean;
+  is_archive_member: boolean;
+  archived_member_path: string | null;
+  parent_document_id: string | null;
+  replaces_document_id: string | null;
+  created_at: string;
+  processed_at: string | null;
+}
+
+/**
+ * Represents full document detail payload including extracted text and metadata.
+ */
+export interface DmsDocumentDetail extends DmsDocument {
+  extracted_text: string;
+  metadata_json: Record<string, unknown>;
+}
+
+/**
+ * Represents paginated document list payload.
+ */
+export interface DocumentListResponse {
+  total: number;
+  items: DmsDocument[];
+}
+
+/**
+ * Represents search result payload.
+ */
+export interface SearchResponse {
+  total: number;
+  items: DmsDocument[];
+}
+
+/**
+ * Represents distinct document type values available for filter controls.
+ */
+export interface TypeListResponse {
+  types: string[];
+}
+
+/**
+ * Represents one processing pipeline event entry returned by the backend.
+ */
+export interface ProcessingLogEntry {
+  id: number;
+  created_at: string;
+  level: string;
+  stage: string;
+  event: string;
+  document_id: string | null;
+  document_filename: string | null;
+  provider_id: string | null;
+  model_name: string | null;
+  prompt_text: string | null;
+  response_text: string | null;
+  payload_json: Record<string, unknown>;
+}
+
+/**
+ * Represents paginated processing log response payload.
+ */
+export interface ProcessingLogListResponse {
+  total: number;
+  items: ProcessingLogEntry[];
+}
+
+/**
+ * Represents upload conflict information.
+ */
+export interface UploadConflict {
+  original_filename: string;
+  sha256: string;
+  existing_document_id: string;
+}
+
+/**
+ * Represents upload response payload.
+ */
+export interface UploadResponse {
+  uploaded: DmsDocument[];
+  conflicts: UploadConflict[];
+}
+
+/**
+ * Represents one model provider binding served by the backend.
+ */
+export interface ProviderSettings {
+  id: string;
+  label: string;
+  provider_type: string;
+  base_url: string;
+  timeout_seconds: number;
+  api_key_set: boolean;
+  api_key_masked: string;
+}
+
+/**
+ * Represents OCR task settings served by the backend.
+ */
+export interface OcrTaskSettings {
+  enabled: boolean;
+  provider_id: string;
+  model: string;
+  prompt: string;
+}
+
+/**
+ * Represents summarization task settings served by the backend.
+ */
+export interface SummaryTaskSettings {
+  enabled: boolean;
+  provider_id: string;
+  model: string;
+  prompt: string;
+  max_input_tokens: number;
+}
+
+/**
+ * Represents routing task settings served by the backend.
+ */
+export interface RoutingTaskSettings {
+  enabled: boolean;
+  provider_id: string;
+  model: string;
+  prompt: string;
+  neighbor_count: number;
+  neighbor_min_similarity: number;
+  auto_apply_confidence_threshold: number;
+  auto_apply_neighbor_similarity_threshold: number;
+  neighbor_path_override_enabled: boolean;
+  neighbor_path_override_min_similarity: number;
+  neighbor_path_override_min_gap: number;
+  neighbor_path_override_max_confidence: number;
+}
+
+/**
+ * Represents default upload destination and tags.
+ */
+export interface UploadDefaultsSettings {
+  logical_path: string;
+  tags: string[];
+}
+
+/**
+ * Represents display preferences for document listings.
+ */
+export interface DisplaySettings {
+  cards_per_page: number;
+  log_typing_animation_enabled: boolean;
+}
+
+/**
+ * Represents one predefined logical path and discoverability scope.
+ */
+export interface PredefinedPathEntry {
+  value: string;
+  global_shared: boolean;
+}
+
+/**
+ * Represents one predefined tag and discoverability scope.
+ */
+export interface PredefinedTagEntry {
+  value: string;
+  global_shared: boolean;
+}
+
+/**
+ * Represents handwriting-style clustering settings for Typesense image embeddings.
+ */
+export interface HandwritingStyleClusteringSettings {
+  enabled: boolean;
+  embed_model: string;
+  neighbor_limit: number;
+  match_min_similarity: number;
+  bootstrap_match_min_similarity: number;
+  bootstrap_sample_size: number;
+  image_max_side: number;
+}
+
+/**
+ * Represents all task-level settings served by the backend.
+ */
+export interface TaskSettings {
+  ocr_handwriting: OcrTaskSettings;
+  summary_generation: SummaryTaskSettings;
+  routing_classification: RoutingTaskSettings;
+}
+
+/**
+ * Represents runtime settings served by the backend.
+ */
+export interface AppSettings {
+  upload_defaults: UploadDefaultsSettings;
+  display: DisplaySettings;
+  handwriting_style_clustering: HandwritingStyleClusteringSettings;
+  predefined_paths: PredefinedPathEntry[];
+  predefined_tags: PredefinedTagEntry[];
+  providers: ProviderSettings[];
+  tasks: TaskSettings;
+}
+
+/**
+ * Represents provider settings update input payload.
+ */
+export interface ProviderSettingsUpdate {
+  id: string;
+  label: string;
+  provider_type: string;
+  base_url: string;
+  timeout_seconds: number;
+  api_key?: string;
+  clear_api_key?: boolean;
+}
+
+/**
+ * Represents task settings update input payload.
+ */
+export interface TaskSettingsUpdate {
+  ocr_handwriting?: Partial<OcrTaskSettings>;
+  summary_generation?: Partial<SummaryTaskSettings>;
+  routing_classification?: Partial<RoutingTaskSettings>;
+}
+
+/**
+ * Represents upload defaults update input payload.
+ */
+export interface UploadDefaultsSettingsUpdate {
+  logical_path?: string;
+  tags?: string[];
+}
+
+/**
+ * Represents display settings update input payload.
+ */
+export interface DisplaySettingsUpdate {
+  cards_per_page?: number;
+  log_typing_animation_enabled?: boolean;
+}
+
+/**
+ * Represents handwriting-style clustering settings update payload.
+ */
+export interface HandwritingStyleClusteringSettingsUpdate {
+  enabled?: boolean;
+  embed_model?: string;
+  neighbor_limit?: number;
+  match_min_similarity?: number;
+  bootstrap_match_min_similarity?: number;
+  bootstrap_sample_size?: number;
+  image_max_side?: number;
+}
+
+/**
+ * Represents app settings update payload sent to backend.
+ */
+export interface AppSettingsUpdate {
+  upload_defaults?: UploadDefaultsSettingsUpdate;
+  display?: DisplaySettingsUpdate;
+  handwriting_style_clustering?: HandwritingStyleClusteringSettingsUpdate;
+  predefined_paths?: PredefinedPathEntry[];
+  predefined_tags?: PredefinedTagEntry[];
+  providers?: ProviderSettingsUpdate[];
+  tasks?: TaskSettingsUpdate;
+}
@@ -0,0 +1,19 @@
+{
+  "compilerOptions": {
+    "target": "ES2022",
+    "useDefineForClassFields": true,
+    "lib": ["ES2022", "DOM", "DOM.Iterable"],
+    "module": "ESNext",
+    "skipLibCheck": true,
+    "moduleResolution": "Bundler",
+    "allowImportingTsExtensions": false,
+    "resolveJsonModule": true,
+    "isolatedModules": true,
+    "noEmit": true,
+    "jsx": "react-jsx",
+    "strict": true,
+    "noFallthroughCasesInSwitch": true,
+    "types": ["vite/client"]
+  },
+  "include": ["src"]
+}
@@ -0,0 +1,9 @@
+{
+  "compilerOptions": {
+    "composite": true,
+    "module": "ESNext",
+    "moduleResolution": "Bundler",
+    "allowSyntheticDefaultImports": true
+  },
+  "include": ["vite.config.ts"]
+}
@@ -0,0 +1,14 @@
+/**
+ * Vite configuration for the DMS frontend application.
+ */
+import { defineConfig } from 'vite';
+
+/**
+ * Exports frontend build and dev-server settings.
+ */
+export default defineConfig({
+  server: {
+    host: '0.0.0.0',
+    port: 5173,
+  },
+});
				`@@ -0,0 +1 @@`
				`"""Backend application package for the DMS service."""`
				`@@ -0,0 +1 @@`
				`"""API package containing route modules and router registration."""`
				`@@ -0,0 +1 @@`
				`"""Core settings and shared configuration package."""`
				`@@ -0,0 +1 @@`
				`"""Database package exposing engine and session utilities."""`
				`@@ -0,0 +1 @@`
				`"""Pydantic schema package for API request and response models."""`
				`@@ -0,0 +1 @@`
				`"""Domain services package for storage, extraction, and classification logic."""`
				`@@ -0,0 +1 @@`
				`"""Background worker package for queueing and document processing tasks."""`