2.0 KiB
2.0 KiB
Data Model Reference
Primary SQLAlchemy models are defined in backend/app/models/.
documents
Model: Document in backend/app/models/document.py
Purpose:
- Stores source file identity, storage location, extracted content, lifecycle status, and classification metadata.
Core fields:
- Identity and source:
id,original_filename,source_relative_path,stored_relative_path - File attributes:
mime_type,extension,sha256,size_bytes - Organization:
logical_path,suggested_path,tags,suggested_tags - Processing outputs:
extracted_text,image_text_type,handwriting_style_id,preview_available - Lifecycle and relations:
status,is_archive_member,archived_member_path,parent_document_id,replaces_document_id - Metadata and timestamps:
metadata_json,created_at,processed_at,updated_at
Enum DocumentStatus:
queuedprocessedunsupportederrortrashed
Relationships:
- Self-referential
parent_documentrelationship for archive extraction trees.
processing_logs
Model: ProcessingLogEntry in backend/app/models/processing_log.py
Purpose:
- Stores timestamped pipeline events for upload, extraction, OCR, routing, indexing, and errors.
Core fields:
- Event identity and timing:
id,created_at - Event classification:
level,stage,event - Document linkage:
document_id,document_filename - Model context:
provider_id,model_name - Prompt or response traces:
prompt_text,response_text - Structured event payload:
payload_json
Foreign keys:
document_idreferencesdocuments.idwithON DELETE SET NULL.
Model Lifecycle Notes
- Upload inserts a
Documentrow inqueuedstate and enqueues background processing. - Worker updates extraction results and final status (
processed,unsupported, orerror). - Trash and restore operations toggle
statuswhile preserving source files until permanent delete. - Permanent delete removes the document tree (including archive descendants) and associated stored files.