Fix predefined catalog visibility and port security must-know guidance
This commit is contained in:
75
README.md
75
README.md
@@ -75,34 +75,69 @@ docker compose down
|
|||||||
|
|
||||||
## Security Must-Know Before Real User Deployment
|
## Security Must-Know Before Real User Deployment
|
||||||
|
|
||||||
This repository starts in a development-friendly mode. Before exposing it to real users or untrusted networks, verify these controls:
|
The items below port the `MUST KNOW User-Dependent Risks` from `REPORT.md` into explicit operator actions.
|
||||||
|
|
||||||
1. Environment mode and host binding:
|
### High: Development-first defaults can be promoted to production
|
||||||
|
|
||||||
|
Avoid:
|
||||||
- Set `APP_ENV=production`.
|
- Set `APP_ENV=production`.
|
||||||
- Keep `HOST_BIND_IP=127.0.0.1` and publish through an HTTPS reverse proxy instead of broad host bind.
|
- Set `PROVIDER_BASE_URL_ALLOW_HTTP=false`.
|
||||||
|
- Set `PROVIDER_BASE_URL_ALLOW_PRIVATE_NETWORK=false`.
|
||||||
|
- Set a strict non-empty `PROVIDER_BASE_URL_ALLOWLIST` for approved provider hosts only.
|
||||||
|
- Set `PUBLIC_BASE_URL` to HTTPS.
|
||||||
|
- Restrict `CORS_ORIGINS` to exact production frontend origins.
|
||||||
|
- Use `REDIS_URL` with `rediss://`.
|
||||||
|
- Set `REDIS_SECURITY_MODE=strict`.
|
||||||
|
- Set `REDIS_TLS_MODE=required`.
|
||||||
|
- Keep `HOST_BIND_IP=127.0.0.1` and expose services only through an HTTPS reverse proxy.
|
||||||
|
|
||||||
2. Bootstrap credentials:
|
Remedy:
|
||||||
- Replace all `AUTH_BOOTSTRAP_*` values with strong unique passwords before first public deployment.
|
- Immediately correct the values above and redeploy `api` and `worker` (`docker compose up -d api worker`).
|
||||||
- Disable optional bootstrap user credentials unless they are needed.
|
- Rotate `AUTH_BOOTSTRAP_*` credentials, provider API keys, and Redis credentials if insecure values were used in a reachable environment.
|
||||||
|
- Re-check `.env.example` and `docker-compose.yml` before each production promotion.
|
||||||
|
|
||||||
3. Processing log text persistence:
|
### Medium: Login throttle IP identity depends on proxy trust model
|
||||||
- Keep `PROCESSING_LOG_STORE_MODEL_IO_TEXT=false` and `PROCESSING_LOG_STORE_PAYLOAD_TEXT=false` unless temporary debugging is required.
|
|
||||||
- Enabling these values can store sensitive prompt, response, and payload text.
|
|
||||||
|
|
||||||
4. Provider outbound restrictions:
|
Current behavior:
|
||||||
- Keep `PROVIDER_BASE_URL_ALLOW_HTTP=false` and `PROVIDER_BASE_URL_ALLOW_PRIVATE_NETWORK=false`.
|
- Login throttle identity currently uses `request.client.host` directly.
|
||||||
- Set a strict `PROVIDER_BASE_URL_ALLOWLIST` containing only approved provider hosts.
|
|
||||||
|
|
||||||
5. Public URL and CORS posture:
|
Avoid:
|
||||||
- Use HTTPS in `PUBLIC_BASE_URL`.
|
- Deploy so the backend receives true client IP addresses and does not collapse all traffic to one proxy source IP.
|
||||||
- Restrict `CORS_ORIGINS` to exact production frontend origins only.
|
- Validate lockout behavior with multiple client IPs before going live behind a proxy.
|
||||||
|
|
||||||
6. Redis transport security:
|
Remedy:
|
||||||
- For live deployments, use `REDIS_URL` with `rediss://`, set `REDIS_SECURITY_MODE=strict`, and set `REDIS_TLS_MODE=required`.
|
- If lockouts affect many users at once, temporarily increase `AUTH_LOGIN_FAILURE_LIMIT` and tune lockout timings to reduce impact while mitigation is in progress.
|
||||||
|
- Update network and proxy topology so client IP identity is preserved for the backend, then re-run lockout validation tests.
|
||||||
|
|
||||||
7. Development compose defaults:
|
### Medium: API documentation endpoints are exposed by default
|
||||||
- Review `.env.example` and `docker-compose.yml` security-related defaults before deployment.
|
|
||||||
- Do not promote development defaults unchanged into production.
|
Avoid:
|
||||||
|
- Block public access to `/docs`, `/redoc`, and `/openapi.json` at the reverse proxy or edge firewall.
|
||||||
|
- Keep docs endpoints reachable only from trusted internal/admin networks.
|
||||||
|
|
||||||
|
Remedy:
|
||||||
|
- Add deny rules for those paths immediately and reload the proxy.
|
||||||
|
- Verify those routes return `403` or `404` from untrusted networks.
|
||||||
|
|
||||||
|
### Medium: Bearer token is stored in browser `sessionStorage`
|
||||||
|
|
||||||
|
Avoid:
|
||||||
|
- Enforce strict CSP and disallow inline script execution where possible.
|
||||||
|
- Avoid rendering untrusted HTML or script-capable content in the frontend.
|
||||||
|
- Keep dependencies patched to reduce known XSS vectors.
|
||||||
|
|
||||||
|
Remedy:
|
||||||
|
- If XSS is suspected, revoke active sessions, rotate privileged credentials, and redeploy frontend fixes before restoring user access.
|
||||||
|
- Treat exposed browser sessions as compromised until revocation and credential rotation are complete.
|
||||||
|
|
||||||
|
### Low: Typesense transport defaults to HTTP on internal network
|
||||||
|
|
||||||
|
Avoid:
|
||||||
|
- Keep Typesense on isolated internal networks only.
|
||||||
|
- Do not expose Typesense service ports directly to untrusted networks.
|
||||||
|
|
||||||
|
Remedy:
|
||||||
|
- For cross-host or untrusted network paths, terminate TLS in front of Typesense (or use equivalent secure service networking) and require encrypted transport for all clients.
|
||||||
|
|
||||||
## Common Operations
|
## Common Operations
|
||||||
|
|
||||||
|
|||||||
@@ -50,6 +50,31 @@ def _scope_document_statement_for_auth_context(statement, auth_context: AuthCont
|
|||||||
return statement.where(Document.owner_user_id == auth_context.user_id)
|
return statement.where(Document.owner_user_id == auth_context.user_id)
|
||||||
|
|
||||||
|
|
||||||
|
def _is_predefined_entry_visible_to_auth_context(entry: dict[str, object], auth_context: AuthContext) -> bool:
|
||||||
|
"""Returns whether one predefined catalog entry is visible to the active caller role."""
|
||||||
|
|
||||||
|
if auth_context.role == UserRole.ADMIN:
|
||||||
|
return True
|
||||||
|
return bool(entry.get("global_shared", False))
|
||||||
|
|
||||||
|
|
||||||
|
def _collect_visible_predefined_values(
|
||||||
|
entries: list[dict[str, object]],
|
||||||
|
*,
|
||||||
|
auth_context: AuthContext,
|
||||||
|
) -> set[str]:
|
||||||
|
"""Collects normalized predefined values visible for the active caller role."""
|
||||||
|
|
||||||
|
visible_values: set[str] = set()
|
||||||
|
for entry in entries:
|
||||||
|
if not _is_predefined_entry_visible_to_auth_context(entry, auth_context):
|
||||||
|
continue
|
||||||
|
normalized = str(entry.get("value", "")).strip()
|
||||||
|
if normalized:
|
||||||
|
visible_values.add(normalized)
|
||||||
|
return visible_values
|
||||||
|
|
||||||
|
|
||||||
def _ensure_document_access(document: Document, auth_context: AuthContext) -> None:
|
def _ensure_document_access(document: Document, auth_context: AuthContext) -> None:
|
||||||
"""Enforces owner-level access for non-admin users and raises not-found on violations."""
|
"""Enforces owner-level access for non-admin users and raises not-found on violations."""
|
||||||
|
|
||||||
@@ -397,9 +422,10 @@ def list_tags(
|
|||||||
rows = session.execute(statement).scalars().all()
|
rows = session.execute(statement).scalars().all()
|
||||||
tags = {tag for row in rows for tag in row if tag}
|
tags = {tag for row in rows for tag in row if tag}
|
||||||
tags.update(
|
tags.update(
|
||||||
str(item.get("value", "")).strip()
|
_collect_visible_predefined_values(
|
||||||
for item in read_predefined_tags_settings()
|
read_predefined_tags_settings(),
|
||||||
if str(item.get("value", "")).strip()
|
auth_context=auth_context,
|
||||||
|
)
|
||||||
)
|
)
|
||||||
tags = sorted(tags)
|
tags = sorted(tags)
|
||||||
return {"tags": tags}
|
return {"tags": tags}
|
||||||
@@ -421,9 +447,10 @@ def list_paths(
|
|||||||
rows = session.execute(statement).scalars().all()
|
rows = session.execute(statement).scalars().all()
|
||||||
paths = {row for row in rows if row}
|
paths = {row for row in rows if row}
|
||||||
paths.update(
|
paths.update(
|
||||||
str(item.get("value", "")).strip()
|
_collect_visible_predefined_values(
|
||||||
for item in read_predefined_paths_settings()
|
read_predefined_paths_settings(),
|
||||||
if str(item.get("value", "")).strip()
|
auth_context=auth_context,
|
||||||
|
)
|
||||||
)
|
)
|
||||||
paths = sorted(paths)
|
paths = sorted(paths)
|
||||||
return {"paths": paths}
|
return {"paths": paths}
|
||||||
|
|||||||
@@ -63,6 +63,22 @@ if "fastapi" not in sys.modules:
|
|||||||
|
|
||||||
return decorator
|
return decorator
|
||||||
|
|
||||||
|
def patch(self, *_args: object, **_kwargs: object): # type: ignore[no-untyped-def]
|
||||||
|
"""Returns no-op decorator for PATCH route declarations."""
|
||||||
|
|
||||||
|
def decorator(func): # type: ignore[no-untyped-def]
|
||||||
|
return func
|
||||||
|
|
||||||
|
return decorator
|
||||||
|
|
||||||
|
def delete(self, *_args: object, **_kwargs: object): # type: ignore[no-untyped-def]
|
||||||
|
"""Returns no-op decorator for DELETE route declarations."""
|
||||||
|
|
||||||
|
def decorator(func): # type: ignore[no-untyped-def]
|
||||||
|
return func
|
||||||
|
|
||||||
|
return decorator
|
||||||
|
|
||||||
class _Request:
|
class _Request:
|
||||||
"""Minimal request placeholder for route function import compatibility."""
|
"""Minimal request placeholder for route function import compatibility."""
|
||||||
|
|
||||||
@@ -88,13 +104,52 @@ if "fastapi" not in sys.modules:
|
|||||||
|
|
||||||
return dependency
|
return dependency
|
||||||
|
|
||||||
|
def _query(default=None, **_kwargs): # type: ignore[no-untyped-def]
|
||||||
|
"""Returns FastAPI-like query defaults for dependency-light route imports."""
|
||||||
|
|
||||||
|
return default
|
||||||
|
|
||||||
|
def _file(default=None, **_kwargs): # type: ignore[no-untyped-def]
|
||||||
|
"""Returns FastAPI-like file defaults for dependency-light route imports."""
|
||||||
|
|
||||||
|
return default
|
||||||
|
|
||||||
|
def _form(default=None, **_kwargs): # type: ignore[no-untyped-def]
|
||||||
|
"""Returns FastAPI-like form defaults for dependency-light route imports."""
|
||||||
|
|
||||||
|
return default
|
||||||
|
|
||||||
|
class _UploadFile:
|
||||||
|
"""Minimal UploadFile placeholder for route import compatibility."""
|
||||||
|
|
||||||
fastapi_stub.APIRouter = _APIRouter
|
fastapi_stub.APIRouter = _APIRouter
|
||||||
fastapi_stub.Depends = _depends
|
fastapi_stub.Depends = _depends
|
||||||
|
fastapi_stub.File = _file
|
||||||
|
fastapi_stub.Form = _form
|
||||||
fastapi_stub.HTTPException = _HTTPException
|
fastapi_stub.HTTPException = _HTTPException
|
||||||
|
fastapi_stub.Query = _query
|
||||||
fastapi_stub.Request = _Request
|
fastapi_stub.Request = _Request
|
||||||
|
fastapi_stub.UploadFile = _UploadFile
|
||||||
fastapi_stub.status = _Status()
|
fastapi_stub.status = _Status()
|
||||||
sys.modules["fastapi"] = fastapi_stub
|
sys.modules["fastapi"] = fastapi_stub
|
||||||
|
|
||||||
|
if "fastapi.responses" not in sys.modules:
|
||||||
|
fastapi_responses_stub = ModuleType("fastapi.responses")
|
||||||
|
|
||||||
|
class _Response:
|
||||||
|
"""Minimal response placeholder for route import compatibility."""
|
||||||
|
|
||||||
|
class _FileResponse(_Response):
|
||||||
|
"""Minimal file response placeholder for route import compatibility."""
|
||||||
|
|
||||||
|
class _StreamingResponse(_Response):
|
||||||
|
"""Minimal streaming response placeholder for route import compatibility."""
|
||||||
|
|
||||||
|
fastapi_responses_stub.Response = _Response
|
||||||
|
fastapi_responses_stub.FileResponse = _FileResponse
|
||||||
|
fastapi_responses_stub.StreamingResponse = _StreamingResponse
|
||||||
|
sys.modules["fastapi.responses"] = fastapi_responses_stub
|
||||||
|
|
||||||
if "fastapi.security" not in sys.modules:
|
if "fastapi.security" not in sys.modules:
|
||||||
fastapi_security_stub = ModuleType("fastapi.security")
|
fastapi_security_stub = ModuleType("fastapi.security")
|
||||||
|
|
||||||
@@ -267,8 +322,14 @@ if "app.services.handwriting_style" not in sys.modules:
|
|||||||
|
|
||||||
return None
|
return None
|
||||||
|
|
||||||
|
def _delete_many_handwriting_style_documents(*_args: object, **_kwargs: object) -> None:
|
||||||
|
"""No-op bulk style document delete stub for route import compatibility."""
|
||||||
|
|
||||||
|
return None
|
||||||
|
|
||||||
handwriting_style_stub.assign_handwriting_style = _assign_handwriting_style
|
handwriting_style_stub.assign_handwriting_style = _assign_handwriting_style
|
||||||
handwriting_style_stub.delete_handwriting_style_document = _delete_handwriting_style_document
|
handwriting_style_stub.delete_handwriting_style_document = _delete_handwriting_style_document
|
||||||
|
handwriting_style_stub.delete_many_handwriting_style_documents = _delete_many_handwriting_style_documents
|
||||||
sys.modules["app.services.handwriting_style"] = handwriting_style_stub
|
sys.modules["app.services.handwriting_style"] = handwriting_style_stub
|
||||||
|
|
||||||
if "app.services.routing_pipeline" not in sys.modules:
|
if "app.services.routing_pipeline" not in sys.modules:
|
||||||
@@ -304,6 +365,7 @@ from fastapi import HTTPException
|
|||||||
|
|
||||||
from app.api.auth import AuthContext, require_admin
|
from app.api.auth import AuthContext, require_admin
|
||||||
from app.api import routes_auth as auth_routes_module
|
from app.api import routes_auth as auth_routes_module
|
||||||
|
from app.api import routes_documents as documents_routes_module
|
||||||
from app.core import config as config_module
|
from app.core import config as config_module
|
||||||
from app.models.auth import UserRole
|
from app.models.auth import UserRole
|
||||||
from app.models.processing_log import sanitize_processing_log_payload_value, sanitize_processing_log_text
|
from app.models.processing_log import sanitize_processing_log_payload_value, sanitize_processing_log_text
|
||||||
@@ -359,6 +421,125 @@ class AuthDependencyTests(unittest.TestCase):
|
|||||||
self.assertEqual(resolved.role, UserRole.ADMIN)
|
self.assertEqual(resolved.role, UserRole.ADMIN)
|
||||||
|
|
||||||
|
|
||||||
|
class DocumentCatalogVisibilityTests(unittest.TestCase):
|
||||||
|
"""Verifies predefined tag and path discovery visibility by caller role."""
|
||||||
|
|
||||||
|
class _ScalarSequence:
|
||||||
|
"""Provides SQLAlchemy-like scalar result chaining for route unit tests."""
|
||||||
|
|
||||||
|
def __init__(self, values: list[object]) -> None:
|
||||||
|
self._values = values
|
||||||
|
|
||||||
|
def scalars(self) -> "DocumentCatalogVisibilityTests._ScalarSequence":
|
||||||
|
"""Returns self to emulate `.scalars().all()` call chains."""
|
||||||
|
|
||||||
|
return self
|
||||||
|
|
||||||
|
def all(self) -> list[object]:
|
||||||
|
"""Returns deterministic sequence values for route helper queries."""
|
||||||
|
|
||||||
|
return list(self._values)
|
||||||
|
|
||||||
|
class _SessionStub:
|
||||||
|
"""Returns a fixed scalar sequence for route metadata queries."""
|
||||||
|
|
||||||
|
def __init__(self, values: list[object]) -> None:
|
||||||
|
self._values = values
|
||||||
|
|
||||||
|
def execute(self, _statement: object) -> "DocumentCatalogVisibilityTests._ScalarSequence":
|
||||||
|
"""Ignores query details and returns deterministic scalar sequence results."""
|
||||||
|
|
||||||
|
return DocumentCatalogVisibilityTests._ScalarSequence(self._values)
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def _auth_context(role: UserRole) -> AuthContext:
|
||||||
|
"""Builds deterministic auth context fixtures for document discovery tests."""
|
||||||
|
|
||||||
|
return AuthContext(
|
||||||
|
user_id=uuid.uuid4(),
|
||||||
|
username=f"{role.value}-user",
|
||||||
|
role=role,
|
||||||
|
session_id=uuid.uuid4(),
|
||||||
|
expires_at=datetime.now(UTC),
|
||||||
|
)
|
||||||
|
|
||||||
|
def test_non_admin_only_receives_global_shared_predefined_tags_and_paths(self) -> None:
|
||||||
|
"""User role receives only globally shared predefined values in discovery responses."""
|
||||||
|
|
||||||
|
session = self._SessionStub(
|
||||||
|
values=[
|
||||||
|
["owner-tag", ""],
|
||||||
|
["owner-only-tag"],
|
||||||
|
]
|
||||||
|
)
|
||||||
|
predefined_tags = [
|
||||||
|
{"value": "SharedTag", "global_shared": True},
|
||||||
|
{"value": "InternalTag", "global_shared": False},
|
||||||
|
{"value": "ImplicitPrivateTag"},
|
||||||
|
]
|
||||||
|
predefined_paths = [
|
||||||
|
{"value": "Shared/Path", "global_shared": True},
|
||||||
|
{"value": "Internal/Path", "global_shared": False},
|
||||||
|
{"value": "Implicit/Private"},
|
||||||
|
]
|
||||||
|
|
||||||
|
with (
|
||||||
|
patch.object(documents_routes_module, "read_predefined_tags_settings", return_value=predefined_tags),
|
||||||
|
patch.object(documents_routes_module, "read_predefined_paths_settings", return_value=predefined_paths),
|
||||||
|
):
|
||||||
|
tags_response = documents_routes_module.list_tags(
|
||||||
|
include_trashed=False,
|
||||||
|
auth_context=self._auth_context(UserRole.USER),
|
||||||
|
session=session,
|
||||||
|
)
|
||||||
|
paths_response = documents_routes_module.list_paths(
|
||||||
|
include_trashed=False,
|
||||||
|
auth_context=self._auth_context(UserRole.USER),
|
||||||
|
session=self._SessionStub(values=["Owner/Path"]),
|
||||||
|
)
|
||||||
|
|
||||||
|
self.assertEqual(tags_response["tags"], ["SharedTag", "owner-only-tag", "owner-tag"])
|
||||||
|
self.assertEqual(paths_response["paths"], ["Owner/Path", "Shared/Path"])
|
||||||
|
|
||||||
|
def test_admin_receives_full_predefined_tags_and_paths_catalog(self) -> None:
|
||||||
|
"""Admin role receives full predefined values regardless of global-sharing scope."""
|
||||||
|
|
||||||
|
predefined_tags = [
|
||||||
|
{"value": "SharedTag", "global_shared": True},
|
||||||
|
{"value": "InternalTag", "global_shared": False},
|
||||||
|
{"value": "ImplicitPrivateTag"},
|
||||||
|
]
|
||||||
|
predefined_paths = [
|
||||||
|
{"value": "Shared/Path", "global_shared": True},
|
||||||
|
{"value": "Internal/Path", "global_shared": False},
|
||||||
|
{"value": "Implicit/Private"},
|
||||||
|
]
|
||||||
|
|
||||||
|
with (
|
||||||
|
patch.object(documents_routes_module, "read_predefined_tags_settings", return_value=predefined_tags),
|
||||||
|
patch.object(documents_routes_module, "read_predefined_paths_settings", return_value=predefined_paths),
|
||||||
|
):
|
||||||
|
tags_response = documents_routes_module.list_tags(
|
||||||
|
include_trashed=False,
|
||||||
|
auth_context=self._auth_context(UserRole.ADMIN),
|
||||||
|
session=self._SessionStub(values=[["admin-tag"]]),
|
||||||
|
)
|
||||||
|
paths_response = documents_routes_module.list_paths(
|
||||||
|
include_trashed=False,
|
||||||
|
auth_context=self._auth_context(UserRole.ADMIN),
|
||||||
|
session=self._SessionStub(values=["Admin/Path"]),
|
||||||
|
)
|
||||||
|
|
||||||
|
self.assertEqual(
|
||||||
|
tags_response["tags"],
|
||||||
|
["ImplicitPrivateTag", "InternalTag", "SharedTag", "admin-tag"],
|
||||||
|
)
|
||||||
|
self.assertEqual(
|
||||||
|
paths_response["paths"],
|
||||||
|
["Admin/Path", "Implicit/Private", "Internal/Path", "Shared/Path"],
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
class _FakeRedisPipeline:
|
class _FakeRedisPipeline:
|
||||||
"""Provides deterministic Redis pipeline behavior for login throttle tests."""
|
"""Provides deterministic Redis pipeline behavior for login throttle tests."""
|
||||||
|
|
||||||
|
|||||||
@@ -61,9 +61,15 @@ Ownership rules:
|
|||||||
- `GET /documents/tags`
|
- `GET /documents/tags`
|
||||||
- Query: `include_trashed`
|
- Query: `include_trashed`
|
||||||
- Response: `{ "tags": string[] }`
|
- Response: `{ "tags": string[] }`
|
||||||
|
- Behavior:
|
||||||
|
- all document-assigned tags visible to caller scope are included
|
||||||
|
- predefined tags are role-filtered: `admin` receives full catalog, `user` receives only entries with `global_shared=true`
|
||||||
- `GET /documents/paths`
|
- `GET /documents/paths`
|
||||||
- Query: `include_trashed`
|
- Query: `include_trashed`
|
||||||
- Response: `{ "paths": string[] }`
|
- Response: `{ "paths": string[] }`
|
||||||
|
- Behavior:
|
||||||
|
- all document-assigned logical paths visible to caller scope are included
|
||||||
|
- predefined paths are role-filtered: `admin` receives full catalog, `user` receives only entries with `global_shared=true`
|
||||||
- `GET /documents/types`
|
- `GET /documents/types`
|
||||||
- Query: `include_trashed`
|
- Query: `include_trashed`
|
||||||
- Response: `{ "types": string[] }`
|
- Response: `{ "types": string[] }`
|
||||||
|
|||||||
Reference in New Issue
Block a user