Initial commit

This commit is contained in:
2026-05-16 12:05:36 -03:00
parent 0ce972a361
commit e82cee97a7
65 changed files with 9051 additions and 5 deletions
+29
View File
@@ -0,0 +1,29 @@
.git
.gitignore
.dockerignore
__pycache__/
*.py[cod]
.pytest_cache/
.mypy_cache/
.ruff_cache/
.coverage
.coverage.*
htmlcov/
.venv/
.venv*/
venv/
env/
ENV/
.env
.env.*
!.env.example
config/config.yml
config/*.local.yml
data/
logs/
docker-compose.override.yml
+16
View File
@@ -0,0 +1,16 @@
TUKUTOI_IMAP_USER=REPLACE_WITH_IMAP_USERNAME
TUKUTOI_IMAP_PASSWORD=REPLACE_WITH_IMAP_PASSWORD
DASHBOARD_USERNAME=beda
DASHBOARD_PASSWORD=123
HOMEPAGE_API_TOKEN=123
OPENAI_API_KEY=REPLACE_WITH_OPENAI_API_KEY
# Only needed if alerts.email.enabled is true in config/config.yml.
ALERT_SMTP_HOST=
ALERT_SMTP_PORT=587
ALERT_SMTP_USER=
ALERT_SMTP_PASSWORD=
ALERT_EMAIL_FROM=
ALERT_EMAIL_TO=
+44
View File
@@ -0,0 +1,44 @@
# Python bytecode and caches
__pycache__/
*.py[cod]
*$py.class
.pytest_cache/
.mypy_cache/
.ruff_cache/
.coverage
.coverage.*
htmlcov/
# Virtual environments
.venv/
.venv*/
venv/
env/
ENV/
# Packaging and build output
build/
dist/
*.egg-info/
pip-wheel-metadata/
# Local environment and secrets
.env
.env.*
!.env.example
# Runtime state mounted by Docker Compose
data/*
!data/.gitkeep
logs/*
!logs/.gitkeep
# Editor, OS, and tool metadata
.DS_Store
.idea/
.vscode/
*.swp
*.swo
# Docker/local deploy scratch files
docker-compose.override.yml
+24
View File
@@ -0,0 +1,24 @@
FROM python:3.12-slim
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1
WORKDIR /app
RUN apt-get update \
&& apt-get install -y --no-install-recommends build-essential curl \
&& rm -rf /var/lib/apt/lists/*
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY app ./app
RUN mkdir -p /app/config /app/data /app/logs
RUN groupadd --gid 1000 app \
&& useradd --uid 1000 --gid app --create-home --shell /usr/sbin/nologin app \
&& chown -R app:app /app
USER app
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
+210
View File
@@ -0,0 +1,210 @@
# DMARC Sentinel
DMARC Sentinel is a self-hosted Docker application for monitoring DMARC aggregate reports. It reads report emails from configured IMAP folders, extracts XML reports, stores normalized telemetry in SQLite, runs deterministic security and deliverability analysis, generates alerts, and uses an LLM to turn already-derived facts into human-readable summaries and recommendations.
Detection is deterministic. The LLM is mandatory for explanations, daily summaries, weekly summaries, and dashboard text, but it is never the source of truth for whether an alert exists or how severe it is.
## What It Does
- Connects to one or more IMAP inboxes and reads only the configured DMARC folder by default.
- Supports `.xml`, `.xml.gz`, `.gz`, and `.zip` report attachments.
- Parses DMARC aggregate XML into reports, records, and authentication results.
- Deduplicates reports by raw XML SHA256.
- Classifies known senders using CIDR allowlists, DKIM auth domains, SPF auth domains, and aligned DKIM evidence.
- Creates deterministic alerts for unknown failures, known sender failures, quarantine/reject dispositions, new sources, high failure rates, missing reporters, and first-seen policies/reporters.
- Uses OpenAI to produce alert explanations and daily/weekly operational summaries from sanitized JSON facts.
- Exposes a server-rendered FastAPI dashboard and gethomepage Custom API widget endpoints.
- Sends email alerts and digests when SMTP settings are configured.
## First Run
```bash
cp .env.example .env
cp config/config.example.yml config/config.yml
nano .env
nano config/config.yml
docker compose up -d --build
```
Then process the existing mailbox backlog:
```bash
docker compose exec dmarc-sentinel python -m app.cli backlog --inbox tukutoi --folder DMARC
```
Visit the app through your NPM reverse proxy:
```text
https://sentinel.tukutoi.com
```
## mailcow / SOGo Expectation
No installation is required on the mailcow server. The app logs into the existing mailbox over IMAP.
Expected mailbox setup:
```yaml
mail_account: hello@tukutoi.com
catch_all: true
reports_recipient: dmarcreports@tukutoi.com
imap_folder: DMARC
primary_domain: tukutoi.com
```
Move or filter DMARC aggregate report messages into the `DMARC` folder in SOGo/mailcow. DMARC Sentinel reads that folder unless a CLI/API backlog option overrides it.
## Docker Compose
The compose file binds the dashboard to `127.0.0.1:8000` for local HITL testing and also attaches to the external NPM proxy network with a static container IP:
```yaml
networks:
npm_proxy:
external: true
```
NPM should proxy to:
```text
http://192.168.99.18:8000
```
The app listens inside the container on `0.0.0.0:8000`; on a local Docker host it is also available at:
```text
http://127.0.0.1:8000
```
## Configuration
Runtime config is read from `config/config.yml`; use `config/config.example.yml` as the template.
Important sections:
- `app`: base URL, timezone, poll interval, database URL, log level, attachment limits.
- `security`: Basic Auth and homepage bearer-token settings.
- `llm`: OpenAI model, retry, timeout, and data-sending controls. Raw XML and raw email are disabled by default.
- `inboxes`: IMAP host, folder, recipient, processed/failed folder behavior, and env var names for credentials.
- `known_senders`: deterministic sender classification rules per domain.
- `alerts`: SMTP env var names and deterministic thresholds.
Secrets live in `.env`, not in `config.yml`.
`OPENAI_API_KEY` is required when `llm.provider` is `openai`. The app only bypasses this when `DMARC_SENTINEL_ALLOW_NO_LLM_FOR_TESTS=true`, which is intended for tests.
Adding another inbox or monitored domain should only require adding entries to `inboxes` and `known_senders`.
## Backlog Command
```bash
docker compose exec dmarc-sentinel python -m app.cli backlog --inbox tukutoi
```
Options:
```text
--folder DMARC
--since YYYY-MM-DD
--before YYYY-MM-DD
--limit 500
--dry-run
--reprocess
--mark-seen
```
Backlog mode scans all matching messages, skips already imported XML hashes unless `--reprocess` is passed, and does not modify messages unless configured or explicitly flagged.
## gethomepage Widget
Endpoint:
```http
GET /api/homepage
Authorization: Bearer YOUR_HOMEPAGE_API_TOKEN
```
Example gethomepage config:
```yaml
- Monitoring:
- DMARC Sentinel:
href: https://sentinel.tukutoi.com
description: DMARC monitoring
widget:
type: customapi
url: https://sentinel.tukutoi.com/api/homepage
headers:
Authorization: Bearer YOUR_HOMEPAGE_API_TOKEN
mappings:
- field: status
label: Status
- field: dmarc_pass_rate
label: Pass
- field: critical_alerts
label: Critical
- field: warnings
label: Warnings
- field: summary
label: Summary
```
Domain-specific endpoint:
```http
GET /api/homepage/tukutoi.com
```
## Dashboard
The dashboard is protected with Basic Auth using `DASHBOARD_USERNAME` and `DASHBOARD_PASSWORD`. It includes:
- `/` overview with monitored domains, daily volume, pass rate, alerts, unknown sources, last check, and latest LLM daily summary.
- `/domains/{domain}` with trends, top sources, known vs unknown sender data, reports, alerts, and LLM summary.
- `/reports/{report_id}` with normalized report evidence and record table.
- `/alerts` with acknowledge, resolve, and reopen actions.
- `/inboxes` with status and manual process/backlog buttons.
## HTTP API
Implemented endpoints:
```text
GET /health
GET /api/homepage
GET /api/homepage/{domain}
GET /api/domains
GET /api/domains/{domain}/summary
GET /api/domains/{domain}/reports
GET /api/domains/{domain}/sources
GET /api/reports/{id}
GET /api/alerts
POST /api/alerts/{id}/ack
POST /api/alerts/{id}/resolve
POST /api/alerts/{id}/reopen
POST /api/admin/process-now
POST /api/admin/backlog
```
`/health` is public. Homepage API routes use bearer auth when enabled. Dashboard and admin/API management routes use Basic Auth.
## Troubleshooting
- `OPENAI_API_KEY is required`: set it in `.env`. Do not use the test bypass in production.
- IMAP folder errors: confirm the folder is exactly named `DMARC` and exists for `hello@tukutoi.com`.
- No reports imported: check that messages contain valid DMARC aggregate XML attachments and that they are in the configured folder.
- Duplicate reports skipped: the raw XML SHA256 has already been imported.
- NPM cannot connect: confirm the `npm_proxy` Docker network exists and the container has `192.168.99.18`.
- Dashboard login fails: confirm `DASHBOARD_USERNAME` and `DASHBOARD_PASSWORD` in `.env`.
- Homepage widget returns 401: confirm `HOMEPAGE_API_TOKEN` and the `Authorization: Bearer ...` header.
- Email alerts do not send: verify SMTP host, port, username, password, sender, and recipient env vars.
## Development
Run tests:
```bash
DMARC_SENTINEL_ALLOW_NO_LLM_FOR_TESTS=true pytest
```
The XML parser uses `defusedxml`, ZIP extraction rejects path traversal and nested archives, and decompressed attachment sizes are capped by config.
+1
View File
@@ -0,0 +1 @@
__version__ = "0.1.0"
+91
View File
@@ -0,0 +1,91 @@
from __future__ import annotations
import json
import logging
import os
import smtplib
from email.message import EmailMessage
from app.config import Settings
from app.models import Alert
logger = logging.getLogger(__name__)
def send_alert_email(settings: Settings, alert: Alert, severity_increased: bool = False) -> bool:
if alert.severity not in {"warning", "critical"}:
return False
if not settings.alerts.email.enabled:
return False
host = os.getenv(settings.alerts.email.smtp_host_env)
to_addr = os.getenv(settings.alerts.email.to_env)
from_addr = os.getenv(settings.alerts.email.from_env)
if not host or not to_addr or not from_addr:
logger.warning("Email alert not sent because SMTP environment is incomplete")
return False
msg = EmailMessage()
msg["Subject"] = f"[DMARC Sentinel] {alert.severity.upper()} {alert.title}"
msg["From"] = from_addr
msg["To"] = to_addr
details = json.dumps(json.loads(alert.details_json or "{}"), indent=2, sort_keys=True)
msg.set_content(
"\n".join(
[
f"Severity: {alert.severity}",
f"Domain: {alert.domain}",
f"Alert type: {alert.type}",
f"Title: {alert.title}",
"",
"Deterministic facts:",
details,
"",
f"LLM summary: {alert.llm_summary or alert.summary}",
f"LLM risk: {alert.llm_risk or 'Unavailable'}",
f"LLM recommended action: {alert.llm_recommended_action or 'Review the deterministic facts.'}",
"",
f"Dashboard: {settings.app.base_url}/alerts",
f"Severity increased: {severity_increased}",
]
)
)
port = int(os.getenv(settings.alerts.email.smtp_port_env, "587"))
user = os.getenv(settings.alerts.email.smtp_user_env)
password = os.getenv(settings.alerts.email.smtp_password_env)
try:
with smtplib.SMTP(host, port, timeout=30) as smtp:
smtp.starttls()
if user and password:
smtp.login(user, password)
smtp.send_message(msg)
logger.info("Email alert delivered for %s", alert.fingerprint)
return True
except Exception as exc:
logger.warning("Email alert delivery failed for %s: %s", alert.fingerprint, exc)
return False
def send_digest_email(settings: Settings, subject: str, body: str) -> bool:
if not settings.alerts.email.enabled:
return False
host = os.getenv(settings.alerts.email.smtp_host_env)
to_addr = os.getenv(settings.alerts.email.to_env)
from_addr = os.getenv(settings.alerts.email.from_env)
if not host or not to_addr or not from_addr:
return False
msg = EmailMessage()
msg["Subject"] = subject
msg["From"] = from_addr
msg["To"] = to_addr
msg.set_content(body)
try:
with smtplib.SMTP(host, int(os.getenv(settings.alerts.email.smtp_port_env, "587")), timeout=30) as smtp:
smtp.starttls()
user = os.getenv(settings.alerts.email.smtp_user_env)
password = os.getenv(settings.alerts.email.smtp_password_env)
if user and password:
smtp.login(user, password)
smtp.send_message(msg)
return True
except Exception as exc:
logger.warning("Digest email delivery failed: %s", exc)
return False
+514
View File
@@ -0,0 +1,514 @@
from __future__ import annotations
import json
import logging
from datetime import datetime, timedelta, timezone
from typing import Any
from sqlalchemy import func, select
from sqlalchemy.orm import Session
from app.config import Settings
from app.llm import LLMClient
from app.models import Alert, Record, Report, utcnow
logger = logging.getLogger(__name__)
SEVERITY_RANK = {"info": 0, "warning": 1, "critical": 2}
def _as_utc(value: datetime | str | None) -> datetime | None:
if value is None:
return None
if isinstance(value, str):
try:
value = datetime.fromisoformat(value.replace("Z", "+00:00"))
except ValueError:
return None
if value.tzinfo is None:
return value.replace(tzinfo=timezone.utc)
return value
def _fingerprint(domain: str, alert_type: str, key: str) -> str:
return f"{domain}:{alert_type}:{key}"
def _merge_details(existing: str, incoming: dict[str, Any]) -> str:
try:
data = json.loads(existing or "{}")
except json.JSONDecodeError:
data = {}
existing_range = data.get("date_range") if isinstance(data.get("date_range"), dict) else {}
incoming_range = incoming.get("date_range") if isinstance(incoming.get("date_range"), dict) else {}
report_ids = list(data.get("report_db_ids") or [])
if data.get("report_db_id") and data["report_db_id"] not in report_ids:
report_ids.append(data["report_db_id"])
if incoming.get("report_db_id") and incoming["report_db_id"] not in report_ids:
report_ids.append(incoming["report_db_id"])
data.update(incoming)
if "count" in data and "count" in incoming:
data["count"] = max(int(data["count"]), int(incoming["count"]))
if existing_range or incoming_range:
begins = [item for item in [existing_range.get("begin"), incoming_range.get("begin")] if item]
ends = [item for item in [existing_range.get("end"), incoming_range.get("end")] if item]
data["date_range"] = {
"begin": min(begins) if begins else None,
"end": max(ends) if ends else None,
}
if report_ids:
data["report_db_ids"] = report_ids[-25:]
data["report_db_id"] = incoming.get("report_db_id") or data.get("report_db_id")
return json.dumps(data, sort_keys=True)
def create_or_update_alert(
session: Session,
*,
inbox_id: str,
domain: str,
severity: str,
alert_type: str,
key: str,
title: str,
summary: str,
details: dict[str, Any],
) -> tuple[Alert, bool, bool]:
fp = _fingerprint(domain, alert_type, key)
alert = session.scalar(select(Alert).where(Alert.fingerprint == fp, Alert.status == "open"))
now = utcnow()
if alert:
previous = alert.severity
if SEVERITY_RANK[severity] > SEVERITY_RANK[alert.severity]:
alert.severity = severity
alert.title = title
alert.summary = summary
alert.details_json = _merge_details(alert.details_json, details)
alert.last_seen_at = now
alert.updated_at = now
return alert, False, SEVERITY_RANK[alert.severity] > SEVERITY_RANK[previous]
alert = Alert(
fingerprint=fp,
inbox_id=inbox_id,
domain=domain,
severity=severity,
type=alert_type,
title=title,
summary=summary,
details_json=json.dumps(details, sort_keys=True, default=str),
first_seen_at=now,
last_seen_at=now,
)
session.add(alert)
session.flush()
return alert, True, False
def _record_details(record: Record, report: Report) -> dict[str, Any]:
return {
"source_ip": record.source_ip,
"count": record.count,
"spf_aligned": record.spf_aligned,
"dkim_aligned": record.dkim_aligned,
"dmarc_pass": record.dmarc_pass,
"disposition": record.disposition,
"known_sender": record.is_known_sender,
"known_sender_id": record.known_sender_id,
"reporting_orgs": [report.org_name] if report.org_name else [],
"report_db_id": report.id,
"report_id": report.report_id,
"date_range": {
"begin": report.date_begin.isoformat() if report.date_begin else None,
"end": report.date_end.isoformat() if report.date_end else None,
},
}
def _new_authenticated_path_alert(session: Session, record: Record, report: Report, details: dict[str, Any]) -> tuple[Alert, bool, bool] | None:
if not record.dmarc_pass or record.is_known_sender:
return None
if record.dkim_aligned and not record.spf_aligned:
return create_or_update_alert(
session=session,
inbox_id=report.inbox_id,
domain=report.domain,
severity="info",
alert_type="dkim_authenticated_relay",
key=record.source_ip,
title=f"DKIM-authenticated relay observed for {report.domain}",
summary=(
f"A receiver observed {record.source_ip} transmitting mail claiming to be from {report.domain}. "
"SPF did not align for that observed hop, but DKIM aligned, so DMARC passed. "
"This commonly represents forwarding or an intermediary mail gateway, not a sender to add to SPF."
),
details=details,
)
if record.spf_aligned and record.dkim_aligned:
return create_or_update_alert(
session=session,
inbox_id=report.inbox_id,
domain=report.domain,
severity="warning",
alert_type="new_authenticated_source",
key=record.source_ip,
title=f"New authenticated source observed for {report.domain}",
summary=(
f"{record.source_ip} is newly observed and passed DMARC with both SPF and DKIM alignment. "
"Confirm whether this is an expected direct sender path before classifying it."
),
details=details,
)
if record.spf_aligned:
return create_or_update_alert(
session=session,
inbox_id=report.inbox_id,
domain=report.domain,
severity="warning",
alert_type="new_spf_authenticated_source",
key=record.source_ip,
title=f"New SPF-authenticated source observed for {report.domain}",
summary=(
f"{record.source_ip} is newly observed and passed DMARC through SPF alignment. "
"Confirm whether this is an expected direct sender path before classifying it."
),
details=details,
)
return None
def _report_time(report: Report) -> datetime:
return _as_utc(report.date_end or report.date_begin or report.created_at) or utcnow()
def _report_day(report: Report) -> datetime:
return _report_time(report).replace(hour=0, minute=0, second=0, microsecond=0)
def _report_evidence(report: Report, *, link_report: bool = True) -> dict[str, Any]:
evidence = {
"reporting_orgs": [report.org_name] if report.org_name else [],
"date_range": {
"begin": report.date_begin.isoformat() if report.date_begin else None,
"end": report.date_end.isoformat() if report.date_end else None,
},
}
if link_report:
evidence["report_db_id"] = report.id
evidence["report_id"] = report.report_id
return evidence
def analyze_report(session: Session, settings: Settings, report: Report, llm: LLMClient | None = None) -> list[tuple[Alert, bool, bool]]:
created: list[tuple[Alert, bool, bool]] = []
thresholds = settings.alerts.thresholds
for record in report.records:
details = _record_details(record, report)
if not record.is_known_sender and not record.spf_aligned and not record.dkim_aligned and record.count >= thresholds.unknown_source_fail_count:
created.append(
create_or_update_alert(
session,
inbox_id=report.inbox_id,
domain=report.domain,
severity="critical",
alert_type="unknown_source_failed_both",
key=record.source_ip,
title=f"Unknown source failed SPF and DKIM for {report.domain}",
summary=f"{record.source_ip} sent {record.count} messages that failed SPF and DKIM alignment.",
details=details,
)
)
if record.is_known_sender and not record.dmarc_pass and record.count >= thresholds.min_messages_for_rate_alert:
created.append(
create_or_update_alert(
session,
inbox_id=report.inbox_id,
domain=report.domain,
severity="critical",
alert_type="known_sender_dmarc_failure",
key=record.known_sender_id or record.source_ip,
title=f"Known sender failed DMARC for {report.domain}",
summary=f"{record.known_sender_name or record.source_ip} failed DMARC for {record.count} messages.",
details=details,
)
)
if record.disposition in {"quarantine", "reject"} and record.count > 0:
created.append(
create_or_update_alert(
session,
inbox_id=report.inbox_id,
domain=report.domain,
severity="critical",
alert_type="quarantine_or_reject_seen",
key=f"{record.disposition}:{record.source_ip}",
title=f"{record.disposition.title()} disposition seen for {report.domain}",
summary=f"Receiver applied {record.disposition} to {record.count} messages.",
details=details,
)
)
existing_source = session.scalar(
select(func.count(Record.id))
.join(Report)
.where(Report.domain == report.domain, Record.source_ip == record.source_ip, Record.id != record.id)
)
if not existing_source and not record.dmarc_pass:
created.append(
create_or_update_alert(
session,
inbox_id=report.inbox_id,
domain=report.domain,
severity="warning",
alert_type="new_unknown_source",
key=record.source_ip,
title=f"New unknown failing source for {report.domain}",
summary=f"{record.source_ip} is newly observed and failed DMARC.",
details=details,
)
)
if not existing_source and record.dmarc_pass and not record.is_known_sender:
alert = _new_authenticated_path_alert(session, record, report, details)
if alert:
created.append(alert)
created.extend(_rate_alerts(session, settings, report))
created.extend(_reporter_alerts(session, settings, report))
if llm and settings.llm.generate_alert_explanations:
for alert, is_new, severity_increased in created:
if (is_new or severity_increased) and alert.severity in {"warning", "critical"}:
explanation = llm.explain_alert(alert)
alert.llm_summary = explanation.summary
alert.llm_risk = explanation.risk
alert.llm_recommended_action = explanation.recommended_action
return created
def _rate_alerts(session: Session, settings: Settings, report: Report) -> list[tuple[Alert, bool, bool]]:
thresholds = settings.alerts.thresholds
period_start = _as_utc(report.date_begin) or _report_time(report).replace(hour=0, minute=0, second=0, microsecond=0)
period_end = _as_utc(report.date_end) or (period_start + timedelta(days=1))
current_rows = session.execute(
select(Record, Report)
.join(Report)
.where(
Report.domain == report.domain,
func.coalesce(Report.date_end, Report.date_begin, Report.created_at) >= period_start,
func.coalesce(Report.date_end, Report.date_begin, Report.created_at) <= period_end,
)
).all()
current_records = [row for row, _ in current_rows]
total = sum(row.count for row in current_records)
if total < thresholds.min_messages_for_rate_alert:
return _repeated_failure_alerts(session, settings, report, current_records)
alerts: list[tuple[Alert, bool, bool]] = []
evidence = _report_evidence(report, link_report=False)
unknown_fail = sum(row.count for row in current_records if not row.is_known_sender and not row.dmarc_pass)
unknown_fail_rate = unknown_fail / total * 100 if total else 0
if unknown_fail >= thresholds.unknown_source_fail_count and unknown_fail_rate >= thresholds.unknown_source_fail_rate_percent:
alerts.append(
create_or_update_alert(
session,
inbox_id=report.inbox_id,
domain=report.domain,
severity="critical",
alert_type="high_unknown_source_failure_rate",
key="global",
title=f"High unknown source failure rate for {report.domain}",
summary=f"Unknown sources failed DMARC for {unknown_fail} of {total} messages ({unknown_fail_rate:.1f}%).",
details={**evidence, "failed_messages": unknown_fail, "total_messages": total, "failure_rate_percent": unknown_fail_rate},
)
)
known_total = sum(row.count for row in current_records if row.is_known_sender)
known_fail = sum(row.count for row in current_records if row.is_known_sender and not row.dmarc_pass)
known_fail_rate = known_fail / known_total * 100 if known_total else 0
if known_total >= thresholds.min_messages_for_rate_alert and known_fail_rate >= thresholds.known_source_fail_rate_percent:
alerts.append(
create_or_update_alert(
session,
inbox_id=report.inbox_id,
domain=report.domain,
severity="critical",
alert_type="high_known_source_failure_rate",
key="global",
title=f"High known sender failure rate for {report.domain}",
summary=f"Known senders failed DMARC for {known_fail} of {known_total} messages ({known_fail_rate:.1f}%).",
details={
**evidence,
"failed_messages": known_fail,
"known_sender_messages": known_total,
"failure_rate_percent": known_fail_rate,
},
)
)
report_time = _report_time(report)
recent_start = report_time - timedelta(days=1)
trailing_start = report_time - timedelta(days=8)
trend_rows = session.execute(
select(Record, Report)
.join(Report)
.where(
Report.domain == report.domain,
func.coalesce(Report.date_end, Report.date_begin, Report.created_at) >= trailing_start,
func.coalesce(Report.date_end, Report.date_begin, Report.created_at) <= report_time,
)
).all()
current_unknown = sum(
row.count
for row, row_report in trend_rows
if not row.is_known_sender and not row.dmarc_pass and _report_time(row_report) >= recent_start
)
trailing = sum(
row.count
for row, row_report in trend_rows
if not row.is_known_sender
and not row.dmarc_pass
and trailing_start <= _report_time(row_report) < recent_start
)
trailing_avg = trailing / 7 if trailing else 0
if trailing_avg and current_unknown > thresholds.total_volume_spike_multiplier * trailing_avg and current_unknown >= thresholds.unknown_source_fail_count:
alerts.append(
create_or_update_alert(
session,
inbox_id=report.inbox_id,
domain=report.domain,
severity="critical",
alert_type="sudden_unknown_failure_spike",
key="global",
title=f"Unknown failure spike for {report.domain}",
summary=f"Unknown failed volume is {current_unknown}, above the trailing 7-day average of {trailing_avg:.1f}.",
details={**evidence, "current_24h": current_unknown, "trailing_7d_avg": trailing_avg},
)
)
trailing_volume = sum(row.count for row, row_report in trend_rows if trailing_start <= _report_time(row_report) < recent_start)
trailing_volume_avg = trailing_volume / 7 if trailing_volume else 0
drop_threshold = max(0, 1 - thresholds.total_volume_drop_percent / 100)
if trailing_volume_avg and total <= trailing_volume_avg * drop_threshold:
alerts.append(
create_or_update_alert(
session,
inbox_id=report.inbox_id,
domain=report.domain,
severity="warning",
alert_type="total_volume_drop",
key="global",
title=f"DMARC report volume dropped for {report.domain}",
summary=f"Current report volume is {total}, below the trailing 7-day average of {trailing_volume_avg:.1f}.",
details={**evidence, "current_messages": total, "trailing_7d_avg": trailing_volume_avg},
)
)
alerts.extend(_repeated_failure_alerts(session, settings, report, current_records))
return alerts
def _repeated_failure_alerts(
session: Session,
settings: Settings,
report: Report,
current_records: list[Record],
) -> list[tuple[Alert, bool, bool]]:
thresholds = settings.alerts.thresholds
days = max(1, thresholds.repeated_failure_days)
if days <= 1:
return []
report_day = _report_day(report)
start = report_day - timedelta(days=days - 1)
end = report_day + timedelta(days=1)
alerts: list[tuple[Alert, bool, bool]] = []
sources = {row.source_ip: row for row in current_records if not row.dmarc_pass}
for source_ip, current_record in sources.items():
rows = session.execute(
select(Record, Report)
.join(Report)
.where(
Report.domain == report.domain,
Record.source_ip == source_ip,
Record.dmarc_pass.is_(False),
func.coalesce(Report.date_end, Report.date_begin, Report.created_at) >= start,
func.coalesce(Report.date_end, Report.date_begin, Report.created_at) < end,
)
).all()
failure_days = sorted({_report_day(row_report).date().isoformat() for _, row_report in rows})
if len(failure_days) < days:
continue
failed_messages = sum(row.count for row, _ in rows)
severity = "critical" if current_record.is_known_sender else "warning"
sender_label = current_record.known_sender_name or source_ip
alerts.append(
create_or_update_alert(
session,
inbox_id=report.inbox_id,
domain=report.domain,
severity=severity,
alert_type="repeated_dmarc_failure",
key=source_ip,
title=f"Repeated DMARC failure for {sender_label}",
summary=f"{sender_label} failed DMARC on {len(failure_days)} report days in the last {days} days.",
details={
**_record_details(current_record, report),
"failure_days": failure_days,
"window_days": days,
"failed_messages": failed_messages,
},
)
)
return alerts
def _reporter_alerts(session: Session, settings: Settings, report: Report) -> list[tuple[Alert, bool, bool]]:
alerts: list[tuple[Alert, bool, bool]] = []
if report.org_name:
existing = session.scalar(
select(func.count(Report.id)).where(Report.domain == report.domain, Report.org_name == report.org_name, Report.id != report.id)
)
if not existing:
alerts.append(
create_or_update_alert(
session,
inbox_id=report.inbox_id,
domain=report.domain,
severity="info",
alert_type="new_reporter",
key=report.org_name,
title=f"New DMARC reporter for {report.domain}",
summary=f"{report.org_name} sent its first observed aggregate report.",
details={**_report_evidence(report), "reporter": report.org_name},
)
)
first_domain_report = session.scalar(select(func.count(Report.id)).where(Report.domain == report.domain, Report.id != report.id))
if not first_domain_report:
alerts.append(
create_or_update_alert(
session,
inbox_id=report.inbox_id,
domain=report.domain,
severity="info",
alert_type="policy_seen",
key="policy",
title=f"DMARC policy seen for {report.domain}",
summary=f"Published policy p={report.policy_p}, sp={report.policy_sp}, pct={report.policy_pct}.",
details={**_report_evidence(report), "policy_p": report.policy_p, "policy_sp": report.policy_sp, "policy_pct": report.policy_pct},
)
)
missing_after = max(1, settings.alerts.thresholds.missing_reporter_days)
cutoff = _report_time(report) - timedelta(days=missing_after)
reporter_rows = session.execute(
select(Report.org_name, func.max(func.coalesce(Report.date_end, Report.date_begin, Report.created_at)))
.where(Report.domain == report.domain, Report.org_name.is_not(None))
.group_by(Report.org_name)
).all()
for org_name, last_seen in reporter_rows:
last_seen_at = _as_utc(last_seen)
if not org_name or not last_seen_at or last_seen_at >= cutoff:
continue
missed_days = (_report_time(report) - last_seen_at).days
alerts.append(
create_or_update_alert(
session,
inbox_id=report.inbox_id,
domain=report.domain,
severity="warning",
alert_type="missing_reporter",
key=org_name,
title=f"DMARC reporter missing for {report.domain}",
summary=f"{org_name} has not sent a DMARC aggregate report for {missed_days} days.",
details={**_report_evidence(report, link_report=False), "reporter": org_name, "last_seen_at": last_seen_at.isoformat()},
)
)
return alerts
+162
View File
@@ -0,0 +1,162 @@
from __future__ import annotations
import gzip
import hashlib
import io
import zipfile
from dataclasses import dataclass
from email.message import Message
from pathlib import PurePosixPath
class AttachmentExtractionError(Exception):
pass
@dataclass(frozen=True)
class ExtractedReport:
filename: str
payload: bytes
sha256: str
ARCHIVE_SUFFIXES = (".zip", ".gz")
XML_MIME_HINTS = {"text/xml", "application/xml", "application/dmarc+xml"}
GZIP_MIME_HINTS = {"application/gzip", "application/x-gzip"}
ZIP_MIME_HINTS = {"application/zip", "application/x-zip-compressed"}
def _max_bytes(max_mb: int) -> int:
return max_mb * 1024 * 1024
def _sha(payload: bytes) -> str:
return hashlib.sha256(payload).hexdigest()
def _ensure_size(payload: bytes, max_mb: int, filename: str) -> None:
if len(payload) > _max_bytes(max_mb):
raise AttachmentExtractionError(f"{filename} exceeds decompressed limit of {max_mb} MB")
def _ensure_compressed_size(payload: bytes, max_mb: int, filename: str) -> None:
if len(payload) > _max_bytes(max_mb):
raise AttachmentExtractionError(f"{filename} exceeds compressed limit of {max_mb} MB")
def _ensure_ratio(compressed_size: int, decompressed_size: int, max_ratio: int, filename: str) -> None:
if compressed_size <= 0:
return
ratio = decompressed_size / compressed_size
if ratio > max_ratio:
raise AttachmentExtractionError(f"{filename} exceeds compression ratio limit of {max_ratio}:1")
def _safe_zip_name(name: str) -> bool:
path = PurePosixPath(name)
return not path.is_absolute() and ".." not in path.parts
def _extract_zip(filename: str, payload: bytes, max_mb: int, max_reports: int, max_ratio: int) -> list[ExtractedReport]:
reports: list[ExtractedReport] = []
with zipfile.ZipFile(io.BytesIO(payload)) as archive:
for info in archive.infolist():
if info.is_dir():
continue
if not _safe_zip_name(info.filename):
raise AttachmentExtractionError(f"{filename} contains unsafe zip path {info.filename}")
lower = info.filename.lower()
if lower.endswith(ARCHIVE_SUFFIXES):
raise AttachmentExtractionError(f"{filename} contains nested archive {info.filename}")
if not lower.endswith(".xml"):
continue
if len(reports) >= max_reports:
raise AttachmentExtractionError(f"{filename} exceeds archive XML report limit of {max_reports}")
with archive.open(info) as handle:
xml = handle.read(_max_bytes(max_mb) + 1)
_ensure_size(xml, max_mb, info.filename)
_ensure_ratio(info.compress_size, len(xml), max_ratio, info.filename)
reports.append(ExtractedReport(info.filename, xml, _sha(xml)))
return reports
def _extract_gzip(filename: str, payload: bytes, max_mb: int, max_ratio: int) -> list[ExtractedReport]:
with gzip.GzipFile(fileobj=io.BytesIO(payload)) as gz:
xml = gz.read(_max_bytes(max_mb) + 1)
_ensure_size(xml, max_mb, filename)
_ensure_ratio(len(payload), len(xml), max_ratio, filename)
out_name = filename[:-3] if filename.lower().endswith(".gz") else f"{filename}.xml"
return [ExtractedReport(out_name, xml, _sha(xml))]
def extract_payload(
filename: str,
content_type: str | None,
payload: bytes,
max_mb: int,
*,
max_compressed_mb: int = 10,
max_reports_per_archive: int = 20,
max_compression_ratio: int = 100,
) -> list[ExtractedReport]:
_ensure_compressed_size(payload, max_compressed_mb, filename)
lower = filename.lower()
mime = (content_type or "").lower()
if lower.endswith(".zip") or mime in ZIP_MIME_HINTS:
return _extract_zip(filename, payload, max_mb, max_reports_per_archive, max_compression_ratio)
if lower.endswith(".gz") or mime in GZIP_MIME_HINTS:
return _extract_gzip(filename, payload, max_mb, max_compression_ratio)
if lower.endswith(".xml") or mime in XML_MIME_HINTS:
_ensure_size(payload, max_mb, filename)
return [ExtractedReport(filename, payload, _sha(payload))]
return []
def message_has_candidate_attachment(message: Message) -> bool:
for part in message.walk():
filename = part.get_filename() or ""
content_type = (part.get_content_type() or "").lower()
lower = filename.lower()
if lower.endswith((".xml", ".xml.gz", ".gz", ".zip")):
return True
if content_type in XML_MIME_HINTS | GZIP_MIME_HINTS | ZIP_MIME_HINTS:
return True
return False
def extract_dmarc_attachments(
message: Message,
max_mb: int,
*,
max_compressed_mb: int = 10,
max_attachments: int = 20,
max_reports_per_message: int = 20,
max_reports_per_archive: int = 20,
max_compression_ratio: int = 100,
) -> list[ExtractedReport]:
reports: list[ExtractedReport] = []
attachment_count = 0
for part in message.walk():
if part.is_multipart():
continue
filename = part.get_filename() or "attachment"
payload = part.get_payload(decode=True)
if not payload:
continue
attachment_count += 1
if attachment_count > max_attachments:
raise AttachmentExtractionError(f"message exceeds attachment limit of {max_attachments}")
reports.extend(
extract_payload(
filename,
part.get_content_type(),
payload,
max_mb,
max_compressed_mb=max_compressed_mb,
max_reports_per_archive=max_reports_per_archive,
max_compression_ratio=max_compression_ratio,
)
)
if len(reports) > max_reports_per_message:
raise AttachmentExtractionError(f"message exceeds extracted report limit of {max_reports_per_message}")
return reports
+70
View File
@@ -0,0 +1,70 @@
from __future__ import annotations
import os
import secrets
from urllib.parse import urlparse
from fastapi import Depends, HTTPException, Request, status
from fastapi.security import HTTPBasic, HTTPBasicCredentials, HTTPBearer, HTTPAuthorizationCredentials
from app.config import Settings, get_settings
basic = HTTPBasic(auto_error=False)
bearer = HTTPBearer(auto_error=False)
def require_dashboard_auth(
credentials: HTTPBasicCredentials | None = Depends(basic),
settings: Settings = Depends(get_settings),
) -> None:
if not settings.security.dashboard_auth_enabled:
return
username = os.getenv(settings.security.dashboard_username_env)
password = os.getenv(settings.security.dashboard_password_env)
if not username or not password:
raise HTTPException(
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
detail="Dashboard authentication is enabled but credentials are not configured.",
)
valid = credentials and secrets.compare_digest(credentials.username, username) and secrets.compare_digest(credentials.password, password)
if not valid:
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Authentication required",
headers={"WWW-Authenticate": "Basic"},
)
def require_homepage_token(
credentials: HTTPAuthorizationCredentials | None = Depends(bearer),
settings: Settings = Depends(get_settings),
) -> None:
if not settings.security.api_token_required:
return
expected = os.getenv(settings.security.homepage_token_env, "")
if not credentials or not expected or not secrets.compare_digest(credentials.credentials, expected):
raise HTTPException(status_code=status.HTTP_401_UNAUTHORIZED, detail="Invalid bearer token")
def _same_origin(candidate: str, allowed_hosts: set[str]) -> bool:
parsed = urlparse(candidate)
return bool(parsed.scheme in {"http", "https"} and parsed.netloc in allowed_hosts)
def require_admin_csrf(request: Request, settings: Settings = Depends(get_settings)) -> None:
if not settings.security.dashboard_auth_enabled or request.method in {"GET", "HEAD", "OPTIONS", "TRACE"}:
return
allowed_hosts = {host for host in {request.headers.get("host"), urlparse(settings.app.base_url).netloc} if host}
origin = request.headers.get("origin")
if origin:
if _same_origin(origin, allowed_hosts):
return
raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, detail="Cross-site admin POST rejected.")
referer = request.headers.get("referer")
if referer:
if _same_origin(referer, allowed_hosts):
return
raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, detail="Cross-site admin POST rejected.")
if request.headers.get("x-requested-with") == "XMLHttpRequest":
return
raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, detail="Admin POST requires same-origin headers.")
+79
View File
@@ -0,0 +1,79 @@
from __future__ import annotations
import argparse
from datetime import datetime
from app.config import configure_logging, get_settings
from app.db import init_db, session_scope
from app.inbox_locks import inbox_run_locks
from app.message_processor import process_inbox
def _date(value: str | None):
return datetime.strptime(value, "%Y-%m-%d").date() if value else None
def backlog(args: argparse.Namespace) -> int:
settings = get_settings()
configure_logging(settings)
init_db()
inbox = settings.get_inbox(args.inbox)
lease = inbox_run_locks.acquire(inbox.id, blocking=False)
if not lease:
print(f"Inbox {inbox.id} is already processing.")
return 1
with session_scope() as session:
with lease:
summary = process_inbox(
session,
settings,
inbox,
folder=args.folder or inbox.folder,
mode="backlog",
since=_date(args.since),
before=_date(args.before),
limit=args.limit,
dry_run=args.dry_run,
reprocess=args.reprocess,
mark_seen=args.mark_seen,
)
print("Backlog run complete")
print(f"Inbox: {summary.inbox_id}")
print(f"Folder: {summary.folder}")
print(f"Scanned messages: {summary.scanned_messages}")
print(f"Candidate messages: {summary.candidate_messages}")
print(f"Valid reports imported: {summary.valid_reports_imported}")
print(f"Duplicate messages skipped: {summary.duplicate_messages_skipped}")
print(f"Duplicate report payloads skipped: {summary.duplicate_reports_skipped}")
print(f"Rejected messages: {summary.rejected_messages}")
print(f"Failed messages: {summary.failed_messages}")
print(f"Records imported: {summary.records_imported}")
print(f"Alerts created: {summary.alerts_created}")
print(f"LLM explanations generated: {summary.llm_explanations_generated}")
return 0
def build_parser() -> argparse.ArgumentParser:
parser = argparse.ArgumentParser(prog="python -m app.cli")
sub = parser.add_subparsers(dest="command", required=True)
backlog_parser = sub.add_parser("backlog")
backlog_parser.add_argument("--inbox", required=True)
backlog_parser.add_argument("--folder")
backlog_parser.add_argument("--since")
backlog_parser.add_argument("--before")
backlog_parser.add_argument("--limit", type=int, default=500)
backlog_parser.add_argument("--dry-run", action="store_true")
backlog_parser.add_argument("--reprocess", action="store_true")
backlog_parser.add_argument("--mark-seen", action="store_true")
backlog_parser.set_defaults(func=backlog)
return parser
def main() -> int:
parser = build_parser()
args = parser.parse_args()
return args.func(args)
if __name__ == "__main__":
raise SystemExit(main())
+200
View File
@@ -0,0 +1,200 @@
from __future__ import annotations
import logging
import os
from functools import lru_cache
from pathlib import Path
from typing import Any
import yaml
from pydantic import BaseModel, Field
class AppConfig(BaseModel):
name: str = "DMARC Sentinel"
base_url: str = "https://sentinel.tukutoi.com"
timezone: str = "Europe/Zurich"
poll_interval_minutes: int = 30
database_url: str = "sqlite:////app/data/dmarc-sentinel.sqlite3"
log_level: str = "INFO"
max_attachment_decompressed_mb: int = 20
max_attachment_compressed_mb: int = 10
max_attachments_per_message: int = 20
max_reports_per_message: int = 20
max_reports_per_archive: int = 20
max_archive_compression_ratio: int = 100
max_xml_records_per_report: int = 10000
max_record_count: int = 10000000
max_report_future_days: int = 3
max_report_past_days: int = 3650
max_reports_per_poll: int = 200
class SecurityConfig(BaseModel):
dashboard_auth_enabled: bool = True
dashboard_username_env: str = "DASHBOARD_USERNAME"
dashboard_password_env: str = "DASHBOARD_PASSWORD"
api_token_required: bool = True
homepage_token_env: str = "HOMEPAGE_API_TOKEN"
class LLMConfig(BaseModel):
provider: str = "openai"
api_key_env: str = "OPENAI_API_KEY"
model: str = "gpt-4.1-mini"
temperature: float = 0.2
timeout_seconds: int = 45
max_retries: int = 2
generate_alert_explanations: bool = True
generate_daily_summary: bool = True
generate_weekly_summary: bool = True
store_llm_outputs: bool = True
send_raw_xml_to_llm: bool = False
send_raw_email_to_llm: bool = False
system_prompt_path: str = "config/prompts/system.md"
alert_prompt_path: str = "config/prompts/alert_explanation.md"
digest_prompt_path: str = "config/prompts/posture_digest.md"
weekly_prompt_path: str = "config/prompts/weekly_summary.md"
class InboxConfig(BaseModel):
id: str
label: str
domain: str
imap_host: str
imap_port: int = 993
imap_ssl: bool = True
username_env: str
password_env: str
folder: str = "DMARC"
recipient: str
processed_folder: str | None = None
failed_folder: str | None = None
move_after_success: bool = False
move_after_failure: bool = False
mark_seen_after_success: bool = True
enabled: bool = True
@property
def username(self) -> str | None:
return os.getenv(self.username_env)
@property
def password(self) -> str | None:
return os.getenv(self.password_env)
class KnownSenderConfig(BaseModel):
id: str
name: str
ip_allowlist: list[str] = Field(default_factory=list)
dkim_domains: list[str] = Field(default_factory=list)
spf_domains: list[str] = Field(default_factory=list)
class EmailAlertConfig(BaseModel):
enabled: bool = True
smtp_host_env: str = "ALERT_SMTP_HOST"
smtp_port_env: str = "ALERT_SMTP_PORT"
smtp_user_env: str = "ALERT_SMTP_USER"
smtp_password_env: str = "ALERT_SMTP_PASSWORD"
from_env: str = "ALERT_EMAIL_FROM"
to_env: str = "ALERT_EMAIL_TO"
class AlertThresholds(BaseModel):
unknown_source_fail_count: int = 10
unknown_source_fail_rate_percent: float = 5
known_source_fail_rate_percent: float = 2
total_volume_spike_multiplier: float = 3
total_volume_drop_percent: float = 80
min_messages_for_rate_alert: int = 20
repeated_failure_days: int = 2
missing_reporter_days: int = 3
class AlertsConfig(BaseModel):
email: EmailAlertConfig = Field(default_factory=EmailAlertConfig)
thresholds: AlertThresholds = Field(default_factory=AlertThresholds)
class Settings(BaseModel):
app: AppConfig = Field(default_factory=AppConfig)
security: SecurityConfig = Field(default_factory=SecurityConfig)
llm: LLMConfig = Field(default_factory=LLMConfig)
inboxes: list[InboxConfig] = Field(default_factory=list)
known_senders: dict[str, list[KnownSenderConfig]] = Field(default_factory=dict)
alerts: AlertsConfig = Field(default_factory=AlertsConfig)
def enabled_inboxes(self) -> list[InboxConfig]:
return [inbox for inbox in self.inboxes if inbox.enabled]
def get_inbox(self, inbox_id: str) -> InboxConfig:
for inbox in self.inboxes:
if inbox.id == inbox_id:
return inbox
raise KeyError(f"Unknown inbox: {inbox_id}")
def _default_config_path() -> Path:
explicit = os.getenv("DMARC_SENTINEL_CONFIG")
if explicit:
return Path(explicit)
return Path("config/config.yml")
def load_settings(path: str | Path | None = None) -> Settings:
config_path = Path(path) if path else _default_config_path()
if not config_path.exists():
raise FileNotFoundError(
f"Runtime config not found at {config_path}. "
"Create config/config.yml from config/config.example.yml or set DMARC_SENTINEL_CONFIG."
)
with config_path.open("r", encoding="utf-8") as handle:
raw: dict[str, Any] = yaml.safe_load(handle) or {}
settings = Settings.model_validate(raw)
validate_llm_environment(settings)
return settings
def validate_llm_environment(settings: Settings) -> None:
if settings.llm.provider != "openai":
return
if not any(
[
settings.llm.generate_alert_explanations,
settings.llm.generate_daily_summary,
settings.llm.generate_weekly_summary,
]
):
return
if os.getenv("DMARC_SENTINEL_ALLOW_NO_LLM_FOR_TESTS", "").lower() == "true":
return
if not os.getenv(settings.llm.api_key_env):
raise RuntimeError(
f"{settings.llm.api_key_env} is required when llm.provider=openai. "
"Set DMARC_SENTINEL_ALLOW_NO_LLM_FOR_TESTS=true only for tests."
)
@lru_cache(maxsize=1)
def get_settings() -> Settings:
return load_settings()
def configure_logging(settings: Settings) -> None:
Path("logs").mkdir(exist_ok=True)
level = getattr(logging, settings.app.log_level.upper(), logging.INFO)
formatter = logging.Formatter("%(asctime)s %(levelname)s %(name)s %(message)s")
root = logging.getLogger()
root.setLevel(level)
root.handlers.clear()
stream = logging.StreamHandler()
stream.setFormatter(formatter)
root.addHandler(stream)
try:
file_handler = logging.FileHandler("logs/dmarc-sentinel.log")
file_handler.setFormatter(formatter)
root.addHandler(file_handler)
except OSError:
logging.getLogger(__name__).warning("Could not open logs/dmarc-sentinel.log")
+58
View File
@@ -0,0 +1,58 @@
from __future__ import annotations
from contextlib import contextmanager
from pathlib import Path
from typing import Iterator
from sqlalchemy import create_engine, text
from sqlalchemy.orm import DeclarativeBase, Session, sessionmaker
from app.config import get_settings
class Base(DeclarativeBase):
pass
def _engine_url() -> str:
url = get_settings().app.database_url
if url.startswith("sqlite:///") and not url.startswith("sqlite:////"):
db_path = url.removeprefix("sqlite:///")
if db_path and db_path != ":memory:":
Path(db_path).parent.mkdir(parents=True, exist_ok=True)
return url
engine = create_engine(_engine_url(), future=True, pool_pre_ping=True)
SessionLocal = sessionmaker(bind=engine, autoflush=False, autocommit=False, future=True)
def init_db() -> None:
from app import models # noqa: F401
Base.metadata.create_all(bind=engine)
def get_db() -> Iterator[Session]:
with SessionLocal() as session:
yield session
@contextmanager
def session_scope() -> Iterator[Session]:
with SessionLocal() as session:
try:
yield session
session.commit()
except Exception:
session.rollback()
raise
def database_ok() -> bool:
try:
with engine.connect() as conn:
conn.execute(text("select 1"))
return True
except Exception:
return False
+231
View File
@@ -0,0 +1,231 @@
from __future__ import annotations
from dataclasses import dataclass, field
from datetime import datetime, timedelta, timezone
from ipaddress import ip_address
from typing import Iterable
from defusedxml import ElementTree as ET
class DMARCParseError(Exception):
pass
@dataclass
class ParsedAuthResult:
auth_type: str
domain: str | None = None
selector: str | None = None
scope: str | None = None
result: str | None = None
human_result: str | None = None
@dataclass
class ParsedRecord:
source_ip: str
count: int
disposition: str | None
policy_dkim: str | None
policy_spf: str | None
dkim_aligned: bool
spf_aligned: bool
dmarc_pass: bool
header_from: str | None
reason_type: str | None
reason_comment: str | None
auth_results: list[ParsedAuthResult] = field(default_factory=list)
@dataclass
class ParsedReport:
org_name: str | None
org_email: str | None
extra_contact_info: str | None
report_id: str | None
date_begin: datetime | None
date_end: datetime | None
domain: str
adkim: str | None
aspf: str | None
policy_p: str | None
policy_sp: str | None
policy_pct: int | None
fo: str | None
records: list[ParsedRecord]
def _strip_namespace(tag: str) -> str:
return tag.rsplit("}", 1)[-1] if "}" in tag else tag
def _children(element: ET.Element, name: str) -> Iterable[ET.Element]:
for child in list(element):
if _strip_namespace(child.tag) == name:
yield child
def _child(element: ET.Element, path: str) -> ET.Element | None:
current = element
for piece in path.split("/"):
found = None
for child in _children(current, piece):
found = child
break
if found is None:
return None
current = found
return current
def _text(element: ET.Element, path: str) -> str | None:
found = _child(element, path)
if found is None or found.text is None:
return None
value = found.text.strip()
return value or None
def _int(value: str | None) -> int | None:
if value in (None, ""):
return None
try:
return int(value)
except ValueError:
return None
def _dt(value: str | None) -> datetime | None:
number = _int(value)
if number is None:
return None
return datetime.fromtimestamp(number, tz=timezone.utc)
def _validate_report_dates(date_begin: datetime | None, date_end: datetime | None, max_future_days: int, max_past_days: int) -> None:
now = datetime.now(timezone.utc)
earliest = now - timedelta(days=max_past_days)
latest = now + timedelta(days=max_future_days)
for label, value in {"begin": date_begin, "end": date_end}.items():
if value is None:
continue
if value < earliest:
raise DMARCParseError(f"Report {label} date is older than {max_past_days} days")
if value > latest:
raise DMARCParseError(f"Report {label} date is more than {max_future_days} days in the future")
if date_begin and date_end and date_begin > date_end:
raise DMARCParseError("Report begin date is after end date")
def parse_dmarc_xml(
payload: bytes,
*,
max_records: int | None = None,
max_record_count: int | None = None,
max_future_days: int = 3,
max_past_days: int = 3650,
) -> ParsedReport:
try:
root = ET.fromstring(payload)
except Exception as exc:
raise DMARCParseError(f"Invalid XML: {exc}") from exc
if _strip_namespace(root.tag) != "feedback":
raise DMARCParseError("Root element is not feedback")
metadata = _child(root, "report_metadata")
policy = _child(root, "policy_published")
if metadata is None or policy is None:
raise DMARCParseError("Missing report_metadata or policy_published")
domain = _text(policy, "domain")
if not domain:
raise DMARCParseError("Missing policy domain")
date_begin = _dt(_text(metadata, "date_range/begin"))
date_end = _dt(_text(metadata, "date_range/end"))
_validate_report_dates(date_begin, date_end, max_future_days, max_past_days)
parsed_records: list[ParsedRecord] = []
for record in _children(root, "record"):
if max_records is not None and len(parsed_records) >= max_records:
raise DMARCParseError(f"Report exceeds record limit of {max_records}")
row = _child(record, "row")
if row is None:
continue
policy_eval = _child(row, "policy_evaluated")
source_ip = _text(row, "source_ip")
count = _int(_text(row, "count")) or 0
if not source_ip:
continue
try:
ip_address(source_ip)
except ValueError as exc:
raise DMARCParseError(f"Invalid source IP: {source_ip}") from exc
if count < 0:
raise DMARCParseError(f"Negative message count for source {source_ip}")
if max_record_count is not None and count > max_record_count:
raise DMARCParseError(f"Record count {count} exceeds limit of {max_record_count}")
policy_dkim = _text(policy_eval, "dkim") if policy_eval is not None else None
policy_spf = _text(policy_eval, "spf") if policy_eval is not None else None
dkim_aligned = policy_dkim == "pass"
spf_aligned = policy_spf == "pass"
reason = _child(policy_eval, "reason") if policy_eval is not None else None
auth_results: list[ParsedAuthResult] = []
auth = _child(record, "auth_results")
if auth is not None:
for dkim in _children(auth, "dkim"):
auth_results.append(
ParsedAuthResult(
auth_type="dkim",
domain=_text(dkim, "domain"),
selector=_text(dkim, "selector"),
result=_text(dkim, "result"),
human_result=_text(dkim, "human_result"),
)
)
for spf in _children(auth, "spf"):
auth_results.append(
ParsedAuthResult(
auth_type="spf",
domain=_text(spf, "domain"),
scope=_text(spf, "scope"),
result=_text(spf, "result"),
)
)
parsed_records.append(
ParsedRecord(
source_ip=source_ip,
count=count,
disposition=_text(policy_eval, "disposition") if policy_eval is not None else None,
policy_dkim=policy_dkim,
policy_spf=policy_spf,
dkim_aligned=dkim_aligned,
spf_aligned=spf_aligned,
dmarc_pass=dkim_aligned or spf_aligned,
header_from=_text(record, "identifiers/header_from"),
reason_type=_text(reason, "type") if reason is not None else None,
reason_comment=_text(reason, "comment") if reason is not None else None,
auth_results=auth_results,
)
)
if not parsed_records:
raise DMARCParseError("No valid DMARC records found")
return ParsedReport(
org_name=_text(metadata, "org_name"),
org_email=_text(metadata, "email"),
extra_contact_info=_text(metadata, "extra_contact_info"),
report_id=_text(metadata, "report_id"),
date_begin=date_begin,
date_end=date_end,
domain=domain,
adkim=_text(policy, "adkim"),
aspf=_text(policy, "aspf"),
policy_p=_text(policy, "p"),
policy_sp=_text(policy, "sp"),
policy_pct=_int(_text(policy, "pct")),
fo=_text(policy, "fo"),
records=parsed_records,
)
+261
View File
@@ -0,0 +1,261 @@
from __future__ import annotations
from datetime import date, datetime, timedelta, timezone
from sqlalchemy import func, select
from sqlalchemy.orm import Session
from app.models import Alert, DailyStat, InboxStatus, LLMReport, Record, Report
def _pct(pass_count: int, total: int) -> str:
return f"{(pass_count / total * 100):.1f}%" if total else "0.0%"
def _as_utc(value: datetime | str | None) -> datetime | None:
if value is None:
return None
if isinstance(value, str):
try:
value = datetime.fromisoformat(value.replace("Z", "+00:00"))
except ValueError:
return None
if value.tzinfo is None:
return value.replace(tzinfo=timezone.utc)
return value
def report_timestamp(report: Report) -> datetime:
return _as_utc(report.date_end or report.date_begin or report.created_at) or datetime.now(timezone.utc)
def _report_date_expr():
return func.coalesce(Report.date_end, Report.date_begin, Report.created_at)
def _display_day(value: date) -> str:
return value.strftime("%d/%m/%Y")
def report_bounds(session: Session, domain: str | None = None) -> tuple[datetime | None, datetime | None]:
stmt = select(func.min(_report_date_expr()), func.max(_report_date_expr()))
if domain:
stmt = stmt.where(Report.domain == domain)
start, end = session.execute(stmt).one()
return _as_utc(start), _as_utc(end)
def resolve_date_range(
session: Session,
*,
period: str = "all",
domain: str | None = None,
date_from: str | None = None,
date_to: str | None = None,
) -> tuple[datetime | None, datetime | None, str]:
first, latest = report_bounds(session, domain)
if not latest:
return None, None, "No reports"
if period == "custom":
start = _as_utc(f"{date_from}T00:00:00+00:00") if date_from else first
end = _as_utc(f"{date_to}T23:59:59+00:00") if date_to else latest
return start, end, "Custom range"
if period == "24h":
return latest - timedelta(days=1), latest, "Latest 24h"
if period == "7d":
return latest - timedelta(days=7), latest, "Latest 7 days"
if period == "30d":
return latest - timedelta(days=30), latest, "Latest 30 days"
if period == "365d":
return latest - timedelta(days=365), latest, "Latest year"
return first, latest, "All imported reports"
def _range_filter(stmt, start: datetime | None, end: datetime | None):
expr = _report_date_expr()
if start:
stmt = stmt.where(expr >= start)
if end:
stmt = stmt.where(expr <= end)
return stmt
def latest_summary(session: Session, domain: str | None = None) -> str:
posture_stmt = select(LLMReport).where(LLMReport.report_type == "posture").order_by(LLMReport.created_at.desc(), LLMReport.period_end.desc())
if domain:
posture_stmt = posture_stmt.where(LLMReport.domain == domain)
else:
posture_stmt = posture_stmt.where(LLMReport.domain == "__all__")
report = session.scalar(posture_stmt.limit(1))
if not report:
stmt = select(LLMReport).where(LLMReport.report_type == "daily").order_by(LLMReport.period_end.desc(), LLMReport.created_at.desc())
if domain:
stmt = stmt.where(LLMReport.domain == domain)
else:
stmt = stmt.where(LLMReport.domain == "__all__")
report = session.scalar(stmt.limit(1))
if report:
return report.plain_text
if domain:
return "No domain posture digest has been generated yet."
return "No portfolio posture digest has been generated yet."
def homepage_summary(
session: Session,
*,
period: str = "all",
domain: str | None = None,
date_from: str | None = None,
date_to: str | None = None,
) -> dict:
start, end, scope_label = resolve_date_range(session, period=period, domain=domain, date_from=date_from, date_to=date_to)
domains = session.scalar(select(func.count(func.distinct(Report.domain)))) or 0
reports_stmt = select(func.count(Report.id))
records_stmt = select(Record).join(Report)
unknown_stmt = select(func.count(func.distinct(Record.source_ip))).join(Report).where(Record.is_known_sender.is_(False))
if domain:
reports_stmt = reports_stmt.where(Report.domain == domain)
records_stmt = records_stmt.where(Report.domain == domain)
unknown_stmt = unknown_stmt.where(Report.domain == domain)
reports_stmt = _range_filter(reports_stmt, start, end)
records_stmt = _range_filter(records_stmt, start, end)
unknown_stmt = _range_filter(unknown_stmt, start, end)
reports_in_range = session.scalar(reports_stmt) or 0
records = session.execute(records_stmt).scalars().all()
messages_in_range = sum(row.count for row in records)
pass_count = sum(row.count for row in records if row.dmarc_pass)
critical = session.scalar(select(func.count(Alert.id)).where(Alert.status == "open", Alert.severity == "critical")) or 0
warnings = session.scalar(select(func.count(Alert.id)).where(Alert.status == "open", Alert.severity == "warning")) or 0
unknown_sources = session.scalar(unknown_stmt) or 0
last_check = session.scalar(select(func.max(InboxStatus.last_success_at)))
return {
"status": "critical" if critical else "warning" if warnings else "ok",
"domains": domains,
"reports_today": reports_in_range,
"messages_today": messages_in_range,
"dmarc_pass_count": pass_count,
"dmarc_fail_count": messages_in_range - pass_count,
"dmarc_pass_rate": _pct(pass_count, messages_in_range),
"dmarc_pass_rate_value": round(pass_count / messages_in_range * 100, 1) if messages_in_range else None,
"critical_alerts": critical,
"warnings": warnings,
"unknown_sources": unknown_sources,
"last_check": last_check.isoformat() if last_check else None,
"report_day": None,
"scope_label": scope_label,
"scope_start": start.isoformat() if start else None,
"scope_end": end.isoformat() if end else None,
"summary": latest_summary(session, domain),
}
def domain_metrics(session: Session, domain: str) -> dict:
records = session.execute(select(Record).join(Report).where(Report.domain == domain)).scalars().all()
total = sum(row.count for row in records)
dmarc_pass = sum(row.count for row in records if row.dmarc_pass)
spf_aligned = sum(row.count for row in records if row.spf_aligned)
dkim_aligned = sum(row.count for row in records if row.dkim_aligned)
return {
"messages": total,
"dmarc_pass": dmarc_pass,
"dmarc_fail": total - dmarc_pass,
"pass_rate": _pct(dmarc_pass, total),
"spf_aligned": spf_aligned,
"spf_rate": _pct(spf_aligned, total),
"dkim_aligned": dkim_aligned,
"dkim_rate": _pct(dkim_aligned, total),
"unknown_sources": len({row.source_ip for row in records if not row.is_known_sender}),
}
def traffic_distribution(
session: Session,
*,
period: str = "all",
domain: str | None = None,
date_from: str | None = None,
date_to: str | None = None,
buckets: int | None = None,
) -> list[dict]:
start, now, _ = resolve_date_range(session, period=period, domain=domain, date_from=date_from, date_to=date_to)
if not start or not now:
return []
default_buckets = {"24h": 12, "7d": 7, "30d": 10, "365d": 12}.get(period, 14)
bucket_count = buckets or default_buckets
duration = (now - start).total_seconds()
bucket_seconds = max(1, duration / bucket_count)
rows = []
stmt = select(Record, Report).join(Report).where(_report_date_expr() >= start, _report_date_expr() <= now)
if domain:
stmt = stmt.where(Report.domain == domain)
for record, report in session.execute(stmt).all():
created = report_timestamp(report)
index = int((created - start).total_seconds() / bucket_seconds)
index = min(bucket_count - 1, max(0, index))
rows.append((index, record))
data = []
for i in range(bucket_count):
bucket_start = start + timedelta(seconds=bucket_seconds * i)
bucket_end = start + timedelta(seconds=bucket_seconds * (i + 1))
if i == bucket_count - 1:
bucket_end = now
start_day = bucket_start.date()
end_day = bucket_end.date()
if start_day == end_day:
label = _display_day(start_day)
else:
label = f"{_display_day(start_day)} to {_display_day(end_day)}"
data.append(
{
"label": label,
"date_from": start_day.isoformat(),
"date_to": end_day.isoformat(),
"valid": 0,
"failed": 0,
"total": 0,
}
)
for index, record in rows:
key = "valid" if record.dmarc_pass else "failed"
data[index][key] += record.count
data[index]["total"] += record.count
max_total = max([item["total"] for item in data] or [0])
for item in data:
item["height"] = round(item["total"] / max_total * 100) if max_total else 0
item["failed_height"] = round(item["failed"] / item["total"] * item["height"]) if item["total"] else 0
item["valid_height"] = max(0, item["height"] - item["failed_height"])
return data
def domain_homepage_summary(session: Session, domain: str) -> dict:
latest = _as_utc(session.scalar(select(func.max(func.coalesce(Report.date_end, Report.date_begin, Report.created_at))).where(Report.domain == domain)))
end = latest or datetime.now(timezone.utc)
since = end - timedelta(days=1)
records = session.execute(
select(Record)
.join(Report)
.where(Report.domain == domain, func.coalesce(Report.date_end, Report.date_begin, Report.created_at) >= since, func.coalesce(Report.date_end, Report.date_begin, Report.created_at) <= end)
).scalars().all()
total = sum(row.count for row in records)
passed = sum(row.count for row in records if row.dmarc_pass)
failed = total - passed
unknown = len({row.source_ip for row in records if not row.is_known_sender})
critical = session.scalar(
select(func.count(Alert.id)).where(Alert.status == "open", Alert.domain == domain, Alert.severity == "critical")
) or 0
warnings = session.scalar(
select(func.count(Alert.id)).where(Alert.status == "open", Alert.domain == domain, Alert.severity == "warning")
) or 0
return {
"status": "critical" if critical else "warning" if warnings else "ok",
"domain": domain,
"messages_24h": total,
"pass_rate": _pct(passed, total),
"failed": failed,
"unknown_sources": unknown,
"critical_alerts": critical,
"warnings": warnings,
"summary": latest_summary(session, domain),
}
+92
View File
@@ -0,0 +1,92 @@
from __future__ import annotations
import email
import imaplib
import logging
from dataclasses import dataclass
from datetime import date
from email.message import Message
from app.config import InboxConfig
logger = logging.getLogger(__name__)
class IMAPError(Exception):
pass
@dataclass
class IMAPMessage:
uid: str
raw: bytes
seen: bool
message: Message
class IMAPClient:
def __init__(self, inbox: InboxConfig):
self.inbox = inbox
self.conn: imaplib.IMAP4 | imaplib.IMAP4_SSL | None = None
def __enter__(self) -> "IMAPClient":
if not self.inbox.username or not self.inbox.password:
raise IMAPError(f"Missing IMAP credentials for inbox {self.inbox.id}")
cls = imaplib.IMAP4_SSL if self.inbox.imap_ssl else imaplib.IMAP4
self.conn = cls(self.inbox.imap_host, self.inbox.imap_port)
typ, _ = self.conn.login(self.inbox.username, self.inbox.password)
if typ != "OK":
raise IMAPError(f"IMAP login failed for {self.inbox.id}")
logger.info("IMAP connection succeeded for inbox %s", self.inbox.id)
return self
def __exit__(self, exc_type, exc, tb) -> None:
if self.conn is None:
return
try:
self.conn.logout()
except Exception:
pass
def select_folder(self, folder: str) -> None:
assert self.conn is not None
typ, data = self.conn.select(f'"{folder}"', readonly=False)
if typ != "OK":
raise IMAPError(f"IMAP folder does not exist or cannot be selected: {folder}")
def search_uids(self, *, unread_only: bool, since: date | None = None, before: date | None = None, limit: int | None = None) -> list[str]:
assert self.conn is not None
criteria: list[str] = ["UNSEEN" if unread_only else "ALL"]
if since:
criteria.extend(["SINCE", since.strftime("%d-%b-%Y")])
if before:
criteria.extend(["BEFORE", before.strftime("%d-%b-%Y")])
typ, data = self.conn.uid("SEARCH", None, *criteria)
if typ != "OK":
raise IMAPError("IMAP UID search failed")
uids = data[0].decode().split() if data and data[0] else []
return uids[:limit] if limit else uids
def fetch_message(self, uid: str) -> IMAPMessage:
assert self.conn is not None
typ, data = self.conn.uid("FETCH", uid, "(RFC822 FLAGS)")
if typ != "OK" or not data:
raise IMAPError(f"Could not fetch UID {uid}")
raw = b""
flags = b""
for item in data:
if isinstance(item, tuple):
flags += item[0]
raw = item[1]
return IMAPMessage(uid=uid, raw=raw, seen=b"\\Seen" in flags, message=email.message_from_bytes(raw))
def mark_seen(self, uid: str) -> None:
assert self.conn is not None
self.conn.uid("STORE", uid, "+FLAGS", "(\\Seen)")
def move(self, uid: str, folder: str) -> None:
assert self.conn is not None
typ, _ = self.conn.uid("COPY", uid, f'"{folder}"')
if typ == "OK":
self.conn.uid("STORE", uid, "+FLAGS", "(\\Deleted)")
self.conn.expunge()
+45
View File
@@ -0,0 +1,45 @@
from __future__ import annotations
import threading
from dataclasses import dataclass
@dataclass
class InboxRunLease:
inbox_id: str
_lock: threading.Lock
_released: bool = False
def release(self) -> None:
if not self._released:
self._released = True
self._lock.release()
def __enter__(self) -> "InboxRunLease":
return self
def __exit__(self, exc_type, exc, tb) -> None:
self.release()
class InboxRunLocks:
def __init__(self) -> None:
self._guard = threading.Lock()
self._locks: dict[str, threading.Lock] = {}
def acquire(self, inbox_id: str, *, blocking: bool = False) -> InboxRunLease | None:
with self._guard:
lock = self._locks.setdefault(inbox_id, threading.Lock())
if not lock.acquire(blocking=blocking):
return None
return InboxRunLease(inbox_id=inbox_id, _lock=lock)
def active(self, inbox_id: str) -> bool:
lease = self.acquire(inbox_id, blocking=False)
if not lease:
return True
lease.release()
return False
inbox_run_locks = InboxRunLocks()
+94
View File
@@ -0,0 +1,94 @@
from __future__ import annotations
import threading
from dataclasses import asdict, dataclass, field
from datetime import datetime
from typing import Callable
from uuid import uuid4
from app.models import utcnow
@dataclass
class ImportJob:
id: str
inbox_id: str
action: str
status: str = "queued"
scanned_messages: int = 0
processed_messages: int = 0
candidate_messages: int = 0
valid_reports_imported: int = 0
duplicate_messages_skipped: int = 0
duplicate_reports_skipped: int = 0
failed_messages: int = 0
rejected_messages: int = 0
records_imported: int = 0
alerts_created: int = 0
duplicate_report_samples: list[dict] | None = None
error: str | None = None
started_at: datetime = field(default_factory=utcnow)
updated_at: datetime = field(default_factory=utcnow)
completed_at: datetime | None = None
@property
def progress_percent(self) -> int | None:
if not self.scanned_messages:
return None
return min(100, round(self.processed_messages / self.scanned_messages * 100))
def to_dict(self) -> dict:
data = asdict(self)
data["started_at"] = self.started_at.isoformat()
data["updated_at"] = self.updated_at.isoformat()
data["completed_at"] = self.completed_at.isoformat() if self.completed_at else None
data["progress_percent"] = self.progress_percent
data["duplicates_skipped"] = self.duplicate_messages_skipped + self.duplicate_reports_skipped
return data
class ImportJobStore:
def __init__(self) -> None:
self._jobs: dict[str, ImportJob] = {}
self._lock = threading.Lock()
def active_for_inbox(self, inbox_id: str) -> ImportJob | None:
with self._lock:
for job in sorted(self._jobs.values(), key=lambda item: item.started_at, reverse=True):
if job.inbox_id == inbox_id and job.status in {"queued", "running"}:
return job
return None
def latest_for_inbox(self, inbox_id: str) -> ImportJob | None:
with self._lock:
jobs = [job for job in self._jobs.values() if job.inbox_id == inbox_id]
return max(jobs, key=lambda item: item.started_at) if jobs else None
def create(self, inbox_id: str, action: str) -> ImportJob:
job = ImportJob(id=uuid4().hex, inbox_id=inbox_id, action=action)
with self._lock:
self._jobs[job.id] = job
return job
def get(self, job_id: str) -> ImportJob | None:
with self._lock:
return self._jobs.get(job_id)
def list(self, inbox_id: str | None = None) -> list[ImportJob]:
with self._lock:
jobs = list(self._jobs.values())
if inbox_id:
jobs = [job for job in jobs if job.inbox_id == inbox_id]
return sorted(jobs, key=lambda item: item.started_at, reverse=True)
def update(self, job_id: str, mutator: Callable[[ImportJob], None]) -> ImportJob | None:
with self._lock:
job = self._jobs.get(job_id)
if not job:
return None
mutator(job)
job.updated_at = utcnow()
return job
import_jobs = ImportJobStore()
+47
View File
@@ -0,0 +1,47 @@
from __future__ import annotations
from dataclasses import dataclass
from ipaddress import ip_address, ip_network
from app.config import KnownSenderConfig, Settings
from app.dmarc_parser import ParsedRecord
@dataclass(frozen=True)
class SenderMatch:
id: str | None
name: str | None
is_known: bool
def _domain_equal(a: str | None, b: str | None) -> bool:
return (a or "").lower().rstrip(".") == (b or "").lower().rstrip(".")
def _ip_matches(source_ip: str, sender: KnownSenderConfig) -> bool:
try:
ip = ip_address(source_ip)
except ValueError:
return False
for cidr in sender.ip_allowlist:
try:
if ip in ip_network(cidr, strict=False):
return True
except ValueError:
continue
return False
def classify_record(settings: Settings, domain: str, record: ParsedRecord) -> SenderMatch:
senders = settings.known_senders.get(domain, [])
for sender in senders:
if _ip_matches(record.source_ip, sender):
return SenderMatch(sender.id, sender.name, True)
if sender.ip_allowlist:
continue
for auth in record.auth_results:
if auth.auth_type == "dkim" and any(_domain_equal(auth.domain, item) for item in sender.dkim_domains):
return SenderMatch(sender.id, sender.name, True)
if auth.auth_type == "spf" and any(_domain_equal(auth.domain, item) for item in sender.spf_domains):
return SenderMatch(sender.id, sender.name, True)
return SenderMatch(None, None, False)
+253
View File
@@ -0,0 +1,253 @@
from __future__ import annotations
import json
import logging
import os
from pathlib import Path
from typing import Any
from openai import OpenAI
from pydantic import BaseModel, ValidationError
from app.config import Settings
from app.models import Alert
logger = logging.getLogger(__name__)
SYSTEM_PROMPT = (
"You are an expert email authentication and DMARC operations analyst. You explain deterministic DMARC telemetry "
"to a business owner/admin. You must not invent facts. You must distinguish confirmed facts from likely "
"interpretations. You must never claim an account is compromised solely from DMARC aggregate failures. You must "
"provide practical next steps. Output only valid JSON matching the requested schema."
)
ALERT_PROMPT = (
"Explain this DMARC alert to a business owner/admin. Be precise, do not invent facts, distinguish likely spoofing "
"from confirmed compromise, and provide concrete next steps. DMARC aggregate source IPs are observed transmitting "
"IPs from the reporter's point of view and may be final-hop relays, forwarders, mailing lists, or gateways. If SPF "
"fails but DKIM aligns and DMARC passes, do not frame the IP as a threat or as something to add to SPF; explain that "
"forwarding commonly breaks SPF while DKIM can still prove authorization. If a source is not legitimate, say not to "
"add it to known senders, keep it unauthorized, preserve or tighten DMARC enforcement after legitimate senders are "
"aligned, and investigate whether any internal system is leaking mail through that source. Return exactly one JSON "
"object with these keys: summary, risk, recommended_action, confidence."
)
POSTURE_DIGEST_PROMPT = (
"Write a current DMARC posture report for the admin using all supplied deterministic telemetry and all open alerts. "
"Base the report on unresolved/open risk, not only one report day. Mention exact counts/rates, important failing or "
"unknown sources, relevant reporters, and concrete remediation. DMARC aggregate source IPs are observed transmitting "
"IPs from the reporter's point of view and may be final-hop relays, forwarders, mailing lists, or gateways. For "
"SPF-fail, DKIM-pass, DMARC-pass observations, explain that this commonly indicates forwarding or an intermediary "
"relay and do not recommend adding those observed relay IPs to SPF solely because they appear in aggregate reports. "
"For unknown failing sources, explain both branches: if legitimate, authorize/fix SPF/DKIM/alignment and classify; "
"if not legitimate, do not authorize it, leave it unknown, and use DMARC enforcement such as quarantine/reject once "
"legitimate senders are aligned. Do not claim mailbox compromise from aggregate data alone. Return only JSON "
"matching required_json_schema."
)
WEEKLY_PROMPT = (
"Include high-level posture, trend changes, new senders, persistent failures, whether DMARC policy posture looks "
"safe, and recommended operational actions. Only say consider stricter policy if the metrics support it."
)
class AlertExplanation(BaseModel):
summary: str
risk: str
recommended_action: str
confidence: str = "medium"
class SummaryOutput(BaseModel):
headline: str
summary: str
action_items: list[str] = []
business_risk: str
def _stringify_action(value: Any) -> str:
if isinstance(value, list):
return "; ".join(str(item) for item in value if item)
if value is None:
return ""
return str(value)
def normalize_alert_explanation(output: dict[str, Any], alert: Alert | Any) -> AlertExplanation:
if {"summary", "risk", "recommended_action"}.issubset(output):
return AlertExplanation.model_validate(output)
explanation = output.get("explanation")
if isinstance(explanation, dict):
source = {**explanation, **{key: value for key, value in output.items() if key != "explanation"}}
else:
source = dict(output)
if explanation and "summary" not in source:
source["summary"] = str(explanation)
summary = source.get("summary") or source.get("headline") or getattr(alert, "summary", "")
risk = source.get("risk") or source.get("business_risk") or source.get("impact")
action = (
source.get("recommended_action")
or source.get("recommendation")
or source.get("next_step")
or source.get("next_steps")
or source.get("action_items")
)
if not risk:
risk = "Review the deterministic facts. DMARC aggregate data alone does not prove mailbox compromise."
if not action:
action = "Review the deterministic facts before changing DNS or sender classification; do not add relay or forwarding IPs to SPF solely because they appear in aggregate reports."
return AlertExplanation(
summary=str(summary),
risk=str(risk),
recommended_action=_stringify_action(action),
confidence=str(source.get("confidence") or "medium"),
)
def normalize_summary_output(output: dict[str, Any], payload: dict[str, Any]) -> SummaryOutput:
metrics = payload.get("metrics") or {}
headline = output.get("headline") or output.get("title")
summary = output.get("summary") or output.get("explanation") or output.get("analysis")
risk = output.get("business_risk") or output.get("risk") or output.get("impact")
actions = output.get("action_items") or output.get("recommended_actions") or output.get("recommendations") or output.get("next_steps") or []
if isinstance(actions, str):
actions = [item.strip() for item in actions.split(";") if item.strip()]
if not isinstance(actions, list):
actions = []
if not headline:
headline = f"DMARC posture for {payload.get('domain', 'domain')} on {payload.get('period', 'the selected period')}"
if not summary:
total = metrics.get("total_messages", 0)
pass_rate = metrics.get("dmarc_pass_rate", 0)
failed = metrics.get("dmarc_failed", 0)
unknown = metrics.get("unknown_sources", 0)
summary = (
f"{payload.get('domain', 'The domain')} processed {total} DMARC-observed messages with a {pass_rate}% "
f"DMARC pass rate. {failed} messages failed DMARC and {unknown} unknown sources were observed."
)
if not risk:
risk = "Review failures and unknown senders before changing policy. DMARC aggregate data alone does not prove mailbox compromise."
if not actions:
top_sources = payload.get("top_sources") or []
source = top_sources[0]["source_ip"] if top_sources and isinstance(top_sources[0], dict) and top_sources[0].get("source_ip") else "the top unknown or failing sources"
actions = [
f"Review {source}; if legitimate, fix SPF/DKIM alignment and classify it as approved, and if not legitimate, do not authorize it and rely on DMARC enforcement after legitimate senders are aligned."
]
return SummaryOutput(
headline=str(headline),
summary=str(summary),
action_items=[str(item) for item in actions if str(item).strip()],
business_risk=str(risk),
)
def fallback_alert_explanation(alert: Alert | Any) -> AlertExplanation:
return AlertExplanation(
summary=getattr(alert, "summary", "DMARC Sentinel created a deterministic alert."),
risk="Review the deterministic facts. DMARC aggregate data alone does not prove mailbox compromise.",
recommended_action="Review the deterministic facts before changing DNS or sender classification; do not add relay or forwarding IPs to SPF solely because they appear in aggregate reports.",
confidence="fallback",
)
class LLMClient:
def __init__(self, settings: Settings):
self.settings = settings
self.client = None
if settings.llm.provider == "openai" and os.getenv(settings.llm.api_key_env):
self.client = OpenAI(api_key=os.getenv(settings.llm.api_key_env), timeout=settings.llm.timeout_seconds)
def _prompt(self, path: str, fallback: str) -> str:
try:
prompt_path = Path(path)
if prompt_path.exists():
return prompt_path.read_text(encoding="utf-8").strip()
except OSError as exc:
logger.warning("Could not read prompt file %s: %s", path, exc)
return fallback
def _json_call(self, payload: dict[str, Any]) -> dict[str, Any]:
if self.client is None:
raise RuntimeError("OpenAI client is not configured")
last_error: Exception | None = None
for attempt in range(self.settings.llm.max_retries + 1):
try:
response = self.client.chat.completions.create(
model=self.settings.llm.model,
temperature=self.settings.llm.temperature,
response_format={"type": "json_object"},
messages=[
{"role": "system", "content": self._prompt(self.settings.llm.system_prompt_path, SYSTEM_PROMPT)},
{"role": "user", "content": json.dumps(payload, sort_keys=True)},
],
)
text = response.choices[0].message.content or "{}"
return json.loads(text)
except Exception as exc:
last_error = exc
logger.warning("LLM call failed on attempt %s: %s", attempt + 1, exc)
raise RuntimeError(f"LLM call failed: {last_error}")
def explain_alert(self, alert: Alert) -> AlertExplanation:
payload = {
"task": "explain_dmarc_alert",
"domain": alert.domain,
"severity": alert.severity,
"alert_type": alert.type,
"facts": json.loads(alert.details_json or "{}"),
"required_json_schema": {
"summary": "string, one concise sentence based only on the supplied facts",
"risk": "string, business/operational risk without claiming compromise from aggregate data alone",
"recommended_action": "string, concrete next step for the admin",
"confidence": "low|medium|high",
},
"instruction": self._prompt(self.settings.llm.alert_prompt_path, ALERT_PROMPT),
}
last_error: Exception | None = None
for _ in range(2):
try:
output = self._json_call(payload)
return normalize_alert_explanation(output, alert)
except (RuntimeError, ValidationError, json.JSONDecodeError) as exc:
last_error = exc
logger.warning("LLM alert explanation validation failed for %s: %s", alert.fingerprint, exc)
logger.warning("Using fallback LLM alert explanation for %s: %s", alert.fingerprint, last_error)
return fallback_alert_explanation(alert)
def daily_summary(self, payload: dict[str, Any]) -> SummaryOutput:
try:
enriched = {
**payload,
"required_json_schema": payload.get("required_json_schema")
or {
"headline": "string",
"summary": "string",
"action_items": "array of strings",
"business_risk": "string",
},
"instruction": payload.get("instruction") or self._prompt(self.settings.llm.digest_prompt_path, POSTURE_DIGEST_PROMPT),
}
output = self._json_call(enriched)
return normalize_summary_output(output, enriched)
except Exception as exc:
logger.warning("Using fallback daily summary: %s", exc)
return normalize_summary_output({}, payload)
def weekly_summary(self, payload: dict[str, Any]) -> SummaryOutput:
try:
output = self._json_call({**payload, "instruction": payload.get("instruction") or self._prompt(self.settings.llm.weekly_prompt_path, WEEKLY_PROMPT)})
return SummaryOutput.model_validate(output)
except Exception as exc:
logger.warning("Using fallback weekly summary: %s", exc)
return SummaryOutput(
headline="Weekly DMARC posture summary generated from deterministic telemetry.",
summary="Review trend changes, new senders, and persistent failures before changing DMARC policy.",
action_items=["Classify legitimate new senders.", "Investigate persistent failures."],
business_risk="Unknown",
)
+953
View File
@@ -0,0 +1,953 @@
from __future__ import annotations
import json
import threading
from datetime import date, datetime, timedelta, timezone
import os
from pathlib import Path
from types import SimpleNamespace
from fastapi import Depends, FastAPI, HTTPException, Request
from fastapi.responses import HTMLResponse
from fastapi.staticfiles import StaticFiles
from fastapi.templating import Jinja2Templates
from sqlalchemy import desc, func, select
from sqlalchemy.orm import Session, selectinload
from app import __version__
from app.auth import require_admin_csrf, require_dashboard_auth, require_homepage_token
from app.config import Settings, configure_logging, get_settings
from app.db import database_ok, get_db, init_db, session_scope
from app.homepage import domain_homepage_summary, domain_metrics, homepage_summary, latest_summary, resolve_date_range, traffic_distribution
from app.inbox_locks import InboxRunLease, inbox_run_locks
from app.jobs import import_jobs
from app.models import Alert, DailyStat, InboxStatus, LLMReport, Record, Report, SkippedReportPayload, utcnow
from app.scheduler import generate_open_posture_summaries, scheduler_ok, start_scheduler
from app.schemas import BacklogRequest, ProcessNowRequest
from app.message_processor import process_inbox
from app.validation import parse_positive_int_ids
settings = get_settings()
configure_logging(settings)
init_db()
app = FastAPI(
title=settings.app.name,
version=__version__,
docs_url=None,
redoc_url=None,
openapi_url=None,
)
templates = Jinja2Templates(directory="app/templates")
Path("app/static").mkdir(parents=True, exist_ok=True)
app.mount("/static", StaticFiles(directory="app/static"), name="static")
dashboard_auth = [Depends(require_dashboard_auth)]
dashboard_post_auth = [Depends(require_dashboard_auth), Depends(require_admin_csrf)]
def _format_display_datetime(value, fallback: str = "never") -> str:
if not value:
return fallback
if isinstance(value, str):
parsed = _parse_dt(value)
if parsed:
value = parsed
else:
return value
if isinstance(value, datetime):
return value.strftime("%d/%m/%Y %H:%M:%S")
if isinstance(value, date):
return value.strftime("%d/%m/%Y")
return str(value)
def _format_display_date(value, fallback: str = "never") -> str:
if not value:
return fallback
if isinstance(value, str):
parsed = _parse_dt(value)
if parsed:
return parsed.strftime("%d/%m/%Y")
try:
return date.fromisoformat(value).strftime("%d/%m/%Y")
except ValueError:
return value
if isinstance(value, datetime):
return value.strftime("%d/%m/%Y")
if isinstance(value, date):
return value.strftime("%d/%m/%Y")
return str(value)
templates.env.filters["fmt_dt"] = _format_display_datetime
templates.env.filters["fmt_date"] = _format_display_date
@app.on_event("startup")
def _startup() -> None:
start_scheduler(settings)
@app.get("/health")
def health():
if not database_ok():
raise HTTPException(status_code=500, detail={"status": "error", "database": "failed"})
return {"status": "ok", "database": "ok", "scheduler": "ok" if scheduler_ok() else "stopped", "version": __version__}
@app.get("/", response_class=HTMLResponse, dependencies=[Depends(require_dashboard_auth)])
def index(request: Request, session: Session = Depends(get_db)):
data = homepage_summary(session, period="all")
domains = session.execute(select(Report.domain).distinct().order_by(Report.domain)).scalars().all()
alerts_total = session.scalar(select(func.count(Alert.id)).where(Alert.status == "open")) or 0
alerts = _alert_views(session.execute(select(Alert).where(Alert.status == "open")).scalars().all(), session)[:5]
traffic = traffic_distribution(session, period="all")
return templates.TemplateResponse(
"index.html",
{
"request": request,
"data": data,
"domains": domains,
"alerts": alerts,
"alerts_total": alerts_total,
"traffic": traffic,
"traffic_label": f'{data["scope_label"]} · All domains',
},
)
def _parse_dt(value: str | None) -> datetime | None:
if not value:
return None
try:
parsed = datetime.fromisoformat(value.replace("Z", "+00:00"))
except ValueError:
return None
if parsed.tzinfo is None:
return parsed.replace(tzinfo=timezone.utc)
return parsed
def _alert_details(alert: Alert) -> dict:
try:
return json.loads(alert.details_json or "{}")
except json.JSONDecodeError:
return {}
def _alert_report_time(alert: Alert) -> datetime:
details = _alert_details(alert)
date_range = details.get("date_range") if isinstance(details.get("date_range"), dict) else {}
return (
_parse_dt(date_range.get("end"))
or _parse_dt(date_range.get("begin"))
or _parse_dt(alert.last_seen_at.isoformat() if alert.last_seen_at else None)
or _parse_dt(alert.created_at.isoformat() if alert.created_at else None)
or datetime.now(timezone.utc)
)
def _severity_class(severity: str) -> str:
return {
"critical": "critical",
"warning": "warning",
"info": "info",
}.get(severity, "info")
def _infer_alert_report_details(session: Session | None, alert: Alert, details: dict) -> dict:
date_range = details.get("date_range") if isinstance(details.get("date_range"), dict) else {}
if date_range.get("begin") or date_range.get("end") or not session:
return details
source_ip = details.get("source_ip")
if not source_ip:
parts = alert.fingerprint.split(":", 2)
source_ip = parts[2] if len(parts) == 3 and parts[2] != "global" else None
report = None
is_aggregate = alert.type in {
"sudden_unknown_failure_spike",
}
if source_ip and not is_aggregate:
report = session.scalar(
select(Report)
.join(Record)
.where(Report.domain == alert.domain, Record.source_ip == source_ip)
.order_by(desc(func.coalesce(Report.date_end, Report.date_begin, Report.created_at)))
.limit(1)
)
if not report:
report = session.scalar(
select(Report)
.where(Report.domain == alert.domain)
.order_by(desc(func.coalesce(Report.date_end, Report.date_begin, Report.created_at)))
.limit(1)
)
if not report:
return details
enriched = dict(details)
enriched["source_ip"] = source_ip
if not is_aggregate:
enriched["report_db_id"] = report.id
enriched["date_range"] = {
"begin": report.date_begin.isoformat() if report.date_begin else None,
"end": report.date_end.isoformat() if report.date_end else None,
}
return enriched
def _alert_view(alert: Alert, session: Session | None = None) -> SimpleNamespace:
details = _alert_details(alert)
details = _infer_alert_report_details(session, alert, details)
date_range = details.get("date_range") if isinstance(details.get("date_range"), dict) else {}
report_db_id = details.get("report_db_id")
report_db_ids = details.get("report_db_ids") if isinstance(details.get("report_db_ids"), list) else []
if not report_db_id and isinstance(details.get("report_db_ids"), list) and details["report_db_ids"]:
report_db_id = details["report_db_ids"][-1]
if alert.type in {"sudden_unknown_failure_spike"}:
report_db_id = None
report_time = (
_parse_dt(date_range.get("end"))
or _parse_dt(date_range.get("begin"))
or _alert_report_time(alert)
)
return SimpleNamespace(
id=alert.id,
fingerprint=alert.fingerprint,
inbox_id=alert.inbox_id,
domain=alert.domain,
severity=alert.severity,
severity_class=_severity_class(alert.severity),
type=alert.type,
title=alert.title,
summary=alert.summary,
details_json=alert.details_json,
llm_summary=alert.llm_summary,
llm_risk=alert.llm_risk,
llm_recommended_action=alert.llm_recommended_action,
status=alert.status,
first_seen_at=alert.first_seen_at,
last_seen_at=alert.last_seen_at,
created_at=alert.created_at,
updated_at=alert.updated_at,
report_start=_parse_dt(date_range.get("begin")),
report_end=_parse_dt(date_range.get("end")),
report_time=report_time,
report_db_id=report_db_id,
report_db_ids=report_db_ids,
source_ip=details.get("source_ip"),
source_history=_source_history(session, alert.domain, details.get("source_ip"), alert.type, report_db_id) if session else None,
)
def _alert_views(alerts: list[Alert], session: Session | None = None) -> list[SimpleNamespace]:
return sorted((_alert_view(alert, session) for alert in alerts), key=lambda item: item.report_time, reverse=True)
def _prompt_settings(settings: Settings) -> list[SimpleNamespace]:
items = [
("System", settings.llm.system_prompt_path),
("Alert Explanation", settings.llm.alert_prompt_path),
("Posture Digest", settings.llm.digest_prompt_path),
("Weekly Summary", settings.llm.weekly_prompt_path),
]
prompts = []
for label, path in items:
prompt_path = Path(path)
try:
content = prompt_path.read_text(encoding="utf-8") if prompt_path.exists() else ""
except OSError:
content = ""
prompts.append(SimpleNamespace(label=label, path=path, exists=prompt_path.exists(), content=content))
return prompts
def _domain_trend(session: Session, domain: str) -> list[SimpleNamespace]:
rows = session.execute(select(Record, Report).join(Report).where(Report.domain == domain)).all()
by_day: dict[date, dict[str, int]] = {}
for record, report in rows:
stamp = report.date_end or report.date_begin or report.created_at
day = stamp.date()
bucket = by_day.setdefault(day, {"total": 0, "pass": 0, "fail": 0})
bucket["total"] += record.count
if record.dmarc_pass:
bucket["pass"] += record.count
else:
bucket["fail"] += record.count
return [
SimpleNamespace(date=day, total_messages=data["total"], dmarc_pass_count=data["pass"], dmarc_fail_count=data["fail"])
for day, data in sorted(by_day.items(), reverse=True)
]
def _domain_sources(session: Session, domain: str) -> list[SimpleNamespace]:
rows = session.execute(
select(Record)
.options(selectinload(Record.auth_results))
.join(Report)
.where(Report.domain == domain)
).scalars().all()
sources: dict[str, dict[str, object]] = {}
for record in rows:
source = sources.setdefault(
record.source_ip,
{"source_ip": record.source_ip, "count": 0, "pass": 0, "fail": 0, "known": None, "dkim": set()},
)
source["count"] = int(source["count"]) + record.count
source["pass"] = int(source["pass"]) + (record.count if record.dmarc_pass else 0)
source["fail"] = int(source["fail"]) + (0 if record.dmarc_pass else record.count)
if record.known_sender_name:
source["known"] = record.known_sender_name
for auth in record.auth_results:
if auth.auth_type == "dkim" and auth.domain:
source["dkim"].add(auth.domain)
return [
SimpleNamespace(
source_ip=str(item["source_ip"]),
count=int(item["count"]),
pass_count=int(item["pass"]),
fail_count=int(item["fail"]),
known_sender_name=item["known"],
dkim_domains=", ".join(sorted(item["dkim"])) or "none reported",
dmarc_pass=int(item["pass"]) >= int(item["fail"]),
)
for item in sorted(sources.values(), key=lambda entry: int(entry["count"]), reverse=True)
]
def _source_history(session: Session, domain: str, source_ip: str | None, alert_type: str, report_db_id: int | None) -> str | None:
if not source_ip:
return None
rows = session.execute(
select(
Report.id,
Record.count,
Record.dmarc_pass,
func.coalesce(Report.date_end, Report.date_begin, Report.created_at),
)
.select_from(Record)
.join(Report)
.where(Report.domain == domain, Record.source_ip == source_ip)
).all()
if not rows:
return None
by_report: dict[int, dict[str, object]] = {}
for report_id, count, dmarc_pass, stamp in rows:
parsed = _parse_dt(stamp.isoformat() if hasattr(stamp, "isoformat") else stamp)
item = by_report.setdefault(report_id, {"time": parsed, "messages": 0, "failed": 0})
item["messages"] = int(item["messages"]) + int(count or 0)
item["failed"] = int(item["failed"]) + (0 if dmarc_pass else int(count or 0))
reports = sorted(by_report.items(), key=lambda item: item[1]["time"] or datetime.min.replace(tzinfo=timezone.utc))
report_count = len(reports)
total_messages = sum(int(item["messages"]) for _, item in reports)
failed_messages = sum(int(item["failed"]) for _, item in reports)
first_day = reports[0][1]["time"]
last_day = reports[-1][1]["time"]
linked_time = by_report.get(report_db_id, {}).get("time") if report_db_id else first_day
linked_messages = int(by_report.get(report_db_id, {}).get("messages", 0)) if report_db_id else int(reports[0][1]["messages"])
later_reports = [
item for item in reports
if linked_time and item[1]["time"] and item[1]["time"] > linked_time
]
if alert_type in {"new_unknown_source", "dkim_authenticated_relay", "new_authenticated_source", "new_spf_authenticated_source"}:
noun = "relay path" if alert_type == "dkim_authenticated_relay" else "source"
if later_reports:
later_failed = sum(int(item["failed"]) for _, item in later_reports)
return f"First observed {noun}: {linked_messages} messages in the source report. Later seen in {len(later_reports)} more reports; {later_failed} failed messages afterward."
return f"First observed {noun}: {linked_messages or total_messages} messages in 1 report."
if report_count <= 1:
return f"First seen source: {total_messages} messages in 1 report."
date_text = f" since {_format_display_date(first_day)}" if first_day else ""
last_text = f"; latest {_format_display_date(last_day)}" if last_day else ""
return f"Repeat offender: seen in {report_count} reports{date_text}{last_text}; {failed_messages} failed messages."
def _record_auth_tooltip(record: Record, auth_type: str) -> str:
items = []
for auth in record.auth_results:
if auth.auth_type != auth_type:
continue
parts = []
if auth.domain:
parts.append(f"domain={auth.domain}")
if auth.selector:
parts.append(f"selector={auth.selector}")
if auth.scope:
parts.append(f"scope={auth.scope}")
if auth.result:
parts.append(f"result={auth.result}")
items.append(", ".join(parts) if parts else f"{auth_type.upper()} result without reported domain")
return "; ".join(items) if items else f"No {auth_type.upper()} auth result domain reported."
@app.get("/domains/{domain}", response_class=HTMLResponse, dependencies=[Depends(require_dashboard_auth)])
def domain_page(domain: str, request: Request, source_page: int = 1, alert_page: int = 1, report_page: int = 1, trend_page: int = 1, session: Session = Depends(get_db)):
metrics = domain_metrics(session, domain)
all_stats = _domain_trend(session, domain)
trend_page_size = 14
trend_page = max(1, trend_page)
trend_total = len(all_stats)
stats = all_stats[(trend_page - 1) * trend_page_size : trend_page * trend_page_size]
all_sources = _domain_sources(session, domain)
source_page_size = 25
source_page = max(1, source_page)
source_total = len(all_sources)
records = all_sources[(source_page - 1) * source_page_size : source_page * source_page_size]
alert_page_size = 10
alert_page = max(1, alert_page)
alert_total = session.scalar(select(func.count(Alert.id)).where(Alert.domain == domain, Alert.status == "open")) or 0
all_alerts = _alert_views(session.execute(select(Alert).where(Alert.domain == domain, Alert.status == "open")).scalars().all(), session)
alerts = all_alerts[(alert_page - 1) * alert_page_size : alert_page * alert_page_size]
report_page_size = 20
report_page = max(1, report_page)
report_total = session.scalar(select(func.count(Report.id)).where(Report.domain == domain)) or 0
reports = session.execute(
select(Report)
.where(Report.domain == domain)
.order_by(desc(func.coalesce(Report.date_end, Report.date_begin, Report.created_at)))
.offset((report_page - 1) * report_page_size)
.limit(report_page_size)
).scalars().all()
reporters = session.execute(
select(Report.org_name, func.count(Report.id)).where(Report.domain == domain).group_by(Report.org_name).order_by(desc(func.count(Report.id))).limit(10)
).all()
dispositions = session.execute(
select(Record.disposition, func.sum(Record.count)).join(Report).where(Report.domain == domain).group_by(Record.disposition)
).all()
known_unknown = session.execute(
select(Record.is_known_sender, func.sum(Record.count)).join(Report).where(Report.domain == domain).group_by(Record.is_known_sender)
).all()
return templates.TemplateResponse(
"domain.html",
{
"request": request,
"domain": domain,
"metrics": metrics,
"stats": stats,
"trend_page": trend_page,
"trend_page_size": trend_page_size,
"trend_total": trend_total,
"records": records,
"source_page": source_page,
"source_page_size": source_page_size,
"source_total": source_total,
"alerts": alerts,
"alert_page": alert_page,
"alert_page_size": alert_page_size,
"alert_total": alert_total,
"reports": reports,
"report_page": report_page,
"report_page_size": report_page_size,
"report_total": report_total,
"reporters": reporters,
"dispositions": dispositions,
"known_unknown": known_unknown,
"summary": latest_summary(session, domain),
},
)
@app.get("/reports/{report_id}", response_class=HTMLResponse, dependencies=[Depends(require_dashboard_auth)])
def report_page(report_id: int, request: Request, session: Session = Depends(get_db)):
report = session.scalar(select(Report).options(selectinload(Report.records).selectinload(Record.auth_results)).where(Report.id == report_id))
if not report:
raise HTTPException(status_code=404)
for record in report.records:
record.dkim_auth_tooltip = _record_auth_tooltip(record, "dkim")
record.spf_auth_tooltip = _record_auth_tooltip(record, "spf")
domain_alerts = session.execute(select(Alert).where(Alert.domain == report.domain)).scalars().all()
related_alerts = []
for view in _alert_views(domain_alerts, session):
same_report = view.report_db_id == report.id or report.id in view.report_db_ids
if same_report:
related_alerts.append(view)
return templates.TemplateResponse("report.html", {"request": request, "report": report, "alerts": related_alerts})
@app.get("/alerts", response_class=HTMLResponse, dependencies=[Depends(require_dashboard_auth)])
def alerts_page(
request: Request,
page: int = 1,
domain: str | None = None,
alert_type: str | None = None,
severity: str | None = None,
status: str | None = "open",
date_from: str | None = None,
date_to: str | None = None,
session: Session = Depends(get_db),
):
page = max(1, page)
page_size = 25
stmt = select(Alert)
count_stmt = select(func.count(Alert.id))
filters = []
if domain:
filters.append(Alert.domain == domain)
if alert_type:
filters.append(Alert.type == alert_type)
if severity:
filters.append(Alert.severity == severity)
if status:
filters.append(Alert.status == status)
for item in filters:
stmt = stmt.where(item)
count_stmt = count_stmt.where(item)
filtered_alerts = _alert_views(session.execute(stmt).scalars().all(), session)
start = _parse_dt(f"{date_from}T00:00:00+00:00") if date_from else None
end = _parse_dt(f"{date_to}T23:59:59+00:00") if date_to else None
if start:
filtered_alerts = [alert for alert in filtered_alerts if alert.report_time >= start]
if end:
filtered_alerts = [alert for alert in filtered_alerts if alert.report_time <= end]
total = len(filtered_alerts)
alerts = filtered_alerts[(page - 1) * page_size : page * page_size]
domains = session.execute(select(Alert.domain).distinct().order_by(Alert.domain)).scalars().all()
type_stmt = select(Alert.type).distinct().order_by(Alert.type)
if domain:
type_stmt = type_stmt.where(Alert.domain == domain)
if status:
type_stmt = type_stmt.where(Alert.status == status)
alert_types = session.execute(type_stmt).scalars().all()
severity_stmt = select(Alert.severity).distinct().order_by(Alert.severity)
if domain:
severity_stmt = severity_stmt.where(Alert.domain == domain)
if status:
severity_stmt = severity_stmt.where(Alert.status == status)
severities = session.execute(severity_stmt).scalars().all()
return templates.TemplateResponse(
"alerts.html",
{
"request": request,
"alerts": alerts,
"domains": domains,
"alert_types": alert_types,
"severities": severities,
"page": page,
"page_size": page_size,
"total": total,
"selected_domain": domain or "",
"selected_type": alert_type or "",
"selected_severity": severity or "",
"selected_status": status or "",
"selected_date_from": date_from or "",
"selected_date_to": date_to or "",
},
)
@app.get("/inboxes", response_class=HTMLResponse, dependencies=[Depends(require_dashboard_auth)])
def inboxes_page(request: Request, session: Session = Depends(get_db)):
statuses = {
status.inbox_id: status
for status in session.execute(select(InboxStatus).order_by(InboxStatus.inbox_id)).scalars().all()
}
inboxes = []
for configured in settings.inboxes:
status = statuses.pop(configured.id, None)
inboxes.append(
SimpleNamespace(
inbox_id=configured.id,
label=configured.label,
domain=configured.domain,
folder=configured.folder,
recipient=configured.recipient,
enabled=configured.enabled,
last_check_at=status.last_check_at if status else None,
last_success_at=status.last_success_at if status else None,
last_error=status.last_error if status else None,
last_new_messages=status.last_new_messages if status else 0,
last_reports_imported=status.last_reports_imported if status else 0,
)
)
inboxes.extend(statuses.values())
jobs = {}
for job in import_jobs.list():
jobs.setdefault(job.inbox_id, job.to_dict())
inbox_ids = [inbox.inbox_id for inbox in inboxes]
skipped_payloads = {inbox_id: [] for inbox_id in inbox_ids}
if inbox_ids:
skipped_rows = session.execute(
select(SkippedReportPayload)
.where(SkippedReportPayload.inbox_id.in_(inbox_ids))
.order_by(desc(SkippedReportPayload.created_at), desc(SkippedReportPayload.id))
.limit(500)
).scalars().all()
for row in skipped_rows:
skipped_payloads.setdefault(row.inbox_id, []).append(row)
return templates.TemplateResponse(
"inboxes.html",
{"request": request, "inboxes": inboxes, "jobs": jobs, "skipped_payloads": skipped_payloads},
)
def _inbox_status_payload(inbox_id: str, session: Session) -> dict:
try:
configured = settings.get_inbox(inbox_id)
except KeyError:
raise HTTPException(status_code=404, detail=f"Unknown inbox: {inbox_id}") from None
status = session.scalar(select(InboxStatus).where(InboxStatus.inbox_id == inbox_id))
return {
"inbox_id": configured.id,
"label": configured.label,
"domain": configured.domain,
"folder": configured.folder,
"recipient": configured.recipient,
"enabled": configured.enabled,
"last_check_at": status.last_check_at.isoformat() if status and status.last_check_at else None,
"last_success_at": status.last_success_at.isoformat() if status and status.last_success_at else None,
"last_error_at": status.last_error_at.isoformat() if status and status.last_error_at else None,
"last_error": status.last_error if status else None,
"last_new_messages": status.last_new_messages if status else 0,
"last_reports_imported": status.last_reports_imported if status else 0,
}
@app.get("/settings", response_class=HTMLResponse, dependencies=[Depends(require_dashboard_auth)])
def settings_page(request: Request):
env_status = {
settings.security.dashboard_username_env: bool(os.getenv(settings.security.dashboard_username_env)),
settings.security.dashboard_password_env: bool(os.getenv(settings.security.dashboard_password_env)),
settings.security.homepage_token_env: bool(os.getenv(settings.security.homepage_token_env)),
settings.llm.api_key_env: bool(os.getenv(settings.llm.api_key_env)),
}
for inbox in settings.inboxes:
env_status[inbox.username_env] = bool(os.getenv(inbox.username_env))
env_status[inbox.password_env] = bool(os.getenv(inbox.password_env))
if settings.alerts.email.enabled:
email = settings.alerts.email
for name in [
email.smtp_host_env,
email.smtp_port_env,
email.smtp_user_env,
email.smtp_password_env,
email.from_env,
email.to_env,
]:
env_status[name] = bool(os.getenv(name))
return templates.TemplateResponse(
"settings.html",
{
"request": request,
"settings": settings,
"env_status": env_status,
"config_path": os.getenv("DMARC_SENTINEL_CONFIG") or "config/config.yml",
"prompts": _prompt_settings(settings),
},
)
@app.get("/api/homepage", dependencies=[Depends(require_homepage_token)])
def api_homepage(session: Session = Depends(get_db)):
return homepage_summary(session)
@app.get("/api/homepage/{domain}", dependencies=[Depends(require_homepage_token)])
def api_homepage_domain(domain: str, session: Session = Depends(get_db)):
return domain_homepage_summary(session, domain)
@app.get("/api/domains", dependencies=[Depends(require_dashboard_auth)])
def api_domains(session: Session = Depends(get_db)):
return {"domains": session.execute(select(Report.domain).distinct()).scalars().all()}
def _overview_payload(session: Session, period: str = "all", domain: str | None = None, date_from: str | None = None, date_to: str | None = None) -> dict:
data = homepage_summary(session, period=period, domain=domain or None, date_from=date_from, date_to=date_to)
traffic = traffic_distribution(session, period=period, domain=domain or None, date_from=date_from, date_to=date_to)
return {
"period": period,
"period_label": data["scope_label"],
"domain": domain,
"metrics": data,
"buckets": traffic,
}
@app.get("/api/overview", dependencies=[Depends(require_dashboard_auth)])
def api_overview(period: str = "all", domain: str | None = None, date_from: str | None = None, date_to: str | None = None, session: Session = Depends(get_db)):
return _overview_payload(session, period=period, domain=domain, date_from=date_from, date_to=date_to)
@app.get("/api/traffic", dependencies=[Depends(require_dashboard_auth)])
def api_traffic(period: str = "all", domain: str | None = None, date_from: str | None = None, date_to: str | None = None, session: Session = Depends(get_db)):
payload = _overview_payload(session, period=period, domain=domain, date_from=date_from, date_to=date_to)
return {key: payload[key] for key in ["period", "period_label", "domain", "buckets"]}
def _latest_report_day(session: Session) -> date | None:
latest = session.scalar(select(func.max(func.coalesce(Report.date_end, Report.date_begin, Report.created_at))))
if isinstance(latest, str):
latest = _parse_dt(latest)
return latest.date() if latest else None
@app.post("/api/admin/scheduler/daily-summary", dependencies=dashboard_post_auth)
def api_generate_daily_summary(session: Session = Depends(get_db)):
if not settings.llm.generate_daily_summary:
raise HTTPException(status_code=400, detail="Daily LLM summaries are disabled.")
target_day = _latest_report_day(session)
if not target_day:
raise HTTPException(status_code=400, detail="No reports have been imported yet.")
generate_open_posture_summaries(settings, force=True)
summary = latest_summary(session)
return {"ok": True, "target_day": target_day.isoformat(), "summary": summary}
@app.get("/api/domains/{domain}/summary", dependencies=[Depends(require_dashboard_auth)])
def api_domain_summary(domain: str, session: Session = Depends(get_db)):
return domain_homepage_summary(session, domain)
@app.get("/api/domains/{domain}/reports", dependencies=[Depends(require_dashboard_auth)])
def api_domain_reports(domain: str, session: Session = Depends(get_db)):
reports = session.execute(
select(Report)
.where(Report.domain == domain)
.order_by(desc(func.coalesce(Report.date_end, Report.date_begin, Report.created_at)))
.limit(100)
).scalars().all()
return {"reports": [{"id": r.id, "org_name": r.org_name, "report_id": r.report_id, "date_begin": r.date_begin, "date_end": r.date_end} for r in reports]}
@app.get("/api/domains/{domain}/sources", dependencies=[Depends(require_dashboard_auth)])
def api_domain_sources(domain: str, session: Session = Depends(get_db)):
rows = session.execute(select(Record.source_ip, func.sum(Record.count), func.max(Record.is_known_sender)).join(Report).where(Report.domain == domain).group_by(Record.source_ip)).all()
return {"sources": [{"source_ip": ip, "count": count, "known": bool(known)} for ip, count, known in rows]}
@app.get("/api/reports/{report_id}", dependencies=[Depends(require_dashboard_auth)])
def api_report(report_id: int, session: Session = Depends(get_db)):
report = session.scalar(select(Report).options(selectinload(Report.records)).where(Report.id == report_id))
if not report:
raise HTTPException(status_code=404)
return {
"id": report.id,
"domain": report.domain,
"org_name": report.org_name,
"report_id": report.report_id,
"records": [
{
"source_ip": row.source_ip,
"count": row.count,
"spf_aligned": row.spf_aligned,
"dkim_aligned": row.dkim_aligned,
"dmarc_pass": row.dmarc_pass,
"known_sender": row.known_sender_name,
"disposition": row.disposition,
}
for row in report.records
],
}
@app.get("/api/alerts", dependencies=[Depends(require_dashboard_auth)])
def api_alerts(session: Session = Depends(get_db)):
alerts = _alert_views(session.execute(select(Alert)).scalars().all(), session)
return {"alerts": [{"id": a.id, "severity": a.severity, "type": a.type, "title": a.title, "status": a.status, "llm_summary": a.llm_summary} for a in alerts]}
def _set_alert_status(alert_id: int, status: str, session: Session) -> dict:
alert = session.get(Alert, alert_id)
if not alert:
raise HTTPException(status_code=404)
alert.status = status
session.commit()
return {"ok": True, "status": status}
def _copy_summary_to_job(job_id: str, summary) -> None:
def mutate(job):
job.status = "running"
job.scanned_messages = summary.scanned_messages
job.processed_messages = summary.processed_messages
job.candidate_messages = summary.candidate_messages
job.valid_reports_imported = summary.valid_reports_imported
job.duplicate_messages_skipped = summary.duplicate_messages_skipped
job.duplicate_reports_skipped = summary.duplicate_reports_skipped
job.failed_messages = summary.failed_messages
job.rejected_messages = summary.rejected_messages
job.records_imported = summary.records_imported
job.alerts_created = summary.alerts_created
job.duplicate_report_samples = summary.duplicate_report_samples
import_jobs.update(job_id, mutate)
def _run_import_job(job_id: str, action: str, body: ProcessNowRequest | BacklogRequest, lease: InboxRunLease) -> None:
def mark_running(job):
job.status = "running"
import_jobs.update(job_id, mark_running)
try:
inbox = settings.get_inbox(body.inbox_id)
with session_scope() as session:
if action == "backlog":
assert isinstance(body, BacklogRequest)
summary = process_inbox(
session,
settings,
inbox,
folder=body.folder or inbox.folder,
mode="backlog",
since=body.since,
before=body.before,
limit=body.limit,
dry_run=body.dry_run,
reprocess=body.reprocess,
mark_seen=body.mark_seen,
progress_callback=lambda item: _copy_summary_to_job(job_id, item),
)
else:
assert isinstance(body, ProcessNowRequest)
summary = process_inbox(
session,
settings,
inbox,
mode=body.mode,
limit=body.limit,
progress_callback=lambda item: _copy_summary_to_job(job_id, item),
)
_copy_summary_to_job(job_id, summary)
def mark_done(job):
job.status = "succeeded"
job.completed_at = utcnow()
import_jobs.update(job_id, mark_done)
except Exception as exc:
error = str(exc)
def mark_failed(job):
job.status = "failed"
job.error = error
job.completed_at = utcnow()
import_jobs.update(job_id, mark_failed)
finally:
lease.release()
def _start_import_job(action: str, body: ProcessNowRequest | BacklogRequest) -> dict:
try:
settings.get_inbox(body.inbox_id)
except KeyError:
raise HTTPException(status_code=404, detail=f"Unknown inbox: {body.inbox_id}") from None
active = import_jobs.active_for_inbox(body.inbox_id)
if active:
return active.to_dict()
lease = inbox_run_locks.acquire(body.inbox_id, blocking=False)
if not lease:
raise HTTPException(status_code=409, detail=f"Inbox {body.inbox_id} is already processing.")
try:
job = import_jobs.create(body.inbox_id, action)
thread = threading.Thread(target=_run_import_job, args=(job.id, action, body, lease), daemon=True)
thread.start()
return job.to_dict()
except Exception:
lease.release()
raise
@app.post("/api/alerts/{alert_id}/ack", dependencies=dashboard_post_auth)
def api_alert_ack(alert_id: int, session: Session = Depends(get_db)):
return _set_alert_status(alert_id, "acknowledged", session)
@app.post("/api/alerts/{alert_id}/resolve", dependencies=dashboard_post_auth)
def api_alert_resolve(alert_id: int, session: Session = Depends(get_db)):
return _set_alert_status(alert_id, "resolved", session)
@app.post("/api/alerts/{alert_id}/reopen", dependencies=dashboard_post_auth)
def api_alert_reopen(alert_id: int, session: Session = Depends(get_db)):
return _set_alert_status(alert_id, "open", session)
@app.post("/api/alerts/bulk", dependencies=dashboard_post_auth)
async def api_alert_bulk(request: Request, session: Session = Depends(get_db)):
try:
payload = await request.json()
except json.JSONDecodeError:
raise HTTPException(status_code=400, detail="Request body must be valid JSON") from None
if not isinstance(payload, dict):
raise HTTPException(status_code=400, detail="Request body must be a JSON object")
ids = parse_positive_int_ids(payload.get("ids", []))
status = payload.get("status")
if status not in {"open", "acknowledged", "resolved"}:
raise HTTPException(status_code=400, detail="Invalid alert status")
updated = 0
if ids:
alerts = session.execute(select(Alert).where(Alert.id.in_(ids))).scalars().all()
for alert in alerts:
alert.status = status
updated += 1
session.commit()
return {"ok": True, "status": status, "updated": updated}
@app.post("/api/admin/import-jobs/process-now", dependencies=dashboard_post_auth)
def api_start_process_now(body: ProcessNowRequest):
return _start_import_job("process-now", body)
@app.post("/api/admin/import-jobs/backlog", dependencies=dashboard_post_auth)
def api_start_backlog(body: BacklogRequest):
return _start_import_job("backlog", body)
@app.get("/api/admin/import-jobs", dependencies=[Depends(require_dashboard_auth)])
def api_import_jobs(inbox_id: str | None = None):
return {"jobs": [job.to_dict() for job in import_jobs.list(inbox_id)]}
@app.get("/api/admin/import-jobs/{job_id}", dependencies=[Depends(require_dashboard_auth)])
def api_import_job(job_id: str):
job = import_jobs.get(job_id)
if not job:
raise HTTPException(status_code=404)
return job.to_dict()
@app.get("/api/admin/inboxes/{inbox_id}/status", dependencies=[Depends(require_dashboard_auth)])
def api_inbox_status(inbox_id: str, session: Session = Depends(get_db)):
return _inbox_status_payload(inbox_id, session)
@app.post("/api/admin/process-now", dependencies=dashboard_post_auth)
def api_process_now(body: ProcessNowRequest, session: Session = Depends(get_db)):
try:
inbox = settings.get_inbox(body.inbox_id)
except KeyError:
raise HTTPException(status_code=404, detail=f"Unknown inbox: {body.inbox_id}") from None
lease = inbox_run_locks.acquire(inbox.id, blocking=False)
if not lease:
raise HTTPException(status_code=409, detail=f"Inbox {inbox.id} is already processing.")
with lease:
summary = process_inbox(session, settings, inbox, mode=body.mode, limit=body.limit)
return summary.__dict__
@app.post("/api/admin/backlog", dependencies=dashboard_post_auth)
def api_backlog(body: BacklogRequest, session: Session = Depends(get_db)):
try:
inbox = settings.get_inbox(body.inbox_id)
except KeyError:
raise HTTPException(status_code=404, detail=f"Unknown inbox: {body.inbox_id}") from None
lease = inbox_run_locks.acquire(inbox.id, blocking=False)
if not lease:
raise HTTPException(status_code=409, detail=f"Inbox {inbox.id} is already processing.")
with lease:
summary = process_inbox(
session,
settings,
inbox,
folder=body.folder or inbox.folder,
mode="backlog",
since=body.since,
before=body.before,
limit=body.limit,
dry_run=body.dry_run,
reprocess=body.reprocess,
mark_seen=body.mark_seen,
)
return summary.__dict__
+414
View File
@@ -0,0 +1,414 @@
from __future__ import annotations
import logging
import hashlib
import json
from dataclasses import dataclass
from datetime import date
from email.message import Message
from email.utils import getaddresses, parsedate_to_datetime
from typing import Callable
from sqlalchemy import select
from sqlalchemy.orm import Session
from app.alerts import send_alert_email
from app.analyzer import analyze_report
from app.attachment_extractor import AttachmentExtractionError, extract_dmarc_attachments, message_has_candidate_attachment
from app.config import InboxConfig, Settings
from app.dmarc_parser import DMARCParseError, parse_dmarc_xml
from app.imap_client import IMAPClient
from app.known_senders import classify_record
from app.llm import LLMClient
from app.models import Alert, AuthResult, InboxStatus, MailMessage, Record, Report, SkippedReportPayload, utcnow
logger = logging.getLogger(__name__)
@dataclass
class ProcessingSummary:
inbox_id: str
folder: str
scanned_messages: int = 0
processed_messages: int = 0
candidate_messages: int = 0
valid_reports_imported: int = 0
duplicate_messages_skipped: int = 0
duplicate_reports_skipped: int = 0
failed_messages: int = 0
records_imported: int = 0
alerts_created: int = 0
llm_explanations_generated: int = 0
rejected_messages: int = 0
duplicate_report_samples: list[dict[str, str | int | None]] | None = None
@property
def duplicates_skipped(self) -> int:
return self.duplicate_messages_skipped + self.duplicate_reports_skipped
def ensure_inbox_status(session: Session, inbox: InboxConfig) -> InboxStatus:
status = session.scalar(select(InboxStatus).where(InboxStatus.inbox_id == inbox.id))
if not status:
status = InboxStatus(
inbox_id=inbox.id,
label=inbox.label,
domain=inbox.domain,
folder=inbox.folder,
recipient=inbox.recipient,
enabled=inbox.enabled,
)
session.add(status)
session.flush()
else:
status.label = inbox.label
status.domain = inbox.domain
status.folder = inbox.folder
status.recipient = inbox.recipient
status.enabled = inbox.enabled
return status
def _headers(message: Message, names: list[str]) -> str:
return " ".join(str(message.get(name, "")) for name in names)
def is_candidate_message(message: Message, inbox: InboxConfig) -> bool:
recipients = _headers(message, ["To", "Cc", "Bcc", "Delivered-To", "X-Original-To", "Envelope-To"]).lower()
subject = str(message.get("Subject", ""))
return (
inbox.recipient.lower() in recipients
or "dmarc" in subject.lower()
or inbox.domain.lower() in subject.lower()
or "report domain" in subject.lower()
or message_has_candidate_attachment(message)
)
def _message_date(message: Message):
try:
parsed = parsedate_to_datetime(message.get("Date"))
return parsed if parsed.tzinfo else parsed.replace(tzinfo=utcnow().tzinfo)
except Exception:
return None
def _recipient(message: Message) -> str | None:
values = _headers(message, ["To", "Cc", "Bcc", "Delivered-To", "X-Original-To"])
addrs = [addr for _, addr in getaddresses([values]) if addr]
return ", ".join(addrs) or None
def _upsert_mail_message(session: Session, inbox: InboxConfig, folder: str, imap_message, status: str = "skipped") -> MailMessage:
existing = session.scalar(
select(MailMessage).where(
MailMessage.inbox_id == inbox.id,
MailMessage.folder == folder,
MailMessage.imap_uid == imap_message.uid,
)
)
if existing:
return existing
message = imap_message.message
mail = MailMessage(
inbox_id=inbox.id,
imap_uid=imap_message.uid,
message_id=message.get("Message-ID"),
folder=folder,
subject=message.get("Subject"),
sender=message.get("From"),
recipient=_recipient(message),
message_date=_message_date(message),
seen=imap_message.seen,
status=status,
)
session.add(mail)
session.flush()
return mail
def _domain_equal(a: str | None, b: str | None) -> bool:
return (a or "").lower().rstrip(".") == (b or "").lower().rstrip(".")
def _record_ingestion_rejection(
session: Session,
inbox: InboxConfig,
mail: MailMessage,
reason: str,
*,
stage: str,
) -> tuple[Alert, bool]:
digest = hashlib.sha256(f"{mail.inbox_id}:{mail.folder}:{mail.imap_uid}:{stage}:{reason}".encode()).hexdigest()[:24]
fingerprint = f"{inbox.domain}:ingestion_rejected:{digest}"
details = {
"stage": stage,
"reason": reason,
"inbox_id": inbox.id,
"folder": mail.folder,
"imap_uid": mail.imap_uid,
"message_id": mail.message_id,
"subject": mail.subject,
"sender": mail.sender,
}
existing = session.scalar(select(Alert).where(Alert.fingerprint == fingerprint, Alert.status == "open"))
now = utcnow()
if existing:
existing.last_seen_at = now
existing.updated_at = now
existing.details_json = json.dumps(details, sort_keys=True, default=str)
return existing, False
alert = Alert(
fingerprint=fingerprint,
inbox_id=inbox.id,
domain=inbox.domain,
severity="warning",
type="ingestion_rejected",
title=f"DMARC payload rejected for {inbox.label}",
summary=f"A message in {inbox.folder} was rejected during {stage}: {reason}",
details_json=json.dumps(details, sort_keys=True, default=str),
first_seen_at=now,
last_seen_at=now,
)
session.add(alert)
session.flush()
return alert, True
def _duplicate_report_sample(existing: Report, mail: MailMessage) -> dict[str, str | int | None]:
return {
"existing_report_db_id": existing.id,
"existing_report_id": existing.report_id,
"reporting_org": existing.org_name,
"report_date": (existing.date_end or existing.date_begin).date().isoformat() if (existing.date_end or existing.date_begin) else None,
"duplicate_message_uid": mail.imap_uid,
"duplicate_message_id": mail.message_id,
}
def _record_duplicate_report_payload(session: Session, inbox: InboxConfig, mail: MailMessage, existing: Report, sha256: str) -> None:
skipped = session.scalar(
select(SkippedReportPayload).where(
SkippedReportPayload.inbox_id == inbox.id,
SkippedReportPayload.folder == mail.folder,
SkippedReportPayload.imap_uid == mail.imap_uid,
SkippedReportPayload.raw_xml_sha256 == sha256,
SkippedReportPayload.reason == "duplicate_report_payload",
)
)
report_date = (existing.date_end or existing.date_begin).date() if (existing.date_end or existing.date_begin) else None
if not skipped:
skipped = SkippedReportPayload(
inbox_id=inbox.id,
folder=mail.folder,
imap_uid=mail.imap_uid,
message_id=mail.message_id,
mail_message_id=mail.id,
reason="duplicate_report_payload",
raw_xml_sha256=sha256,
)
session.add(skipped)
skipped.message_id = mail.message_id
skipped.mail_message_id = mail.id
skipped.existing_report_id = existing.id
skipped.report_identifier = existing.report_id
skipped.reporting_org = existing.org_name
skipped.report_date = report_date
def _store_report(session: Session, settings: Settings, inbox: InboxConfig, mail: MailMessage, extracted) -> tuple[Report | None, Report | None]:
existing = session.scalar(select(Report).where(Report.raw_xml_sha256 == extracted.sha256))
if existing:
return None, existing
parsed = parse_dmarc_xml(
extracted.payload,
max_records=settings.app.max_xml_records_per_report,
max_record_count=settings.app.max_record_count,
max_future_days=settings.app.max_report_future_days,
max_past_days=settings.app.max_report_past_days,
)
if not _domain_equal(parsed.domain, inbox.domain):
raise DMARCParseError(f"Report domain {parsed.domain} does not match inbox domain {inbox.domain}")
report = Report(
inbox_id=inbox.id,
mail_message_id=mail.id,
raw_xml_sha256=extracted.sha256,
report_id=parsed.report_id,
org_name=parsed.org_name,
org_email=parsed.org_email,
extra_contact_info=parsed.extra_contact_info,
domain=parsed.domain,
date_begin=parsed.date_begin,
date_end=parsed.date_end,
policy_p=parsed.policy_p,
policy_sp=parsed.policy_sp,
policy_pct=parsed.policy_pct,
adkim=parsed.adkim,
aspf=parsed.aspf,
fo=parsed.fo,
)
session.add(report)
session.flush()
for parsed_record in parsed.records:
match = classify_record(settings, parsed.domain, parsed_record)
record = Record(
report=report,
source_ip=parsed_record.source_ip,
count=parsed_record.count,
disposition=parsed_record.disposition,
policy_dkim=parsed_record.policy_dkim,
policy_spf=parsed_record.policy_spf,
dkim_aligned=parsed_record.dkim_aligned,
spf_aligned=parsed_record.spf_aligned,
dmarc_pass=parsed_record.dmarc_pass,
header_from=parsed_record.header_from,
reason_type=parsed_record.reason_type,
reason_comment=parsed_record.reason_comment,
known_sender_id=match.id,
known_sender_name=match.name,
is_known_sender=match.is_known,
)
session.add(record)
session.flush()
for auth in parsed_record.auth_results:
session.add(
AuthResult(
record=record,
auth_type=auth.auth_type,
domain=auth.domain,
selector=auth.selector,
scope=auth.scope,
result=auth.result,
human_result=auth.human_result,
)
)
session.flush()
return report, None
def process_inbox(
session: Session,
settings: Settings,
inbox: InboxConfig,
*,
folder: str | None = None,
mode: str = "new",
since: date | None = None,
before: date | None = None,
limit: int | None = None,
dry_run: bool = False,
reprocess: bool = False,
mark_seen: bool = False,
progress_callback: Callable[[ProcessingSummary], None] | None = None,
) -> ProcessingSummary:
folder = folder or inbox.folder
summary = ProcessingSummary(inbox_id=inbox.id, folder=folder)
status = ensure_inbox_status(session, inbox)
status.last_check_at = utcnow()
llm = LLMClient(settings)
try:
with IMAPClient(inbox) as client:
client.select_folder(folder)
uids = client.search_uids(unread_only=mode == "new", since=since, before=before, limit=limit or settings.app.max_reports_per_poll)
summary.scanned_messages = len(uids)
if progress_callback:
progress_callback(summary)
for uid in uids:
try:
imap_message = client.fetch_message(uid)
mail = _upsert_mail_message(session, inbox, folder, imap_message)
if mail.status == "success" and not reprocess:
summary.duplicate_messages_skipped += 1
continue
if not is_candidate_message(imap_message.message, inbox):
mail.status = "skipped"
mail.processed_at = utcnow()
continue
summary.candidate_messages += 1
if dry_run:
continue
reports = extract_dmarc_attachments(
imap_message.message,
settings.app.max_attachment_decompressed_mb,
max_compressed_mb=settings.app.max_attachment_compressed_mb,
max_attachments=settings.app.max_attachments_per_message,
max_reports_per_message=settings.app.max_reports_per_message,
max_reports_per_archive=settings.app.max_reports_per_archive,
max_compression_ratio=settings.app.max_archive_compression_ratio,
)
imported_any = False
for extracted in reports:
report, duplicate_report = _store_report(session, settings, inbox, mail, extracted)
if duplicate_report:
_record_duplicate_report_payload(session, inbox, mail, duplicate_report, extracted.sha256)
summary.duplicate_reports_skipped += 1
if summary.duplicate_report_samples is None:
summary.duplicate_report_samples = []
if len(summary.duplicate_report_samples) < 100:
summary.duplicate_report_samples.append(_duplicate_report_sample(duplicate_report, mail))
continue
if report:
imported_any = True
summary.valid_reports_imported += 1
summary.records_imported += len(report.records)
alerts = analyze_report(session, settings, report, llm=llm)
new_alerts = [item for item in alerts if item[1]]
summary.alerts_created += len(new_alerts)
summary.llm_explanations_generated += len([item for item in alerts if item[1] and item[0].severity in {"warning", "critical"}])
for alert, is_new, severity_increased in alerts:
if is_new or severity_increased:
send_alert_email(settings, alert, severity_increased=severity_increased)
mail.status = "success" if imported_any else "skipped"
mail.error = None
mail.processed_at = utcnow()
if inbox.mark_seen_after_success or mark_seen:
client.mark_seen(uid)
if imported_any and inbox.move_after_success and inbox.processed_folder:
client.move(uid, inbox.processed_folder)
session.commit()
except Exception as exc:
session.rollback()
summary.failed_messages += 1
logger.exception("Message UID %s failed: %s", uid, exc)
try:
imap_message = client.fetch_message(uid)
mail = _upsert_mail_message(session, inbox, folder, imap_message)
mail.status = "failed"
mail.error = str(exc)
mail.processed_at = utcnow()
if isinstance(exc, (AttachmentExtractionError, DMARCParseError)):
alert, is_new = _record_ingestion_rejection(
session,
inbox,
mail,
str(exc),
stage="attachment extraction" if isinstance(exc, AttachmentExtractionError) else "DMARC XML validation",
)
summary.rejected_messages += 1
if is_new:
summary.alerts_created += 1
send_alert_email(settings, alert)
session.commit()
if inbox.move_after_failure and inbox.failed_folder:
client.move(uid, inbox.failed_folder)
except Exception:
session.rollback()
finally:
summary.processed_messages += 1
if progress_callback:
progress_callback(summary)
status.last_success_at = utcnow()
status.last_error = None
status.last_new_messages = summary.scanned_messages
status.last_reports_imported = summary.valid_reports_imported
session.commit()
logger.info("Poll complete for %s: %s", inbox.id, summary)
except Exception as exc:
session.rollback()
status = ensure_inbox_status(session, inbox)
status.last_error_at = utcnow()
status.last_error = str(exc)
session.commit()
logger.exception("Inbox processing failed for %s: %s", inbox.id, exc)
raise
return summary
+205
View File
@@ -0,0 +1,205 @@
from __future__ import annotations
from datetime import date, datetime, timezone
from sqlalchemy import Boolean, Date, DateTime, Float, ForeignKey, Integer, String, Text, UniqueConstraint
from sqlalchemy.orm import Mapped, mapped_column, relationship
from app.db import Base
def utcnow() -> datetime:
return datetime.now(timezone.utc)
class InboxStatus(Base):
__tablename__ = "inbox_statuses"
id: Mapped[int] = mapped_column(primary_key=True)
inbox_id: Mapped[str] = mapped_column(String(120), unique=True, index=True)
label: Mapped[str] = mapped_column(String(200))
domain: Mapped[str] = mapped_column(String(255), index=True)
folder: Mapped[str] = mapped_column(String(255))
recipient: Mapped[str] = mapped_column(String(320))
enabled: Mapped[bool] = mapped_column(Boolean, default=True)
last_check_at: Mapped[datetime | None] = mapped_column(DateTime(timezone=True))
last_success_at: Mapped[datetime | None] = mapped_column(DateTime(timezone=True))
last_error_at: Mapped[datetime | None] = mapped_column(DateTime(timezone=True))
last_error: Mapped[str | None] = mapped_column(Text)
last_new_messages: Mapped[int] = mapped_column(Integer, default=0)
last_reports_imported: Mapped[int] = mapped_column(Integer, default=0)
created_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), default=utcnow)
updated_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), default=utcnow, onupdate=utcnow)
class MailMessage(Base):
__tablename__ = "mail_messages"
__table_args__ = (UniqueConstraint("inbox_id", "folder", "imap_uid", name="uq_message_uid"),)
id: Mapped[int] = mapped_column(primary_key=True)
inbox_id: Mapped[str] = mapped_column(String(120), index=True)
imap_uid: Mapped[str] = mapped_column(String(120))
message_id: Mapped[str | None] = mapped_column(String(500))
folder: Mapped[str] = mapped_column(String(255))
subject: Mapped[str | None] = mapped_column(Text)
sender: Mapped[str | None] = mapped_column(Text)
recipient: Mapped[str | None] = mapped_column(Text)
message_date: Mapped[datetime | None] = mapped_column(DateTime(timezone=True))
seen: Mapped[bool] = mapped_column(Boolean, default=False)
status: Mapped[str] = mapped_column(String(40), default="skipped")
error: Mapped[str | None] = mapped_column(Text)
processed_at: Mapped[datetime | None] = mapped_column(DateTime(timezone=True))
created_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), default=utcnow)
reports: Mapped[list["Report"]] = relationship(back_populates="mail_message")
class Report(Base):
__tablename__ = "reports"
id: Mapped[int] = mapped_column(primary_key=True)
inbox_id: Mapped[str] = mapped_column(String(120), index=True)
mail_message_id: Mapped[int | None] = mapped_column(ForeignKey("mail_messages.id"))
raw_xml_sha256: Mapped[str] = mapped_column(String(64), unique=True, index=True)
report_id: Mapped[str | None] = mapped_column(String(500), index=True)
org_name: Mapped[str | None] = mapped_column(String(255), index=True)
org_email: Mapped[str | None] = mapped_column(String(320))
extra_contact_info: Mapped[str | None] = mapped_column(Text)
domain: Mapped[str] = mapped_column(String(255), index=True)
date_begin: Mapped[datetime | None] = mapped_column(DateTime(timezone=True), index=True)
date_end: Mapped[datetime | None] = mapped_column(DateTime(timezone=True), index=True)
policy_p: Mapped[str | None] = mapped_column(String(40))
policy_sp: Mapped[str | None] = mapped_column(String(40))
policy_pct: Mapped[int | None] = mapped_column(Integer)
adkim: Mapped[str | None] = mapped_column(String(20))
aspf: Mapped[str | None] = mapped_column(String(20))
fo: Mapped[str | None] = mapped_column(String(80))
created_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), default=utcnow)
mail_message: Mapped[MailMessage | None] = relationship(back_populates="reports")
records: Mapped[list["Record"]] = relationship(back_populates="report", cascade="all, delete-orphan")
class Record(Base):
__tablename__ = "records"
id: Mapped[int] = mapped_column(primary_key=True)
report_id: Mapped[int] = mapped_column(ForeignKey("reports.id"), index=True)
source_ip: Mapped[str] = mapped_column(String(80), index=True)
source_reverse_dns: Mapped[str | None] = mapped_column(String(255))
source_asn: Mapped[str | None] = mapped_column(String(80))
source_country: Mapped[str | None] = mapped_column(String(80))
count: Mapped[int] = mapped_column(Integer, default=0)
disposition: Mapped[str | None] = mapped_column(String(40), index=True)
policy_dkim: Mapped[str | None] = mapped_column(String(40))
policy_spf: Mapped[str | None] = mapped_column(String(40))
dkim_aligned: Mapped[bool] = mapped_column(Boolean, default=False)
spf_aligned: Mapped[bool] = mapped_column(Boolean, default=False)
dmarc_pass: Mapped[bool] = mapped_column(Boolean, default=False)
header_from: Mapped[str | None] = mapped_column(String(255), index=True)
reason_type: Mapped[str | None] = mapped_column(String(120))
reason_comment: Mapped[str | None] = mapped_column(Text)
known_sender_id: Mapped[str | None] = mapped_column(String(120), index=True)
known_sender_name: Mapped[str | None] = mapped_column(String(255))
is_known_sender: Mapped[bool] = mapped_column(Boolean, default=False, index=True)
created_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), default=utcnow)
report: Mapped[Report] = relationship(back_populates="records")
auth_results: Mapped[list["AuthResult"]] = relationship(back_populates="record", cascade="all, delete-orphan")
class AuthResult(Base):
__tablename__ = "auth_results"
id: Mapped[int] = mapped_column(primary_key=True)
record_id: Mapped[int] = mapped_column(ForeignKey("records.id"), index=True)
auth_type: Mapped[str] = mapped_column(String(20), index=True)
domain: Mapped[str | None] = mapped_column(String(255), index=True)
selector: Mapped[str | None] = mapped_column(String(120))
scope: Mapped[str | None] = mapped_column(String(120))
result: Mapped[str | None] = mapped_column(String(120))
human_result: Mapped[str | None] = mapped_column(Text)
created_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), default=utcnow)
record: Mapped[Record] = relationship(back_populates="auth_results")
class SkippedReportPayload(Base):
__tablename__ = "skipped_report_payloads"
__table_args__ = (
UniqueConstraint("inbox_id", "folder", "imap_uid", "raw_xml_sha256", "reason", name="uq_skipped_report_payload"),
)
id: Mapped[int] = mapped_column(primary_key=True)
inbox_id: Mapped[str] = mapped_column(String(120), index=True)
folder: Mapped[str] = mapped_column(String(255))
imap_uid: Mapped[str] = mapped_column(String(120))
message_id: Mapped[str | None] = mapped_column(String(500))
mail_message_id: Mapped[int | None] = mapped_column(ForeignKey("mail_messages.id"))
reason: Mapped[str] = mapped_column(String(80), index=True)
raw_xml_sha256: Mapped[str | None] = mapped_column(String(64), index=True)
existing_report_id: Mapped[int | None] = mapped_column(ForeignKey("reports.id"))
report_identifier: Mapped[str | None] = mapped_column(String(500), index=True)
reporting_org: Mapped[str | None] = mapped_column(String(255), index=True)
report_date: Mapped[date | None] = mapped_column(Date, index=True)
created_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), default=utcnow)
class Alert(Base):
__tablename__ = "alerts"
id: Mapped[int] = mapped_column(primary_key=True)
fingerprint: Mapped[str] = mapped_column(String(500), unique=True, index=True)
inbox_id: Mapped[str] = mapped_column(String(120), index=True)
domain: Mapped[str] = mapped_column(String(255), index=True)
severity: Mapped[str] = mapped_column(String(40), index=True)
type: Mapped[str] = mapped_column(String(120), index=True)
title: Mapped[str] = mapped_column(String(500))
summary: Mapped[str] = mapped_column(Text)
details_json: Mapped[str] = mapped_column(Text, default="{}")
llm_summary: Mapped[str | None] = mapped_column(Text)
llm_risk: Mapped[str | None] = mapped_column(Text)
llm_recommended_action: Mapped[str | None] = mapped_column(Text)
status: Mapped[str] = mapped_column(String(40), default="open", index=True)
first_seen_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), default=utcnow)
last_seen_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), default=utcnow)
created_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), default=utcnow)
updated_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), default=utcnow, onupdate=utcnow)
class DailyStat(Base):
__tablename__ = "daily_stats"
__table_args__ = (UniqueConstraint("domain", "date", name="uq_daily_stat_domain_date"),)
id: Mapped[int] = mapped_column(primary_key=True)
domain: Mapped[str] = mapped_column(String(255), index=True)
date: Mapped[date] = mapped_column(Date, index=True)
total_messages: Mapped[int] = mapped_column(Integer, default=0)
dmarc_pass_count: Mapped[int] = mapped_column(Integer, default=0)
dmarc_fail_count: Mapped[int] = mapped_column(Integer, default=0)
spf_aligned_count: Mapped[int] = mapped_column(Integer, default=0)
spf_failed_count: Mapped[int] = mapped_column(Integer, default=0)
dkim_aligned_count: Mapped[int] = mapped_column(Integer, default=0)
dkim_failed_count: Mapped[int] = mapped_column(Integer, default=0)
unknown_source_count: Mapped[int] = mapped_column(Integer, default=0)
known_source_count: Mapped[int] = mapped_column(Integer, default=0)
quarantine_count: Mapped[int] = mapped_column(Integer, default=0)
reject_count: Mapped[int] = mapped_column(Integer, default=0)
top_reporters_json: Mapped[str] = mapped_column(Text, default="[]")
top_sources_json: Mapped[str] = mapped_column(Text, default="[]")
created_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), default=utcnow)
updated_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), default=utcnow, onupdate=utcnow)
class LLMReport(Base):
__tablename__ = "llm_reports"
id: Mapped[int] = mapped_column(primary_key=True)
domain: Mapped[str] = mapped_column(String(255), index=True)
period_start: Mapped[datetime] = mapped_column(DateTime(timezone=True), index=True)
period_end: Mapped[datetime] = mapped_column(DateTime(timezone=True), index=True)
report_type: Mapped[str] = mapped_column(String(40), index=True)
input_json: Mapped[str] = mapped_column(Text)
output_json: Mapped[str] = mapped_column(Text)
plain_text: Mapped[str] = mapped_column(Text)
created_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), default=utcnow)
+541
View File
@@ -0,0 +1,541 @@
from __future__ import annotations
import json
import logging
from datetime import date, datetime, time, timedelta, timezone
from zoneinfo import ZoneInfo
from apscheduler.schedulers.background import BackgroundScheduler
from sqlalchemy import desc, func, select
from sqlalchemy.orm import Session
from app.alerts import send_digest_email
from app.config import Settings
from app.db import session_scope
from app.inbox_locks import inbox_run_locks
from app.llm import LLMClient
from app.message_processor import process_inbox
from app.models import Alert, DailyStat, LLMReport, Record, Report, utcnow
logger = logging.getLogger(__name__)
scheduler: BackgroundScheduler | None = None
def _as_utc(value: datetime | str | None) -> datetime | None:
if value is None:
return None
if isinstance(value, str):
try:
value = datetime.fromisoformat(value.replace("Z", "+00:00"))
except ValueError:
return None
if value.tzinfo is None:
return value.replace(tzinfo=timezone.utc)
return value
def poll_all(settings: Settings) -> None:
logger.info("Poll start")
with session_scope() as session:
for inbox in settings.enabled_inboxes():
lease = inbox_run_locks.acquire(inbox.id, blocking=False)
if not lease:
logger.info("Skipping inbox %s because another import is already running", inbox.id)
continue
try:
with lease:
process_inbox(session, settings, inbox, mode="new")
except Exception as exc:
logger.warning("Polling inbox %s failed: %s", inbox.id, exc)
logger.info("Poll end")
def _domain_records(session: Session, domain: str, start: datetime, end: datetime) -> list[Record]:
return session.execute(
select(Record)
.join(Report)
.where(
Report.domain == domain,
func.coalesce(Report.date_end, Report.date_begin, Report.created_at) >= start,
func.coalesce(Report.date_end, Report.date_begin, Report.created_at) < end,
)
).scalars().all()
def aggregate_daily_stats(session: Session, domain: str, day: date) -> DailyStat:
start = datetime.combine(day, time.min, tzinfo=timezone.utc)
end = start + timedelta(days=1)
records = _domain_records(session, domain, start, end)
total = sum(row.count for row in records)
reporters = session.execute(
select(Report.org_name, func.count(Report.id))
.where(
Report.domain == domain,
func.coalesce(Report.date_end, Report.date_begin, Report.created_at) >= start,
func.coalesce(Report.date_end, Report.date_begin, Report.created_at) < end,
)
.group_by(Report.org_name)
).all()
sources = sorted(((row.source_ip, row.count) for row in records), key=lambda item: item[1], reverse=True)[:10]
stat = session.scalar(select(DailyStat).where(DailyStat.domain == domain, DailyStat.date == day))
if not stat:
stat = DailyStat(domain=domain, date=day)
session.add(stat)
stat.total_messages = total
stat.dmarc_pass_count = sum(row.count for row in records if row.dmarc_pass)
stat.dmarc_fail_count = total - stat.dmarc_pass_count
stat.spf_aligned_count = sum(row.count for row in records if row.spf_aligned)
stat.spf_failed_count = total - stat.spf_aligned_count
stat.dkim_aligned_count = sum(row.count for row in records if row.dkim_aligned)
stat.dkim_failed_count = total - stat.dkim_aligned_count
stat.unknown_source_count = len({row.source_ip for row in records if not row.is_known_sender})
stat.known_source_count = len({row.source_ip for row in records if row.is_known_sender})
stat.quarantine_count = sum(row.count for row in records if row.disposition == "quarantine")
stat.reject_count = sum(row.count for row in records if row.disposition == "reject")
stat.top_reporters_json = json.dumps([{"org": org, "reports": count} for org, count in reporters if org])
stat.top_sources_json = json.dumps([{"source_ip": ip, "count": count} for ip, count in sources])
return stat
def _summary_payload(session: Session, domain: str, day: date, stat: DailyStat) -> dict:
period_start = datetime.combine(day, time.min, tzinfo=timezone.utc)
period_end = datetime.combine(day + timedelta(days=1), time.min, tzinfo=timezone.utc)
critical = session.scalar(select(func.count(Alert.id)).where(Alert.domain == domain, Alert.status == "open", Alert.severity == "critical")) or 0
warnings = session.scalar(select(func.count(Alert.id)).where(Alert.domain == domain, Alert.status == "open", Alert.severity == "warning")) or 0
alerts = session.execute(
select(Alert)
.where(Alert.domain == domain, Alert.status == "open")
.order_by(Alert.severity.desc(), Alert.updated_at.desc())
.limit(10)
).scalars().all()
reports = session.execute(
select(Report.org_name, func.count(Report.id))
.where(
Report.domain == domain,
func.coalesce(Report.date_end, Report.date_begin, Report.created_at) >= period_start,
func.coalesce(Report.date_end, Report.date_begin, Report.created_at) < period_end,
)
.group_by(Report.org_name)
.order_by(desc(func.count(Report.id)))
.limit(10)
).all()
total = stat.total_messages
return {
"task": "daily_dmarc_summary",
"domain": domain,
"period": day.isoformat(),
"required_json_schema": {
"headline": "string, specific concise headline for the report period",
"summary": "string, 2-4 sentences using the supplied metrics, sources, reporters and alerts",
"action_items": "array of specific action strings based on the telemetry",
"business_risk": "string, concise risk statement; do not claim compromise from DMARC aggregate data alone",
},
"metrics": {
"total_messages": total,
"dmarc_passed": stat.dmarc_pass_count,
"dmarc_failed": stat.dmarc_fail_count,
"dmarc_pass_rate": round(stat.dmarc_pass_count / total * 100, 2) if total else 0,
"spf_alignment_rate": round(stat.spf_aligned_count / total * 100, 2) if total else 0,
"dkim_alignment_rate": round(stat.dkim_aligned_count / total * 100, 2) if total else 0,
"unknown_sources": stat.unknown_source_count,
"critical_alerts": critical,
"warnings": warnings,
},
"top_sources": json.loads(stat.top_sources_json or "[]"),
"reporters": [{"org": org or "unknown", "reports": count} for org, count in reports],
"alerts": [
{
"severity": alert.severity,
"type": alert.type,
"title": alert.title,
"summary": alert.summary,
"details": json.loads(alert.details_json or "{}"),
}
for alert in alerts
],
"instruction": (
"Write an actual operational DMARC daily summary for the admin. Mention exact pass/fail counts and rates, "
"important unknown or failing sources, relevant reporters, and concrete next actions. Do not provide generic "
"advice if the telemetry supports a specific recommendation. Return only JSON matching required_json_schema."
),
}
def _posture_payload(session: Session, domain: str) -> tuple[dict, datetime, datetime]:
bounds = session.execute(
select(
func.min(func.coalesce(Report.date_end, Report.date_begin, Report.created_at)),
func.max(func.coalesce(Report.date_end, Report.date_begin, Report.created_at)),
).where(Report.domain == domain)
).one()
period_start = _as_utc(bounds[0]) or datetime.now(timezone.utc)
period_end = _as_utc(bounds[1]) or period_start
records = session.execute(select(Record).join(Report).where(Report.domain == domain)).scalars().all()
reports = session.execute(
select(Report.org_name, func.count(Report.id))
.where(Report.domain == domain)
.group_by(Report.org_name)
.order_by(desc(func.count(Report.id)))
.limit(10)
).all()
alerts = session.execute(
select(Alert)
.where(Alert.domain == domain, Alert.status == "open")
.order_by(Alert.severity.desc(), Alert.updated_at.desc())
).scalars().all()
total = sum(row.count for row in records)
dmarc_pass = sum(row.count for row in records if row.dmarc_pass)
spf_aligned = sum(row.count for row in records if row.spf_aligned)
dkim_aligned = sum(row.count for row in records if row.dkim_aligned)
unknown_records = [row for row in records if not row.is_known_sender]
failing_unknown = [row for row in unknown_records if not row.dmarc_pass]
top_sources = sorted(records, key=lambda row: row.count, reverse=True)[:12]
return (
{
"task": "current_dmarc_open_posture_summary",
"domain": domain,
"period": {"start": period_start.isoformat(), "end": period_end.isoformat()},
"required_json_schema": {
"headline": "string, specific concise headline for the current posture",
"summary": "string, 2-5 sentences based on all imported telemetry and open alerts",
"action_items": "array of specific action strings with if-legitimate and if-not-legitimate remediation where relevant",
"business_risk": "string, concise risk statement; do not claim compromise from DMARC aggregate data alone",
},
"metrics": {
"total_reports": session.scalar(select(func.count(Report.id)).where(Report.domain == domain)) or 0,
"total_messages": total,
"dmarc_passed": dmarc_pass,
"dmarc_failed": total - dmarc_pass,
"dmarc_pass_rate": round(dmarc_pass / total * 100, 2) if total else 0,
"spf_alignment_rate": round(spf_aligned / total * 100, 2) if total else 0,
"dkim_alignment_rate": round(dkim_aligned / total * 100, 2) if total else 0,
"unknown_sources": len({row.source_ip for row in unknown_records}),
"unknown_failing_sources": len({row.source_ip for row in failing_unknown}),
"open_critical_alerts": len([alert for alert in alerts if alert.severity == "critical"]),
"open_warnings": len([alert for alert in alerts if alert.severity == "warning"]),
},
"top_sources": [
{
"source_ip": row.source_ip,
"count": row.count,
"dmarc_pass": row.dmarc_pass,
"known_sender": row.known_sender_name,
"spf_aligned": row.spf_aligned,
"dkim_aligned": row.dkim_aligned,
}
for row in top_sources
],
"reporters": [{"org": org or "unknown", "reports": count} for org, count in reports],
"open_alerts": [
{
"severity": alert.severity,
"type": alert.type,
"title": alert.title,
"summary": alert.summary,
"details": json.loads(alert.details_json or "{}"),
}
for alert in alerts[:25]
],
"instruction": (
"Write a current DMARC posture report from all imported telemetry and all open alerts. Do not focus only "
"on the latest report day. For failing unknown sources, state what to do if they are legitimate and what "
"to do if they are not legitimate. Make the DMARC enforcement relationship explicit: quarantine/reject "
"helps receivers handle unauthorized spoofing only after legitimate senders are aligned. Return only JSON."
),
},
period_start,
period_end,
)
def _portfolio_posture_payload(session: Session) -> tuple[dict, datetime, datetime] | None:
domains = session.execute(select(Report.domain).distinct().order_by(Report.domain)).scalars().all()
if not domains:
return None
bounds = session.execute(
select(
func.min(func.coalesce(Report.date_end, Report.date_begin, Report.created_at)),
func.max(func.coalesce(Report.date_end, Report.date_begin, Report.created_at)),
)
).one()
period_start = _as_utc(bounds[0]) or datetime.now(timezone.utc)
period_end = _as_utc(bounds[1]) or period_start
records = session.execute(select(Record).join(Report)).scalars().all()
alerts = session.execute(
select(Alert)
.where(Alert.status == "open")
.order_by(Alert.severity.desc(), Alert.updated_at.desc())
).scalars().all()
total = sum(row.count for row in records)
dmarc_pass = sum(row.count for row in records if row.dmarc_pass)
unknown_records = [row for row in records if not row.is_known_sender]
failing_unknown = [row for row in unknown_records if not row.dmarc_pass]
domain_rows = []
for domain in domains:
domain_records = session.execute(select(Record).join(Report).where(Report.domain == domain)).scalars().all()
domain_total = sum(row.count for row in domain_records)
domain_pass = sum(row.count for row in domain_records if row.dmarc_pass)
domain_alerts = [alert for alert in alerts if alert.domain == domain]
domain_rows.append(
{
"domain": domain,
"reports": session.scalar(select(func.count(Report.id)).where(Report.domain == domain)) or 0,
"messages": domain_total,
"dmarc_pass_rate": round(domain_pass / domain_total * 100, 2) if domain_total else 0,
"unknown_sources": len({row.source_ip for row in domain_records if not row.is_known_sender}),
"open_critical_alerts": len([alert for alert in domain_alerts if alert.severity == "critical"]),
"open_warnings": len([alert for alert in domain_alerts if alert.severity == "warning"]),
}
)
top_alerts = [
{
"domain": alert.domain,
"severity": alert.severity,
"type": alert.type,
"title": alert.title,
"summary": alert.summary,
"details": json.loads(alert.details_json or "{}"),
}
for alert in alerts[:12]
]
return (
{
"task": "current_dmarc_portfolio_posture_summary",
"scope": "all_domains",
"domains": domains,
"period": {"start": period_start.isoformat(), "end": period_end.isoformat()},
"required_json_schema": {
"headline": "string, concise portfolio headline across all domains",
"summary": "string, 2-3 sentences covering all domains without per-record verbosity",
"action_items": "array of 1-4 specific cross-domain or domain-named action strings",
"business_risk": "string, concise portfolio-level risk statement",
},
"metrics": {
"domains": len(domains),
"total_reports": session.scalar(select(func.count(Report.id))) or 0,
"total_messages": total,
"dmarc_passed": dmarc_pass,
"dmarc_failed": total - dmarc_pass,
"dmarc_pass_rate": round(dmarc_pass / total * 100, 2) if total else 0,
"unknown_sources": len({row.source_ip for row in unknown_records}),
"unknown_failing_sources": len({row.source_ip for row in failing_unknown}),
"open_critical_alerts": len([alert for alert in alerts if alert.severity == "critical"]),
"open_warnings": len([alert for alert in alerts if alert.severity == "warning"]),
},
"domain_posture": domain_rows,
"top_open_alerts": top_alerts,
"instruction": (
"Write a compact all-domain DMARC portfolio posture for the overview page. Compare domains only where "
"there is a meaningful difference. Keep it shorter than a single-domain detail report. Mention exact "
"domain names only for domains that need attention. Return only JSON."
),
},
period_start,
period_end,
)
def generate_open_posture_summaries(settings: Settings, *, force: bool = True) -> list[LLMReport]:
if not settings.llm.generate_daily_summary:
logger.info("Open posture summaries skipped because daily LLM summaries are disabled")
return []
llm = LLMClient(settings)
generated: list[LLMReport] = []
with session_scope() as session:
portfolio = _portfolio_posture_payload(session)
if portfolio:
payload, period_start, period_end = portfolio
existing = session.scalar(
select(LLMReport).where(
LLMReport.domain == "__all__",
LLMReport.report_type == "posture",
LLMReport.period_start == period_start,
LLMReport.period_end == period_end,
)
)
if existing and not force:
generated.append(existing)
else:
report = existing or LLMReport(
domain="__all__",
period_start=period_start,
period_end=period_end,
report_type="posture",
input_json="{}",
output_json="{}",
plain_text="",
)
if settings.llm.store_llm_outputs and not existing:
session.add(report)
output = llm.daily_summary(payload)
plain = f"{output.headline}\n\n{output.summary}\n\nActions: " + "; ".join(output.action_items)
if settings.llm.store_llm_outputs:
report.input_json = json.dumps(payload, sort_keys=True, default=str)
report.output_json = output.model_dump_json()
report.plain_text = plain
generated.append(report)
send_digest_email(settings, "DMARC Sentinel portfolio posture summary", plain)
domains = session.execute(select(Report.domain).distinct()).scalars().all()
for domain in domains:
payload, period_start, period_end = _posture_payload(session, domain)
existing = session.scalar(
select(LLMReport).where(
LLMReport.domain == domain,
LLMReport.report_type == "posture",
LLMReport.period_start == period_start,
LLMReport.period_end == period_end,
)
)
if existing and not force:
generated.append(existing)
continue
report = existing or LLMReport(
domain=domain,
period_start=period_start,
period_end=period_end,
report_type="posture",
input_json="{}",
output_json="{}",
plain_text="",
)
if settings.llm.store_llm_outputs and not existing:
session.add(report)
output = llm.daily_summary(payload)
plain = f"{output.headline}\n\n{output.summary}\n\nActions: " + "; ".join(output.action_items)
if settings.llm.store_llm_outputs:
report.input_json = json.dumps(payload, sort_keys=True, default=str)
report.output_json = output.model_dump_json()
report.plain_text = plain
generated.append(report)
send_digest_email(settings, f"DMARC Sentinel posture summary for {domain}", plain)
logger.info("Open posture summaries generated")
return generated
def generate_daily_summaries(settings: Settings, target_day: date | None = None, *, force: bool = False) -> list[LLMReport]:
if not settings.llm.generate_daily_summary:
logger.info("Daily summaries skipped because daily LLM summaries are disabled")
return []
target_day = target_day or (date.today() - timedelta(days=1))
llm = LLMClient(settings)
generated: list[LLMReport] = []
with session_scope() as session:
domains = session.execute(select(Report.domain).distinct()).scalars().all()
for domain in domains:
stat = aggregate_daily_stats(session, domain, target_day)
payload = _summary_payload(session, domain, target_day, stat)
period_start = datetime.combine(target_day, time.min, tzinfo=timezone.utc)
period_end = datetime.combine(target_day + timedelta(days=1), time.min, tzinfo=timezone.utc)
existing = session.scalar(
select(LLMReport).where(
LLMReport.domain == domain,
LLMReport.report_type == "daily",
LLMReport.period_start == period_start,
LLMReport.period_end == period_end,
)
)
if existing:
if not force:
generated.append(existing)
continue
report = existing
else:
report = LLMReport(
domain=domain,
period_start=period_start,
period_end=period_end,
report_type="daily",
input_json="{}",
output_json="{}",
plain_text="",
)
if settings.llm.store_llm_outputs:
session.add(report)
output = llm.daily_summary(payload)
plain = f"{output.headline}\n\n{output.summary}\n\nActions: " + "; ".join(output.action_items)
if settings.llm.store_llm_outputs:
report.input_json = json.dumps(payload, sort_keys=True, default=str)
report.output_json = output.model_dump_json()
report.plain_text = plain
generated.append(report)
send_digest_email(settings, f"DMARC Sentinel daily summary for {domain}", plain)
logger.info("Daily summaries generated")
return generated
def generate_weekly_summaries(settings: Settings) -> list[LLMReport]:
if not settings.llm.generate_weekly_summary:
logger.info("Weekly summaries skipped because weekly LLM summaries are disabled")
return []
end = datetime.now(timezone.utc).replace(hour=0, minute=0, second=0, microsecond=0)
start = end - timedelta(days=7)
llm = LLMClient(settings)
generated: list[LLMReport] = []
with session_scope() as session:
domains = session.execute(select(Report.domain).distinct()).scalars().all()
for domain in domains:
records = _domain_records(session, domain, start, end)
existing = session.scalar(
select(LLMReport).where(
LLMReport.domain == domain,
LLMReport.report_type == "weekly",
LLMReport.period_start == start,
LLMReport.period_end == end,
)
)
if existing:
continue
total = sum(row.count for row in records)
pass_count = sum(row.count for row in records if row.dmarc_pass)
payload = {
"task": "weekly_dmarc_summary",
"domain": domain,
"period": {"start": start.isoformat(), "end": end.isoformat()},
"metrics": {
"total_messages": total,
"dmarc_pass_rate": round(pass_count / total * 100, 2) if total else 0,
"new_senders": len({row.source_ip for row in records if not row.is_known_sender and row.dmarc_pass}),
"persistent_failures": len({row.source_ip for row in records if not row.dmarc_pass}),
"critical_known_sender_failures": len({row.known_sender_id for row in records if row.is_known_sender and not row.dmarc_pass}),
},
"instruction": (
"Include high-level posture, trend changes, new senders, persistent failures, whether DMARC policy "
"posture looks safe, and recommended operational actions. Only say consider stricter policy if the "
"metrics support it."
),
}
output = llm.weekly_summary(payload)
plain = f"{output.headline}\n\n{output.summary}\n\nActions: " + "; ".join(output.action_items)
report = LLMReport(
domain=domain,
period_start=start,
period_end=end,
report_type="weekly",
input_json=json.dumps(payload, sort_keys=True),
output_json=output.model_dump_json(),
plain_text=plain,
)
if settings.llm.store_llm_outputs:
session.add(report)
generated.append(report)
send_digest_email(settings, f"DMARC Sentinel weekly summary for {domain}", plain)
logger.info("Weekly summaries generated")
return generated
def start_scheduler(settings: Settings) -> BackgroundScheduler:
global scheduler
tz = ZoneInfo(settings.app.timezone)
scheduler = BackgroundScheduler(timezone=tz)
scheduler.add_job(poll_all, "interval", minutes=settings.app.poll_interval_minutes, args=[settings], id="poll", replace_existing=True)
if settings.llm.generate_daily_summary:
scheduler.add_job(generate_daily_summaries, "cron", hour=7, minute=0, args=[settings], id="daily", replace_existing=True)
if settings.llm.generate_weekly_summary:
scheduler.add_job(generate_weekly_summaries, "cron", day_of_week="mon", hour=7, minute=30, args=[settings], id="weekly", replace_existing=True)
scheduler.start()
return scheduler
def scheduler_ok() -> bool:
return bool(scheduler and scheduler.running)
+27
View File
@@ -0,0 +1,27 @@
from __future__ import annotations
from datetime import date
from pydantic import BaseModel, Field, field_validator
class ProcessNowRequest(BaseModel):
inbox_id: str
mode: str = Field(pattern="^(new|backlog)$")
limit: int | None = None
class BacklogRequest(BaseModel):
inbox_id: str
folder: str | None = None
since: date | None = None
before: date | None = None
limit: int | None = 200
dry_run: bool = False
reprocess: bool = False
mark_seen: bool = False
@field_validator("since", "before", mode="before")
@classmethod
def empty_date_is_missing(cls, value):
return None if value == "" else value
+402
View File
@@ -0,0 +1,402 @@
from __future__ import annotations
import json
from datetime import date, datetime, time, timedelta, timezone
from sqlalchemy import delete, select
from sqlalchemy.orm import Session
from app.db import init_db, session_scope
from app.models import Alert, AuthResult, DailyStat, InboxStatus, LLMReport, MailMessage, Record, Report, utcnow
DOMAIN = "tukutoi.com"
INBOX = "tukutoi"
def _dt(days_ago: int, hour: int = 0) -> datetime:
target = date.today() - timedelta(days=days_ago)
return datetime.combine(target, time(hour=hour), tzinfo=timezone.utc)
def _purge_smoke(session: Session) -> None:
smoke_reports = session.execute(select(Report.id).where(Report.report_id.like("smoke-%"))).scalars().all()
if smoke_reports:
smoke_records = session.execute(select(Record.id).where(Record.report_id.in_(smoke_reports))).scalars().all()
if smoke_records:
session.execute(delete(AuthResult).where(AuthResult.record_id.in_(smoke_records)))
session.execute(delete(Record).where(Record.id.in_(smoke_records)))
session.execute(delete(Report).where(Report.id.in_(smoke_reports)))
smoke_messages = session.execute(select(MailMessage.id).where(MailMessage.message_id.like("<smoke-%"))).scalars().all()
if smoke_messages:
session.execute(delete(MailMessage).where(MailMessage.id.in_(smoke_messages)))
session.execute(delete(Alert).where(Alert.details_json.like('%"smoke": true%')))
session.execute(delete(LLMReport).where(LLMReport.input_json.like('%"smoke": true%')))
session.execute(delete(DailyStat).where(DailyStat.domain == DOMAIN))
session.commit()
def _mail(session: Session, uid: str, subject: str, days_ago: int) -> MailMessage:
mail = MailMessage(
inbox_id=INBOX,
imap_uid=uid,
message_id=f"<smoke-{uid}@dmarc-sentinel.local>",
folder="DMARC",
subject=subject,
sender="reports@example.net",
recipient="dmarcreports@tukutoi.com",
message_date=_dt(days_ago, 6),
seen=True,
status="success",
processed_at=utcnow(),
)
session.add(mail)
session.flush()
return mail
def _report(
session: Session,
*,
mail: MailMessage,
days_ago: int,
org: str,
report_id: str,
sha: str,
policy: str = "none",
) -> Report:
report = Report(
inbox_id=INBOX,
mail_message_id=mail.id,
raw_xml_sha256=sha,
report_id=report_id,
org_name=org,
org_email=f"dmarc@{org}",
extra_contact_info=f"https://{org}/dmarc",
domain=DOMAIN,
date_begin=_dt(days_ago, 0),
date_end=_dt(days_ago - 1, 0) if days_ago else utcnow(),
policy_p=policy,
policy_sp=policy,
policy_pct=100,
adkim="r",
aspf="r",
fo="1",
)
session.add(report)
session.flush()
return report
def _record(
session: Session,
*,
report: Report,
source_ip: str,
count: int,
disposition: str,
spf: bool,
dkim: bool,
known_id: str | None,
known_name: str | None,
header_from: str = DOMAIN,
reason_type: str | None = None,
reason_comment: str | None = None,
dkim_domain: str | None = DOMAIN,
spf_domain: str | None = DOMAIN,
) -> Record:
record = Record(
report_id=report.id,
source_ip=source_ip,
count=count,
disposition=disposition,
policy_dkim="pass" if dkim else "fail",
policy_spf="pass" if spf else "fail",
dkim_aligned=dkim,
spf_aligned=spf,
dmarc_pass=dkim or spf,
header_from=header_from,
reason_type=reason_type,
reason_comment=reason_comment,
known_sender_id=known_id,
known_sender_name=known_name,
is_known_sender=known_id is not None,
)
session.add(record)
session.flush()
session.add(
AuthResult(
record_id=record.id,
auth_type="dkim",
domain=dkim_domain,
selector="default",
result="pass" if dkim else "fail",
human_result="synthetic smoke data",
)
)
session.add(
AuthResult(
record_id=record.id,
auth_type="spf",
domain=spf_domain,
scope="mfrom",
result="pass" if spf else "fail",
)
)
return record
def _alert(
session: Session,
*,
fingerprint: str,
severity: str,
alert_type: str,
title: str,
summary: str,
details: dict,
llm_summary: str,
llm_risk: str,
llm_action: str,
days_ago: int,
) -> None:
now = utcnow()
session.add(
Alert(
fingerprint=fingerprint,
inbox_id=INBOX,
domain=DOMAIN,
severity=severity,
type=alert_type,
title=title,
summary=summary,
details_json=json.dumps({"smoke": True, **details}, sort_keys=True),
llm_summary=llm_summary,
llm_risk=llm_risk,
llm_recommended_action=llm_action,
status="open",
first_seen_at=_dt(days_ago, 8),
last_seen_at=now,
)
)
def seed_smoke_data() -> None:
init_db()
with session_scope() as session:
_purge_smoke(session)
status = session.scalar(select(InboxStatus).where(InboxStatus.inbox_id == INBOX))
if not status:
status = InboxStatus(
inbox_id=INBOX,
label="Tukutoi",
domain=DOMAIN,
folder="DMARC",
recipient="dmarcreports@tukutoi.com",
enabled=True,
)
session.add(status)
status.last_check_at = utcnow()
status.last_success_at = utcnow()
status.last_new_messages = 18
status.last_reports_imported = 15
status.last_error = None
reporters = ["google.com", "yahoo.com", "outlook.com", "proton.me"]
for i, days_ago in enumerate(range(13, -1, -1), start=1):
mail = _mail(session, str(9000 + i), f"DMARC aggregate report for {DOMAIN}", days_ago)
report = _report(
session,
mail=mail,
days_ago=days_ago,
org=reporters[i % len(reporters)],
report_id=f"smoke-report-{i}",
sha=f"{i:064x}",
policy="none" if days_ago > 3 else "quarantine",
)
base = 2800 + i * 110
_record(
session,
report=report,
source_ip="198.51.100.20",
count=base,
disposition="none",
spf=True,
dkim=True,
known_id="mailcow",
known_name="mailcow outbound",
)
_record(
session,
report=report,
source_ip="203.0.113.40",
count=420 + i * 12,
disposition="none",
spf=True,
dkim=False,
known_id="google_workspace",
known_name="Google Workspace",
dkim_domain="tukutoi.com",
spf_domain="_spf.google.com",
)
if i >= 9:
_record(
session,
report=report,
source_ip="203.0.113.99",
count=18 + i * 4,
disposition="none",
spf=False,
dkim=False,
known_id=None,
known_name=None,
header_from=DOMAIN,
reason_type="local_policy",
reason_comment="Unrecognized source failed both aligned SPF and DKIM.",
dkim_domain="bad-sender.example",
spf_domain="bad-sender.example",
)
if i in {12, 13}:
_record(
session,
report=report,
source_ip="192.0.2.77",
count=9 + i,
disposition="quarantine",
spf=False,
dkim=False,
known_id=None,
known_name=None,
header_from=DOMAIN,
reason_type="sampled_out",
reason_comment="Receiver applied quarantine to a small unauthorized sample.",
dkim_domain="newsletter.invalid",
spf_domain="newsletter.invalid",
)
for days_ago in range(13, -1, -1):
day = date.today() - timedelta(days=days_ago)
total = 3600 + (13 - days_ago) * 160
fail = 12 + max(0, 6 - days_ago) * 11
stat = DailyStat(
domain=DOMAIN,
date=day,
total_messages=total,
dmarc_pass_count=total - fail,
dmarc_fail_count=fail,
spf_aligned_count=total - fail - 18,
spf_failed_count=fail + 18,
dkim_aligned_count=total - fail - 35,
dkim_failed_count=fail + 35,
unknown_source_count=1 if days_ago < 6 else 0,
known_source_count=2,
quarantine_count=22 if days_ago in {0, 1} else 0,
reject_count=0,
top_reporters_json=json.dumps(
[
{"org": "google.com", "reports": 5},
{"org": "yahoo.com", "reports": 4},
{"org": "outlook.com", "reports": 3},
]
),
top_sources_json=json.dumps(
[
{"source_ip": "198.51.100.20", "count": total - 600},
{"source_ip": "203.0.113.40", "count": 520},
{"source_ip": "203.0.113.99", "count": fail},
]
),
)
session.add(stat)
_alert(
session,
fingerprint=f"{DOMAIN}:unknown_source_failed_both:203.0.113.99:smoke",
severity="critical",
alert_type="unknown_source_failed_both",
title=f"Unknown source failed SPF and DKIM for {DOMAIN}",
summary="203.0.113.99 sent a growing volume of mail that failed both SPF and DKIM alignment.",
details={"source_ip": "203.0.113.99", "count": 74, "spf_aligned": False, "dkim_aligned": False, "dmarc_pass": False},
llm_summary="Unknown infrastructure is sending mail that claims to be from tukutoi.com and fails both aligned SPF and DKIM.",
llm_risk="This is likely spoofing or unauthorized sending. It does not by itself prove a mailbox compromise.",
llm_action="Confirm whether the IP belongs to an approved sender. If not, monitor volume and keep legitimate senders passing before considering stricter policy.",
days_ago=4,
)
_alert(
session,
fingerprint=f"{DOMAIN}:quarantine_or_reject_seen:192.0.2.77:smoke",
severity="critical",
alert_type="quarantine_or_reject_seen",
title=f"Quarantine disposition seen for {DOMAIN}",
summary="Receivers quarantined a small number of messages from an unknown source.",
details={"source_ip": "192.0.2.77", "count": 22, "disposition": "quarantine"},
llm_summary="Some mail claiming to be from tukutoi.com is now being quarantined by receivers.",
llm_risk="The impacted traffic appears unauthorized in this sample, but verify whether any legitimate sender is missing from known senders.",
llm_action="Review the quarantined source and classify it only if it is an approved sender.",
days_ago=1,
)
_alert(
session,
fingerprint=f"{DOMAIN}:dkim_authenticated_relay:203.0.113.40:smoke",
severity="info",
alert_type="dkim_authenticated_relay",
title=f"DKIM-authenticated relay observed for {DOMAIN}",
summary="A receiver observed 203.0.113.40 transmitting mail for tukutoi.com. SPF failed for that hop, DKIM aligned, and DMARC passed.",
details={"source_ip": "203.0.113.40", "count": 612, "spf_aligned": False, "dkim_aligned": True, "dmarc_pass": True},
llm_summary=None,
llm_risk=None,
llm_action=None,
days_ago=2,
)
today = date.today()
daily_input = {"smoke": True, "task": "daily_dmarc_summary", "domain": DOMAIN, "period": today.isoformat()}
session.add(
LLMReport(
domain=DOMAIN,
period_start=datetime.combine(today, time.min, tzinfo=timezone.utc),
period_end=datetime.combine(today + timedelta(days=1), time.min, tzinfo=timezone.utc),
report_type="daily",
input_json=json.dumps(daily_input),
output_json=json.dumps(
{
"headline": "DMARC is mostly healthy, with one unauthorized source to review.",
"summary": "Legitimate mail is passing consistently. A small but increasing unknown source is failing both SPF and DKIM, and receivers quarantined a small sample.",
"action_items": [
"Review 203.0.113.99 and confirm it is not an approved sender.",
"Classify 203.0.113.40 if it belongs to an approved platform.",
],
"business_risk": "Medium",
}
),
plain_text=(
"DMARC is mostly healthy, with one unauthorized source to review.\n\n"
"Legitimate mail is passing consistently. A small but increasing unknown source is failing both SPF and DKIM, "
"and receivers quarantined a small sample.\n\n"
"Actions: Review 203.0.113.99; classify 203.0.113.40 if approved."
),
)
)
week_start = today - timedelta(days=7)
session.add(
LLMReport(
domain=DOMAIN,
period_start=datetime.combine(week_start, time.min, tzinfo=timezone.utc),
period_end=datetime.combine(today, time.min, tzinfo=timezone.utc),
report_type="weekly",
input_json=json.dumps({"smoke": True, "task": "weekly_dmarc_summary", "domain": DOMAIN}),
output_json=json.dumps(
{
"headline": "Weekly posture is stable with one spoofing pattern.",
"summary": "Known senders continue to pass. Unknown failures appeared late in the week and should be watched before any policy change.",
"action_items": ["Verify new sources.", "Keep policy at quarantine until known sender coverage is confirmed."],
"business_risk": "Medium",
}
),
plain_text="Weekly posture is stable with one spoofing pattern.\n\nKnown senders continue to pass. Unknown failures appeared late in the week.",
)
)
if __name__ == "__main__":
seed_smoke_data()
print("Smoke data seeded")
+1610
View File
File diff suppressed because it is too large Load Diff
+229
View File
@@ -0,0 +1,229 @@
{% extends "base.html" %}
{% block content %}
<header class="mb-stack-lg">
<h1 class="text-headline-xl-mobile font-bold text-on-background md:text-headline-xl">Alerts</h1>
<p class="mt-1 text-body-base text-on-surface-variant">Filter, triage, acknowledge, resolve, or reopen deterministic DMARC alerts.</p>
</header>
<form class="alerts-filter-bar" method="get" id="alerts-filter-form">
<label>
<span class="label-caps">Domain</span>
<select name="domain">
<option value="">All domains</option>
{% for item in domains %}
<option value="{{ item }}" {{ "selected" if item == selected_domain else "" }}>{{ item }}</option>
{% endfor %}
</select>
</label>
<label>
<span class="label-caps">Type</span>
<select name="alert_type">
<option value="">All types</option>
{% for item in alert_types %}
<option value="{{ item }}" {{ "selected" if item == selected_type else "" }}>{{ item }}</option>
{% endfor %}
</select>
</label>
<label>
<span class="label-caps">Severity</span>
<select name="severity">
<option value="">All severities</option>
{% for item in severities %}
<option value="{{ item }}" {{ "selected" if item == selected_severity else "" }}>{{ item }}</option>
{% endfor %}
</select>
</label>
<label>
<span class="label-caps">State</span>
<select name="status">
{% for item in ["open", "acknowledged", "resolved", ""] %}
<option value="{{ item }}" {{ "selected" if item == selected_status else "" }}>{{ item or "all states" }}</option>
{% endfor %}
</select>
</label>
<label>
<span class="label-caps">Report From</span>
<input type="date" name="date_from" value="{{ selected_date_from }}">
</label>
<label>
<span class="label-caps">Report To</span>
<input type="date" name="date_to" value="{{ selected_date_to }}">
</label>
</form>
<section class="alerts-bulk-bar">
<div>
<strong id="alerts-selected-count">0 selected</strong>
<span class="dw-muted">{{ total }} alerts match current filters</span>
</div>
<div class="flex flex-wrap gap-stack-sm">
<button class="button-secondary js-bulk-alert-action" type="button" data-status="acknowledged" disabled>
<span class="material-symbols-outlined text-[18px]">done</span>
Acknowledge
</button>
<button class="button-secondary js-bulk-alert-action" type="button" data-status="resolved" disabled>
<span class="material-symbols-outlined text-[18px]">task_alt</span>
Resolve
</button>
<button class="button-secondary js-bulk-alert-action" type="button" data-status="open" disabled>
<span class="material-symbols-outlined text-[18px]">restart_alt</span>
Reopen
</button>
</div>
</section>
<section class="surface-card overflow-hidden">
{% for alert in alerts %}
{% set is_critical = alert.severity == "critical" %}
<article class="alert-row border-b border-outline-variant p-stack-md last:border-b-0 is-{{ alert.severity_class }}" data-alert-id="{{ alert.id }}" data-status="{{ alert.status }}">
<div class="flex flex-col items-start justify-between gap-stack-md xl:flex-row">
<div class="min-w-0 flex-1">
<div class="mb-1 flex flex-wrap items-center gap-stack-sm">
<label class="alert-select">
<input class="js-alert-checkbox" type="checkbox" value="{{ alert.id }}">
<span></span>
</label>
<span class="status-chip chip-{{ alert.severity_class }}">{{ alert.severity }}</span>
<span class="status-chip js-alert-status {{ 'chip-pass' if alert.status == 'resolved' else ('chip-warning' if alert.status == 'acknowledged' else 'chip-fail') }}">{{ alert.status }}</span>
<span class="label-caps">• {{ alert.type }}</span>
</div>
<h2 class="text-headline-md font-semibold text-on-surface">{{ alert.title }}</h2>
<div class="mt-stack-sm flex flex-wrap items-center gap-stack-sm text-body-sm text-on-surface-variant">
<code class="rounded bg-surface-container px-2 py-0.5 font-mono text-data-mono text-secondary">{{ alert.domain }}</code>
<span>Report period: {{ alert.report_start | fmt_dt }}{% if alert.report_end and alert.report_end != alert.report_start %} to {{ alert.report_end | fmt_dt }}{% endif %}</span>
{% if alert.source_history %}
<span>{{ alert.source_history }}</span>
{% endif %}
{% if alert.report_db_id %}
<a class="dw-inline-link" href="/reports/{{ alert.report_db_id }}">Source report</a>
{% endif %}
</div>
<p class="mt-stack-md text-body-base text-on-surface-variant">{{ alert.llm_summary or alert.summary }}</p>
{% if alert.llm_recommended_action %}
<p class="mt-stack-sm text-body-sm italic text-secondary">{{ alert.llm_recommended_action }}</p>
{% endif %}
</div>
<div class="alert-actions flex shrink-0 flex-wrap gap-stack-sm">
<button class="button-secondary js-alert-action" type="button" data-status="acknowledged" {{ "disabled" if alert.status == "acknowledged" else "" }}>
<span class="material-symbols-outlined text-[18px]">done</span>
Acknowledge
</button>
<button class="button-secondary js-alert-action" type="button" data-status="resolved" {{ "disabled" if alert.status == "resolved" else "" }}>
<span class="material-symbols-outlined text-[18px]">task_alt</span>
Resolve
</button>
<button class="button-secondary js-alert-action" type="button" data-status="open" {{ "disabled" if alert.status == "open" else "" }}>
<span class="material-symbols-outlined text-[18px]">restart_alt</span>
Reopen
</button>
</div>
</div>
</article>
{% else %}
<div class="p-gutter text-on-surface-variant">No alerts match these filters.</div>
{% endfor %}
</section>
<div class="dw-table-footer alerts-pager">
<span>{{ ((page - 1) * page_size) + 1 if total else 0 }}-{{ [page * page_size, total] | min }} of {{ total }}</span>
<span class="dw-pager">
<a class="{{ 'is-disabled' if page <= 1 else '' }}" href="/alerts?page={{ page - 1 }}&domain={{ selected_domain }}&alert_type={{ selected_type }}&severity={{ selected_severity }}&status={{ selected_status }}&date_from={{ selected_date_from }}&date_to={{ selected_date_to }}"><span class="material-symbols-outlined">chevron_left</span></a>
<a class="{{ 'is-disabled' if page * page_size >= total else '' }}" href="/alerts?page={{ page + 1 }}&domain={{ selected_domain }}&alert_type={{ selected_type }}&severity={{ selected_severity }}&status={{ selected_status }}&date_from={{ selected_date_from }}&date_to={{ selected_date_to }}"><span class="material-symbols-outlined">chevron_right</span></a>
</span>
</div>
<script>
(() => {
const rows = Array.from(document.querySelectorAll(".alert-row"));
const boxes = Array.from(document.querySelectorAll(".js-alert-checkbox"));
const selectedCount = document.getElementById("alerts-selected-count");
const bulkButtons = Array.from(document.querySelectorAll(".js-bulk-alert-action"));
const filterForm = document.getElementById("alerts-filter-form");
let lastChecked = null;
filterForm.querySelectorAll("select,input").forEach((control) => {
control.addEventListener("change", () => {
const page = filterForm.querySelector("input[name='page']");
if (page) page.value = "1";
filterForm.requestSubmit();
});
});
const selectedIds = () => boxes.filter((box) => box.checked).map((box) => Number(box.value));
const refreshBulk = () => {
const count = selectedIds().length;
selectedCount.textContent = `${count} selected`;
bulkButtons.forEach((button) => { button.disabled = count === 0; });
};
boxes.forEach((box, index) => {
box.addEventListener("click", (event) => {
if ((event.shiftKey || (event.metaKey && event.shiftKey)) && lastChecked !== null) {
const start = Math.min(lastChecked, index);
const end = Math.max(lastChecked, index);
boxes.slice(start, end + 1).forEach((item) => { item.checked = box.checked; });
}
lastChecked = index;
refreshBulk();
});
});
rows.forEach((row) => {
row.addEventListener("click", (event) => {
if (event.target.closest("button,a,label,input")) return;
const box = row.querySelector(".js-alert-checkbox");
box.checked = !box.checked;
lastChecked = boxes.indexOf(box);
refreshBulk();
});
});
const applyStatus = (row, status) => {
row.dataset.status = status;
const chip = row.querySelector(".js-alert-status");
chip.textContent = status;
chip.className = `status-chip js-alert-status ${status === "resolved" ? "chip-pass" : (status === "acknowledged" ? "chip-warning" : "chip-fail")}`;
row.querySelectorAll(".js-alert-action").forEach((button) => {
button.disabled = button.dataset.status === status;
});
};
const postStatus = async (id, status) => {
const endpoint = status === "open" ? `/api/alerts/${id}/reopen` : `/api/alerts/${id}/${status === "resolved" ? "resolve" : "ack"}`;
const response = await fetch(endpoint, { method: "POST", headers: window.adminPostHeaders, credentials: "same-origin" });
if (!response.ok) throw new Error("Alert update failed.");
return response.json();
};
document.querySelectorAll(".js-alert-action").forEach((button) => {
button.addEventListener("click", async () => {
const row = button.closest(".alert-row");
button.disabled = true;
try {
await postStatus(row.dataset.alertId, button.dataset.status);
applyStatus(row, button.dataset.status);
} catch (error) {
button.disabled = false;
}
});
});
bulkButtons.forEach((button) => {
button.addEventListener("click", async () => {
const ids = selectedIds();
if (!ids.length) return;
const response = await fetch("/api/alerts/bulk", {
method: "POST",
headers: { ...window.adminPostHeaders, "Content-Type": "application/json" },
credentials: "same-origin",
body: JSON.stringify({ ids, status: button.dataset.status }),
});
if (!response.ok) return;
rows.filter((row) => ids.includes(Number(row.dataset.alertId))).forEach((row) => applyStatus(row, button.dataset.status));
boxes.forEach((box) => { box.checked = false; });
refreshBulk();
});
});
})();
</script>
{% endblock %}
+50
View File
@@ -0,0 +1,50 @@
<!doctype html>
<html class="light" lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>{{ request.app.title }}</title>
<link rel="preconnect" href="https://fonts.googleapis.com">
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
<link href="https://fonts.googleapis.com/css2?family=Inter:wght@400;600;700&family=JetBrains+Mono:wght@450&family=Material+Symbols+Outlined:wght,FILL@100..700,0..1&display=swap" rel="stylesheet">
<script src="https://cdn.tailwindcss.com?plugins=forms,container-queries"></script>
<script src="https://unpkg.com/htmx.org@1.9.12"></script>
<script>
window.adminPostHeaders = { "X-Requested-With": "XMLHttpRequest" };
</script>
<link rel="stylesheet" href="/static/app.css?v=9">
</head>
<body>
<header class="dw-topbar">
<div class="dw-topbar-inner">
<div class="dw-brand-row">
<a class="dw-brand" href="/">DMARC Sentinel</a>
{% set path = request.url.path %}
<nav class="dw-nav">
<a class="dw-nav-link {{ 'is-active' if path == '/' else '' }}" href="/">Overview</a>
<a class="dw-nav-link {{ 'is-active' if path.startswith('/alerts') else '' }}" href="/alerts">Alerts</a>
<a class="dw-nav-link {{ 'is-active' if path.startswith('/inboxes') else '' }}" href="/inboxes">Inboxes</a>
<a class="dw-nav-link {{ 'is-active' if path.startswith('/settings') else '' }}" href="/settings">Settings</a>
</nav>
</div>
<nav class="dw-top-actions dw-mobile-actions">
<a class="dw-icon-button" href="/settings" aria-label="Settings">
<span class="material-symbols-outlined">settings</span>
</a>
</nav>
</div>
</header>
<main class="dw-main">
{% block content %}{% endblock %}
</main>
<footer class="dw-footer">
<div class="dw-footer-inner">
<div class="dw-system-status">
<span class="dw-status-dot"></span>
<span>System Operational</span>
</div>
<div class="dw-footer-code">DMARC Sentinel</div>
</div>
</footer>
</body>
</html>
+230
View File
@@ -0,0 +1,230 @@
{% extends "base.html" %}
{% block content %}
<header class="dw-page-header">
<h1>{{ domain }}</h1>
<p>Domain telemetry and alert evidence.</p>
</header>
<section class="dw-metrics-grid" aria-label="Domain metrics">
<article class="dw-metric-card">
<span class="dw-kicker">Messages</span>
<strong>{{ metrics.messages }}</strong>
</article>
<article class="dw-metric-card">
<span class="dw-kicker">SPF Aligned</span>
<strong>{{ metrics.spf_aligned }} <small>{{ metrics.spf_rate }}</small></strong>
</article>
<article class="dw-metric-card">
<span class="dw-kicker">DKIM Aligned</span>
<strong>{{ metrics.dkim_aligned }} <small>{{ metrics.dkim_rate }}</small></strong>
</article>
<article class="dw-metric-card dw-metric-card-critical">
<span class="dw-kicker">Unknown Sources</span>
<strong class="dw-danger-value">{{ metrics.unknown_sources }}</strong>
</article>
</section>
<section class="dw-domain-summary-section">
<h2 class="dw-sidebar-kicker">Latest LLM Posture Summary</h2>
<article class="dw-summary-card dw-domain-summary-card">
<div class="dw-summary-rail"></div>
<div class="dw-summary-copy">
{% set summary_parts = summary.split("Actions:", 1) %}
<div>{{ summary_parts[0] }}</div>
{% if summary_parts | length > 1 %}
<div class="dw-recommendations">
<span>Recommended Actions</span>
<ul>
{% for action in summary_parts[1].replace(".", "").split(";") %}
{% if action.strip() %}
<li><span class="material-symbols-outlined">task_alt</span>{{ action.strip() }}</li>
{% endif %}
{% endfor %}
</ul>
</div>
{% endif %}
</div>
</article>
</section>
<section class="dw-domain-main-grid">
<div class="dw-domain-main-column" id="source-panel">
<h2 class="dw-panel-title">Top Observed IPs</h2>
<div class="dw-table-card">
<table class="dw-table">
<thead>
<tr>
<th title="DMARC aggregate source_ip: the IP observed by the reporting receiver. It may be a relay, forwarder, gateway, or direct sender.">Observed IP</th>
<th>Count</th>
<th>DKIM Domains</th>
<th>Known</th>
<th>DMARC</th>
</tr>
</thead>
<tbody>
{% for row in records %}
<tr>
<td><code>{{ row.source_ip }}</code></td>
<td>{{ row.count }}</td>
<td class="dw-muted">{{ row.dkim_domains }}</td>
<td class="dw-muted">{{ row.known_sender_name or "unknown" }}</td>
<td>
<span class="dw-chip {{ 'dw-chip-pass' if row.dmarc_pass else 'dw-chip-fail' }}">{{ "pass" if row.dmarc_pass else "fail" }}</span>
</td>
</tr>
{% else %}
<tr>
<td colspan="5" class="dw-muted">No observed IP records yet.</td>
</tr>
{% endfor %}
</tbody>
<tfoot>
<tr>
<td colspan="5">
<div class="dw-table-footer">
<span>Showing {{ records | length }} observed IPs</span>
<span class="dw-pager">
<a hx-get="/domains/{{ domain }}?source_page={{ source_page - 1 }}&alert_page={{ alert_page }}&report_page={{ report_page }}&trend_page={{ trend_page }}" hx-select="#source-panel" hx-target="#source-panel" hx-swap="outerHTML" class="{{ 'is-disabled' if source_page <= 1 else '' }}" href="/domains/{{ domain }}?source_page={{ source_page - 1 }}&alert_page={{ alert_page }}&report_page={{ report_page }}&trend_page={{ trend_page }}"><span class="material-symbols-outlined">chevron_left</span></a>
<a hx-get="/domains/{{ domain }}?source_page={{ source_page + 1 }}&alert_page={{ alert_page }}&report_page={{ report_page }}&trend_page={{ trend_page }}" hx-select="#source-panel" hx-target="#source-panel" hx-swap="outerHTML" class="{{ 'is-disabled' if source_page * source_page_size >= source_total else '' }}" href="/domains/{{ domain }}?source_page={{ source_page + 1 }}&alert_page={{ alert_page }}&report_page={{ report_page }}&trend_page={{ trend_page }}"><span class="material-symbols-outlined">chevron_right</span></a>
</span>
</div>
</td>
</tr>
</tfoot>
</table>
</div>
</div>
<aside class="dw-domain-alert-column" id="domain-alert-panel">
<h2 class="dw-panel-title">Open Alerts <span class="dw-muted">({{ alert_total }})</span></h2>
<div class="dw-alert-feed">
{% for alert in alerts %}
<a class="dw-alert-item is-{{ alert.severity_class }}" href="{{ '/reports/' ~ alert.report_db_id if alert.report_db_id else '/alerts?domain=' ~ domain }}">
<span class="dw-alert-row">
<span>{{ alert.severity }}</span>
<time>{{ (alert.report_end or alert.report_start or alert.report_time) | fmt_dt }}</time>
</span>
<strong>{{ alert.title }}</strong>
<p>{{ alert.llm_summary or alert.summary }}</p>
</a>
{% else %}
<div class="dw-alert-empty">No open alerts.</div>
{% endfor %}
</div>
{% if alert_total > alert_page_size %}
<div class="dw-table-footer">
<span>{{ ((alert_page - 1) * alert_page_size) + 1 }}-{{ [alert_page * alert_page_size, alert_total] | min }} of {{ alert_total }}</span>
<span class="dw-pager">
<a hx-get="/domains/{{ domain }}?source_page={{ source_page }}&alert_page={{ alert_page - 1 }}&report_page={{ report_page }}&trend_page={{ trend_page }}" hx-select="#domain-alert-panel" hx-target="#domain-alert-panel" hx-swap="outerHTML" class="{{ 'is-disabled' if alert_page <= 1 else '' }}" href="/domains/{{ domain }}?source_page={{ source_page }}&alert_page={{ alert_page - 1 }}&report_page={{ report_page }}&trend_page={{ trend_page }}"><span class="material-symbols-outlined">chevron_left</span></a>
<a hx-get="/domains/{{ domain }}?source_page={{ source_page }}&alert_page={{ alert_page + 1 }}&report_page={{ report_page }}&trend_page={{ trend_page }}" hx-select="#domain-alert-panel" hx-target="#domain-alert-panel" hx-swap="outerHTML" class="{{ 'is-disabled' if alert_page * alert_page_size >= alert_total else '' }}" href="/domains/{{ domain }}?source_page={{ source_page }}&alert_page={{ alert_page + 1 }}&report_page={{ report_page }}&trend_page={{ trend_page }}"><span class="material-symbols-outlined">chevron_right</span></a>
</span>
</div>
{% endif %}
</aside>
</section>
<section class="dw-domain-lower-grid">
<div id="trend-panel">
<h2 class="dw-panel-title">Daily DMARC and Volume Trend</h2>
<div class="dw-table-card">
<table class="dw-table dw-compact-table">
<thead>
<tr>
<th>Date</th>
<th>Messages</th>
<th>Pass</th>
<th>Fail</th>
</tr>
</thead>
<tbody>
{% for stat in stats %}
<tr>
<td>{{ stat.date | fmt_date }}</td>
<td>{{ stat.total_messages }}</td>
<td class="dw-success-text">{{ stat.dmarc_pass_count }}</td>
<td class="dw-danger-text">{{ stat.dmarc_fail_count }}</td>
</tr>
{% else %}
<tr>
<td colspan="4" class="dw-muted">No daily stats yet.</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
{% if trend_total > trend_page_size %}
<div class="dw-table-footer">
<span>{{ ((trend_page - 1) * trend_page_size) + 1 }}-{{ [trend_page * trend_page_size, trend_total] | min }} of {{ trend_total }}</span>
<span class="dw-pager">
<a hx-get="/domains/{{ domain }}?source_page={{ source_page }}&alert_page={{ alert_page }}&report_page={{ report_page }}&trend_page={{ trend_page - 1 }}" hx-select="#trend-panel" hx-target="#trend-panel" hx-swap="outerHTML" class="{{ 'is-disabled' if trend_page <= 1 else '' }}" href="/domains/{{ domain }}?source_page={{ source_page }}&alert_page={{ alert_page }}&report_page={{ report_page }}&trend_page={{ trend_page - 1 }}"><span class="material-symbols-outlined">chevron_left</span></a>
<a hx-get="/domains/{{ domain }}?source_page={{ source_page }}&alert_page={{ alert_page }}&report_page={{ report_page }}&trend_page={{ trend_page + 1 }}" hx-select="#trend-panel" hx-target="#trend-panel" hx-swap="outerHTML" class="{{ 'is-disabled' if trend_page * trend_page_size >= trend_total else '' }}" href="/domains/{{ domain }}?source_page={{ source_page }}&alert_page={{ alert_page }}&report_page={{ report_page }}&trend_page={{ trend_page + 1 }}"><span class="material-symbols-outlined">chevron_right</span></a>
</span>
</div>
{% endif %}
</div>
<div>
<h2 class="dw-panel-title">Top Report Organizations</h2>
<div class="dw-list-card">
{% for org, count in reporters %}
<div class="dw-list-row">
<span>{{ org or "unknown" }}</span>
<code>{{ count }}</code>
</div>
{% else %}
<div class="dw-list-empty">No reporting organizations yet.</div>
{% endfor %}
</div>
</div>
<div>
<h2 class="dw-panel-title">Disposition and Sender Mix</h2>
<div class="dw-list-card">
{% for disposition, count in dispositions %}
<div class="dw-list-row">
<span>{{ disposition or "none" }}</span>
<code>{{ count }}</code>
</div>
{% endfor %}
{% for known, count in known_unknown %}
<div class="dw-list-row">
<span>{{ "Known senders" if known else "Unknown senders" }}</span>
<code>{{ count }}</code>
</div>
{% else %}
{% if not dispositions %}
<div class="dw-list-empty">No disposition data yet.</div>
{% endif %}
{% endfor %}
</div>
</div>
</section>
<section class="dw-reports-section" id="reports-panel">
<h2 class="dw-panel-title">Recent Reports</h2>
<div class="dw-report-list">
{% for report in reports %}
<a href="/reports/{{ report.id }}" class="dw-report-row">
<span class="material-symbols-outlined">description</span>
<span class="dw-report-copy">
<strong>{{ report.org_name or "unknown" }} · {{ report.report_id or report.id }}</strong>
<code>{{ report.date_begin | fmt_dt }}{% if report.date_end %} to {{ report.date_end | fmt_dt }}{% endif %}</code>
</span>
<span class="material-symbols-outlined">arrow_forward_ios</span>
</a>
{% else %}
<div class="dw-list-empty">No reports imported for this domain yet.</div>
{% endfor %}
</div>
{% if report_total > report_page_size %}
<div class="dw-table-footer">
<span>{{ ((report_page - 1) * report_page_size) + 1 }}-{{ [report_page * report_page_size, report_total] | min }} of {{ report_total }}</span>
<span class="dw-pager">
<a hx-get="/domains/{{ domain }}?source_page={{ source_page }}&alert_page={{ alert_page }}&report_page={{ report_page - 1 }}&trend_page={{ trend_page }}" hx-select="#reports-panel" hx-target="#reports-panel" hx-swap="outerHTML" class="{{ 'is-disabled' if report_page <= 1 else '' }}" href="/domains/{{ domain }}?source_page={{ source_page }}&alert_page={{ alert_page }}&report_page={{ report_page - 1 }}&trend_page={{ trend_page }}"><span class="material-symbols-outlined">chevron_left</span></a>
<a hx-get="/domains/{{ domain }}?source_page={{ source_page }}&alert_page={{ alert_page }}&report_page={{ report_page + 1 }}&trend_page={{ trend_page }}" hx-select="#reports-panel" hx-target="#reports-panel" hx-swap="outerHTML" class="{{ 'is-disabled' if report_page * report_page_size >= report_total else '' }}" href="/domains/{{ domain }}?source_page={{ source_page }}&alert_page={{ alert_page }}&report_page={{ report_page + 1 }}&trend_page={{ trend_page }}"><span class="material-symbols-outlined">chevron_right</span></a>
</span>
</div>
{% endif %}
</section>
{% endblock %}
+299
View File
@@ -0,0 +1,299 @@
{% extends "base.html" %}
{% block content %}
<header class="mb-stack-lg">
<h1 class="text-headline-xl-mobile font-bold text-on-background md:text-headline-xl">Inboxes</h1>
<p class="mt-1 text-body-base text-on-surface-variant">Polling health and manual import controls.</p>
</header>
<section class="surface-card divide-y divide-outline-variant overflow-hidden">
{% for inbox in inboxes %}
{% set job = jobs.get(inbox.inbox_id) if jobs else none %}
{% set running = job and job.status in ["queued", "running"] %}
{% set skipped = skipped_payloads.get(inbox.inbox_id, []) if skipped_payloads else [] %}
<article class="p-stack-md" id="inbox-row-{{ inbox.inbox_id }}">
<div class="dw-inbox-row">
<div class="min-w-0">
<div class="flex flex-wrap items-center gap-stack-sm">
<h2 class="text-headline-md font-semibold text-on-background">{{ inbox.label }}</h2>
<code class="rounded bg-surface-container px-2 py-0.5 font-mono text-data-mono text-secondary">{{ inbox.inbox_id }}</code>
{% set status_label = "running" if running else ("disabled" if not inbox.enabled else ("error" if inbox.last_error else "ready")) %}
<span id="inbox-status-chip-{{ inbox.inbox_id }}" class="status-chip {{ 'chip-warning' if running or not inbox.enabled else ('chip-fail' if inbox.last_error else 'chip-pass') }}">{{ status_label }}</span>
</div>
<p class="mt-stack-sm text-body-sm text-on-surface-variant">{{ inbox.domain }} · {{ inbox.folder }} · {{ inbox.recipient }}</p>
<div class="dw-inbox-meta mt-stack-md">
<div><span class="label-caps block">Last Check</span><span id="inbox-last-check-{{ inbox.inbox_id }}">{{ inbox.last_check_at | fmt_dt }}</span></div>
<div><span class="label-caps block">Last Success</span><span id="inbox-last-success-{{ inbox.inbox_id }}">{{ inbox.last_success_at | fmt_dt }}</span></div>
<div><span class="label-caps block">New Messages</span><span id="inbox-new-messages-{{ inbox.inbox_id }}">{{ inbox.last_new_messages }}</span></div>
<div><span class="label-caps block">Imported</span><span id="inbox-imported-{{ inbox.inbox_id }}">{{ inbox.last_reports_imported }}</span></div>
</div>
<p id="inbox-last-error-{{ inbox.inbox_id }}" class="mt-stack-md border-l-4 border-l-error bg-error-container p-stack-sm text-body-sm text-on-error-container {{ '' if inbox.last_error else 'hidden' }}">{{ inbox.last_error or "" }}</p>
</div>
<div class="dw-inbox-work">
<div class="dw-inbox-actions">
<a class="button-secondary" href="/domains/{{ inbox.domain }}">
<span class="material-symbols-outlined text-[18px]">visibility</span>
View Domain
</a>
<button class="button-secondary js-inbox-action" type="button" data-action="process-now" data-inbox-id="{{ inbox.inbox_id }}" {% if not inbox.enabled or running %}disabled{% endif %}>
<span class="material-symbols-outlined text-[18px]">sync</span>
Process Now
</button>
<button class="button-secondary js-inbox-action" type="button" data-action="backlog" data-inbox-id="{{ inbox.inbox_id }}" {% if not inbox.enabled or running %}disabled{% endif %}>
<span class="material-symbols-outlined text-[18px]">manage_search</span>
Backlog Scan
</button>
</div>
<div class="dw-inbox-job">
<div class="inbox-action-result {{ 'is-running' if running else '' }}" id="inbox-action-result-{{ inbox.inbox_id }}" data-job-id="{{ job.id if job else '' }}" role="status" aria-live="polite">
{% if running %}
{{ job.processed_messages }} of {{ job.scanned_messages or "?" }} scanned · {{ job.valid_reports_imported }} imported
{% endif %}
</div>
<div class="inbox-progress {{ 'is-active' if running else '' }}" id="inbox-progress-{{ inbox.inbox_id }}">
<span style="width: {{ job.progress_percent if job and job.progress_percent is not none else 100 }}%;"></span>
</div>
</div>
</div>
<div class="inbox-duplicate-list {{ 'hidden' if not skipped else '' }}" id="inbox-duplicate-list-{{ inbox.inbox_id }}">
{% if skipped %}
<details>
<summary>
<span>Skipped report payloads</span>
<strong>{{ skipped|length }}</strong>
</summary>
<table class="inbox-duplicate-table">
<thead>
<tr>
<th>Reason</th>
<th>Reporter</th>
<th>Report ID</th>
<th>Report Date</th>
<th>Skipped IMAP UID</th>
</tr>
</thead>
<tbody>
{% for item in skipped %}
<tr>
<td>{{ item.reason.replace("_", " ") }}</td>
<td>{{ item.reporting_org or "unknown" }}</td>
<td>{{ item.report_identifier or ("DB #" ~ item.existing_report_id) }}</td>
<td>{{ item.report_date | fmt_date }}</td>
<td>{{ item.imap_uid }}</td>
</tr>
{% endfor %}
</tbody>
</table>
</details>
{% endif %}
</div>
</div>
</article>
{% else %}
<div class="p-gutter text-on-surface-variant">No inboxes are configured.</div>
{% endfor %}
</section>
<script>
(() => {
const formatDate = (value) => {
if (!value) return "";
const parts = value.split("-");
return parts.length === 3 ? `${parts[2]}/${parts[1]}/${parts[0]}` : value;
};
const summarize = (data) => {
const imported = data.valid_reports_imported ?? 0;
const duplicateMessages = data.duplicate_messages_skipped ?? 0;
const duplicateReports = data.duplicate_reports_skipped ?? 0;
const rejected = data.rejected_messages ?? 0;
const failed = data.failed_messages ?? 0;
const notImported = [];
if (duplicateMessages) notImported.push(`${duplicateMessages} messages already processed earlier`);
if (duplicateReports) notImported.push(`${duplicateReports} duplicate report payloads`);
if (rejected) notImported.push(`${rejected} rejected by validation/guardrails`);
const skipped = notImported.length ? ` Not imported: ${notImported.join(", ")}.` : "";
return `Done: ${data.scanned_messages ?? 0} scanned, ${data.candidate_messages ?? 0} candidate messages, ${imported} new reports imported.${skipped} Failures: ${failed}.`;
};
const text = (id, value) => {
const element = document.getElementById(id);
if (element) {
element.textContent = value;
}
};
const formatDateTime = (value, fallback = "never") => {
if (!value) return fallback;
const date = new Date(value);
if (Number.isNaN(date.getTime())) return value;
const pad = (item) => String(item).padStart(2, "0");
return `${pad(date.getDate())}/${pad(date.getMonth() + 1)}/${date.getFullYear()} ${pad(date.getHours())}:${pad(date.getMinutes())}:${pad(date.getSeconds())}`;
};
const escapeHtml = (value) => String(value ?? "")
.replaceAll("&", "&amp;")
.replaceAll("<", "&lt;")
.replaceAll(">", "&gt;")
.replaceAll('"', "&quot;")
.replaceAll("'", "&#039;");
const renderDuplicateReports = (inboxId, samples = []) => {
const list = document.getElementById(`inbox-duplicate-list-${inboxId}`);
if (!list) return;
if (!samples.length) {
list.classList.add("hidden");
list.innerHTML = "";
return;
}
const rows = samples.map((item) => {
const reportLabel = item.existing_report_id || `DB #${item.existing_report_db_id}`;
const messageLabel = item.duplicate_message_uid || item.duplicate_message_id || "unknown message";
return `
<tr>
<td>duplicate report payload</td>
<td>${escapeHtml(item.reporting_org || "unknown")}</td>
<td>${escapeHtml(reportLabel)}</td>
<td>${escapeHtml(item.report_date ? formatDate(item.report_date) : "unknown")}</td>
<td>${escapeHtml(messageLabel)}</td>
</tr>`;
}).join("");
const wasOpen = list.querySelector("details")?.open || false;
list.classList.remove("hidden");
list.innerHTML = `
<details ${wasOpen ? "open" : ""}>
<summary>
<span>Skipped report payloads</span>
<strong>${samples.length}</strong>
</summary>
<table class="inbox-duplicate-table">
<thead>
<tr>
<th>Reason</th>
<th>Reporter</th>
<th>Report ID</th>
<th>Report Date</th>
<th>Skipped IMAP UID</th>
</tr>
</thead>
<tbody>${rows}</tbody>
</table>
</details>`;
};
const renderInboxStatus = async (inboxId, keepRunning = false) => {
const response = await fetch(`/api/admin/inboxes/${encodeURIComponent(inboxId)}/status`, { credentials: "same-origin" });
if (!response.ok) {
return;
}
const status = await response.json();
text(`inbox-last-check-${inboxId}`, formatDateTime(status.last_check_at));
text(`inbox-last-success-${inboxId}`, formatDateTime(status.last_success_at));
text(`inbox-new-messages-${inboxId}`, status.last_new_messages ?? 0);
text(`inbox-imported-${inboxId}`, status.last_reports_imported ?? 0);
const chip = document.getElementById(`inbox-status-chip-${inboxId}`);
if (chip) {
const label = keepRunning ? "running" : (!status.enabled ? "disabled" : (status.last_error ? "error" : "ready"));
chip.textContent = label;
chip.className = `status-chip ${keepRunning || !status.enabled ? "chip-warning" : (status.last_error ? "chip-fail" : "chip-pass")}`;
}
const error = document.getElementById(`inbox-last-error-${inboxId}`);
if (error) {
error.textContent = status.last_error || "";
error.classList.toggle("hidden", !status.last_error);
}
};
document.querySelectorAll(".js-inbox-action").forEach((button) => {
button.addEventListener("click", async () => {
const inboxId = button.dataset.inboxId;
const result = document.getElementById(`inbox-action-result-${inboxId}`);
const action = button.dataset.action;
const endpoint = action === "backlog" ? "/api/admin/import-jobs/backlog" : "/api/admin/import-jobs/process-now";
const payload = action === "backlog"
? { inbox_id: inboxId, limit: 200 }
: { inbox_id: inboxId, mode: "new", limit: 200 };
const original = button.innerHTML;
const progress = document.getElementById(`inbox-progress-${inboxId}`);
button.disabled = true;
button.innerHTML = `<span class="material-symbols-outlined text-[18px]">sync</span>Running`;
result.className = "inbox-action-result is-running";
result.textContent = "Starting import job...";
progress.classList.add("is-active");
try {
const response = await fetch(endpoint, {
method: "POST",
headers: { ...window.adminPostHeaders, "Content-Type": "application/json" },
credentials: "same-origin",
body: JSON.stringify(payload),
});
const data = await response.json();
if (!response.ok) {
throw new Error(typeof data.detail === "string" ? data.detail : JSON.stringify(data.detail || data));
}
pollJob(data.id, inboxId);
} catch (error) {
result.className = "inbox-action-result is-error";
result.textContent = error.message || "Processing failed.";
button.disabled = false;
button.innerHTML = original;
}
});
});
const renderJob = (job, inboxId) => {
const result = document.getElementById(`inbox-action-result-${inboxId}`);
const progress = document.getElementById(`inbox-progress-${inboxId}`);
const fill = progress.querySelector("span");
const running = job.status === "queued" || job.status === "running";
const buttons = document.querySelectorAll(`.js-inbox-action[data-inbox-id="${inboxId}"]`);
buttons.forEach((button) => { button.disabled = running; });
progress.classList.toggle("is-active", running || job.status === "succeeded");
fill.style.width = `${job.progress_percent ?? (running ? 100 : 0)}%`;
progress.classList.toggle("is-indeterminate", running && job.progress_percent === null);
text(`inbox-last-check-${inboxId}`, formatDateTime(job.started_at, "running"));
text(`inbox-new-messages-${inboxId}`, job.scanned_messages ?? 0);
text(`inbox-imported-${inboxId}`, job.valid_reports_imported ?? 0);
if (running) {
result.className = "inbox-action-result is-running";
result.textContent = `${job.processed_messages} of ${job.scanned_messages || "?"} scanned · ${job.valid_reports_imported} imported · ${job.duplicate_reports_skipped ?? 0} duplicate reports · ${job.duplicate_messages_skipped ?? 0} already processed · ${job.rejected_messages ?? 0} rejected · ${job.alerts_created} alerts`;
renderDuplicateReports(inboxId, job.duplicate_report_samples || []);
const chip = document.getElementById(`inbox-status-chip-${inboxId}`);
if (chip) {
chip.textContent = "running";
chip.className = "status-chip chip-warning";
}
} else if (job.status === "succeeded") {
result.className = "inbox-action-result is-success";
result.textContent = summarize(job);
renderDuplicateReports(inboxId, job.duplicate_report_samples || []);
} else if (job.status === "failed") {
result.className = "inbox-action-result is-error";
result.textContent = job.error || "Processing failed.";
renderDuplicateReports(inboxId, []);
}
return running;
};
const pollJob = async (jobId, inboxId) => {
const response = await fetch(`/api/admin/import-jobs/${jobId}`, { credentials: "same-origin" });
const job = await response.json();
if (renderJob(job, inboxId)) {
window.setTimeout(() => pollJob(jobId, inboxId), 2000);
} else {
renderInboxStatus(inboxId);
}
};
document.querySelectorAll(".inbox-action-result[data-job-id]").forEach((result) => {
if (result.dataset.jobId) {
const inboxId = result.id.replace("inbox-action-result-", "");
pollJob(result.dataset.jobId, inboxId);
}
});
})();
</script>
{% endblock %}
+250
View File
@@ -0,0 +1,250 @@
{% extends "base.html" %}
{% block content %}
<header class="dw-page-header">
<h1>Operational Overview</h1>
<p>Deterministic detection with LLM-assisted reporting.</p>
</header>
<section class="dw-overview-filter" aria-label="Traffic filters">
<div class="dw-chart-controls">
<select id="traffic-period" aria-label="Traffic period">
<option value="all" selected>All reports</option>
<option value="24h">24h</option>
<option value="7d">7d</option>
<option value="30d">30d</option>
<option value="365d">Year</option>
<option value="custom">Custom</option>
</select>
<input id="traffic-from" type="date" aria-label="Traffic from date">
<input id="traffic-to" type="date" aria-label="Traffic to date">
<select id="traffic-domain" aria-label="Traffic domain">
<option value="">All domains</option>
{% for domain in domains %}
<option value="{{ domain }}">{{ domain }}</option>
{% endfor %}
</select>
</div>
</section>
<section class="dw-metrics-grid" aria-label="Operational metrics">
<a class="dw-metric-card dw-metric-link" id="monitored-domains-card" href="/inboxes">
<span class="dw-kicker">Monitored Domains</span>
<strong id="metric-domains">{{ data.domains }}</strong>
<small id="metric-domain-target">View inboxes</small>
</a>
<article class="dw-metric-card">
<span class="dw-kicker">DMARC Reports</span>
<strong id="metric-reports">{{ data.reports_today }}</strong>
</article>
<article class="dw-metric-card">
<span class="dw-kicker">Reported Emails</span>
<strong id="metric-messages">{{ data.messages_today }}</strong>
</article>
<article class="dw-metric-card">
<span class="dw-kicker">DMARC Pass Rate</span>
<strong id="metric-pass-rate" class="{{ 'dw-success-value' if data.dmarc_pass_rate_value is none or data.dmarc_pass_rate_value >= 95 else ('dw-warning-value' if data.dmarc_pass_rate_value >= 80 else 'dw-danger-value') }}">{{ data.dmarc_pass_rate }}</strong>
</article>
<article class="dw-metric-card">
<span class="dw-kicker">Passing Emails</span>
<strong id="metric-pass-count">{{ data.dmarc_pass_count }}</strong>
</article>
<article class="dw-metric-card dw-metric-card-critical">
<span class="dw-kicker">Failed Emails</span>
<strong id="metric-fail-count">{{ data.dmarc_fail_count }}</strong>
</article>
<article class="dw-metric-card">
<span class="dw-kicker">Unknown Sources</span>
<strong id="metric-unknown">{{ data.unknown_sources }}</strong>
</article>
<article class="dw-metric-card">
<span class="dw-kicker">Last Successful Check</span>
<code>{{ data.last_check | fmt_dt }}</code>
</article>
</section>
<section class="dw-chart-card dw-overview-chart">
<div class="dw-card-head">
<h3>Traffic Distribution <span id="traffic-period-label">{{ traffic_label }}</span></h3>
<div class="dw-legend">
<span><i class="dw-dot-valid"></i>Valid</span>
<span><i class="dw-dot-failed"></i>Failed</span>
</div>
</div>
<div class="dw-bars" id="traffic-bars" aria-label="Traffic distribution for imported reports in the selected period">
{% for bucket in traffic %}
<a href="/alerts?status=&date_from={{ bucket.date_from }}&date_to={{ bucket.date_to }}" title="{{ bucket.label }} · {{ bucket.total }} messages, {{ bucket.failed }} failed" style="height: {{ [bucket.height, 3] | max }}%;" aria-label="Show alerts for {{ bucket.label }}">
<span class="dw-bar-valid" style="flex-grow: {{ bucket.valid }};"></span>
<span class="dw-bar-failed" style="flex-grow: {{ bucket.failed }};"></span>
</a>
{% else %}
<span style="height: 3%;"></span>
{% endfor %}
</div>
</section>
<section class="dw-overview-summary">
<div class="dw-section-heading">
<span class="material-symbols-outlined dw-filled-icon">auto_awesome</span>
<h2 id="summary-title">Portfolio DMARC posture</h2>
<button class="button-secondary dw-summary-run" type="button" id="run-daily-summary">
<span class="material-symbols-outlined text-[18px]">play_arrow</span>
Generate Digest
</button>
</div>
<article class="dw-summary-card">
<span class="dw-ai-label">AI Assisted</span>
<div class="dw-summary-copy">
{% set summary_parts = data.summary.split("Actions:", 1) %}
<div>{{ summary_parts[0] }}</div>
{% if summary_parts | length > 1 %}
<div class="dw-recommendations">
<span>Recommended Actions</span>
<ul>
{% for action in summary_parts[1].replace(".", "").split(";") %}
{% if action.strip() %}
<li><span class="material-symbols-outlined">task_alt</span>{{ action.strip() }}</li>
{% endif %}
{% endfor %}
</ul>
</div>
{% endif %}
</div>
<div class="inbox-action-result" id="daily-summary-result" role="status" aria-live="polite"></div>
</article>
</section>
<script>
(() => {
const period = document.getElementById("traffic-period");
const domain = document.getElementById("traffic-domain");
const bars = document.getElementById("traffic-bars");
const periodLabel = document.getElementById("traffic-period-label");
const summaryButton = document.getElementById("run-daily-summary");
const summaryResult = document.getElementById("daily-summary-result");
const summaryTitle = document.getElementById("summary-title");
const dateFrom = document.getElementById("traffic-from");
const dateTo = document.getElementById("traffic-to");
const summaryCopy = document.querySelector(".dw-summary-copy");
const monitoredDomainsCard = document.getElementById("monitored-domains-card");
const metricDomains = document.getElementById("metric-domains");
const metricDomainTarget = document.getElementById("metric-domain-target");
const metricReports = document.getElementById("metric-reports");
const metricMessages = document.getElementById("metric-messages");
const metricPassRate = document.getElementById("metric-pass-rate");
const metricPassCount = document.getElementById("metric-pass-count");
const metricFailCount = document.getElementById("metric-fail-count");
const metricUnknown = document.getElementById("metric-unknown");
const scopeText = (periodLabelValue) => `${periodLabelValue || period.options[period.selectedIndex].text} · ${domain.value || "All domains"}`;
const renderDomainCard = (totalDomains) => {
if (domain.value) {
monitoredDomainsCard.href = `/domains/${encodeURIComponent(domain.value)}`;
metricDomains.textContent = "1";
metricDomainTarget.textContent = domain.value;
} else {
monitoredDomainsCard.href = "/inboxes";
metricDomains.textContent = totalDomains;
metricDomainTarget.textContent = "View inboxes";
}
};
const render = (buckets) => {
bars.innerHTML = "";
if (!buckets.length) {
const bar = document.createElement("span");
bar.style.height = "3%";
bars.appendChild(bar);
return;
}
buckets.forEach((bucket) => {
const bar = document.createElement("a");
bar.style.height = `${Math.max(bucket.height, 3)}%`;
bar.title = `${bucket.label} · ${bucket.total} messages, ${bucket.failed} failed`;
bar.href = `/alerts?status=&date_from=${encodeURIComponent(bucket.date_from)}&date_to=${encodeURIComponent(bucket.date_to)}`;
bar.setAttribute("aria-label", `Show alerts for ${bucket.label}`);
const valid = document.createElement("span");
valid.className = "dw-bar-valid";
valid.style.flexGrow = bucket.valid || 0;
const failed = document.createElement("span");
failed.className = "dw-bar-failed";
failed.style.flexGrow = bucket.failed || 0;
bar.appendChild(valid);
bar.appendChild(failed);
bars.appendChild(bar);
});
};
const passClass = (value) => value === null || value >= 95 ? "dw-success-value" : (value >= 80 ? "dw-warning-value" : "dw-danger-value");
const formatDate = (value) => {
if (!value) return "";
const parts = value.split("-");
return parts.length === 3 ? `${parts[2]}/${parts[1]}/${parts[0]}` : value;
};
const renderSummary = (plain) => {
const parts = (plain || "").split("Actions:");
summaryCopy.innerHTML = "";
const body = document.createElement("div");
body.textContent = parts[0] || "";
summaryCopy.appendChild(body);
if (parts.length > 1) {
const rec = document.createElement("div");
rec.className = "dw-recommendations";
rec.innerHTML = "<span>Recommended Actions</span>";
const list = document.createElement("ul");
parts[1].replace(/\.$/, "").split(";").map((item) => item.trim()).filter(Boolean).forEach((item) => {
const li = document.createElement("li");
li.innerHTML = '<span class="material-symbols-outlined">task_alt</span>';
li.appendChild(document.createTextNode(item));
list.appendChild(li);
});
rec.appendChild(list);
summaryCopy.appendChild(rec);
}
};
const refresh = async () => {
const params = new URLSearchParams({ period: period.value });
if (domain.value) params.set("domain", domain.value);
if (period.value === "custom") {
if (dateFrom.value) params.set("date_from", dateFrom.value);
if (dateTo.value) params.set("date_to", dateTo.value);
}
const response = await fetch(`/api/overview?${params}`, { credentials: "same-origin" });
if (response.ok) {
const data = await response.json();
const label = scopeText(data.period_label);
periodLabel.textContent = label;
metricReports.textContent = data.metrics.reports_today;
metricMessages.textContent = data.metrics.messages_today;
metricPassCount.textContent = data.metrics.dmarc_pass_count;
metricFailCount.textContent = data.metrics.dmarc_fail_count;
metricUnknown.textContent = data.metrics.unknown_sources;
renderDomainCard(data.metrics.domains);
metricPassRate.textContent = data.metrics.dmarc_pass_rate;
metricPassRate.className = passClass(data.metrics.dmarc_pass_rate_value);
summaryTitle.textContent = domain.value ? `${domain.value} DMARC posture` : "Portfolio DMARC posture";
renderSummary(data.metrics.summary || "");
render(data.buckets || []);
}
};
period.addEventListener("change", refresh);
domain.addEventListener("change", refresh);
dateFrom.addEventListener("change", refresh);
dateTo.addEventListener("change", refresh);
summaryButton.addEventListener("click", async () => {
summaryButton.disabled = true;
summaryResult.className = "inbox-action-result is-running";
summaryResult.textContent = "Generating digest...";
try {
const response = await fetch("/api/admin/scheduler/daily-summary", { method: "POST", headers: window.adminPostHeaders, credentials: "same-origin" });
const data = await response.json();
if (!response.ok) throw new Error(data.detail || "Digest generation failed.");
summaryResult.className = "inbox-action-result is-success";
summaryResult.textContent = "Digest generated.";
await refresh();
} catch (error) {
summaryResult.className = "inbox-action-result is-error";
summaryResult.textContent = error.message || "Digest generation failed.";
} finally {
summaryButton.disabled = false;
}
});
})();
</script>
{% endblock %}
+84
View File
@@ -0,0 +1,84 @@
{% extends "base.html" %}
{% block content %}
<header class="mb-stack-lg">
<h1 class="text-headline-xl-mobile font-bold text-on-background md:text-headline-xl">Report {{ report.id }}</h1>
<p class="mt-1 text-body-base text-on-surface-variant">{{ report.domain }} · {{ report.org_name or "unknown organization" }}</p>
</header>
<section class="mb-stack-lg grid grid-cols-1 gap-gutter md:grid-cols-2 xl:grid-cols-4">
<div class="metric-card">
<span class="label-caps">Report Org</span>
<span class="text-body-base font-bold">{{ report.org_name or "unknown" }}</span>
</div>
<div class="metric-card">
<span class="label-caps">Report ID</span>
<span class="break-all font-mono text-data-mono">{{ report.report_id or report.id }}</span>
</div>
<div class="metric-card">
<span class="label-caps">Date Range</span>
<span class="font-mono text-data-mono">{{ report.date_begin | fmt_dt }}<br>{{ report.date_end | fmt_dt }}</span>
</div>
<div class="metric-card">
<span class="label-caps">Published Policy</span>
<span class="font-mono text-data-mono">p={{ report.policy_p }}, sp={{ report.policy_sp }}, pct={{ report.policy_pct }}</span>
</div>
</section>
<section class="mb-stack-lg">
<h2 class="mb-stack-md text-headline-md font-semibold">Alerts From This Report</h2>
<div class="dw-alert-feed">
{% for alert in alerts %}
<a class="dw-alert-item is-{{ alert.severity_class }}" href="/alerts?domain={{ report.domain }}&alert_type={{ alert.type }}">
<span class="dw-alert-row">
<span>{{ alert.severity }}</span>
<time>{{ alert.status }}</time>
</span>
<strong>{{ alert.title }}</strong>
<p>{{ alert.llm_summary or alert.summary }}</p>
</a>
{% else %}
<div class="dw-alert-empty">No alerts are linked to this report.</div>
{% endfor %}
</div>
</section>
<section>
<h2 class="mb-stack-md text-headline-md font-semibold">Records</h2>
<div class="surface-card overflow-hidden">
<div class="data-table-wrap">
<table class="data-table">
<thead>
<tr>
<th title="DMARC aggregate source_ip: the IP observed by the reporting receiver. It may be a relay, forwarder, gateway, or direct sender.">Observed IP</th>
<th>Count</th>
<th>SPF</th>
<th>DKIM</th>
<th>DMARC</th>
<th>Known Sender</th>
<th>Applied Policy</th>
<th>Policy Override</th>
</tr>
</thead>
<tbody>
{% for row in report.records %}
<tr>
<td class="font-mono text-data-mono">{{ row.source_ip }}</td>
<td>{{ row.count }}</td>
<td title="{{ row.spf_auth_tooltip }}"><span class="status-chip {{ 'chip-pass' if row.policy_spf == 'pass' else 'chip-fail' }}">{{ row.policy_spf or "none" }}</span></td>
<td title="{{ row.dkim_auth_tooltip }}"><span class="status-chip {{ 'chip-pass' if row.policy_dkim == 'pass' else 'chip-fail' }}">{{ row.policy_dkim or "none" }}</span></td>
<td><span class="status-chip {{ 'chip-pass' if row.dmarc_pass else 'chip-fail' }}">{{ "pass" if row.dmarc_pass else "fail" }}</span></td>
<td><span class="status-chip {{ 'chip-pass' if row.known_sender_name else 'chip-info' }}" title="{{ row.known_sender_name or 'No configured sender matched this observed IP/authentication evidence.' }}">{{ row.known_sender_name or "unknown" }}</span></td>
<td><span class="status-chip {{ 'chip-pass' if not row.disposition or row.disposition == 'none' else ('chip-warning' if row.disposition == 'quarantine' else 'chip-fail') }}">{{ row.disposition or "none" }}</span></td>
<td><span class="status-chip {{ 'chip-info' if row.reason_type else 'chip-pass' }}" title="{{ row.reason_comment or 'No policy override reason reported.' }}">{{ row.reason_type or "none" }}</span></td>
</tr>
{% else %}
<tr>
<td colspan="8" class="text-on-surface-variant">No records found in this report.</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
</div>
</section>
{% endblock %}
+247
View File
@@ -0,0 +1,247 @@
{% extends "base.html" %}
{% block content %}
{% set env_items = env_status.items() | list %}
{% set missing_env = env_items | selectattr("1", "equalto", false) | list %}
<header class="dw-page-header dw-settings-header">
<div>
<h1>Settings</h1>
<p>Read-only runtime configuration and operational posture.</p>
</div>
<code>{{ config_path }}</code>
</header>
<section class="dw-settings-metrics" aria-label="Settings summary">
<article class="dw-metric-card">
<span class="dw-kicker">Application</span>
<strong>{{ settings.app.name }}</strong>
<span class="dw-card-note">{{ settings.app.base_url }}</span>
</article>
<article class="dw-metric-card">
<span class="dw-kicker">Polling</span>
<strong>{{ settings.app.poll_interval_minutes }} min</strong>
<span class="dw-card-note">{{ settings.app.timezone }}</span>
</article>
<article class="dw-metric-card">
<span class="dw-kicker">LLM</span>
<strong>{{ settings.llm.model }}</strong>
<span class="dw-card-note">{{ settings.llm.provider }}</span>
</article>
<article class="dw-metric-card {{ 'dw-metric-card-critical' if missing_env else '' }}">
<span class="dw-kicker">Environment</span>
<strong>{{ env_items | length - missing_env | length }}/{{ env_items | length }}</strong>
<span class="dw-card-note">{{ missing_env | length }} missing</span>
</article>
</section>
<section class="dw-settings-board">
<section class="dw-settings-panel">
<h2 class="dw-panel-title">Runtime</h2>
<div class="dw-info-card">
<div class="dw-info-row">
<span>Database</span>
<code>{{ settings.app.database_url }}</code>
</div>
<div class="dw-info-row">
<span>Log Level</span>
<strong>{{ settings.app.log_level }}</strong>
</div>
<div class="dw-info-row">
<span>Max Attachment Size</span>
<strong>{{ settings.app.max_attachment_decompressed_mb }} MB</strong>
</div>
<div class="dw-info-row">
<span>Max Reports Per Poll</span>
<strong>{{ settings.app.max_reports_per_poll }}</strong>
</div>
</div>
</section>
<section class="dw-settings-panel">
<h2 class="dw-panel-title">Inboxes</h2>
<div class="dw-inbox-grid">
{% for inbox in settings.inboxes %}
<article class="dw-settings-card">
<div class="dw-settings-card-head">
<div>
<h3>{{ inbox.label }}</h3>
<code>{{ inbox.id }}</code>
</div>
<span class="dw-chip {{ 'dw-chip-pass' if inbox.enabled else 'dw-chip-warning' }}">{{ "Enabled" if inbox.enabled else "Disabled" }}</span>
</div>
<div class="dw-info-list">
<div><span>Domain</span><strong>{{ inbox.domain }}</strong></div>
<div><span>Folder</span><strong>{{ inbox.folder }}</strong></div>
<div><span>Recipient</span><strong>{{ inbox.recipient }}</strong></div>
<div><span>IMAP</span><strong>{{ inbox.imap_host }}:{{ inbox.imap_port }} · {{ "SSL" if inbox.imap_ssl else "plain" }}</strong></div>
</div>
</article>
{% else %}
<div class="dw-list-empty">No inboxes configured.</div>
{% endfor %}
</div>
</section>
<section class="dw-settings-panel dw-settings-panel-wide">
<h2 class="dw-panel-title">Known Senders</h2>
<div class="dw-sender-domain-grid">
{% for domain, senders in settings.known_senders.items() %}
<article class="dw-settings-card dw-sender-domain">
<div class="dw-sender-domain-head">
<h3>{{ domain }}</h3>
<span>{{ senders | length }} senders</span>
</div>
<div class="dw-sender-list">
{% for sender in senders %}
<article class="dw-sender-row">
<div>
<strong>{{ sender.name }}</strong>
<code>{{ sender.id }}</code>
</div>
<div class="dw-sender-values">
<div>
<span>IP ranges</span>
<ul>
{% for item in sender.ip_allowlist %}
<li><code>{{ item }}</code></li>
{% else %}
<li class="dw-muted">None</li>
{% endfor %}
</ul>
</div>
<div>
<span>DKIM domains</span>
<ul>
{% for item in sender.dkim_domains %}
<li><code>{{ item }}</code></li>
{% else %}
<li class="dw-muted">None</li>
{% endfor %}
</ul>
</div>
<div>
<span>SPF domains</span>
<ul>
{% for item in sender.spf_domains %}
<li><code>{{ item }}</code></li>
{% else %}
<li class="dw-muted">None</li>
{% endfor %}
</ul>
</div>
</div>
</article>
{% endfor %}
</div>
</article>
{% else %}
<div class="dw-list-empty">No known senders configured.</div>
{% endfor %}
</div>
</section>
<section class="dw-settings-panel">
<h2 class="dw-panel-title">Security</h2>
<div class="dw-list-card">
<div class="dw-list-row">
<span>Dashboard Basic Auth</span>
<span class="dw-chip {{ 'dw-chip-pass' if settings.security.dashboard_auth_enabled else 'dw-chip-warning' }}">{{ "Enabled" if settings.security.dashboard_auth_enabled else "Disabled" }}</span>
</div>
<div class="dw-list-row">
<span>Homepage Token</span>
<span class="dw-chip {{ 'dw-chip-pass' if settings.security.api_token_required else 'dw-chip-warning' }}">{{ "Required" if settings.security.api_token_required else "Not required" }}</span>
</div>
<div class="dw-list-row">
<span>Email Alerts</span>
<span class="dw-chip {{ 'dw-chip-pass' if settings.alerts.email.enabled else 'dw-chip-warning' }}">{{ "Enabled" if settings.alerts.email.enabled else "Disabled" }}</span>
</div>
</div>
</section>
<section class="dw-settings-panel">
<h2 class="dw-panel-title">LLM Data Controls</h2>
<div class="dw-list-card">
<div class="dw-list-row">
<span>Alert Explanations</span>
<span class="dw-chip {{ 'dw-chip-pass' if settings.llm.generate_alert_explanations else 'dw-chip-warning' }}">{{ "On" if settings.llm.generate_alert_explanations else "Off" }}</span>
</div>
<div class="dw-list-row">
<span>Daily Summary</span>
<span class="dw-chip {{ 'dw-chip-pass' if settings.llm.generate_daily_summary else 'dw-chip-warning' }}">{{ "On" if settings.llm.generate_daily_summary else "Off" }}</span>
</div>
<div class="dw-list-row">
<span>Weekly Summary</span>
<span class="dw-chip {{ 'dw-chip-pass' if settings.llm.generate_weekly_summary else 'dw-chip-warning' }}">{{ "On" if settings.llm.generate_weekly_summary else "Off" }}</span>
</div>
<div class="dw-list-row">
<span>Raw XML to LLM</span>
<span class="dw-chip {{ 'dw-chip-warning' if settings.llm.send_raw_xml_to_llm else 'dw-chip-pass' }}">{{ "On" if settings.llm.send_raw_xml_to_llm else "Off" }}</span>
</div>
<div class="dw-list-row">
<span>Raw Email to LLM</span>
<span class="dw-chip {{ 'dw-chip-warning' if settings.llm.send_raw_email_to_llm else 'dw-chip-pass' }}">{{ "On" if settings.llm.send_raw_email_to_llm else "Off" }}</span>
</div>
<div class="dw-list-row">
<span>System Prompt</span>
<code>{{ settings.llm.system_prompt_path }}</code>
</div>
<div class="dw-list-row">
<span>Alert Prompt</span>
<code>{{ settings.llm.alert_prompt_path }}</code>
</div>
<div class="dw-list-row">
<span>Digest Prompt</span>
<code>{{ settings.llm.digest_prompt_path }}</code>
</div>
</div>
</section>
<section class="dw-settings-panel">
<h2 class="dw-panel-title">Alert Thresholds</h2>
<div class="dw-list-card">
{% for name, value in settings.alerts.thresholds.model_dump().items() %}
<div class="dw-list-row">
<span>{{ name.replace("_", " ") }}</span>
<code>{{ value }}</code>
</div>
{% endfor %}
</div>
</section>
</section>
<section class="dw-settings-env">
<div class="dw-sidebar-head">
<h2>LLM Prompts</h2>
<span class="dw-kicker">Read From Disk</span>
</div>
<div class="dw-prompt-grid">
{% for prompt in prompts %}
<article class="dw-prompt-card">
<div class="dw-settings-card-head">
<div>
<h3>{{ prompt.label }}</h3>
<code>{{ prompt.path }}</code>
</div>
<span class="dw-chip {{ 'dw-chip-pass' if prompt.exists else 'dw-chip-warning' }}">{{ "Loaded" if prompt.exists else "Fallback" }}</span>
</div>
<pre>{{ prompt.content or "Using built-in fallback prompt." }}</pre>
</article>
{% endfor %}
</div>
</section>
<section class="dw-settings-env">
<div class="dw-sidebar-head">
<h2>Environment</h2>
<span class="dw-kicker">{{ missing_env | length }} Missing</span>
</div>
<div class="dw-env-grid">
{% for name, present in env_items | sort %}
<div class="dw-env-item {{ 'is-missing' if not present else '' }}">
<code>{{ name }}</code>
<span class="dw-chip {{ 'dw-chip-pass' if present else 'dw-chip-fail' }}">{{ "Set" if present else "Missing" }}</span>
</div>
{% endfor %}
</div>
</section>
{% endblock %}
+23
View File
@@ -0,0 +1,23 @@
from __future__ import annotations
from fastapi import HTTPException
def parse_positive_int_ids(value, field_name: str = "ids") -> list[int]:
detail = f"{field_name} must be a list of positive integers"
if not isinstance(value, list):
raise HTTPException(status_code=400, detail=detail)
ids: list[int] = []
for item in value:
if isinstance(item, bool):
raise HTTPException(status_code=400, detail=detail)
if isinstance(item, int):
item_id = item
elif isinstance(item, str) and item.isdecimal():
item_id = int(item)
else:
raise HTTPException(status_code=400, detail=detail)
if item_id <= 0:
raise HTTPException(status_code=400, detail=detail)
ids.append(item_id)
return ids
+103
View File
@@ -0,0 +1,103 @@
app:
name: "DMARC Sentinel"
base_url: "https://sentinel.tukutoi.com"
timezone: "Europe/Zurich"
poll_interval_minutes: 30
database_url: "sqlite:////app/data/dmarc-sentinel.sqlite3"
log_level: "INFO"
max_attachment_decompressed_mb: 20
max_attachment_compressed_mb: 10
max_attachments_per_message: 20
max_reports_per_message: 20
max_reports_per_archive: 20
max_archive_compression_ratio: 100
max_xml_records_per_report: 10000
max_record_count: 10000000
max_report_future_days: 3
max_report_past_days: 3650
max_reports_per_poll: 200
security:
dashboard_auth_enabled: true
dashboard_username_env: "DASHBOARD_USERNAME"
dashboard_password_env: "DASHBOARD_PASSWORD"
api_token_required: true
homepage_token_env: "HOMEPAGE_API_TOKEN"
llm:
provider: "openai"
api_key_env: "OPENAI_API_KEY"
model: "gpt-4.1-mini"
temperature: 0.2
timeout_seconds: 45
max_retries: 2
generate_alert_explanations: true
generate_daily_summary: true
generate_weekly_summary: true
store_llm_outputs: true
send_raw_xml_to_llm: false
send_raw_email_to_llm: false
system_prompt_path: "config/prompts/system.md"
alert_prompt_path: "config/prompts/alert_explanation.md"
digest_prompt_path: "config/prompts/posture_digest.md"
weekly_prompt_path: "config/prompts/weekly_summary.md"
inboxes:
- id: "tukutoi"
label: "Tukutoi"
domain: "tukutoi.com"
imap_host: "mail.tukutoi.com"
imap_port: 993
imap_ssl: true
username_env: "TUKUTOI_IMAP_USER"
password_env: "TUKUTOI_IMAP_PASSWORD"
folder: "DMARC"
recipient: "dmarcreports@tukutoi.com"
processed_folder: "DMARC/Processed"
failed_folder: "DMARC/Failed"
move_after_success: false
move_after_failure: false
mark_seen_after_success: true
enabled: true
known_senders:
tukutoi.com:
- id: "mailcow"
name: "mailcow outbound"
ip_allowlist:
- "REPLACE_WITH_MAILCOW_OUTBOUND_IP/32"
dkim_domains:
- "tukutoi.com"
spf_domains:
- "tukutoi.com"
- id: "google_workspace"
name: "Google Workspace"
ip_allowlist: []
dkim_domains:
- "tukutoi.com"
spf_domains:
- "_spf.google.com"
- id: "mailchimp"
name: "Mailchimp"
ip_allowlist: []
dkim_domains: []
spf_domains: []
alerts:
email:
enabled: true
smtp_host_env: "ALERT_SMTP_HOST"
smtp_port_env: "ALERT_SMTP_PORT"
smtp_user_env: "ALERT_SMTP_USER"
smtp_password_env: "ALERT_SMTP_PASSWORD"
from_env: "ALERT_EMAIL_FROM"
to_env: "ALERT_EMAIL_TO"
thresholds:
unknown_source_fail_count: 10
unknown_source_fail_rate_percent: 5
known_source_fail_rate_percent: 2
total_volume_spike_multiplier: 3
total_volume_drop_percent: 80
min_messages_for_rate_alert: 20
repeated_failure_days: 2
missing_reporter_days: 3
+13
View File
@@ -0,0 +1,13 @@
Explain this DMARC alert to a business owner/admin.
Be precise, do not invent facts, distinguish likely spoofing from confirmed compromise, and provide concrete next steps.
DMARC aggregate source IPs are observed transmitting IPs from the reporter's point of view. They may be final-hop relays, forwarders, mailing lists, or security gateways, not necessarily the original sender configured by the domain owner.
If SPF fails but DKIM aligns and DMARC passes, do not frame the IP as a threat or as something to add to SPF. Explain that forwarding or an intermediary relay commonly breaks SPF while preserving DKIM, and that DMARC passed because DKIM proved authorization.
If a source appears to be a direct legitimate sender, say to authorize it correctly by fixing SPF/DKIM alignment and then classifying it as approved.
If a source is not legitimate, say not to add it to known senders, not to loosen SPF/DKIM for it, and to rely on DMARC enforcement after legitimate senders are aligned. Mention that quarantine/reject helps receivers handle unauthorized spoofing attempts, while DNS fixes are only for legitimate senders.
Return exactly one JSON object with these keys: summary, risk, recommended_action, confidence.
+17
View File
@@ -0,0 +1,17 @@
Write a current DMARC posture report for the admin using all supplied deterministic telemetry and all open alerts.
Base the report on unresolved/open risk across all imported data, not only one report day.
Mention exact counts/rates, important failing or unknown sources, relevant reporters, and concrete remediation.
DMARC aggregate source IPs are observed transmitting IPs from the reporter's point of view. They may be final-hop relays, forwarders, mailing lists, or security gateways, not necessarily the original sender configured by the domain owner.
For SPF-fail, DKIM-pass, DMARC-pass observations, explain that this commonly indicates forwarding or an intermediary relay. Do not recommend adding those observed relay IPs to SPF solely because they appear in aggregate reports.
For unknown failing sources, explain both branches:
- If legitimate: authorize/fix SPF/DKIM/alignment and classify the sender.
- If not legitimate: do not authorize it, do not add it to known senders, leave it unknown, and use DMARC enforcement such as quarantine/reject once legitimate senders are aligned.
Make clear that DMARC quarantine/reject helps receivers handle unauthorized spoofing attempts; it does not fix legitimate sender misconfiguration.
Do not claim mailbox compromise from DMARC aggregate data alone. Return only JSON matching required_json_schema.
+5
View File
@@ -0,0 +1,5 @@
You are an expert email authentication and DMARC operations analyst.
Explain deterministic DMARC telemetry to a business owner/admin. Do not invent facts. Distinguish confirmed facts from likely interpretations. Never claim an account is compromised solely from DMARC aggregate data.
Provide practical next steps. Output only valid JSON matching the requested schema.
+5
View File
@@ -0,0 +1,5 @@
Write a weekly DMARC posture summary for the admin.
Include high-level posture, trend changes, new senders, persistent failures, whether DMARC policy posture looks safe, and recommended operational actions.
Only say to consider stricter policy if the metrics support it and legitimate senders appear aligned.
+1
View File
@@ -0,0 +1 @@
+20
View File
@@ -0,0 +1,20 @@
services:
dmarc-sentinel:
build: .
container_name: dmarc-sentinel
restart: unless-stopped
env_file:
- .env
ports:
- "127.0.0.1:8000:8000"
volumes:
- ./config:/app/config:ro
- ./data:/app/data
- ./logs:/app/logs
networks:
npm_proxy:
ipv4_address: 192.168.99.18
networks:
npm_proxy:
external: true
@@ -5,7 +5,7 @@ Use this file as a quick orientation guide before editing the repository.
## Ground Rules
- Keep documentation and code claims aligned with implemented files.
- Do not describe settings as editable through the web application. Runtime settings come from `config/config.yml` or `config/config.example.yml`, plus environment variables.
- Do not describe settings as editable through the web application. Runtime settings come from `config/config.yml` or `DMARC_SENTINEL_CONFIG`, plus environment variables. `config/config.example.yml` is only a template.
- Treat `data/` and `logs/` as runtime output locations. The repository ignores SQLite databases and log files.
- Do not add compatibility, migration, or legacy-support behavior unless a task explicitly asks for an implemented change.
- UI work should follow the mockups in `design/`. If documenting UI icons, respect the project policy that icons must be material icons.
@@ -11,7 +11,7 @@ This map describes the implemented Python modules and their main responsibilitie
## Configuration and Auth
- `app/config.py` defines Pydantic models for `app`, `security`, `llm`, `inboxes`, `known_senders`, and `alerts`.
- `load_settings()` reads `DMARC_SENTINEL_CONFIG` when set, otherwise `config/config.yml` if it exists, otherwise `config/config.example.yml`.
- `load_settings()` reads `DMARC_SENTINEL_CONFIG` when set, otherwise `config/config.yml`; missing runtime config is a startup error.
- `validate_llm_environment()` requires the configured OpenAI API key when `llm.provider` is `openai`, except when `DMARC_SENTINEL_ALLOW_NO_LLM_FOR_TESTS=true`.
- `app/auth.py` protects dashboard/admin routes with Basic Auth when enabled and protects homepage API routes with bearer token auth when required.
- `app/templates/settings.html` renders runtime settings and environment-variable presence as read-only information.
@@ -7,7 +7,7 @@ The repository contains a FastAPI application for monitoring DMARC aggregate rep
- `README.md`: current user-facing setup and operational notes.
- `requirements.txt`: Python runtime and test dependencies.
- `pytest.ini`: adds the repository root to `pythonpath`.
- `Dockerfile`: Python 3.12 image that installs requirements, copies `app/` and `config/config.example.yml`, creates runtime directories, and starts Uvicorn.
- `Dockerfile`: Python 3.12 image that installs requirements, copies `app/`, creates runtime directories, and starts Uvicorn.
- `docker-compose.yml`: builds the app service, mounts `config/` read-only, mounts `data/` and `logs/`, publishes port `8000`, and attaches to the external `npm_proxy` Docker network.
- `config/config.example.yml`: host-controlled runtime configuration template.
- `.env.example`: environment variable template for IMAP, dashboard auth, homepage token, OpenAI, and SMTP settings.
@@ -7,8 +7,9 @@ This file summarizes how the implemented repository runs and how developers can
`app/config.py` loads settings in this order:
1. `DMARC_SENTINEL_CONFIG`, when set;
2. `config/config.yml`, when present;
3. `config/config.example.yml`.
2. `config/config.yml`.
If neither path exists, startup fails with an instruction to create `config/config.yml` from `config/config.example.yml`.
Secrets are read from environment variables named by the loaded settings. The settings page renders loaded values and environment-variable presence as read-only status; the application does not implement settings writes through the dashboard.
+1
View File
@@ -0,0 +1 @@
+13
View File
@@ -0,0 +1,13 @@
fastapi==0.115.6
uvicorn[standard]==0.34.0
SQLAlchemy==2.0.36
pydantic==2.10.4
pydantic-settings==2.7.1
PyYAML==6.0.2
Jinja2==3.1.5
python-multipart==0.0.20
APScheduler==3.10.4
defusedxml==0.7.1
openai==1.58.1
pytest==8.3.4
httpx==0.28.1
+8
View File
@@ -0,0 +1,8 @@
import os
os.environ.setdefault("DMARC_SENTINEL_ALLOW_NO_LLM_FOR_TESTS", "true")
os.environ.setdefault("OPENAI_API_KEY", "test")
os.environ.setdefault("DASHBOARD_USERNAME", "admin")
os.environ.setdefault("DASHBOARD_PASSWORD", "test")
os.environ.setdefault("HOMEPAGE_API_TOKEN", "test")
os.environ.setdefault("DMARC_SENTINEL_CONFIG", "tests/fixtures/config_test.yml")
+36
View File
@@ -0,0 +1,36 @@
app:
name: "DMARC Sentinel"
base_url: "https://sentinel.tukutoi.com"
timezone: "Europe/Zurich"
poll_interval_minutes: 30
database_url: "sqlite:///data/test-main.sqlite3"
log_level: "INFO"
max_attachment_decompressed_mb: 20
max_reports_per_poll: 200
security:
dashboard_auth_enabled: false
api_token_required: false
llm:
provider: "openai"
api_key_env: "OPENAI_API_KEY"
model: "gpt-4.1-mini"
inboxes:
- id: "tukutoi"
label: "Tukutoi"
domain: "tukutoi.com"
imap_host: "mail.tukutoi.com"
username_env: "TUKUTOI_IMAP_USER"
password_env: "TUKUTOI_IMAP_PASSWORD"
folder: "DMARC"
recipient: "dmarcreports@tukutoi.com"
enabled: true
known_senders:
tukutoi.com: []
alerts:
email:
enabled: false
+53
View File
@@ -0,0 +1,53 @@
<?xml version="1.0" encoding="UTF-8"?>
<feedback>
<report_metadata>
<org_name>google.com</org_name>
<email>noreply-dmarc-support@google.com</email>
<extra_contact_info>https://support.google.com/a/answer/2466580</extra_contact_info>
<report_id>sample-report-1</report_id>
<date_range>
<begin>1778716800</begin>
<end>1778803200</end>
</date_range>
</report_metadata>
<policy_published>
<domain>tukutoi.com</domain>
<adkim>r</adkim>
<aspf>r</aspf>
<p>none</p>
<sp>none</sp>
<pct>100</pct>
<fo>1</fo>
</policy_published>
<record>
<row>
<source_ip>203.0.113.10</source_ip>
<count>25</count>
<policy_evaluated>
<disposition>none</disposition>
<dkim>fail</dkim>
<spf>fail</spf>
<reason>
<type>local_policy</type>
<comment>sample</comment>
</reason>
</policy_evaluated>
</row>
<identifiers>
<header_from>tukutoi.com</header_from>
</identifiers>
<auth_results>
<dkim>
<domain>bad.example</domain>
<selector>x</selector>
<result>fail</result>
<human_result>body hash did not verify</human_result>
</dkim>
<spf>
<domain>bad.example</domain>
<scope>mfrom</scope>
<result>fail</result>
</spf>
</auth_results>
</record>
</feedback>
+16
View File
@@ -0,0 +1,16 @@
import pytest
from fastapi import HTTPException
from app.validation import parse_positive_int_ids
def test_parse_alert_ids_accepts_positive_ints_and_decimal_strings():
assert parse_positive_int_ids([1, "2"]) == [1, 2]
@pytest.mark.parametrize("value", [["abc"], [0], [-1], [True], "1"])
def test_parse_alert_ids_rejects_malformed_values(value):
with pytest.raises(HTTPException) as exc:
parse_positive_int_ids(value)
assert exc.value.status_code == 400
+177
View File
@@ -0,0 +1,177 @@
import json
from datetime import datetime, timedelta, timezone
from sqlalchemy import create_engine
from sqlalchemy.orm import Session
from app.analyzer import analyze_report
from app.config import Settings
from app.db import Base
from app.models import Record, Report
def _session():
engine = create_engine("sqlite:///:memory:", future=True)
Base.metadata.create_all(engine)
return Session(engine)
def _settings() -> Settings:
return Settings.model_validate(
{
"inboxes": [],
"known_senders": {
"tukutoi.com": [
{"id": "mailcow", "name": "mailcow outbound", "ip_allowlist": ["198.51.100.5/32"], "dkim_domains": [], "spf_domains": []}
]
},
"alerts": {"email": {"enabled": False}},
}
)
def _report(
session: Session,
*,
source_ip: str,
count: int,
known: bool,
dmarc_pass: bool,
spf_aligned: bool = False,
dkim_aligned: bool | None = None,
report_time: datetime | None = None,
org_name: str = "google.com",
) -> Report:
dkim_aligned = dmarc_pass if dkim_aligned is None else dkim_aligned
report_time = report_time or datetime.now(timezone.utc)
report = Report(
inbox_id="tukutoi",
raw_xml_sha256=f"sha-{source_ip}-{count}-{known}-{dmarc_pass}-{spf_aligned}-{dkim_aligned}-{report_time.isoformat()}-{org_name}",
report_id=f"r-{source_ip}-{report_time.isoformat()}",
org_name=org_name,
domain="tukutoi.com",
date_begin=report_time - timedelta(hours=1),
date_end=report_time,
)
session.add(report)
session.flush()
session.add(
Record(
report=report,
source_ip=source_ip,
count=count,
disposition="none",
policy_dkim="pass" if dkim_aligned else "fail",
policy_spf="pass" if spf_aligned else "fail",
dkim_aligned=dkim_aligned,
spf_aligned=spf_aligned,
dmarc_pass=dmarc_pass,
header_from="tukutoi.com",
known_sender_id="mailcow" if known else None,
known_sender_name="mailcow outbound" if known else None,
is_known_sender=known,
)
)
session.commit()
return report
def test_unknown_source_failed_both_alert():
session = _session()
report = _report(session, source_ip="203.0.113.10", count=25, known=False, dmarc_pass=False)
alerts = analyze_report(session, _settings(), report)
assert any(alert.type == "unknown_source_failed_both" and alert.severity == "critical" for alert, _, _ in alerts)
def test_known_sender_failure_alert():
session = _session()
report = _report(session, source_ip="198.51.100.5", count=25, known=True, dmarc_pass=False)
alerts = analyze_report(session, _settings(), report)
assert any(alert.type == "known_sender_dmarc_failure" and alert.severity == "critical" for alert, _, _ in alerts)
def test_dkim_authenticated_relay_is_info_not_sender_warning():
session = _session()
report = _report(
session,
source_ip="209.85.220.69",
count=1,
known=False,
dmarc_pass=True,
spf_aligned=False,
dkim_aligned=True,
)
alerts = analyze_report(session, _settings(), report)
relay = next(alert for alert, _, _ in alerts if alert.type == "dkim_authenticated_relay")
assert relay.severity == "info"
assert "intermediary" in relay.summary
assert "add to SPF" in relay.summary
assert not any(alert.type == "new_passing_source" for alert, _, _ in alerts)
def test_alert_fingerprint_prevents_duplicate_open_alerts():
session = _session()
settings = _settings()
report = _report(session, source_ip="203.0.113.10", count=25, known=False, dmarc_pass=False)
first = analyze_report(session, settings, report)
second = analyze_report(session, settings, report)
created = [is_new for _, is_new, _ in first + second]
assert created.count(True) >= 1
assert created.count(False) >= 1
def test_unknown_failure_spike_uses_trailing_reports_outside_current_period():
session = _session()
settings = _settings()
now = datetime(2026, 5, 16, 12, tzinfo=timezone.utc)
for offset in range(2, 9):
_report(session, source_ip=f"203.0.113.{offset}", count=10, known=False, dmarc_pass=False, report_time=now - timedelta(days=offset))
report = _report(session, source_ip="203.0.113.200", count=40, known=False, dmarc_pass=False, report_time=now)
alerts = analyze_report(session, settings, report)
spike = next(alert for alert, _, _ in alerts if alert.type == "sudden_unknown_failure_spike")
details = json.loads(spike.details_json)
assert details["current_24h"] == 40
assert details["trailing_7d_avg"] > 0
def test_configured_rate_thresholds_create_alerts():
session = _session()
settings = _settings()
report = _report(session, source_ip="203.0.113.55", count=25, known=False, dmarc_pass=False)
alerts = analyze_report(session, settings, report)
assert any(alert.type == "high_unknown_source_failure_rate" for alert, _, _ in alerts)
def test_repeated_failure_days_threshold_creates_alert():
session = _session()
settings = _settings()
now = datetime(2026, 5, 16, 12, tzinfo=timezone.utc)
_report(session, source_ip="203.0.113.77", count=8, known=False, dmarc_pass=False, report_time=now - timedelta(days=1))
report = _report(session, source_ip="203.0.113.77", count=8, known=False, dmarc_pass=False, report_time=now)
alerts = analyze_report(session, settings, report)
assert any(alert.type == "repeated_dmarc_failure" for alert, _, _ in alerts)
def test_missing_reporter_threshold_creates_alert():
session = _session()
settings = _settings()
now = datetime(2026, 5, 16, 12, tzinfo=timezone.utc)
_report(session, source_ip="203.0.113.88", count=1, known=False, dmarc_pass=True, report_time=now - timedelta(days=5), org_name="old-reporter")
report = _report(session, source_ip="203.0.113.89", count=1, known=False, dmarc_pass=True, report_time=now, org_name="current-reporter")
alerts = analyze_report(session, settings, report)
assert any(alert.type == "missing_reporter" for alert, _, _ in alerts)
+9
View File
@@ -0,0 +1,9 @@
from app.main import app
def test_generated_api_documentation_is_disabled():
paths = {route.path for route in app.routes}
assert "/docs" not in paths
assert "/redoc" not in paths
assert "/openapi.json" not in paths
+62
View File
@@ -0,0 +1,62 @@
import gzip
import io
import zipfile
from email.message import EmailMessage
from pathlib import Path
import pytest
from app.attachment_extractor import AttachmentExtractionError, extract_dmarc_attachments, extract_payload
def _xml() -> bytes:
return Path("tests/fixtures/sample_dmarc.xml").read_bytes()
def test_gzip_attachment_extraction():
gz = gzip.compress(_xml())
reports = extract_payload("report.xml.gz", "application/octet-stream", gz, 20)
assert len(reports) == 1
assert reports[0].payload.startswith(b"<?xml")
assert len(reports[0].sha256) == 64
def test_zip_attachment_extraction_rejects_traversal():
buf = io.BytesIO()
with zipfile.ZipFile(buf, "w") as archive:
archive.writestr("report.xml", _xml())
archive.writestr("../evil.xml", _xml())
with pytest.raises(AttachmentExtractionError, match="unsafe zip path"):
extract_payload("reports.zip", "application/zip", buf.getvalue(), 20)
def test_zip_attachment_extraction_rejects_nested_archives():
buf = io.BytesIO()
with zipfile.ZipFile(buf, "w") as archive:
archive.writestr("nested.zip", b"not allowed")
with pytest.raises(AttachmentExtractionError, match="nested archive"):
extract_payload("reports.zip", "application/zip", buf.getvalue(), 20)
def test_zip_attachment_extraction_caps_reports_per_archive():
buf = io.BytesIO()
with zipfile.ZipFile(buf, "w") as archive:
archive.writestr("one.xml", _xml())
archive.writestr("two.xml", _xml())
with pytest.raises(AttachmentExtractionError, match="archive XML report limit"):
extract_payload("reports.zip", "application/zip", buf.getvalue(), 20, max_reports_per_archive=1)
def test_message_attachment_detection_with_octet_stream_valid_filename():
msg = EmailMessage()
msg["Subject"] = "Report domain tukutoi.com"
msg.set_content("attached")
msg.add_attachment(gzip.compress(_xml()), maintype="application", subtype="octet-stream", filename="report.gz")
reports = extract_dmarc_attachments(msg, 20)
assert len(reports) == 1
+25
View File
@@ -0,0 +1,25 @@
import pytest
from fastapi import HTTPException
from fastapi.security import HTTPBasicCredentials
from app.auth import require_dashboard_auth
from app.config import Settings
def test_dashboard_auth_fails_closed_when_credentials_are_missing(monkeypatch):
monkeypatch.delenv("DASHBOARD_USERNAME", raising=False)
monkeypatch.delenv("DASHBOARD_PASSWORD", raising=False)
settings = Settings.model_validate({"inboxes": [], "alerts": {"email": {"enabled": False}}})
with pytest.raises(HTTPException) as exc:
require_dashboard_auth(HTTPBasicCredentials(username="", password=""), settings)
assert exc.value.status_code == 500
def test_dashboard_auth_accepts_configured_credentials(monkeypatch):
monkeypatch.setenv("DASHBOARD_USERNAME", "admin")
monkeypatch.setenv("DASHBOARD_PASSWORD", "secret")
settings = Settings.model_validate({"inboxes": [], "alerts": {"email": {"enabled": False}}})
require_dashboard_auth(HTTPBasicCredentials(username="admin", password="secret"), settings)
+22
View File
@@ -0,0 +1,22 @@
from pathlib import Path
import pytest
from app.config import load_settings
def test_default_config_requires_real_runtime_config(monkeypatch, tmp_path):
monkeypatch.delenv("DMARC_SENTINEL_CONFIG", raising=False)
monkeypatch.chdir(tmp_path)
with pytest.raises(FileNotFoundError, match="config/config.yml"):
load_settings()
def test_explicit_config_path_is_loaded(monkeypatch):
path = Path("tests/fixtures/config_test.yml")
monkeypatch.setenv("DMARC_SENTINEL_CONFIG", str(path))
settings = load_settings()
assert settings.inboxes[0].id == "tukutoi"
+79
View File
@@ -0,0 +1,79 @@
from datetime import datetime, timezone
from sqlalchemy import create_engine
from sqlalchemy.orm import Session
from app.db import Base
from app.homepage import homepage_summary, latest_summary
from app.models import Alert, LLMReport, Record, Report
def test_homepage_api_status_calculation():
engine = create_engine("sqlite:///:memory:", future=True)
Base.metadata.create_all(engine)
session = Session(engine)
report = Report(
inbox_id="tukutoi",
raw_xml_sha256="sha-homepage",
report_id="r1",
org_name="google.com",
domain="tukutoi.com",
date_begin=datetime.now(timezone.utc),
date_end=datetime.now(timezone.utc),
)
session.add(report)
session.flush()
session.add(Record(report=report, source_ip="198.51.100.1", count=99, dmarc_pass=True, spf_aligned=True, dkim_aligned=True))
session.add(Record(report=report, source_ip="203.0.113.10", count=1, dmarc_pass=False, spf_aligned=False, dkim_aligned=False))
session.add(
Alert(
fingerprint="tukutoi.com:new_unknown_source:203.0.113.10",
inbox_id="tukutoi",
domain="tukutoi.com",
severity="warning",
type="new_unknown_source",
title="New unknown source",
summary="summary",
details_json="{}",
)
)
session.commit()
data = homepage_summary(session)
assert data["status"] == "warning"
assert data["dmarc_pass_rate"] == "99.0%"
assert data["warnings"] == 1
def test_overview_summary_prefers_portfolio_report():
engine = create_engine("sqlite:///:memory:", future=True)
Base.metadata.create_all(engine)
session = Session(engine)
now = datetime.now(timezone.utc)
session.add_all(
[
LLMReport(
domain="tukutoi.com",
period_start=now,
period_end=now,
report_type="posture",
input_json="{}",
output_json="{}",
plain_text="single domain",
),
LLMReport(
domain="__all__",
period_start=now,
period_end=now,
report_type="posture",
input_json="{}",
output_json="{}",
plain_text="portfolio",
),
]
)
session.commit()
assert latest_summary(session) == "portfolio"
assert latest_summary(session, "tukutoi.com") == "single domain"
+104
View File
@@ -0,0 +1,104 @@
from app.config import Settings
from app.dmarc_parser import ParsedAuthResult, ParsedRecord
from app.known_senders import classify_record
def _record(source_ip: str, *, dkim_domain: str = "tukutoi.com", spf_domain: str = "tukutoi.com") -> ParsedRecord:
return ParsedRecord(
source_ip=source_ip,
count=1,
disposition="none",
policy_dkim="pass",
policy_spf="pass",
dkim_aligned=True,
spf_aligned=True,
dmarc_pass=True,
header_from="tukutoi.com",
reason_type=None,
reason_comment=None,
auth_results=[
ParsedAuthResult(auth_type="dkim", domain=dkim_domain, result="pass"),
ParsedAuthResult(auth_type="spf", domain=spf_domain, result="pass"),
],
)
def test_ip_allowlisted_sender_requires_ip_match_even_when_auth_domain_matches():
settings = Settings.model_validate(
{
"known_senders": {
"tukutoi.com": [
{
"id": "mailcow",
"name": "mailcow outbound",
"ip_allowlist": ["45.148.30.200/32"],
"dkim_domains": ["tukutoi.com"],
"spf_domains": ["tukutoi.com"],
}
]
},
"alerts": {"email": {"enabled": False}},
}
)
match = classify_record(settings, "tukutoi.com", _record("50.31.205.203"))
assert match.is_known is False
assert match.id is None
assert match.name is None
def test_ip_allowlisted_sender_matches_configured_ip():
settings = Settings.model_validate(
{
"known_senders": {
"tukutoi.com": [
{
"id": "mailcow",
"name": "mailcow outbound",
"ip_allowlist": ["45.148.30.200/32"],
"dkim_domains": ["tukutoi.com"],
"spf_domains": ["tukutoi.com"],
}
]
},
"alerts": {"email": {"enabled": False}},
}
)
match = classify_record(settings, "tukutoi.com", _record("45.148.30.200"))
assert match.is_known is True
assert match.id == "mailcow"
def test_domain_only_sender_still_matches_auth_domain_when_no_ip_allowlist_exists():
settings = Settings.model_validate(
{
"known_senders": {
"tukutoi.com": [
{
"id": "domain-only",
"name": "domain-only sender",
"ip_allowlist": [],
"dkim_domains": ["tukutoi.com"],
"spf_domains": [],
}
]
},
"alerts": {"email": {"enabled": False}},
}
)
match = classify_record(settings, "tukutoi.com", _record("50.31.205.203"))
assert match.is_known is True
assert match.id == "domain-only"
def test_aligned_dkim_without_configured_sender_is_not_known_sender():
settings = Settings.model_validate({"known_senders": {}, "alerts": {"email": {"enabled": False}}})
match = classify_record(settings, "tukutoi.com", _record("50.31.205.203"))
assert match.is_known is False
+49
View File
@@ -0,0 +1,49 @@
from app.config import Settings
from app.llm import LLMClient, normalize_alert_explanation
from app.models import Alert
def test_llm_json_validation_fallback():
client = LLMClient(Settings.model_validate({"alerts": {"email": {"enabled": False}}}))
alert = Alert(
fingerprint="x",
inbox_id="tukutoi",
domain="tukutoi.com",
severity="critical",
type="unknown_source_failed_both",
title="Unknown source failed SPF and DKIM for tukutoi.com",
summary="Deterministic summary",
details_json="{}",
)
explanation = client.explain_alert(alert)
assert explanation.confidence == "fallback"
assert "DMARC aggregate data alone" in explanation.risk
def test_alert_explanation_accepts_explanation_action_items_shape():
alert = Alert(
fingerprint="x",
inbox_id="tukutoi",
domain="tukutoi.com",
severity="warning",
type="new_authenticated_source",
title="New authenticated source observed for tukutoi.com",
summary="Deterministic summary",
details_json="{}",
)
explanation = normalize_alert_explanation(
{
"explanation": "A new authenticated source was observed for tukutoi.com.",
"action_items": ["Confirm whether this source is authorized.", "Add it to known senders if approved."],
"confidence": "high",
},
alert,
)
assert explanation.summary == "A new authenticated source was observed for tukutoi.com."
assert "aggregate data alone" in explanation.risk
assert "Confirm whether this source is authorized" in explanation.recommended_action
assert explanation.confidence == "high"
+44
View File
@@ -0,0 +1,44 @@
from pathlib import Path
import pytest
from app.dmarc_parser import DMARCParseError, parse_dmarc_xml
def test_parser_valid_dmarc_report():
payload = Path("tests/fixtures/sample_dmarc.xml").read_bytes()
report = parse_dmarc_xml(payload)
assert report.org_name == "google.com"
assert report.domain == "tukutoi.com"
assert report.policy_p == "none"
assert report.date_begin is not None
assert len(report.records) == 1
record = report.records[0]
assert record.source_ip == "203.0.113.10"
assert record.count == 25
assert record.dkim_aligned is False
assert record.spf_aligned is False
assert record.dmarc_pass is False
assert {auth.auth_type for auth in record.auth_results} == {"dkim", "spf"}
def test_parser_rejects_record_limit():
payload = Path("tests/fixtures/sample_dmarc.xml").read_bytes()
with pytest.raises(DMARCParseError, match="record limit"):
parse_dmarc_xml(payload, max_records=0)
def test_parser_rejects_invalid_source_ip():
payload = Path("tests/fixtures/sample_dmarc.xml").read_text().replace("203.0.113.10", "not-an-ip").encode()
with pytest.raises(DMARCParseError, match="Invalid source IP"):
parse_dmarc_xml(payload)
def test_parser_rejects_absurd_record_count():
payload = Path("tests/fixtures/sample_dmarc.xml").read_text().replace("<count>25</count>", "<count>10000001</count>").encode()
with pytest.raises(DMARCParseError, match="exceeds limit"):
parse_dmarc_xml(payload, max_record_count=10000000)
+41
View File
@@ -0,0 +1,41 @@
from app.config import Settings
from app.scheduler import generate_daily_summaries, generate_weekly_summaries, start_scheduler
def _settings(**llm):
return Settings.model_validate({"alerts": {"email": {"enabled": False}}, "llm": llm})
def test_disabled_digest_jobs_do_not_instantiate_llm(monkeypatch):
def fail_llm(*args, **kwargs):
raise AssertionError("LLM should not be constructed when summaries are disabled")
monkeypatch.setattr("app.scheduler.LLMClient", fail_llm)
settings = _settings(generate_daily_summary=False, generate_weekly_summary=False)
assert generate_daily_summaries(settings) == []
assert generate_weekly_summaries(settings) == []
def test_scheduler_only_registers_enabled_digest_jobs(monkeypatch):
created = []
class FakeScheduler:
running = True
def __init__(self, timezone):
self.timezone = timezone
self.jobs = []
created.append(self)
def add_job(self, func, trigger, **kwargs):
self.jobs.append(kwargs["id"])
def start(self):
pass
monkeypatch.setattr("app.scheduler.BackgroundScheduler", FakeScheduler)
scheduler = start_scheduler(_settings(generate_daily_summary=False, generate_weekly_summary=True))
assert scheduler.jobs == ["poll", "weekly"]
+20
View File
@@ -0,0 +1,20 @@
from datetime import date
import pytest
from pydantic import ValidationError
from app.schemas import BacklogRequest
def test_backlog_request_parses_iso_dates():
request = BacklogRequest.model_validate(
{"inbox_id": "tukutoi", "since": "2026-05-01", "before": "2026-05-16"}
)
assert request.since == date(2026, 5, 1)
assert request.before == date(2026, 5, 16)
def test_backlog_request_rejects_malformed_dates():
with pytest.raises(ValidationError):
BacklogRequest.model_validate({"inbox_id": "tukutoi", "since": "not-a-date"})
+37
View File
@@ -0,0 +1,37 @@
from email.message import EmailMessage
from pathlib import Path
from sqlalchemy import create_engine
from sqlalchemy.orm import Session
from app.config import Settings
from app.db import Base
from app.attachment_extractor import ExtractedReport
from app.message_processor import _store_report
from app.models import MailMessage
def test_duplicate_sha_detection():
engine = create_engine("sqlite:///:memory:", future=True)
Base.metadata.create_all(engine)
session = Session(engine)
mail = MailMessage(inbox_id="tukutoi", imap_uid="1", folder="DMARC", status="skipped")
session.add(mail)
session.commit()
payload = Path("tests/fixtures/sample_dmarc.xml").read_bytes()
extracted = ExtractedReport("report.xml", payload, "0" * 64)
settings = Settings.model_validate({"alerts": {"email": {"enabled": False}}})
report, duplicate = _store_report(session, settings, settings.inboxes[0] if settings.inboxes else _Inbox(), mail, extracted)
session.commit()
second, second_duplicate = _store_report(session, settings, settings.inboxes[0] if settings.inboxes else _Inbox(), mail, extracted)
assert report is not None
assert duplicate is None
assert second is None
assert second_duplicate == report
class _Inbox:
id = "tukutoi"
domain = "tukutoi.com"