Private
Public Access
1
0
Files
linux_patch_manager/tasks/todo.md
Draco-Lunaris-Echo 88b190ac8d fix(security): restrict auth-config mutations to Admin role (#5)
Restrict manager-wide authentication configuration mutations (OIDC, SMTP, IP allowlist) to Admin role. Operators now receive 403 forbidden_role.

- New admin_required helper in settings.rs
- 4 gate changes: update_settings, discover_oidc, test_oidc, update_ip_whitelist
- 5 new AuditAction variants + migration 019
- SPA friendly error message on 403
- 3 admin_required unit tests pass (43/43)
- Full integration tests deferred to issue #15

Closes #5
2026-06-03 09:16:41 -05:00

15 KiB

SSO Implementation Fix Plan

Issues Identified

  1. No SSO Login Button — LoginPage.tsx missing "Sign in with Azure" button
  2. No SSO Callback Route — App.tsx missing frontend route to handle SSO callback
  3. authStore No SSO Support — authStore.ts has no method to store SSO tokens
  4. Backend Returns JSON Not Redirect — azure_sso.rs callback returns JSON tokens instead of redirecting to frontend
  5. No SSO Session Cleanup — sso_sessions DashMap has no expiry/cleanup task (memory leak)
  6. No JWT Signature Verification — id_token decoded without verifying Azure AD signature

Phases

Phase 1: Backend SSO Fixes (Issues 4, 5) — COMPLETE

  • 1a: Add SSO session cleanup task in main.rs (purge sessions older than 10 minutes)
  • 1b: Modify azure_sso.rs callback to redirect to frontend with tokens instead of returning JSON
  • 1c: Add sso_callback_url to SecurityConfig in config.rs with serde default
  • 1d: Update settings.rs to include sso_callback_url in settings response
  • 1e: Verify backend compiles with cargo check

Phase 2: Frontend SSO Integration (Issues 1, 2, 3) — COMPLETE

  • 2a: Add SSO callback page component (SsoCallbackPage.tsx)
  • 2b: Add SSO callback route to App.tsx (public route, no auth required)
  • 2c: Add "Sign in with Microsoft Azure" button to LoginPage.tsx
  • 2d: Add SSO-related types and API methods to frontend
  • 2e: Verify frontend builds with TypeScript compilation

Phase 3: JWT Signature Verification (Issue 6) — COMPLETE

  • 3a: Add JWKS client dependency to pm-web/Cargo.toml
  • 3b: Implement id_token signature verification in azure_sso.rs
  • 3c: Verify backend compiles with cargo check

Phase 4: Integration Testing and Verification — COMPLETE

  • 4a: Backend code review — all changes verified manually
  • 4b: Frontend TypeScript compilation — passes cleanly
  • 4c: SSO login flow reviewed end-to-end (backend redirect → frontend callback → auth store)
  • 4d: SSO session cleanup verified (10-minute expiry, 60-second purge interval)
  • 4e: Settings page SSO config unchanged (sso_callback_url added as read-only)
  • 4f: Lessons captured below

Lessons Learned


WS Origin Allowlist — Implementation Plan (Issue #10)

Spec: tasks/ws-origin-check-spec.md (v0.1.0, awaiting sign-off)

Issues Identified

  1. No Origin check on WS upgradecrates/pm-web/src/routes/ws.rs ws_handler does not inspect the Origin header, leaving the /api/v1/ws/jobs endpoint exposed to Cross-Site WebSocket Hijacking (CSWSH) if a ticket ever leaks via logs / Referer / browser history / support bundles.
  2. No allowed_origins config fieldSecurityConfig has no way to express the allowlist; defaults need to be derived from sso_callback_url to stay secure out of the box.
  3. No integration tests for ws.rs — there is no crates/pm-web/tests/ directory today, so the new behavior would land without automated coverage.

Phases

Phase 1: Config schema (Issue 2)

  • 1a: Add allowed_origins: Vec<String> to SecurityConfig in crates/pm-core/src/config.rs
  • 1b: Implement default_allowed_origins() that parses sso_callback_url to scheme://host[:port]
  • 1c: Emit tracing::warn! at startup if the derived allowlist ends up empty
  • 1d: Update Default for AppConfig to include the new field
  • 1e: Update config/config.example.toml with documented allowed_origins key

Phase 2: Handler change (Issue 1)

  • 2a: Add HeaderMap extractor to ws_handler
  • 2b: Implement hand-rolled Origin parser (scheme, host, port) with default-port normalization
  • 2c: Implement allowlist match (exact, case-insensitive host, case-sensitive scheme/port)
  • 2d: Reject missing / malformed / non-allowlisted Origin with 403 forbidden_origin before ticket validation
  • 2e: Augment the success tracing::info! with origin; add tracing::warn! on rejection (never log the ticket)
  • 2f: Verify cargo check -p pm-web and cargo clippy --all-targets pass

Phase 3: Tests (Issue 3)

  • 3a: Add crates/pm-web/tests/ and a build_test_app harness (no DB, minimal AppState)
  • 3b: Add ws_rejects_missing_origin test
  • 3c: Add ws_rejects_disallowed_origin test
  • 3d: Add ws_rejects_malformed_origin test
  • 3e: Add ws_allows_listed_origin_with_valid_ticket test (asserts ticket is consumed)
  • 3f: Add ws_default_origin_derived_from_sso_callback_url config-derivation test
  • 3g: Verify cargo test -p pm-web passes

Phase 4: Documentation

  • 4a: Update docs/security-review.md with a new control row for the WS Origin allowlist
  • 4b: (Optional, per Kelly) bump SPEC.md to 0.0.3 with a sentence in the Security section

Phase 5: Review

  • 5a: Self-review against the 10-point acceptance criteria in the spec
  • 5b: Commit on a feature branch (issue/10-ws-origin-check) per git-workflow skill
  • 5c: Lessons captured below

Lessons Learned (this issue)

(filled in at completion)

  • SSO callback must redirect, not return JSON — Browser OAuth2 flows require the backend to redirect to the frontend SPA, not return JSON tokens. The frontend must parse tokens from URL query parameters.
  • URLSearchParams.get() already decodes — Don't double-decode with decodeURIComponent() when using URLSearchParams.
  • JWKS caching prevents rate-limiting — Azure AD JWKS endpoint should be cached with TTL (1 hour) to avoid fetching on every SSO login.
  • tokio::sync::Mutex over std::sync::Mutex — Axum handlers must be Send; std::sync::MutexGuard is not Send across await points.
  • DashMap session cleanup — In-memory session stores (DashMap) need periodic cleanup tasks to prevent memory leaks. Pattern: tokio::spawn with interval + retain with time-based cutoff.

IP Allowlist Hardening — Implementation Plan (Issue #3)

Spec: tasks/ip-allowlist-spec.md (v0.1.0, awaiting sign-off)

Issues Identified

  1. Allowlist bypass via missing XFFextract_remote_ip returns None when the header is absent, and the middleware's if let Some(ip) block has no else branch, so a request without X-Forwarded-For skips the check.
  2. Allowlist spoofing via XFFextract_remote_ip reads the header unconditionally; any client can claim to be from a whitelisted IP.
  3. No trusted-proxy concept — there is no config field to declare which intermediate proxies are allowed to set X-Forwarded-For.
  4. No ConnectInfo<SocketAddr> wiring — the axum listeners in pm-web/src/main.rs do not use into_make_service_with_connect_info, so the middleware cannot access the real peer address.

Phases

Phase 1: Resolver helper in pm-auth

  • 1a: Add fn resolve_client_ip(headers, peer, trusted_proxies) -> Option<IpAddr>
  • 1b: Add 12 unit tests in crates/pm-auth/src/rbac.rs (cfg(test)) covering the resolution matrix (peer-only, XFF trusted/untrusted, multi-hop, IPv6, malformed, missing peer)
  • 1c: Run cargo test -p pm-auth and confirm green

Phase 2: AuthConfig + SecurityConfig schema

  • 2a: Add trusted_proxies: Arc<RwLock<Vec<IpNet>>> to AuthConfig
  • 2b: Add trusted_proxies: Vec<String> to SecurityConfig in crates/pm-core/src/config.rs
  • 2c: Update Default for AppConfig to include trusted_proxies: vec![]
  • 2d: Add update_trusted_proxies setter on AuthConfig (symmetric to update_ip_whitelist)
  • 2e: Update config/config.example.toml with a documented trusted_proxies entry and a reverse-proxy runbook comment block
  • 2f: Plumb trusted_proxies from SecurityConfig into AuthConfig::new in pm-web/src/main.rs
  • 2g: Run cargo check and cargo clippy --all-targets

Phase 3: Middleware change

  • 3a: Update require_auth to extract ConnectInfo<SocketAddr> from request extensions and call resolve_client_ip
  • 3b: Add fail-closed path: non-empty allowlist + unresolvable IP → 403 forbidden_ip
  • 3c: Replace forbidden("Access denied") with the new error code in IP-deny path
  • 3d: Add tracing::warn! with client_ip, peer, xff_present, reason
  • 3e: Remove the old extract_remote_ip (header-only) function
  • 3f: Run cargo check and cargo clippy --all-targets

Phase 4: pm-web listener wiring

  • 4a: Switch both TCP and TLS axum listeners in pm-web/src/main.rs to into_make_service_with_connect_info::<SocketAddr>()
  • 4b: Run cargo check -p pm-web

Phase 5: Middleware integration tests

  • 5a: Add TestApp harness in crates/pm-auth/src/rbac.rs cfg(test) (no DB, single-route router, tower::ServiceExt-style call)
  • 5b: Add 8 middleware integration tests per spec section 6.1 (allow empty, deny non-empty, allow in list, fail-closed no peer, spoofed XFF ignored, trusted proxy honors XFF, bad XFF fallback, no-JWT on deny)
  • 5c: Run cargo test -p pm-auth and confirm green

Phase 6: Documentation

  • 6a: Update docs/security-review.md — update existing IP-allowlist row and reference new code path + trusted_proxies field
  • 6b: Update SPEC.md Security section (one paragraph)
  • 6c: Add a "Reverse proxy deployment" runbook under docs/runbooks/ (optional, per Kelly)

Phase 7: Review & commit

  • 7a: Self-review against the 8 acceptance criteria in the spec
  • 7b: Run bash /a0/usr/skills/git-workflow/scripts/validate-push.sh
  • 7c: Commit on fix/3-ip-allowlist-bypass (per git-workflow skill)
  • 7d: Push to github/fix/3-ip-allowlist-bypass and open PR against master
  • 7e: Comment on issue #3 linking the PR; close issue on merge
  • 7f: Capture lessons in this file

Lessons Learned (this issue)

(filled in at completion)


Host Self-Enrollment Implementation Plan

Phases

Phase 1: Database & Core Models

  • 1a: Create SQL migration for enrollment_requests table
  • 1b: Define Rust data models for EnrollmentRequest in pm-core
  • 1c: Add DB interaction methods (insert, list, delete) in pm-core

Phase 2: Client-Facing API (pm-web)

  • 2a: Implement POST /api/v1/enroll to accept payloads and generate polling_token
  • 2b: Implement GET /api/v1/enroll/status/{token} to return pending/approved (PKI) statuses
  • 2c: Implement IP-based rate limiting for the /enroll endpoint

Phase 3: Admin-Facing API (pm-web)

  • 3a: Implement GET /api/v1/admin/enrollments to list pending queue
  • 3b: Implement POST /api/v1/admin/enrollments/{id}/approve (generate PKI via pm-ca, migrate to hosts table)
  • 3c: Implement DELETE /api/v1/admin/enrollments/{id}/deny to purge request

Phase 4: Background Workers (pm-worker)

  • 4a: Create a scheduled task to purge enrollment_requests older than 24 hours

Phase 5: Frontend UI (pm-web/React)

  • 5a: Add enrollment API methods and types to frontend
  • 5b: Update Hosts view to include "Pending Enrollments" filter and visual badge
  • 5c: Render pending hosts in the table with highlight styling
  • 5d: Add Approve/Deny action buttons to pending host rows
  • 5e: Implement "merge/overwrite" interactive modal for fqdn/ip_address collisions on approval

Issue #5: Admin-Only Manager-Wide Configuration (Authz Gate)

Spec: tasks/authz-gate-spec.md (v0.1.0) Branch: fix/5-operator-can-modify-auth-config Status: Draft spec — awaiting Kelly sign-off

Phase 1: admin_required helper + 3 unit tests

  • 1a: Add admin_required helper in crates/pm-web/src/routes/settings.rs (after write_access_required ~line 173). Returns 403 with code forbidden_role if not Admin.
  • 1b: Add 3 unit tests in cfg(test) module: admin_required_admin_passes, admin_required_operator_denied, admin_required_reporter_denied.
  • 1c: Run cargo test -p pm-web --bins --tests and confirm green.

Phase 2: Gate changes + audit log calls

  • 2a: Replace write_access_required with admin_required in update_settings (line 336).
  • 2b: Replace write_access_required with admin_required in update_ip_whitelist (line 902).
  • 2c: Replace write_access_required with admin_required in discover_oidc (line 561).
  • 2d: Replace write_access_required with admin_required in test_oidc (line 619).
  • 2e: Create migrations/019_auth_config_audit_actions.sql with 5 new enum values.
  • 2f: Add 5 new variants to the AuditAction enum in crates/pm-core/src/audit.rs (or wherever defined).
  • 2g: Add write_audit_event calls in each of the 4 handlers, after successful mutations.
  • 2h: Run cargo fmt --check --all, cargo clippy --all-targets -- -D warnings, cargo test -p pm-web --bins --tests and confirm clean.

Phase 3: Integration tests (8 new)

  • 3a: update_settings_operator_denied — POST as Operator with OIDC fields → 403 forbidden_role.
  • 3b: update_settings_admin_allowed — POST as Admin with OIDC fields → 200 + audit row written.
  • 3c: update_settings_smtp_operator_denied — POST as Operator with SMTP fields → 403 forbidden_role.
  • 3d: update_settings_smtp_admin_allowed — POST as Admin with SMTP fields → 200 + audit row written.
  • 3e: update_ip_whitelist_operator_denied — POST as Operator → 403 forbidden_role.
  • 3f: update_ip_whitelist_admin_allowed — POST as Admin → 200 + audit row written + in-memory AuthConfig.ip_whitelist updated.
  • 3g: discover_oidc_operator_denied / discover_oidc_admin_allowed — 2 tests.
  • 3h: test_oidc_operator_denied / test_oidc_admin_allowed — 2 tests.
  • 3i: Run cargo test -p pm-web --bins --tests and confirm all green.

Phase 4: SPA error message + 1 test

  • 4a: Update frontend/src/pages/SettingsPage.tsx to detect error.code === 'forbidden_role' and show friendly message: "Only Admins can modify authentication configuration. Contact an Admin to make this change."
  • 4b: Create frontend/src/pages/__tests__/SettingsPage.test.tsx with 1 test: settings_page_forbidden_role_shows_friendly_message.
  • 4c: Run npm test in frontend/ and confirm green.

Phase 5: Documentation

  • 5a: Update docs/security-review.md §2.3 (Authorization / RBAC) with 2 new rows.
  • 5b: Annotate the 4 affected endpoints in docs/REST_API.md with "🔒 Admin only".
  • 5c: Add a project-specific lesson in tasks/lessons.md about the role model (Admin = Manager-wide, Operator = per-host, Reporter = read-only).

Phase 6: Review & commit

  • 6a: Self-review against the 9 acceptance criteria in the spec.
  • 6b: Manual pre-push checks (cargo fmt, cargo clippy, eslint, cargo test, npm test) — run all 6 from the recent lessons-learned entry.
  • 6c: Commit on fix/5-operator-can-modify-auth-config with conventional format.
  • 6d: Push to github/fix/5-operator-can-modify-auth-config via github-echo SSH alias.
  • 6e: Open PR against master and comment on issue #5.
  • 6f: Capture lessons in tasks/lessons.md (project-specific) and git-workflow/references/lessons-learned.md (skill-level).