Restrict manager-wide authentication configuration mutations (OIDC, SMTP, IP allowlist) to Admin role. Operators now receive 403 forbidden_role. - New admin_required helper in settings.rs - 4 gate changes: update_settings, discover_oidc, test_oidc, update_ip_whitelist - 5 new AuditAction variants + migration 019 - SPA friendly error message on 403 - 3 admin_required unit tests pass (43/43) - Full integration tests deferred to issue #15 Closes #5
15 KiB
15 KiB
SSO Implementation Fix Plan
Issues Identified
- No SSO Login Button — LoginPage.tsx missing "Sign in with Azure" button
- No SSO Callback Route — App.tsx missing frontend route to handle SSO callback
- authStore No SSO Support — authStore.ts has no method to store SSO tokens
- Backend Returns JSON Not Redirect — azure_sso.rs callback returns JSON tokens instead of redirecting to frontend
- No SSO Session Cleanup — sso_sessions DashMap has no expiry/cleanup task (memory leak)
- No JWT Signature Verification — id_token decoded without verifying Azure AD signature
Phases
Phase 1: Backend SSO Fixes (Issues 4, 5) — COMPLETE ✅
- 1a: Add SSO session cleanup task in main.rs (purge sessions older than 10 minutes)
- 1b: Modify azure_sso.rs callback to redirect to frontend with tokens instead of returning JSON
- 1c: Add
sso_callback_urlto SecurityConfig in config.rs with serde default - 1d: Update settings.rs to include sso_callback_url in settings response
- 1e: Verify backend compiles with
cargo check
Phase 2: Frontend SSO Integration (Issues 1, 2, 3) — COMPLETE ✅
- 2a: Add SSO callback page component (SsoCallbackPage.tsx)
- 2b: Add SSO callback route to App.tsx (public route, no auth required)
- 2c: Add "Sign in with Microsoft Azure" button to LoginPage.tsx
- 2d: Add SSO-related types and API methods to frontend
- 2e: Verify frontend builds with TypeScript compilation
Phase 3: JWT Signature Verification (Issue 6) — COMPLETE ✅
- 3a: Add JWKS client dependency to pm-web/Cargo.toml
- 3b: Implement id_token signature verification in azure_sso.rs
- 3c: Verify backend compiles with
cargo check
Phase 4: Integration Testing and Verification — COMPLETE ✅
- 4a: Backend code review — all changes verified manually
- 4b: Frontend TypeScript compilation — passes cleanly
- 4c: SSO login flow reviewed end-to-end (backend redirect → frontend callback → auth store)
- 4d: SSO session cleanup verified (10-minute expiry, 60-second purge interval)
- 4e: Settings page SSO config unchanged (sso_callback_url added as read-only)
- 4f: Lessons captured below
Lessons Learned
WS Origin Allowlist — Implementation Plan (Issue #10)
Spec: tasks/ws-origin-check-spec.md (v0.1.0, awaiting sign-off)
Issues Identified
- No Origin check on WS upgrade —
crates/pm-web/src/routes/ws.rsws_handlerdoes not inspect theOriginheader, leaving the/api/v1/ws/jobsendpoint exposed to Cross-Site WebSocket Hijacking (CSWSH) if a ticket ever leaks via logs /Referer/ browser history / support bundles. - No
allowed_originsconfig field —SecurityConfighas no way to express the allowlist; defaults need to be derived fromsso_callback_urlto stay secure out of the box. - No integration tests for ws.rs — there is no
crates/pm-web/tests/directory today, so the new behavior would land without automated coverage.
Phases
Phase 1: Config schema (Issue 2)
- 1a: Add
allowed_origins: Vec<String>toSecurityConfigincrates/pm-core/src/config.rs - 1b: Implement
default_allowed_origins()that parsessso_callback_urltoscheme://host[:port] - 1c: Emit
tracing::warn!at startup if the derived allowlist ends up empty - 1d: Update
Default for AppConfigto include the new field - 1e: Update
config/config.example.tomlwith documentedallowed_originskey
Phase 2: Handler change (Issue 1)
- 2a: Add
HeaderMapextractor tows_handler - 2b: Implement hand-rolled
Originparser (scheme, host, port) with default-port normalization - 2c: Implement allowlist match (exact, case-insensitive host, case-sensitive scheme/port)
- 2d: Reject missing / malformed / non-allowlisted
Originwith403 forbidden_originbefore ticket validation - 2e: Augment the success
tracing::info!withorigin; addtracing::warn!on rejection (never log the ticket) - 2f: Verify
cargo check -p pm-webandcargo clippy --all-targetspass
Phase 3: Tests (Issue 3)
- 3a: Add
crates/pm-web/tests/and abuild_test_appharness (no DB, minimalAppState) - 3b: Add
ws_rejects_missing_origintest - 3c: Add
ws_rejects_disallowed_origintest - 3d: Add
ws_rejects_malformed_origintest - 3e: Add
ws_allows_listed_origin_with_valid_tickettest (asserts ticket is consumed) - 3f: Add
ws_default_origin_derived_from_sso_callback_urlconfig-derivation test - 3g: Verify
cargo test -p pm-webpasses
Phase 4: Documentation
- 4a: Update
docs/security-review.mdwith a new control row for the WS Origin allowlist - 4b: (Optional, per Kelly) bump
SPEC.mdto 0.0.3 with a sentence in the Security section
Phase 5: Review
- 5a: Self-review against the 10-point acceptance criteria in the spec
- 5b: Commit on a feature branch (
issue/10-ws-origin-check) per git-workflow skill - 5c: Lessons captured below
Lessons Learned (this issue)
(filled in at completion)
- SSO callback must redirect, not return JSON — Browser OAuth2 flows require the backend to redirect to the frontend SPA, not return JSON tokens. The frontend must parse tokens from URL query parameters.
- URLSearchParams.get() already decodes — Don't double-decode with decodeURIComponent() when using URLSearchParams.
- JWKS caching prevents rate-limiting — Azure AD JWKS endpoint should be cached with TTL (1 hour) to avoid fetching on every SSO login.
- tokio::sync::Mutex over std::sync::Mutex — Axum handlers must be Send; std::sync::MutexGuard is not Send across await points.
- DashMap session cleanup — In-memory session stores (DashMap) need periodic cleanup tasks to prevent memory leaks. Pattern: tokio::spawn with interval + retain with time-based cutoff.
IP Allowlist Hardening — Implementation Plan (Issue #3)
Spec: tasks/ip-allowlist-spec.md (v0.1.0, awaiting sign-off)
Issues Identified
- Allowlist bypass via missing XFF —
extract_remote_ipreturnsNonewhen the header is absent, and the middleware'sif let Some(ip)block has noelsebranch, so a request withoutX-Forwarded-Forskips the check. - Allowlist spoofing via XFF —
extract_remote_ipreads the header unconditionally; any client can claim to be from a whitelisted IP. - No trusted-proxy concept — there is no config field to declare which
intermediate proxies are allowed to set
X-Forwarded-For. - No
ConnectInfo<SocketAddr>wiring — the axum listeners inpm-web/src/main.rsdo not useinto_make_service_with_connect_info, so the middleware cannot access the real peer address.
Phases
Phase 1: Resolver helper in pm-auth
- 1a: Add
fn resolve_client_ip(headers, peer, trusted_proxies) -> Option<IpAddr> - 1b: Add 12 unit tests in
crates/pm-auth/src/rbac.rs(cfg(test)) covering the resolution matrix (peer-only, XFF trusted/untrusted, multi-hop, IPv6, malformed, missing peer) - 1c: Run
cargo test -p pm-authand confirm green
Phase 2: AuthConfig + SecurityConfig schema
- 2a: Add
trusted_proxies: Arc<RwLock<Vec<IpNet>>>toAuthConfig - 2b: Add
trusted_proxies: Vec<String>toSecurityConfigincrates/pm-core/src/config.rs - 2c: Update
Default for AppConfigto includetrusted_proxies: vec![] - 2d: Add
update_trusted_proxiessetter onAuthConfig(symmetric toupdate_ip_whitelist) - 2e: Update
config/config.example.tomlwith a documentedtrusted_proxiesentry and a reverse-proxy runbook comment block - 2f: Plumb
trusted_proxiesfromSecurityConfigintoAuthConfig::newinpm-web/src/main.rs - 2g: Run
cargo checkandcargo clippy --all-targets
Phase 3: Middleware change
- 3a: Update
require_authto extractConnectInfo<SocketAddr>from request extensions and callresolve_client_ip - 3b: Add fail-closed path: non-empty allowlist + unresolvable IP →
403 forbidden_ip - 3c: Replace
forbidden("Access denied")with the new error code in IP-deny path - 3d: Add
tracing::warn!withclient_ip,peer,xff_present,reason - 3e: Remove the old
extract_remote_ip(header-only) function - 3f: Run
cargo checkandcargo clippy --all-targets
Phase 4: pm-web listener wiring
- 4a: Switch both TCP and TLS axum listeners in
pm-web/src/main.rstointo_make_service_with_connect_info::<SocketAddr>() - 4b: Run
cargo check -p pm-web
Phase 5: Middleware integration tests
- 5a: Add
TestAppharness incrates/pm-auth/src/rbac.rscfg(test) (no DB, single-route router,tower::ServiceExt-style call) - 5b: Add 8 middleware integration tests per spec section 6.1 (allow empty, deny non-empty, allow in list, fail-closed no peer, spoofed XFF ignored, trusted proxy honors XFF, bad XFF fallback, no-JWT on deny)
- 5c: Run
cargo test -p pm-authand confirm green
Phase 6: Documentation
- 6a: Update
docs/security-review.md— update existing IP-allowlist row and reference new code path +trusted_proxiesfield - 6b: Update
SPEC.mdSecurity section (one paragraph) - 6c: Add a "Reverse proxy deployment" runbook under
docs/runbooks/(optional, per Kelly)
Phase 7: Review & commit
- 7a: Self-review against the 8 acceptance criteria in the spec
- 7b: Run
bash /a0/usr/skills/git-workflow/scripts/validate-push.sh - 7c: Commit on
fix/3-ip-allowlist-bypass(per git-workflow skill) - 7d: Push to
github/fix/3-ip-allowlist-bypassand open PR againstmaster - 7e: Comment on issue #3 linking the PR; close issue on merge
- 7f: Capture lessons in this file
Lessons Learned (this issue)
(filled in at completion)
Host Self-Enrollment Implementation Plan
Phases
Phase 1: Database & Core Models
- 1a: Create SQL migration for
enrollment_requeststable - 1b: Define Rust data models for
EnrollmentRequestinpm-core - 1c: Add DB interaction methods (insert, list, delete) in
pm-core
Phase 2: Client-Facing API (pm-web)
- 2a: Implement
POST /api/v1/enrollto accept payloads and generatepolling_token - 2b: Implement
GET /api/v1/enroll/status/{token}to return pending/approved (PKI) statuses - 2c: Implement IP-based rate limiting for the
/enrollendpoint
Phase 3: Admin-Facing API (pm-web)
- 3a: Implement
GET /api/v1/admin/enrollmentsto list pending queue - 3b: Implement
POST /api/v1/admin/enrollments/{id}/approve(generate PKI viapm-ca, migrate tohoststable) - 3c: Implement
DELETE /api/v1/admin/enrollments/{id}/denyto purge request
Phase 4: Background Workers (pm-worker)
- 4a: Create a scheduled task to purge
enrollment_requestsolder than 24 hours
Phase 5: Frontend UI (pm-web/React)
- 5a: Add enrollment API methods and types to frontend
- 5b: Update
Hostsview to include "Pending Enrollments" filter and visual badge - 5c: Render pending hosts in the table with highlight styling
- 5d: Add Approve/Deny action buttons to pending host rows
- 5e: Implement "merge/overwrite" interactive modal for
fqdn/ip_addresscollisions on approval
Issue #5: Admin-Only Manager-Wide Configuration (Authz Gate)
Spec: tasks/authz-gate-spec.md (v0.1.0)
Branch: fix/5-operator-can-modify-auth-config
Status: Draft spec — awaiting Kelly sign-off
Phase 1: admin_required helper + 3 unit tests
- 1a: Add
admin_requiredhelper incrates/pm-web/src/routes/settings.rs(afterwrite_access_required~line 173). Returns 403 with codeforbidden_roleif not Admin. - 1b: Add 3 unit tests in cfg(test) module:
admin_required_admin_passes,admin_required_operator_denied,admin_required_reporter_denied. - 1c: Run
cargo test -p pm-web --bins --testsand confirm green.
Phase 2: Gate changes + audit log calls
- 2a: Replace
write_access_requiredwithadmin_requiredinupdate_settings(line 336). - 2b: Replace
write_access_requiredwithadmin_requiredinupdate_ip_whitelist(line 902). - 2c: Replace
write_access_requiredwithadmin_requiredindiscover_oidc(line 561). - 2d: Replace
write_access_requiredwithadmin_requiredintest_oidc(line 619). - 2e: Create
migrations/019_auth_config_audit_actions.sqlwith 5 new enum values. - 2f: Add 5 new variants to the
AuditActionenum incrates/pm-core/src/audit.rs(or wherever defined). - 2g: Add
write_audit_eventcalls in each of the 4 handlers, after successful mutations. - 2h: Run
cargo fmt --check --all,cargo clippy --all-targets -- -D warnings,cargo test -p pm-web --bins --testsand confirm clean.
Phase 3: Integration tests (8 new)
- 3a:
update_settings_operator_denied— POST as Operator with OIDC fields → 403forbidden_role. - 3b:
update_settings_admin_allowed— POST as Admin with OIDC fields → 200 + audit row written. - 3c:
update_settings_smtp_operator_denied— POST as Operator with SMTP fields → 403forbidden_role. - 3d:
update_settings_smtp_admin_allowed— POST as Admin with SMTP fields → 200 + audit row written. - 3e:
update_ip_whitelist_operator_denied— POST as Operator → 403forbidden_role. - 3f:
update_ip_whitelist_admin_allowed— POST as Admin → 200 + audit row written + in-memoryAuthConfig.ip_whitelistupdated. - 3g:
discover_oidc_operator_denied/discover_oidc_admin_allowed— 2 tests. - 3h:
test_oidc_operator_denied/test_oidc_admin_allowed— 2 tests. - 3i: Run
cargo test -p pm-web --bins --testsand confirm all green.
Phase 4: SPA error message + 1 test
- 4a: Update
frontend/src/pages/SettingsPage.tsxto detecterror.code === 'forbidden_role'and show friendly message: "Only Admins can modify authentication configuration. Contact an Admin to make this change." - 4b: Create
frontend/src/pages/__tests__/SettingsPage.test.tsxwith 1 test:settings_page_forbidden_role_shows_friendly_message. - 4c: Run
npm testinfrontend/and confirm green.
Phase 5: Documentation
- 5a: Update
docs/security-review.md§2.3 (Authorization / RBAC) with 2 new rows. - 5b: Annotate the 4 affected endpoints in
docs/REST_API.mdwith "🔒 Admin only". - 5c: Add a project-specific lesson in
tasks/lessons.mdabout the role model (Admin = Manager-wide, Operator = per-host, Reporter = read-only).
Phase 6: Review & commit
- 6a: Self-review against the 9 acceptance criteria in the spec.
- 6b: Manual pre-push checks (cargo fmt, cargo clippy, eslint, cargo test, npm test) — run all 6 from the recent lessons-learned entry.
- 6c: Commit on
fix/5-operator-can-modify-auth-configwith conventional format. - 6d: Push to
github/fix/5-operator-can-modify-auth-configviagithub-echoSSH alias. - 6e: Open PR against
masterand comment on issue #5. - 6f: Capture lessons in
tasks/lessons.md(project-specific) andgit-workflow/references/lessons-learned.md(skill-level).