Private
Public Access
1
0
Files
linux_patch_manager/tasks/sso-token-handoff-spec.md
Draco-Lunaris-Echo f58d7a6f17 fix(security): stop embedding JWT tokens in SSO callback redirect URL (#4) (#14)
Replaces URL-embedded JWT tokens with a single-use, 60-second handoff code that the SPA exchanges via server-to-server POST. The URL now contains only `?handoff=<code>` — no tokens are placed in the browser history, proxy access logs, or Referer header.

Backend: new SsoHandoff store (DashMap, 60s TTL, atomic DashMap::remove for single-use), POST /api/v1/auth/sso/handoff endpoint, 7 new tests.

Frontend: SsoCallbackPage rewritten to use useSearchParams + POST exchange, with history.replaceState to clear the handoff code from the address bar. Switched from window.location.search to useSearchParams() for test compatibility. New Vitest infrastructure (vitest, @testing-library/react, jsdom) and 6 new tests.

CI fix in ccba9e3: cargo fmt --all and added searchParams to useEffect dep array to satisfy CI's Rust Format and Frontend Lint checks.

Refs: closes #4
2026-06-03 06:28:08 -05:00

14 KiB
Raw Blame History

SSO Token Handoff — Specification

Issue: #4 Component: crates/pm-web/src/routes/sso.rs, frontend/src/pages/SsoCallbackPage.tsx, frontend/src/store/authStore.ts Spec version: 0.1.0 (draft) Status: Awaiting Kelly sign-off


1. Goal

Stop embedding JWT access tokens, refresh tokens, and user objects in the SSO callback redirect URL. Today, after a successful OIDC login, the backend 302-redirects the browser to the SPA with the tokens in the query string:

https://app.example.com/auth/sso/callback
  ?access_token=<jwt>
  &refresh_token=<raw>
  &token_type=Bearer
  &expires_in=900
  &user=<urlencoded-json>

Tokens in URLs are written to browser history, intermediate proxy and load-balancer access logs, and may leak via the Referer header when the landing page loads third-party resources. The refresh token is the most sensitive value (long-lived, rotating) and gets the worst exposure.

Replace the URL-embedded tokens with a single-use, short-lived handoff code that the SPA exchanges for tokens via a server-to-server POST. The URL then contains only the code, which expires in 60 seconds and is invalidated on first use.

2. Non-Goals

  • Changing the OIDC flow itself (Authorization Code + PKCE stays the same).
  • Changing the MFA verification path that runs after the OIDC callback.
  • Touching the WS ticket pattern (issue #10) — this spec is a new in-memory store for SSO handoff codes, mirroring but separate from ws_tickets: Arc<DashMap<String, WsTicket>>.
  • Adding cookie-based or form_post delivery. The handoff code approach was selected over those (Kelly sign-off Q1).
  • Long-lived SSO sessions. The handoff code is single-use; subsequent SSO logins re-issue a new code.

3. Design Decisions (Kelly sign-off, 2026-06-02)

# Question Resolution
Q1 Approach selection Handoff code (option C in issue #4). Mirrors the existing WS-ticket pattern. URL contains only a single-use, 60s handoff_code. SPA POSTs to /api/v1/auth/sso/handoff and gets tokens in the JSON response.
Q2 Cookie attributes N/A — handoff code approach uses no cookies.
Q3 Rollout strategy Hard cutover — remove the old query-string parsing in the same PR. No dual-read window. (Justification: security-critical fix, deploy window is short, no in-flight SSO logins survive a rolling restart because the auth state is in the user's browser, not on the server.)
Q4 Secure cookie flag N/A — handoff code approach uses no cookies. Kelly's answer ("unconditionally secure") is noted for future cookie work but does not apply here.

4. Design

4.1 Backend: SSO callback (crates/pm-web/src/routes/sso.rs)

The sso_callback handler currently constructs a redirect URL with all token values. Replace this with a handoff code generation step:

  1. After the access/refresh tokens and user_json are computed (the existing logic through sso_callback is unchanged up to the redirect construction), generate a cryptographically random handoff_code (32 bytes, base64url-encoded, ~43 chars).
  2. Store the handoff payload in a new in-memory map:
    pub struct SsoHandoff {
        pub access_token: String,
        pub raw_refresh: String,
        pub user_json: Value,
        pub access_ttl: u64,
        pub expires_at: Instant, // now + 60s
    }
    pub sso_handoffs: Arc<DashMap<String, SsoHandoff>>,
    
    Mirrors the WsTicket struct (single-use, in-memory, TTL enforced on read). The map is added to AppState alongside ws_tickets.
  3. Build the redirect URL with ONLY the handoff code:
    let redirect_url = format!("{}?handoff={}", callback_url, handoff_code);
    Ok(Redirect::to(&redirect_url))
    
  4. Log the handoff creation (without the code value itself) for audit:
    tracing::info!(user_id = %user.id, auth_provider, "SSO handoff issued");
    

4.2 Backend: Handoff exchange endpoint

New handler POST /api/v1/auth/sso/handoff:

  • Request body: { "handoff_code": "<code>" }
  • Behavior:
    1. Look up handoff_code in sso_handoffs (DashMap read lock).
    2. If not found → 400 invalid_handoff.
    3. If found but expires_at < Instant::now() → remove the entry and return 400 invalid_handoff (the cleanup-on-expiry also prevents memory bloat from expired-but-unconsumed codes).
    4. Remove the entry atomically (DashMap remove is atomic) — this is the single-use guarantee. Even if two requests race with the same code, only one wins.
    5. Return the payload as JSON:
      {
        "access_token": "<jwt>",
        "refresh_token": "<raw>",
        "token_type": "Bearer",
        "expires_in": 900,
        "user": { "id": "...", "username": "...", ... }
      }
      
  • Log:
    • On success: tracing::info!(user_id = %payload.user.id, "SSO handoff exchanged")
    • On failure: tracing::warn!(reason = %reason, "SSO handoff exchange failed")
    • Never log the handoff code value itself (it's a bearer secret with 60s window).

4.3 Backend: Cleanup task

Add a tokio::spawn cleanup task in main.rs (mirroring the existing WS-ticket cleanup if present, or the SSO-session cleanup that already runs per the codebase). Every 60 seconds, walk sso_handoffs and remove entries with expires_at < Instant::now(). Bounded memory growth even if the SPA never POSTs back.

4.4 Backend: Route registration

In pm-web/src/main.rs, add the new route to the public router (alongside /api/v1/ws/ticket, which is also public — no JWT required because the handoff code IS the credential):

.route("/api/v1/auth/sso/handoff", post(sso_handoff_exchange))

4.5 Frontend: SsoCallbackPage.tsx

Replace the URL-param parsing with a POST to the handoff endpoint:

useEffect(() => {
  const params = new URLSearchParams(window.location.search)
  const errorCode = params.get('error')
  if (errorCode) {
    // ... existing error handling unchanged ...
    return
  }

  const handoffCode = params.get('handoff')
  if (!handoffCode) {
    setError('Missing handoff code. Please try logging in again.')
    setProcessing(false)
    return
  }

  // Exchange handoff code for tokens
  fetch('/api/v1/auth/sso/handoff', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ handoff_code: handoffCode }),
  })
    .then(r => r.ok ? r.json() : r.json().then(e => Promise.reject(e)))
    .then(data => {
      setTokens(data.access_token, data.refresh_token)
      setUser(buildUser(data.user))
      // Clear the handoff code from the URL to prevent bookmarking/sharing
      window.history.replaceState({}, '', '/auth/sso/callback')
      navigate('/dashboard', { replace: true })
    })
    .catch(err => {
      setError(err?.error?.message || 'Failed to complete sign-in. Please try again.')
      setProcessing(false)
    })
}, [setTokens, setUser, navigate])

The buildUser helper mirrors the existing field-mapping logic (lines 5467 of the current file).

4.6 Frontend: authStore.ts

No change required. The existing setTokens(access, refresh) and setUser(user) API is what the new code calls. The partialize config (line 74) already correctly persists only refreshToken and user — not accessToken — so the in-memory access token is never written to localStorage. This is the correct security posture and should be preserved.

5. Acceptance Criteria

  • SSO callback no longer places access_token, refresh_token, token_type, expires_in, or user in the redirect URL. The URL contains only handoff=<code> (plus the error params on failure, which are unchanged).
  • The handoff code is at least 128 bits of entropy (32 bytes, base64url-encoded) and is generated with a CSPRNG.
  • The handoff code is single-use: a second exchange attempt with the same code returns 400 invalid_handoff and does NOT return the tokens again.
  • The handoff code expires after 60 seconds. An exchange attempt with an expired code returns 400 invalid_handoff and the entry is removed from the in-memory map.
  • The SPA successfully completes login: POST to the handoff endpoint receives the tokens, calls setTokens and setUser, and navigates to /dashboard.
  • authStore.ts is unchanged (its existing partialize already prevents access-token persistence; the handoff code approach doesn't change that contract).
  • cargo check and cargo clippy --all-targets pass.
  • cargo test -p pm-web passes with new tests for the handoff endpoint (create, exchange success, exchange duplicate=400, exchange expired=400, exchange unknown=400).
  • frontend builds cleanly (npm run build in frontend/).
  • No access or refresh token values appear in any URL or query string in the SSO flow. Manual verification: complete a login and grep the server access log for the callback URL — only the handoff code should be present.
  • docs/security-review.md §2.5 (Azure SSO) is updated to document the handoff code control.

6. Test Plan

6.1 Backend unit/integration tests (crates/pm-web/src/routes/sso.rs)

Using a small TestApp harness mirroring the WS-ticket test pattern (no real HTTP listener, no DB beyond the connection that's already mocked in the existing tests):

  1. handoff_exchange_success — create a handoff, POST to the exchange endpoint, expect 200 with the access/refresh/user fields.
  2. handoff_exchange_single_use — exchange once (success), exchange the same code again (expect 400 invalid_handoff).
  3. handoff_exchange_unknown_code — POST with a code that was never issued (expect 400 invalid_handoff).
  4. handoff_exchange_expired_code — create a handoff with expires_at = past, exchange (expect 400 invalid_handoff AND the entry is removed from the map).
  5. handoff_exchange_race — two concurrent POSTs with the same code (using tokio::join!); exactly one succeeds, the other gets 400.
  6. handoff_exchange_malformed_body — POST with invalid JSON or missing handoff_code field (expect 400 invalid_handoff).
  7. callback_redirect_contains_only_handoff — invoke sso_callback through a mock OIDC config and assert the resulting redirect URL contains only handoff=<code> and NO access_token / refresh_token / user query params.

6.2 Backend cleanup test

  1. handoff_cleanup_removes_expired — create 3 handoffs with varying expires_at, run one tick of the cleanup task, assert only the non-expired ones remain.

6.3 Frontend tests (frontend/src/pages/SsoCallbackPage.tsx)

Add a Vitest + React Testing Library test suite (the frontend already uses Vitest — see frontend/package.json and frontend/vite.config.ts):

  1. renders_processing_state_initially — on mount with a handoff code, shows the spinner and "Completing sign-in…".
  2. calls_handoff_endpoint_on_mount — mocks fetch and asserts the POST goes to /api/v1/auth/sso/handoff with { handoff_code: <code> }.
  3. stores_tokens_and_user_on_success — mocks a successful response, asserts setTokens and setUser are called with the response payload, and the SPA navigates to /dashboard.
  4. shows_error_on_handoff_failure — mocks a 400 response, asserts the error message is rendered and the spinner stops.
  5. shows_error_when_handoff_code_missing — invokes the effect with no handoff code, asserts the "Missing handoff code" error is shown.
  6. clears_handoff_code_from_url_after_success — asserts window.history.replaceState is called to remove the ?handoff= param from the URL after a successful exchange.

7. Risk Analysis

  • Risk: regression in the SSO login flow. Mitigation: the test plan covers the callback redirect shape, the exchange endpoint behavior (success, single-use, expiry, race), and the frontend effect. Manual end-to-end test (completing a real Azure AD login) is required before merge — the new scripts/integration-test.sh should be extended or a new scripts/integration-test-sso.sh added to exercise the full flow against a mock OIDC provider.
  • Risk: in-flight SSO logins during deploy break. Per Kelly sign-off Q3, we accept hard cutover. The mitigation: the 60s handoff TTL means any in-flight redirect that arrives after the server restart has a 60s window to complete. If the new code is deployed and the old handoffs are lost, the user is sent back to /auth/sso/callback?handoff=<old-code> which the new code rejects with 400 invalid_handoff, and the SPA shows "Please try logging in again." Worst case: a 30-second re-login. Acceptable for a security-critical fix.
  • Risk: handoff code leaked via browser history or Referer. The code is single-use and 60s TTL, so the blast radius is small even if logged. The SPA calls history.replaceState after a successful exchange to remove the code from the address bar (and the underlying history entry). The 60s window limits exposure to Referer leakage on subsequent navigations from the callback page.
  • Risk: memory growth from unconsumed handoffs. Mitigation: the cleanup task runs every 60s and removes expired entries. Worst case memory usage is O(active_logins) — typically single digits.
  • Risk: race condition in the single-use guarantee. Mitigation: DashMap::remove is atomic, so only one of two concurrent exchange attempts can succeed. Verified by the handoff_exchange_race test.

8. Documentation Updates

  • docs/security-review.md §2.5 (Azure SSO): add a new row documenting the handoff code control and explicitly state that no tokens appear in any URL.
  • frontend/src/pages/SsoCallbackPage.tsx: update the doc-comment to describe the POST-and-exchange flow instead of the URL-param parse.
  • docs/REST_API.md: document the new POST /api/v1/auth/sso/handoff endpoint.

9. Out of Scope / Follow-ups

  • Cookie-based SSO session (a future enhancement that would let the SPA refresh state without a new OIDC flow on every page load).
  • form_post response mode (a future enhancement if browsers standardize it more widely).
  • Rate limiting on the handoff endpoint (out of scope here; the existing governor-based rate limits on /auth/* may already cover this — verify during implementation).
  • Moving the in-memory sso_handoffs to Redis (out of scope; the single-instance design constraint in SPEC.md is fine for this control).