When a user already exists with auth_provider=local and the same email,
the SSO callback now links the existing user to the new SSO provider
instead of failing with a unique constraint violation on email.
Flow: 1) Try exact match (email+auth_provider), 2) If not found,
try email-only lookup, 3) If found with different provider, link
by updating auth_provider, azure_oid, oidc_sub, 4) If not found,
create new user as before.
The SSO button on the login page was not appearing because the settings
API requires authentication, but the login page cannot authenticate before
the user logs in.
Changes:
- Backend: Add GET /api/v1/auth/sso/config public endpoint that returns
only enabled, display_name, and auth_url (no secrets exposed)
- Backend: Mount sso::public_router() at /api/v1/auth/sso in main.rs
(was previously missing - only azure_compat_router was mounted)
- Frontend: Replace settingsApi.get() call in LoginPage.tsx with
ssoConfigApi.get() which calls the public endpoint
- Frontend: Add SsoConfigResponse interface and ssoConfigApi helper
to client.ts
- Frontend: Use auth_url from config response instead of hardcoded path
- Refactored azure_sso.rs to sso.rs with generic OIDC provider support
- Added OIDC discovery URL lookup with 1hr TTL caching
- Added PKCE for all providers, client_secret optional for public clients
- Added /api/v1/auth/sso/login and /api/v1/auth/sso/callback routes
- Added /api/v1/auth/azure/* backward-compatible routes
- Added POST /settings/sso/discover and POST /settings/sso/test endpoints
- Frontend: Provider dropdown (Keycloak/Azure AD/Custom OIDC)
- Frontend: Auto-fill discovery URL for Keycloak
- Frontend: Discover Endpoints and Test Connection buttons
- Frontend: Dynamic SSO button based on provider display name
- Made migration 014 idempotent with DO blocks and IF NOT EXISTS
- Fixed debian/install to use /usr/local/bin/ for binaries
- Fixed frontend file path in .deb package
- Reset admin password on dev server
- Fixed database permissions for oidc_config table
1. Compliance CSV/PDF: Replace non-existent pd.total_packages with
jsonb_array_length(pd.installed_packages) and pd.pending_patches
with pd.patch_count. Fix GROUP BY to match new columns.
2. Vulnerability CSV/PDF: Replace non-existent pd.cve_data with
jsonb_array_elements on pd.available_patches JSONB, extracting
cve_ids via nested lateral join. Replace pd.updated_at with
pd.polled_at (actual column name).
3. TypeScript: Remove duplicate PollingConfig interface declaration
in frontend/src/types/index.ts.
4. ReportsPage: Replace Group ID text field with Select dropdown
populated from GET /api/v1/groups, showing group names instead
of requiring UUID input.
5. PDF charts: Increase embed_image scale from 0.18 to 0.28 for
better visibility on A4 landscape pages.
6. Vulnerability CSV: Remove invalid (no data) comment row on
query failure; return header-only CSV instead to maintain valid
CSV format.
- Add target_host_id column to host_health_checks table (nullable UUID FK)
- Allow service checks to query a different host agent
- Backend models, API routes, and poller updated
- Frontend: host selector dropdown for service checks
- Validation: target host must exist and be healthy
- FK ON DELETE SET NULL: revert to own host if target deleted
The ip_address column is PostgreSQL inet type. When cast to text
(ip_address::text), it includes the CIDR mask (e.g., 192.168.0.166/32).
Rust IpAddr::parse() fails on CIDR notation, so the IP SAN was silently
skipped in server certificates.
Fix: use host(ip_address) in SQL queries to strip the CIDR mask,
returning just the IP address (e.g., 192.168.0.166).
Affected endpoints:
- POST /hosts/:id/certificates (issue_client_cert)
- POST /hosts/:id/certificates/reissue (reissue_host_cert)
Bug 1: status.output.as_deref() passes NULL when agent returns no output,
but patch_job_hosts.output has NOT NULL constraint.
Fix: use .unwrap_or("") to default to empty string.
Bug 2: sync_job_status passes String to job_status enum column,
PostgreSQL rejects implicit text-to-enum cast.
Fix: add ::job_status cast in SQL UPDATE queries.
- Fixed broken SQL in hosts.rs that was using Python string replacement
instead of proper CASE expression for health_check_status
- Added serde default for health_check_poll_interval_secs config
- Fixed missing AgentClient import in health_check_poller.rs
- Added /etc/patch-manager/keys to systemd ReadWritePaths
- Integration test verified: health check CRUD, service/HTTP checks work
- Added health_check_poller.rs: periodic service/HTTP health checks
- Added pre-patch health gate in job_executor.rs
- Added waiting_health_check job status (migration 008)
- Added health_check_status to HostSummary and hosts API
- Added health check types and API functions to frontend
- Added health check UI section to HostDetailPage
- Added health check status indicators to HostsPage and PatchDeploymentPage
- Added serde default for health_check_poll_interval_secs
- Fixed missing AgentClient import in health_check_poller.rs
- Fixed missing ws_relay import in main.rs
- Fixed missing closing paren in retry_pending_jobs SQL
- Added ReadWritePaths for /etc/patch-manager/keys in systemd services
- ws_relay.rs: Add ALPN protocol http/1.1 to rustls ClientConfig to prevent
HTTP/2 negotiation which breaks WebSocket upgrades (Sec-WebSocket-Accept mismatch)
- ws_relay.rs: Add detailed TLS error chain logging for debugging connection failures
- ws_relay.rs: Add HTTP polling fallback when WebSocket connection fails, using
AgentClient to poll /api/v1/jobs/{id} every ws_relay_poll_interval_secs
- config.rs: Add ws_relay_poll_interval_secs field (default: 10 seconds)
- config.example.toml: Add ws_relay_poll_interval_secs documentation
- jobs.rs: Fire pg_notify with event_type job on cancel
- job_executor.rs: Fire pg_notify with event_type job when parent job transitions
- ws_relay.rs: Add event_type field to NotifyPayload (host vs job events)
- Frontend: Add event_type, succeeded_count, failed_count, host_count to JobWsEvent
- Frontend: handleWsEvent distinguishes host vs job events for accurate status updates
Worker was crashing on startup with:
Could not automatically determine the process-level CryptoProvider
Added rustls::crypto:💍:default_provider().install_default()
to pm-worker main.rs, matching the pm-web initialization.
- Frontend: handleWsEvent now distinguishes host vs job events
- Host events only update detail rows + optimistic counters
- Job events (event_type=job) set authoritative status + counts
- Backend ws_relay: NotifyPayload now includes event_type field
- Host events: event_type=host
- update_parent_job_status fires pg_notify with event_type=job
- Backend job_executor: sync_job_status fires pg_notify with event_type=job
- Backend jobs cancel endpoint fires pg_notify with event_type=job
- Fixes jobs appearing stuck because host status was mapped to job status
BUG-15: Empty patch_selection sent to agent as-is, causing apply nothing
instead of apply all available patches per SPEC. When packages is empty,
now query host_patch_data and expand to full package list.
BUG-16: refresh_listener used INSERT instead of UPSERT for
host_patch_data, causing duplicate key constraint errors.
BUG-14: Patch poller was INSERT-ing a new row every poll cycle instead of
UPSERT-ing, creating 51 duplicate rows in host_patch_data for a single host.
Changes:
- patch_poller.rs: Changed INSERT to INSERT...ON CONFLICT (host_id) DO UPDATE
so each host only has one row that gets updated on each poll
- Migration 006: Added UNIQUE constraint on host_id, cleaned up 50 duplicate
rows keeping only the latest polled_at per host
The dashboard showing 174 pending patches and 0% compliance is expected
behavior - the patch data was collected before the job ran and the poller
runs every 30 minutes. The next poll cycle will refresh the data.
The linux_patch_api agent returns "completed" as its terminal success
status, but the worker only recognized "succeeded". This caused the
worker to log "unexpected agent status — ignoring" every 60 seconds
and never mark patch jobs as finished.
Changes:
- job_executor.rs: match "completed" alongside "succeeded" as terminal
success status, mapping both to patch_job_hosts.status = succeeded
- job_executor.rs: match "cancelled" as a terminal failure status,
routing to handle_host_failure with appropriate error message
- pm-agent-client types.rs: updated AgentJobStatus doc comment to
list all valid agent statuses: queued, running, succeeded, completed,
failed, cancelled
BUG-10: PostgreSQL inet type includes CIDR suffix (/32) when cast to text,
causing malformed agent URLs like https://127.0.0.1/32:12443.
Fixed by using host(ip_address)::text in all SQL queries across
pm-worker and pm-web modules, and adding a Rust-side safety
strip of CIDR notation in pm-agent-client.
Files changed:
- crates/pm-agent-client/src/client.rs: strip CIDR suffix from IP
- crates/pm-worker/src/health_poller.rs: host(ip_address)::text
- crates/pm-worker/src/patch_poller.rs: host(ip_address)::text
- crates/pm-worker/src/refresh_listener.rs: host(ip_address)::text
- crates/pm-worker/src/job_executor.rs: host(ip_address)::text (2 places)
- crates/pm-worker/src/ws_relay.rs: host(h.ip_address)::text
- crates/pm-web/src/routes/discovery.rs: host(ip_address)::text (2 places)
- crates/pm-web/src/routes/hosts.rs: host(ip_address)::text (3 places)
- docs/linux_patch_api_research.md: added research notes
BUG-6: Add TLS support via axum-server + rustls
- Added axum-server with tls-rustls feature to workspace and pm-web
- pm-web now serves HTTPS when TLS certs exist, falls back to HTTP with warning
- setup.sh generates self-signed ECDSA P-256 TLS cert with SANs
- Config already had web_tls_cert_path/web_tls_key_path fields
BUG-7: Fix audit chain integrity errors
- Migration 005 now TRUNCATEs audit_log after adding prev_hash column
- Existing rows had broken hash chains (inserted before prev_hash existed)
BUG-8: Disable WatchdogSec in patch-manager-web.service
- pm-web does not implement sd_notify, causing systemd to kill the service
BUG-9: Disable WatchdogSec in patch-manager-worker.service
- Same issue as BUG-8, worker does not implement sd_notify
Previous fixes (BUG-1 through BUG-5) also included:
- setup.sh: PostgreSQL 15+ schema GRANTs
- Axum route syntax :param → {param} (19 routes)
- DbUser struct role: String → UserRole enum mapping
- UserRole/AuthProvider Display trait implementations
- Seed admin password hash (Argon2id)
- Rename clippy.toml field to single-char-binding-names-threshold
- Add placeholder certificates for pm-agent-client doc tests
- Add .cargo/audit.toml to handle upstream security advisories
- Update CI to install Node.js 18 for frontend linting
- Pin all jobs to ubuntu-22.04 runner
- Use curl -sfL with secrets.GITEATOKEN for checkout
- Switch checkout URL to https://gitea-lxc.moon-dragon.us
- Install rustup with --default-toolchain stable --profile minimal
- Add cargo bin to GITHUB_PATH instead of sourcing per-step
- Enforce clippy -D warnings
- Ignore RUSTSEC-2025-0134 in cargo audit
- Pass GITEA_TOKEN via env for release step