Private
Public Access
1
0
Files
linux_patch_manager/REQUIREMENTS.md
git-echo 124b5b0e3b feat: add bump-version.sh script for version management
Automates version bumps across all version source files:
- Cargo.toml (PRIMARY - workspace.package.version)
- debian/changelog (prepend new entry)
- debian/control (update Version field)
- scripts/build-package.sh (update VERSION variable)
- frontend/package.json (update version field)
- Stale references check after bump

Usage: ./scripts/bump-version.sh <new_version> <old_version>
2026-05-28 10:52:16 -05:00

207 lines
10 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Linux_Patch_Manager — Requirements Document
## Document Control
| Field | Value |
|-------|-------|
| Title | Linux_Patch_Manager — Requirements Document |
| Version | 0.0.2 |
| Status | Draft |
| Last Updated | 2026-04-23 |
| Related Docs | `SPEC.md`, `ARCHITECTURE.md`, `README.md` |
### Revision History
| Version | Date | Summary |
|---------|------|---------|
| 0.0.1 | 2026-04-21 | Initial draft |
| 0.0.2 | 2026-04-23 | Aligned with SDD v0.0.3: hardware-host encryption at rest (no OS-level LUKS), Argon2id, EdDSA JWTs, Azure SSO configuration GUI, web-UI TLS cert strategy, SMTP runtime configurability |
---
## Project Overview
**Title:** Linux_Patch_Manager
**Description:** Enterprise-class, secure, web-based management interface for controlling patching and updates on Linux servers and workstations
**Version:** 0.0.2
**Status:** Draft
## Functional Requirements
### FR-01: Host Management
- Manual host registration by FQDN or IP address (FQDN resolved to IP at add time)
- On-demand auto-discovery targeting a CIDR subnet range (scans for Linux Patch API agents on port 12443)
- Host metadata tracked: hostname, IP, OS, kernel, agent version, last seen, health status
- Static group-based organization with many-to-many relationships (hosts can belong to multiple groups)
- Ungrouped hosts can be managed by any operator or admin
- Host removal with audit logging
### FR-02: Patch Monitoring
- Scheduled background polling: 5-minute intervals for health checks, 30-minute intervals for patch data
- On-demand refresh triggered by operator/admin from the UI
- Visual dashboard alerts for unhealthy or unreachable agents (red/yellow status indicators)
- CVE severity, patch priority, and reboot requirement display per host
### FR-03: Patch Deployment
- Patches queue for the next available maintenance window by default
- Immediate-apply override option for urgent patches
- No approval gate required — operator/admin triggers deployment directly
- Auto-retry failed patch jobs once if still within the maintenance window, then surface failure prominently
- Batch operations across multiple hosts with partial failure handling (auto-retry once, then report failures)
- Rollback support via upstream Linux Patch API rollback endpoint
### FR-04: Scheduling
- Maintenance windows are per-device (not per-group)
- Recurring schedules: daily, weekly, or monthly
- One-time maintenance windows
- Patch operations execute automatically when a maintenance window opens
### FR-05: Reporting
- Compliance report: percentage of hosts fully patched, by group or fleet-wide
- Patch history: log of all patch operations per host or per group
- Vulnerability exposure: hosts with known CVEs pending patches
- Audit trail: who did what, when (user actions, patch operations)
- Charts and graphs required in PDF exports (compliance trends, patch-status distributions)
- Export formats: CSV and PDF
### FR-06: User Management
- **Admin role**: Full access to manage all aspects of Linux Patch Manager
- **Operator role**: Can add/remove clients, manage schedules and patches only for devices in their group memberships
- Operators can belong to multiple groups
- Local accounts with MFA required (TOTP or WebAuthn)
- Azure SSO integration (optional, with Azure's built-in MFA)
- Group membership management for users and hosts
### FR-07: System Configuration
- Azure SSO configuration GUI in the Settings page (tenant ID, client ID, client secret, redirect URI, scopes)
- "Test connection" action in the Azure SSO config GUI that performs a round-trip against Azure AD and reports success/failure without enabling SSO
- SMTP configuration GUI (host, port, auth mode, username/password, TLS mode, from-address); disabled by default
- "Send test email" action in the SMTP config GUI
- Polling-interval tuning (health and patch pollers)
- Web UI TLS certificate strategy selection: self-signed from the internal CA (default) or operator-supplied certificate/key (e.g., existing infrastructure wildcard)
## Non-Functional Requirements
### NFR-01: Security
- Combination authentication: local accounts + Azure SSO
- MFA required for all users (TOTP or WebAuthn; Azure MFA for SSO users)
- Password hashing: **Argon2id**
- Session management: short-lived JWT access tokens (15 min, signed with **EdDSA / Ed25519**) + server-side opaque refresh tokens (1-hour inactivity timeout, rotated on use, revocable)
- JWT signing key rotation every 90 days with a 24-hour overlap window for in-flight tokens
- mTLS for all agent communication (certificate-based, **TLS 1.3 only**)
- HTTPS enforced for web UI (TLS 1.3 only)
- Internal CA managed by Patch Manager for mTLS certificate issuance and renewal
- Certificate distribution to managed clients is manual (server administrators responsible)
- RBAC with group-scoped access control
- IP whitelist enforcement on all connection points
### NFR-02: Performance
- Support 500 typical managed hosts, up to 2,500
- Dashboard load time under 5 seconds for full fleet view
- Background polling must not degrade UI responsiveness
- Concurrent batch operations (e.g., patch 500 hosts simultaneously) must not overwhelm the system
- Login latency budget: 250500 ms on target hardware (Intel Xeon, 4 cores, 16 GB RAM); Argon2id parameters calibrated to land in this window
- CIDR auto-discovery of a `/22` network (~1,024 hosts) across sites completes within 10 seconds wall-clock
### NFR-03: Scalability
- Single-instance design on bare metal/VM (Ubuntu 24.04)
- Manual horizontal scaling by dividing clients between multiple Patch Manager hosts if needed
- No automatic clustering or load balancing required
### NFR-04: Reliability
- Agent communication failures: retry with exponential backoff (3 retries, max 30 minutes between retries)
- Patch job failures: auto-retry once within maintenance window, then surface to operators
- Batch partial failures: auto-retry once, then report remaining failures to operator
- Continue processing healthy hosts regardless of individual host failures
### NFR-05: Usability
- 11-page web UI (React + TypeScript SPA)
- Responsive design for desktop/laptop screens
- Dark mode support
- Certificate download links integrated into dashboard (root CA) and host detail (host-specific mTLS)
- Long-running scans (CIDR discovery, full-fleet operations) must display progress and offer a cancel action
## Interface Requirements
### IR-01: Web Interface
- React + TypeScript SPA served by Axum backend
- Real-time job status via WebSocket relay (agent WebSocket → Patch Manager → browser)
- RESTful API backend for all UI operations
- Certificate download endpoints for root CA and host-specific mTLS certs
- Unauthenticated liveness endpoint at `/status/health` (minimal: process up, DB reachable, worker heartbeat fresh)
- Authenticated fleet-aggregate endpoint at `/api/v1/status/fleet` (counts of healthy / degraded / unreachable agents)
### IR-02: Linux Patch API Integration
- All managed device communication via Linux Patch API (upstream agent)
- mTLS client certificate authentication to each agent
- Base path: `/api/v1/`, Port: 12443, TLS 1.3 only
- Sync operations: GET endpoints (packages, patches, system info, health)
- Async operations: POST/PUT/DELETE endpoints (install, update, remove, patch apply, reboot)
- Job status tracking via `GET /api/v1/jobs/{id}` and WebSocket `/api/v1/ws/jobs`
- Rollback via `POST /api/v1/jobs/{id}/rollback`
## Data Requirements
- **Database:** PostgreSQL 16+
- **Operational data retention:** 30 days (host patch history, job history, health history)
- **Audit log retention:** 6 months
- **Data storage:** All data on Patch Manager host
## Compliance Requirements
### HIPAA (Health Insurance Portability and Accountability Act)
- **Audit Controls (§164.312(b)):** Comprehensive audit logging of all system activity (hash-chained rows for integrity)
- **Access Controls (§164.312(a)(1)):** RBAC with group-scoped access, unique user identification, MFA enforcement
- **Integrity Controls (§164.312(c)(1)):** Audit log integrity protection via hash chaining
- **Transmission Security (§164.312(e)(1)):** mTLS for all agent communication, HTTPS for web UI, TLS 1.3 minimum
- **Encryption at Rest:** Provided by the underlying hardware host (infrastructure-level full-disk encryption). The application does not manage disk encryption.
- **Automatic Logoff (§164.312(a)(2)(iii)):** 1-hour inactivity session timeout
### PCI-DSS (Payment Card Industry Data Security Standard)
- **Requirement 3:** Protect stored data — encryption at rest provided by the hardware host
- **Requirement 4:** Encrypt transmission — mTLS (TLS 1.3) for agent communication, HTTPS (TLS 1.3) for web UI
- **Requirement 6:** Vulnerability management — patch management is the core function; system tracks and enforces timely patching
- **Requirement 7:** Restrict access to need-to-know — RBAC with group-scoped operator access
- **Requirement 8:** Identify and authenticate users — MFA required, unique IDs, session timeouts
- **Requirement 10:** Track and monitor all access — comprehensive audit logging with 6-month retention
## Audit Logging
**Captured Events:**
- All user login/logout events (success and failure)
- All patch operations (who triggered, which hosts, what patches, queue vs. immediate)
- All host registration/removal events
- All group membership changes (hosts and users)
- All certificate operations (issue, renew, download, revoke)
- All maintenance window changes
- All configuration changes (including Azure SSO and SMTP configuration)
**Integrity:** Tamper-evident via hash-chained rows (`prev_hash`, `row_hash`). Periodic and on-demand integrity verification.
**Retention:** 6 months
## Constraints
- Single bare metal/VM host running Ubuntu 24.04
- Systemd service management
- Internal network only (no public internet exposure)
- Rust/Axum backend, React/TypeScript frontend, PostgreSQL 16+ database
- No direct permissions on managed clients
- Certificate distribution to clients is manual
- Encryption at rest is provided by the hardware host; the application does not configure or manage disk encryption