Private
Public Access
1
0
Files
linux_patch_api/ARCHITECTURE.md
Echo eba8849986 M1: Complete all specification documents (kiro standards)
Completed comprehensive spec-driven documentation:
- SPEC.md (222 lines): Project scope, objectives, constraints
- ARCHITECTURE.md (290 lines): System design, components, data flow
- REQUIREMENTS.md (168 lines): Functional & non-functional requirements
- API_SPEC.md (556 lines): 15 API endpoints with schemas
- SECURITY.md (188 lines): STRIDE threat model, security controls
- ROADMAP.md (203 lines): 5 phases, 8 milestones, risk register

Total: 1,627 lines of specification documentation

Milestone M1 complete - Ready for Phase 0 (Rust scaffolding)
2026-04-09 13:49:00 +00:00

8.6 KiB

Linux_Patch_API - Architecture Document

System Overview

The Linux_Patch_API is a secure, single-host API service that enables remote package and patch management on Linux systems. Each instance runs as a systemd service on the managed host, providing a REST API over mTLS with strict IP whitelist enforcement.

Architecture Type: Agent Per Host (Option B)
Deployment: One instance per managed Linux host
Network: Internal network only (no internet exposure)


Component Architecture

Core Components

  1. API Layer (Actix-web/Axum)

    • HTTP/HTTPS endpoint handling
    • mTLS termination
    • IP whitelist enforcement
    • Request routing
    • WebSocket support for real-time job status
  2. Authentication Layer

    • Certificate validation (mTLS)
    • Client identity extraction from certificate
    • No session management (stateless, cert-based auth only)
  3. Authorization Layer

    • IP whitelist checking (deny by default)
    • No permission validation (whitelisted IP + valid cert = full access)
  4. Job Manager

    • Async job queue for long-running operations
    • Job status tracking with persistent storage
    • WebSocket broadcast for real-time status updates
    • 30-minute timeout enforcement
    • Job cleanup and expiration
  5. Package Manager Backend (Pluggable)

    • apt/dpkg adapter (Debian/Ubuntu - primary)
    • dnf/yum adapter (RHEL/CentOS/Fedora)
    • apk adapter (Alpine)
    • pacman adapter (Arch)
    • Distribution detection and adapter selection
  6. Audit Logger

    • systemd journal integration (primary)
    • Optional remote syslog server
    • Local file fallback (/var/log/linux_patch_api/)
    • 30-day retention with daily rotation and gzip compression
  7. Configuration Manager

    • YAML config file watcher (/etc/linux_patch_api/config.yaml)
    • Auto-reload on file change
    • Config validation before reload (prevents service downtime)
    • Runtime settings access for all components

External Integrations

  • Package Managers: apt, dnf, yum, apk, pacman (via system commands)
  • systemd: Service management and journal logging
  • Internal CA: Certificate validation against self-hosted CA
  • Remote Syslog: Optional external log aggregation

Technology Stack

Backend

  • Language: Rust
  • Framework: Actix-web or Axum
  • Database: None (file-based job storage)
  • mTLS: Rust TLS library (rustls or native-tls)

Infrastructure

  • Service Manager: systemd
  • Configuration: YAML
  • Logging: systemd journal + optional syslog

Deployment

  • Package Format: Native Linux packages (deb, rpm, apk, pkg.tar.zst)
  • Distribution: Via target system package manager (apt, dnf, apk, pacman)
  • Installation: Package installs binary, systemd service, and default config structure
  • Updates: Handled through system package manager

Security Architecture

Authentication

  • mTLS certificate-based authentication (required)
  • Internal self-hosted CA
  • Unique client certificates (1-year validity)
  • Silent drop for non-mTLS connections

Authorization

  • IP whitelist enforcement (block all by default)
  • No granular permissions (binary access: allowed or denied)
  • Whitelisted IP + valid cert = full API access

Process Security (systemd Hardening)

  • User: root (required for package management)
  • NoNewPrivileges: true (prevent privilege escalation)
  • ProtectSystem: strict (read-only filesystem except allowed paths)
  • ProtectHome: true (no access to /home, /root, /run/user)
  • PrivateTmp: true (isolated /tmp)
  • SystemCallFilter: Restrict to required syscalls only (application whitelist)
  • RestrictAddressFamilies: AF_INET, AF_INET6, AF_UNIX (network restrictions)
  • CapabilityBoundingSet: CAP_NET_BIND_SERVICE, CAP_SYS_ADMIN (minimal capabilities)

Data Security

  • All communications encrypted via TLS
  • Certificates stored securely with restricted permissions
  • Audit logging of all operations

Certificate Storage (Option A: Separate Files)

/etc/linux_patch_api/certs/
├── ca.pem       (644) - CA certificate
├── server.pem   (644) - Server certificate
└── server.key   (600) - Server private key (restricted)

Rationale:

  • Tighter permissions on private key only (600)
  • Easier certificate rotation (replace cert without touching key)
  • Standard practice for TLS deployments
  • No extraction overhead

File System Layout

/etc/linux_patch_api/
├── config.yaml          # Main configuration
├── whitelist.yaml       # IP whitelist
└── certs/
    ├── ca.pem          # CA certificate (or server.p12)
    ├── server.pem      # Server certificate
    └── server.key      # Server private key

/var/lib/linux_patch_api/
├── jobs/               # Job storage (cleared on restart)
└── state/              # Runtime state

/var/log/linux_patch_api/
└── audit.log           # Local audit log fallback

/usr/bin/linux-patch-api  # Binary location
/etc/systemd/system/linux-patch-api.service  # Systemd service

Data Flow

Synchronous Request Flow (Quick Operations):

Client → [mTLS Handshake] → [IP Whitelist Check] → [API Layer]
         ↓
    [Auth: Cert Valid?] → No → Silent Drop
         ↓ Yes
    [Authz: IP Allowed?] → No → Silent Drop
         ↓ Yes
    [Route to Handler] → [Execute Package Op] → [Log to Audit]
         ↓
    [Return JSON Response] ← Client

Asynchronous Request Flow (Long Operations):

Client → [mTLS + IP Check] → [API Layer] → [Create Job] → [Return Job ID]
                                           ↓
                                    [Job Manager Queue]
                                           ↓
                                    [Package Manager Backend]
                                           ↓
                                    [Update Job Status] → [WebSocket Broadcast]
                                           ↓
                                    [Job Complete/Timeout]
                                           ↓
                                    [Log to Audit]

Job Status Endpoint Flow:

Client → [mTLS + IP Check] → [API Layer] → [GET /jobs/{id}]
                                           ↓
                                    [Query Job Storage]
                                           ↓
                                    [Return Job Status JSON]

Configuration Reload Flow:

[Config File Changed] → [File Watcher Detects]
         ↓
    [Validate New Config] → Invalid → [Log Error, Keep Old Config]
         ↓ Valid
    [Swap Config in Memory] → [Notify Components] → [Log Reload Event]

Certificate Renewal Flow:

[Cert File Updated] → [File Watcher Detects]
         ↓
    [Validate Certificate Chain] → Invalid → [Log Error, Keep Old Certs]
         ↓ Valid
    [Reload TLS Context] → [New Connections Use New Certs] → [Log Reload Event]

Rollback Execution Flow (Exclusive):

[Rollback Triggered] → [Set Exclusive Mode] → [Reject New Requests]
         ↓
    [Execute Rollback Operations] → [Log Each Step]
         ↓
    [Rollback Complete] → [Clear Exclusive Mode] → [Accept New Requests]

Key Behaviors:

  • Failed jobs are cleared on service restart (no persistence)
  • Rollback execution is exclusive - no new requests accepted until complete
  • Certificate renewal follows same validation pattern as config reload
  • Status endpoint available (GET /jobs/{id}) in addition to WebSocket for job monitoring

API Design Principles

  • Pure REST (resources as nouns, HTTP verbs for actions)
  • JSON request/response with standard envelope
  • Hybrid execution model (sync for quick ops, async for long ops)
  • WebSocket for real-time job status streaming
  • GET /jobs/{id} endpoint for job status polling

Network Configuration

  • Bind Address: 0.0.0.0 (all interfaces)
  • Port: 12443 (HTTPS/mTLS)
  • Protocol: TLS 1.3 only
  • Firewall: Host-level firewall should restrict inbound to whitelisted IPs only

Health Checks

Endpoint: GET /health

Purpose: General service status check

Response (200 OK - Healthy):

{
  "success": true,
  "request_id": "uuid",
  "timestamp": "2026-04-09T13:04:02Z",
  "data": {
    "status": "healthy",
    "uptime_seconds": 12345,
    "version": "0.0.1"
  },
  "error": null
}

Health Check Criteria:

  • Service is listening on port 12443
  • mTLS is configured and valid
  • Config file is loaded and valid
  • Package manager backend is accessible

NOT Required:

  • Metrics collection
  • Alerting integration
  • Prometheus/Grafana endpoints

Following kiro spec-driven development standards