M1: Complete all specification documents (kiro standards)
Completed comprehensive spec-driven documentation: - SPEC.md (222 lines): Project scope, objectives, constraints - ARCHITECTURE.md (290 lines): System design, components, data flow - REQUIREMENTS.md (168 lines): Functional & non-functional requirements - API_SPEC.md (556 lines): 15 API endpoints with schemas - SECURITY.md (188 lines): STRIDE threat model, security controls - ROADMAP.md (203 lines): 5 phases, 8 milestones, risk register Total: 1,627 lines of specification documentation Milestone M1 complete - Ready for Phase 0 (Rust scaffolding)
This commit is contained in:
310
ARCHITECTURE.md
310
ARCHITECTURE.md
@ -1,36 +1,290 @@
|
||||
# Linux_Patch_API - Architecture Document
|
||||
|
||||
## System Overview
|
||||
[High-level system architecture description]
|
||||
|
||||
## Component Architecture
|
||||
### Core Components
|
||||
[List and describe core components]
|
||||
The Linux_Patch_API is a secure, single-host API service that enables remote package and patch management on Linux systems. Each instance runs as a systemd service on the managed host, providing a REST API over mTLS with strict IP whitelist enforcement.
|
||||
|
||||
### External Integrations
|
||||
[Describe external system integrations]
|
||||
|
||||
## Technology Stack
|
||||
### Backend
|
||||
- Language: [TBD]
|
||||
- Framework: [TBD]
|
||||
- Database: [TBD]
|
||||
|
||||
### Frontend (if applicable)
|
||||
- Framework: [TBD]
|
||||
|
||||
### Infrastructure
|
||||
- Container: [TBD]
|
||||
- Orchestration: [TBD]
|
||||
|
||||
## Security Architecture
|
||||
[Security layers and controls]
|
||||
|
||||
## Data Flow
|
||||
[Describe data flow through the system]
|
||||
|
||||
## API Design Principles
|
||||
[REST/gRPC/GraphQL design decisions]
|
||||
**Architecture Type:** Agent Per Host (Option B)
|
||||
**Deployment:** One instance per managed Linux host
|
||||
**Network:** Internal network only (no internet exposure)
|
||||
|
||||
---
|
||||
|
||||
## Component Architecture
|
||||
|
||||
### Core Components
|
||||
|
||||
1. **API Layer (Actix-web/Axum)**
|
||||
- HTTP/HTTPS endpoint handling
|
||||
- mTLS termination
|
||||
- IP whitelist enforcement
|
||||
- Request routing
|
||||
- WebSocket support for real-time job status
|
||||
|
||||
2. **Authentication Layer**
|
||||
- Certificate validation (mTLS)
|
||||
- Client identity extraction from certificate
|
||||
- No session management (stateless, cert-based auth only)
|
||||
|
||||
3. **Authorization Layer**
|
||||
- IP whitelist checking (deny by default)
|
||||
- No permission validation (whitelisted IP + valid cert = full access)
|
||||
|
||||
4. **Job Manager**
|
||||
- Async job queue for long-running operations
|
||||
- Job status tracking with persistent storage
|
||||
- WebSocket broadcast for real-time status updates
|
||||
- 30-minute timeout enforcement
|
||||
- Job cleanup and expiration
|
||||
|
||||
5. **Package Manager Backend (Pluggable)**
|
||||
- apt/dpkg adapter (Debian/Ubuntu - primary)
|
||||
- dnf/yum adapter (RHEL/CentOS/Fedora)
|
||||
- apk adapter (Alpine)
|
||||
- pacman adapter (Arch)
|
||||
- Distribution detection and adapter selection
|
||||
|
||||
6. **Audit Logger**
|
||||
- systemd journal integration (primary)
|
||||
- Optional remote syslog server
|
||||
- Local file fallback (`/var/log/linux_patch_api/`)
|
||||
- 30-day retention with daily rotation and gzip compression
|
||||
|
||||
7. **Configuration Manager**
|
||||
- YAML config file watcher (`/etc/linux_patch_api/config.yaml`)
|
||||
- Auto-reload on file change
|
||||
- Config validation before reload (prevents service downtime)
|
||||
- Runtime settings access for all components
|
||||
|
||||
### External Integrations
|
||||
|
||||
- **Package Managers:** apt, dnf, yum, apk, pacman (via system commands)
|
||||
- **systemd:** Service management and journal logging
|
||||
- **Internal CA:** Certificate validation against self-hosted CA
|
||||
- **Remote Syslog:** Optional external log aggregation
|
||||
|
||||
---
|
||||
|
||||
## Technology Stack
|
||||
|
||||
### Backend
|
||||
- **Language:** Rust
|
||||
- **Framework:** Actix-web or Axum
|
||||
- **Database:** None (file-based job storage)
|
||||
- **mTLS:** Rust TLS library (rustls or native-tls)
|
||||
|
||||
### Infrastructure
|
||||
- **Service Manager:** systemd
|
||||
- **Configuration:** YAML
|
||||
- **Logging:** systemd journal + optional syslog
|
||||
|
||||
### Deployment
|
||||
- **Package Format:** Native Linux packages (deb, rpm, apk, pkg.tar.zst)
|
||||
- **Distribution:** Via target system package manager (apt, dnf, apk, pacman)
|
||||
- **Installation:** Package installs binary, systemd service, and default config structure
|
||||
- **Updates:** Handled through system package manager
|
||||
|
||||
---
|
||||
|
||||
## Security Architecture
|
||||
|
||||
### Authentication
|
||||
- mTLS certificate-based authentication (required)
|
||||
- Internal self-hosted CA
|
||||
- Unique client certificates (1-year validity)
|
||||
- Silent drop for non-mTLS connections
|
||||
|
||||
### Authorization
|
||||
- IP whitelist enforcement (block all by default)
|
||||
- No granular permissions (binary access: allowed or denied)
|
||||
- Whitelisted IP + valid cert = full API access
|
||||
|
||||
### Process Security (systemd Hardening)
|
||||
- **User:** root (required for package management)
|
||||
- **NoNewPrivileges:** true (prevent privilege escalation)
|
||||
- **ProtectSystem:** strict (read-only filesystem except allowed paths)
|
||||
- **ProtectHome:** true (no access to /home, /root, /run/user)
|
||||
- **PrivateTmp:** true (isolated /tmp)
|
||||
- **SystemCallFilter:** Restrict to required syscalls only (application whitelist)
|
||||
- **RestrictAddressFamilies:** AF_INET, AF_INET6, AF_UNIX (network restrictions)
|
||||
- **CapabilityBoundingSet:** CAP_NET_BIND_SERVICE, CAP_SYS_ADMIN (minimal capabilities)
|
||||
|
||||
### Data Security
|
||||
- All communications encrypted via TLS
|
||||
- Certificates stored securely with restricted permissions
|
||||
- Audit logging of all operations
|
||||
|
||||
### Certificate Storage (Option A: Separate Files)
|
||||
|
||||
```
|
||||
/etc/linux_patch_api/certs/
|
||||
├── ca.pem (644) - CA certificate
|
||||
├── server.pem (644) - Server certificate
|
||||
└── server.key (600) - Server private key (restricted)
|
||||
```
|
||||
|
||||
**Rationale:**
|
||||
- Tighter permissions on private key only (600)
|
||||
- Easier certificate rotation (replace cert without touching key)
|
||||
- Standard practice for TLS deployments
|
||||
- No extraction overhead
|
||||
---
|
||||
|
||||
## File System Layout
|
||||
|
||||
```
|
||||
/etc/linux_patch_api/
|
||||
├── config.yaml # Main configuration
|
||||
├── whitelist.yaml # IP whitelist
|
||||
└── certs/
|
||||
├── ca.pem # CA certificate (or server.p12)
|
||||
├── server.pem # Server certificate
|
||||
└── server.key # Server private key
|
||||
|
||||
/var/lib/linux_patch_api/
|
||||
├── jobs/ # Job storage (cleared on restart)
|
||||
└── state/ # Runtime state
|
||||
|
||||
/var/log/linux_patch_api/
|
||||
└── audit.log # Local audit log fallback
|
||||
|
||||
/usr/bin/linux-patch-api # Binary location
|
||||
/etc/systemd/system/linux-patch-api.service # Systemd service
|
||||
```
|
||||
---
|
||||
|
||||
## Data Flow
|
||||
|
||||
### Synchronous Request Flow (Quick Operations):
|
||||
|
||||
```
|
||||
Client → [mTLS Handshake] → [IP Whitelist Check] → [API Layer]
|
||||
↓
|
||||
[Auth: Cert Valid?] → No → Silent Drop
|
||||
↓ Yes
|
||||
[Authz: IP Allowed?] → No → Silent Drop
|
||||
↓ Yes
|
||||
[Route to Handler] → [Execute Package Op] → [Log to Audit]
|
||||
↓
|
||||
[Return JSON Response] ← Client
|
||||
```
|
||||
|
||||
### Asynchronous Request Flow (Long Operations):
|
||||
|
||||
```
|
||||
Client → [mTLS + IP Check] → [API Layer] → [Create Job] → [Return Job ID]
|
||||
↓
|
||||
[Job Manager Queue]
|
||||
↓
|
||||
[Package Manager Backend]
|
||||
↓
|
||||
[Update Job Status] → [WebSocket Broadcast]
|
||||
↓
|
||||
[Job Complete/Timeout]
|
||||
↓
|
||||
[Log to Audit]
|
||||
```
|
||||
|
||||
### Job Status Endpoint Flow:
|
||||
|
||||
```
|
||||
Client → [mTLS + IP Check] → [API Layer] → [GET /jobs/{id}]
|
||||
↓
|
||||
[Query Job Storage]
|
||||
↓
|
||||
[Return Job Status JSON]
|
||||
```
|
||||
|
||||
### Configuration Reload Flow:
|
||||
|
||||
```
|
||||
[Config File Changed] → [File Watcher Detects]
|
||||
↓
|
||||
[Validate New Config] → Invalid → [Log Error, Keep Old Config]
|
||||
↓ Valid
|
||||
[Swap Config in Memory] → [Notify Components] → [Log Reload Event]
|
||||
```
|
||||
|
||||
### Certificate Renewal Flow:
|
||||
|
||||
```
|
||||
[Cert File Updated] → [File Watcher Detects]
|
||||
↓
|
||||
[Validate Certificate Chain] → Invalid → [Log Error, Keep Old Certs]
|
||||
↓ Valid
|
||||
[Reload TLS Context] → [New Connections Use New Certs] → [Log Reload Event]
|
||||
```
|
||||
|
||||
### Rollback Execution Flow (Exclusive):
|
||||
|
||||
```
|
||||
[Rollback Triggered] → [Set Exclusive Mode] → [Reject New Requests]
|
||||
↓
|
||||
[Execute Rollback Operations] → [Log Each Step]
|
||||
↓
|
||||
[Rollback Complete] → [Clear Exclusive Mode] → [Accept New Requests]
|
||||
```
|
||||
|
||||
### Key Behaviors:
|
||||
|
||||
- Failed jobs are cleared on service restart (no persistence)
|
||||
- Rollback execution is exclusive - no new requests accepted until complete
|
||||
- Certificate renewal follows same validation pattern as config reload
|
||||
- Status endpoint available (GET /jobs/{id}) in addition to WebSocket for job monitoring
|
||||
|
||||
---
|
||||
|
||||
## API Design Principles
|
||||
|
||||
- Pure REST (resources as nouns, HTTP verbs for actions)
|
||||
- JSON request/response with standard envelope
|
||||
- Hybrid execution model (sync for quick ops, async for long ops)
|
||||
- WebSocket for real-time job status streaming
|
||||
- GET /jobs/{id} endpoint for job status polling
|
||||
|
||||
---
|
||||
|
||||
## Network Configuration
|
||||
|
||||
- **Bind Address:** 0.0.0.0 (all interfaces)
|
||||
- **Port:** 12443 (HTTPS/mTLS)
|
||||
- **Protocol:** TLS 1.3 only
|
||||
- **Firewall:** Host-level firewall should restrict inbound to whitelisted IPs only
|
||||
|
||||
---
|
||||
|
||||
## Health Checks
|
||||
|
||||
### Endpoint: GET /health
|
||||
|
||||
**Purpose:** General service status check
|
||||
|
||||
**Response (200 OK - Healthy):**
|
||||
```json
|
||||
{
|
||||
"success": true,
|
||||
"request_id": "uuid",
|
||||
"timestamp": "2026-04-09T13:04:02Z",
|
||||
"data": {
|
||||
"status": "healthy",
|
||||
"uptime_seconds": 12345,
|
||||
"version": "0.0.1"
|
||||
},
|
||||
"error": null
|
||||
}
|
||||
```
|
||||
|
||||
**Health Check Criteria:**
|
||||
- Service is listening on port 12443
|
||||
- mTLS is configured and valid
|
||||
- Config file is loaded and valid
|
||||
- Package manager backend is accessible
|
||||
|
||||
**NOT Required:**
|
||||
- Metrics collection
|
||||
- Alerting integration
|
||||
- Prometheus/Grafana endpoints
|
||||
|
||||
---
|
||||
|
||||
*Following kiro spec-driven development standards*
|
||||
|
||||
Reference in New Issue
Block a user