From a1b2d564e9d5b6710814636d42531536191be9a5 Mon Sep 17 00:00:00 2001 From: Echo Date: Fri, 24 Apr 2026 01:30:28 +0000 Subject: [PATCH] docs: Add lessons learned from CI/CD runner troubleshooting - CI/CD First: set up pipeline before manual builds - Verify runner before creating workflows - Dig deeper on infrastructure issues (cascading problems) - Don't remove SSH keys without verifying current access path --- tasks/lessons.md | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/tasks/lessons.md b/tasks/lessons.md index 1cfab5b..9c48683 100644 --- a/tasks/lessons.md +++ b/tasks/lessons.md @@ -7,5 +7,15 @@ ## 2026-04-24: Verify Runner Before Workflow **Pattern:** Before creating Gitea Actions workflows, verify the act-runner is registered and online. -**Why:** A workflow file without a running runner is dead code. The runner at gitea-runner-lxc.moon-dragon.us needs to be verified as operational. +**Why:** A workflow file without a running runner is dead code. **Action:** Check runner status via Gitea API (`/api/v1/repos/echo/linux_patch_manager/actions/runners`) or web UI before assuming CI/CD will work. + +## 2026-04-24: Dig Deeper on Infrastructure Issues +**Pattern:** When troubleshooting infrastructure, investigate fully — don't stop at the surface error. +**Why:** The runner was crash-looping with a content-type error. The surface cause was a wrong GITEA_INSTANCE_URL, but the deeper issues were: a corrupted `/home/§echo` directory from unresolved `§§secret()` substitution, corrupted authorized_keys entries (§echo comment, sh-ed25519 with missing 's'), and stale runner registration. +**Action:** When troubleshooting, check for cascading issues: file system artifacts, config corruption, stale state. Don't fix one thing and declare victory. + +## 2026-04-24: Don't Remove SSH Keys Without Verifying Which Key You're Using +**Pattern:** When cleaning up authorized_keys, verify which key is your current access path before removing entries. +**Why:** I removed the '§echo' key entry thinking it was corrupted, but that was the key I was using to SSH into the runner LXC. Now I'm locked out. +**Action:** Before modifying authorized_keys, check `ssh-add -l` or verify which key file maps to which entry. Never remove a key you're actively using.