For Agents
Living index of themes for this project. Each H2 is a topic; bullets are wikilinks to related notes. Updated by
obsidian-documenterwhen documenting work. Read byhistorianat bootstrap. Topics kept alphabetical.
Agent Guides
- agent-guide-provision-new-vm — how-to for provisioning more VMs using the Spec 1 stack (architecture reality, prerequisites, pitfalls)
- agent-guide-configure-app-deploy — how-to for deploying an app onto
ops-vmtoday (manual path) and the Spec 2 features that will replace it
Artifact Registry & CI
- gcp-app-deploy-design — Spec 2a Terraform
modules/ar_wif/(AR Docker repoappsineurope-west3+ WIF pool/provider +ci-pusherSA + per-repo IAM bindings); reusablebuild-push.ymlGitHub Actions workflow called viaworkflow_callfrom each app repo
Ansible Configuration
- gcp-app-deploy-design — Spec 2a adds the
appsrole (compose + systemd templates) and extends thedockerrole with the AR credential helper - gcp-vm-provisioning-design — the four-role Ansible layer (base, docker, github_keys, monitoring),
ansible.cfgloading, explicit fact-gathering afterwait_for_connection - gcp-terraform-ansible-gotchas —
ansible.cfgauto-load gotcha,gather_factsordering gotcha - spec-1-deployment-complete — Ansible roles all green against live
ops-vm - spec-1-operations-runbook — re-applying Ansible via
make configureon the live VM
App Deploy
- gcp-app-deploy-design — Spec 2a approved design: AR + WIF + Ansible
appsrole (serviceandjobruntime shapes) + central manifest + EU migration + first app (polymarket-fetch) - spec-2-roadmap — Artifact Registry +
docker compose+ Cloudflare ZTNA + batch jobs + app logs/traces; the application-deploy layer on top ofops-vm - agent-guide-configure-app-deploy — manual deploy path today (
/opt/apps/<name>, named compose projects, OTel-via-localhost) and what Spec 2 replaces it with
Cost & Sizing
- spec-1-deployment-complete — ~$18/mo at e2-small + 20 GB pd-balanced + ephemeral external IPv4
- spec-1-operations-runbook — billing console URL, ~24-48h lag note, resize procedure
- spec-2-roadmap — open decision on Cloud NAT to drop the $3.65/mo external IPv4
EU Migration
- gcp-app-deploy-design — Spec 2a migrates the existing
ops-vmfromus-central1-atoeurope-west3-a(destroy + reprovision; samevm_nameso MagicDNS resolution stays unchanged)
Gotchas & Learnings
- gcp-terraform-ansible-gotchas — eleven reusable GCP / Terraform / Ansible / OTel / RTK traps from Spec 1 validation and live deployment
- spec-1-operations-runbook — the three gotchas most likely to bite on a re-run, with cross-links to the full reference
- spec-1-retrospective — meta-reflection on which gotchas the two-stage review caught vs which surfaced only at deployment time
How-to / Agent Guides
- agent-guide-provision-new-vm — agent-facing step-by-step for spinning up another VM
- agent-guide-configure-app-deploy — agent-facing step-by-step for deploying an application onto
ops-vm
Networking & Tailscale
- gcp-vm-provisioning-design — dedicated custom VPC, no public SSH, Tailscale MagicDNS access,
tag:cloud, Tailscale SSH ACL - gcp-terraform-ansible-gotchas —
defaultVPC open SSH rule, OS Login override, Tailscale SSH ACLautogroup:selfvs tagged devices - spec-1-deployment-complete — late refinement from key-based SSH to Tailscale SSH; ACL targets
tag:cloud - spec-1-operations-runbook —
ssh ops@ops-vmandtailscale ssh ops@ops-vmaccess paths
Operations
- spec-1-operations-runbook — day-2 access, health checks, common operations (re-apply / resize / teardown), logs, where things live, cost monitoring, secret rotation
- agent-guide-configure-app-deploy — day-1 app deploy hygiene that survives into Spec 2 (per-app folders, named compose projects, localhost-bound ports)
Process
- spec-1-retrospective — RPI-style brainstorm → spec → plan → execute → validate workflow with subagent-driven-development; what worked, what needed mid-stream adjustment, surprises during live run
Provisioning & Design
- gcp-app-deploy-design — Spec 2a design: extends the Terraform root with
modules/ar_wif/, adds theappsAnsible role, migrates the deployment toeurope-west3 - levandor-infra — project overview, two-spec roadmap, Spec 1 deployed
- gcp-vm-provisioning-design — approved Spec 1: Terraform-provisions / Ansible-configures VM lifecycle
- spec-1-deployment-complete — live deployment state:
ops-vme2-small in us-central1-a, on the tailnet, Docker + fail2ban + OTel - agent-guide-provision-new-vm — agent walkthrough for
make preflight → plan → provision → verifyagainst this Terraform root, plus the multi-VM refactor constraint
Roadmap
- gcp-app-deploy-design — Spec 2a approved (core app deploy + EU migration + first app)
- spec-2-roadmap — application-deploy layer (Artifact Registry,
docker compose, Cloudflare ZTNA, batch jobs, app telemetry); open design decisions; next-task list
Secrets & Auth
- gcp-app-deploy-design — Spec 2a uses Workload Identity Federation (OIDC) for CI to AR auth (no long-lived service-account keys); per-app
secrets/<app>.envfiles, gitignored, mode 0600, Ansible-copied to/opt/apps/<app>/.env - gcp-vm-provisioning-design — ADC auth, Tailscale auth key with
tag:cloud, SigNoz ingestion key, passwordless operation - spec-1-deployment-complete — Tailscale SSH replaces SSH keypair + macOS Keychain step entirely
- spec-1-operations-runbook — secret rotation procedures for the Tailscale auth key and SigNoz ingestion key
Telemetry & Monitoring
- gcp-vm-provisioning-design — OpenTelemetry Collector host metrics to SigNoz Cloud (eu2)
- gcp-terraform-ansible-gotchas — OTel
:8888/metricsas export-success signal, v0.152 receiver/exporter renames - spec-1-deployment-complete — ~25k host-metric points sent to
ingest.eu2.signoz.cloud, zero failures - spec-1-operations-runbook —
curl :8888/metricshealth-check and OTel log location for live debugging - spec-2-roadmap — extending
otelcol-contribwithfilelogfor logs and OTLP receivers for traces - agent-guide-configure-app-deploy — manual recipe for extending the collector config with
otlp+filelogreceivers for app logs/traces before Spec 2 lands
Tooling & Workflow
- gcp-terraform-ansible-gotchas —
rtk proxybypasses RTK output filtering when raw output is load-bearing (Terraform/Ansible/curl)
Troubleshooting
- spec-1-operations-runbook — common gotchas section keyed to the three most likely re-run traps; logs commands for OTel / fail2ban / Docker
Workflow
- spec-1-retrospective — two-stage subagent review (spec-compliance + code-quality) caught four design-stage bugs; carryover recommendations for Spec 2