Last mile fixes for distributed runner startup
Catch-all for the remaining issues blocking distributed runners from actually taking work, discovered during MIR-890 dogfooding on Garden.
Issues
1. Runner join writes localhost addresses into runner config
When a runner joins via miren runner join -c <external-ip>:8443, the coordinator responds with its bind address (0.0.0.0:8443) and local etcd (localhost:12379) instead of externally reachable addresses. The runner config ends up with:
coordinator_address: 0.0.0.0:8443
etcd_endpoints:
- https://localhost:12379
Expected: Should write the address the runner actually connected to (or an advertised external address).
Workaround: Manual sed on /var/lib/miren/runner/config.yaml after join.
2. Runner doesn't find containerd in release directory
miren runner start does exec.LookPath("containerd") which searches system PATH, but containerd is in /var/lib/miren/release/ alongside the miren binary. The server knows to look there; the runner doesn't.
Workaround: Add Environment="PATH=/var/lib/miren/release:..." to the systemd unit.
3. Runner needs full base release bundle, not just CLI
The bash installer / CLI tarball only includes the miren binary. The runner needs containerd, runc, and shims from miren-base-linux-*.tar.gz. There's no miren runner install to handle this (see MIR-902).
Workaround: Manually download and extract miren-base-linux-amd64.tar.gz to /var/lib/miren/release/.
4. Runner registered but not showing active status
After fixing issues 1-3, the runner starts, containerd launches, etcd connections establish (verified via ss), but the coordinator's miren runner list still shows no STATUS or ADDRESS for the node. The runner appears to be connected but not heartbeating or reporting readiness.
5. Runner join requires TTY for join code input
miren runner join opens /dev/tty directly to read the join code, which means it can't be piped via stdin or used in non-interactive contexts (e.g. gcloud compute ssh --command). This blocks automation and makes it annoying to operate — you have to fully interactive SSH in, sudo su -, then run the join.
Expected: Should accept the join code via stdin (or a --code flag) so it works in non-interactive shells. The TTY prompt is fine as a fallback when stdin is a terminal, but shouldn't be the only path.
Current state
- Runner-1: joined, service running, containerd running, etcd connections flowing, but not showing active on coordinator
- Runner-2: base bundle installed, not yet joined
- Garden:
main:1226e9a, distributedrunners enabled, healthy
Environment
- Coordinator: miren-garden (
main:1226e9a) - Runners: miren-garden-runner-{1,2} (
main:1226e9a) - GCP project: miren-development