Submit an issue View all issues Source
MIR-1139

apiAddresses() falls back to discovered WAN IP when netcheck reports zero reachable; bypasses MIR-1117 fix

Duplicate Bug public
phinze phinze Opened May 14, 2026 Updated May 14, 2026

MIR-1117 / PR #139 closed one path to a bad outcome: a non-miren listener on 443 (UDM Pro admin UI, NAS, Plex) could complete a TLS handshake during netcheck, the cluster would mark itself "Reachable", advertise its WAN IP, and DNS would point cluster-X.miren.systems at the home router instead of the POP fleet. The fix verifies the peer cert is a miren-runtime self-signed cert.

There's a second path to the same bad outcome that the fix doesn't cover. When netcheck legitimately reports zero reachable — e.g. port 443 isn't forwarded at all, the cluster isn't externally reachable on any address — apiAddresses() falls back to including all DiscoveredIPs, which includes the WAN IP, anyway.

What I observed

Brought up a fresh test cluster via make dev + miren server register + MIREN_LABS=globalrouter. Port 443 is not externally reachable from this network (verified from a Hetzner box outside the LAN: curl 65.79.152.117:443 times out, raw TCP says closed/filtered). Yet the cluster reports api_addresses including 65.79.152.117 to cloud, and cloud writes a direct A record:

cluster-o8qlrgj0mo3e.miren.systems A 65.79.152.117

External clients resolving that hostname hit a closed port. The POP fleet never gets a chance to serve the traffic. Same end state as the bug MIR-1117 set out to fix.

Root cause

runtime/components/coordinate/coordinate.go:1376 in apiAddresses():

// For discovered IPs, netcheck results replace discovered public IPs
// when netcheck found reachable addresses. If netcheck ran but found
// nothing reachable (e.g., firewalled), keep discovered public IPs
// as a fallback.
pubAddrs := c.publicAddresses()
for _, ip := range c.DiscoveredIPs {
    if len(pubAddrs) > 0 && !ip.IsLoopback() && !ip.IsPrivate() && !ip.IsLinkLocalUnicast() {
        continue
    }
    addrs = append(addrs, net.JoinHostPort(ip.String(), "8443"))
}

The fallback intent (per the comment) is "if netcheck ran but found nothing reachable, keep the discovered public IPs anyway as a fallback." But a firewalled cluster shouldn't be advertising its WAN IP — it should rely on POP routing. The fallback inverts the cert-verification gate by re-introducing the same IP that #139's check would have rejected.

PR #139 closed the cert-verification path. This issue is about closing the no-reachable-at-all path.

Two-path picture

Path When triggered Fix status
A: Netcheck false-positive on non-miren responder Some service serves a cert on 443 that isn't miren's Closed by PR #139
B: Discovered-IP fallback when netcheck reports zero reachable Port 443 isn't externally reachable at all Open — this issue

Suggested fix

The conservative move: never include discovered public IPs as a fallback. The cluster only advertises a public IP when netcheck explicitly confirms it's reachable and serves the right cert.

A less invasive variant: differentiate "netcheck ran and found zero" from "netcheck failed to run" (network error, cloud unavailable). Only fall back to DiscoveredIPs in the latter case. Today the code treats both as the same fallback condition.

Either way, the goal is: a firewalled cluster sees its DNS routed through pop-global.miren.cloud, not stranded at an unreachable WAN IP.