Submit an issue View all issues Source
MIR-916

Fixed-mode pools don't restore DesiredInstances after crash cooldown expires

Open public
phinze phinze Opened Mar 27, 2026 Updated Apr 2, 2026

Bug

When a fixed-mode pool (e.g. num_instances=3) enters crash cooldown, the pool manager resets DesiredInstances to 1 to prevent runaway sandbox creation. When cooldown expires naturally, the crash counter fields are reset but DesiredInstances is never restored to the configured value.

This means a fixed-3 app that crashes and recovers will silently run at 1 instance until the next deploy.

Steps to reproduce

  1. Configure an app with mode = \"fixed\", num_instances = 3
  2. Deploy — observe 3 sandboxes running
  3. Trigger a crash loop (e.g. deploy a broken version, then roll back)
  4. Wait for cooldown to expire naturally
  5. Observe only 1 sandbox running instead of 3

Root cause

controllers/sandboxpool/manager.go line ~118-130 sets DesiredInstances = 1 during cooldown. Lines 138-147 reset the crash counter when a healthy sandbox is detected, but don't restore DesiredInstances.

The only code path that restores DesiredInstances for fixed-mode is the deployment launcher (controllers/deployment/launcher.go line ~363), which runs on deploy.

Suggested fix

After resetting the crash counter (manager.go ~143), look up the service's concurrency config and restore DesiredInstances to the configured NumInstances for fixed-mode pools.

Auto-mode pools are unaffected because the activator manages their DesiredInstances based on traffic.

Context

Discovered while implementing MIR-905 (miren app restart). The restart handler works around this by explicitly restoring DesiredInstances from the config, but the natural cooldown expiry path still has the bug." 4