MIR-957

Deploy doesn't stop old stateful sandbox before starting new one (flock conflict)

Done public

phinze Opened Apr 2, 2026 Updated Apr 3, 2026

Drain old pools before starting new ones for services with disks

Summary

When deploying a new version of an app with a provider = "local" disk, the old version's stateful sandbox continues running while the new version's sandbox tries to start. Both mount the same local disk, causing flock conflicts.

Observed behavior

Deploying victoriametrics (which uses a local disk with flock):

Old sandbox victoriametrics-vCZbsR9x (previous version) stayed running

New sandboxes victoriametrics-vCZfG64h tried to start and panicked:

FATAL: cannot acquire lock on file "/miren/data/local/victoria-metrics-data/flock.lock": 
resource temporarily unavailable; make sure a single process has exclusive access

New sandboxes crash-looped 3 times before the old sandbox was eventually stopped
Once old sandbox went dead, new sandbox started successfully

Expected behavior

For stateful services with local disks, the deploy should stop the old version's sandbox before starting the new one — a rolling deploy isn't possible when they share exclusive access to a local disk.

Environment

Cluster: Garden
App: victoriametrics ([[services.victoriametrics.disks]] with provider = "local")