Submit an issue View all issues Source
MIR-957

Deploy doesn't stop old stateful sandbox before starting new one (flock conflict)

Done public
phinze phinze Opened Apr 2, 2026 Updated Apr 3, 2026

Summary

When deploying a new version of an app with a provider = "local" disk, the old version's stateful sandbox continues running while the new version's sandbox tries to start. Both mount the same local disk, causing flock conflicts.

Observed behavior

Deploying victoriametrics (which uses a local disk with flock):

  1. Old sandbox victoriametrics-vCZbsR9x (previous version) stayed running

  2. New sandboxes victoriametrics-vCZfG64h tried to start and panicked:

    FATAL: cannot acquire lock on file "/miren/data/local/victoria-metrics-data/flock.lock": 
    resource temporarily unavailable; make sure a single process has exclusive access
    
  3. New sandboxes crash-looped 3 times before the old sandbox was eventually stopped

  4. Once old sandbox went dead, new sandbox started successfully

Expected behavior

For stateful services with local disks, the deploy should stop the old version's sandbox before starting the new one — a rolling deploy isn't possible when they share exclusive access to a local disk.

Environment

  • Cluster: Garden
  • App: victoriametrics ([[services.victoriametrics.disks]] with provider = "local")