MIR-596

Simplify activator cache architecture to reduce consistency bugs

Open public

phinze Opened Dec 18, 2025 Updated Apr 21, 2026

Problem

The activator maintains three separate caches that overlap in confusing ways:

versions (map[verKey]*versionPoolRef) - maps version+service → pool reference
pools (map[verKey]*poolState) - maps version+service → pool state with sentinel pattern
poolSandboxes (map[entity.Id]*poolSandboxes) - maps pool ID → sandboxes

These caches duplicate data:

Pool entity is stored in both pools[key].pool and poolSandboxes[poolID].pool
Strategy is stored in both versions[key].strategy and poolSandboxes[poolID].strategy
Service is duplicated across caches

All three are guarded by a single RWMutex, so the separation provides no concurrency benefit - just cognitive overhead and consistency bugs when caches get out of sync.

We recently hit a production issue where a deleted pool remained in the caches, causing "pool has reached maximum size" errors. Fixed in https://github.com/mirendev/runtime/pull/498 by adding a watchPools goroutine, but this is a band-aid on a fundamentally fragile architecture.

Suggested Direction

Consolidate to a two-cache model:

versionToPool map[verKey]entity.Id    // Index for hot path lookup
pools map[entity.Id]*poolState        // Single source of truth for pool data

Where poolState contains everything:

Pool entity + revision
Sandboxes list
Strategy
Sentinel pattern fields (inProgress, done, err)

Benefits:

Pool data lives in one place - no sync bugs between pools and poolSandboxes
versionToPool is just a routing index, not duplicated data
Cleanup on pool deletion: delete from pools, scan versionToPool for stale references
Hot path (AcquireLease) remains two map lookups under RLock