MIR-482

Refactor BuildFromTar for testability and maintainability

Done public

phinze Opened Oct 28, 2025 Updated Mar 25, 2026

Problem

servers/build/build.go::BuildFromTar() is a 360-line monolithic method that handles everything from tar extraction to buildkit setup to image building to version management to pool scaling. This makes it:

Hard to test - Requires full buildkit + registry infrastructure for any testing
Hard to maintain - All logic in one massive method
Hard to debug - Difficult to isolate which phase is failing
Not reusable - Business logic buried in orchestration code

Concrete Example

PR #282 added pool scaling logic (lines 453-495) but we couldn't easily add unit tests because the code is embedded in a 360-line method with many dependencies (buildkit, entity store, app client, resolver, temp dirs, tar streams, status callbacks).

Proposal

Extract 4-5 logical phases into separate, testable methods:

1. `scaleDownOldVersionPools(ctx, oldVersionID)` (lines 453-495)

Pure business logic - no buildkit dependency
Easy to test - just needs entity store
Clear interface - versionID in, scaled pools out
High value - would have caught our recent pool scaling issue

2. `prepareAppVersion(ctx, tarFS, version)` (lines 345-435)

Handles Procfile parsing, service config, defaults resolution
90+ lines of business logic
Currently buried in the middle
No external dependencies

3. `buildAndPushImage(ctx, tarFS, buildStack, version)` (lines 296-343)

Encapsulates buildkit interaction
Clear boundaries

4. `extractAndValidateCode(ctx, tarStream, tempDir)` (lines 163-219)

Tar extraction + stack detection
Clear input/output

After Refactoring

Main method becomes high-level orchestration (~20 lines):

func (b *Builder) BuildFromTar(ctx, state) error {
    tempDir := b.setupTempDirectory()
    defer cleanup(tempDir)
    
    tarFS := b.extractAndValidateCode(ctx, state.Args().Tardata(), tempDir)
    buildStack := b.detectBuildStack(tarFS)
    
    appRec, newVersion := b.prepareNextVersion(ctx, state.Args().Application())
    oldVersion := appRec.ActiveVersion
    
    image := b.buildAndPushImage(ctx, tarFS, buildStack, newVersion)
    finalVersion := b.prepareAppVersion(ctx, tarFS, newVersion, image)
    
    b.activateVersion(ctx, appRec, finalVersion)
    
    if oldVersion != "" {
        b.scaleDownOldVersionPools(ctx, oldVersion)
    }
    
    return nil
}

Benefits

Testability - Each phase can be unit tested independently
Readability - Main method shows clear build pipeline flow
Maintainability - Each method has single responsibility
Reusability - Pool scaling could be used elsewhere (manual deploys, rollbacks)
Debugging - Easier to isolate which phase is failing

Example Test (After Refactor)

func TestScaleDownOldVersionPools(t *testing.T) {
    server := testutils.NewInMemEntityServer(t)
    builder := &Builder{
        ec: entityserver.NewClient(log, server.EAC),
        EAS: server.EAC,
    }
    
    // Create pools for different versions
    oldPool := createTestPool(t, server, "v1", "web")
    newPool := createTestPool(t, server, "v2", "web")
    
    // Scale down old version
    err := builder.scaleDownOldVersionPools(ctx, "v1")
    require.NoError(t, err)
    
    // Assert v1 pool scaled to zero, v2 untouched
    assertPoolDesiredInstances(t, server, oldPool.ID, 0)
    assertPoolDesiredInstances(t, server, newPool.ID, 1)
}

Scope

Extract 4-5 logical phases into separate methods
Add unit tests for extracted methods (especially scaleDownOldVersionPools)
Ensure no behavior change (pure refactoring)
Keep all existing integration tests passing

Non-Goals

Changing build behavior
Adding new features
Rewriting the entire build system

PR #282: Added pool scaling code that highlighted this testability gap
MIR-474: E2E deployment test (would complement these unit tests)