nSkillHub
Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Back to homepage

Build, Deploy, and Release: Trunk-Based Dev, Deployment Strategies, Zero-Downtime DB Migrations

How you deploy code is as important as how you write it. The gap between writing a feature and it running in production reliably is where most engineering organizations lose velocity. This post covers the decisions that shape that gap.


Trunk-Based Development vs GitFlow

GitFlow

Long-lived branches: main, develop, feature branches, release branches, hotfix branches. Features are developed on branches, merged to develop, periodically merged to release branches, then to main.

GitFlow was designed for versioned software releases — desktop applications, mobile apps with app store releases, libraries with semantic versioning. The release branch model makes sense when you control when customers get updates.

GitFlow is wrong for continuously deployed web services. Long-lived feature branches create integration debt. The further a branch diverges from main, the more painful the merge. Release branches add ceremony without adding value when you deploy continuously.

Trunk-Based Development (TBD)

All engineers work on short-lived branches (< 1 day ideally, max 2 days) and merge to main frequently. Main is always deployable. CI runs on every merge. Deploy from main.

Why TBD works:

  • Continuous integration — conflicts surfaced when they’re small, not after 2 weeks of divergence
  • Always-releasable main branch — deployment is a operational decision, not a coordination event
  • Forces small, incremental changes which are easier to review, test, and rollback
  • Matches the Git design intention — frequent small merges, not large infrequent ones

The prerequisite: Strong CI. Every merge to main must pass tests automatically. If CI is slow or unreliable, engineers avoid merging frequently — which defeats TBD.

Feature flags enable TBD at scale: Incomplete features are merged to main behind a flag. The code ships but is invisible to users until the flag is enabled.

The EM stance: For web services with continuous deployment, trunk-based development is the right default. GitFlow is appropriate for versioned software. Enforce short-lived feature branches by policy (auto-delete merged branches, flag any branch > 3 days old).


Blue-Green vs Canary vs Rolling Deployments

Rolling Deployment

Gradually replace old instances with new ones. At any moment, some instances run the old version and some run the new.

Start: [v1, v1, v1, v1]
Step 1: [v2, v1, v1, v1]
Step 2: [v2, v2, v1, v1]
Step 3: [v2, v2, v2, v1]
Done:  [v2, v2, v2, v2]

Advantages: No extra infrastructure cost (no idle environment). Simple in Kubernetes (default strategy).

Disadvantages: Old and new versions run simultaneously — any API contract changes must be backwards compatible. Rollback requires rolling back all instances (takes time). Not suitable for migrations that break old code.

Blue-Green Deployment

Two identical environments: blue (current) and green (new). Switch traffic from blue to green atomically via load balancer update.

Before: traffic → Blue (v1)
Deploy: green (v2) warmed up, tested
Switch: traffic → Green (v2)
Blue: stands by for instant rollback

Advantages: Instant rollback (flip back to blue). No version mixing — all traffic goes to one version at a time. Blue environment can be used for smoke testing before cutover.

Disadvantages: Double infrastructure cost during deployment. Database migrations must be compatible with both blue and green simultaneously (if blue is in standby, rollback means old code runs against the new schema).

Canary Deployment

Send a small percentage of traffic to the new version, gradually increase if metrics are good.

Start:      100% v1
Canary:     95% v1, 5% v2 → observe metrics
Expand:     75% v1, 25% v2 → observe
Continue:   0% v1, 100% v2

Advantages: Real production traffic validates the new version. Failure impact is limited to the canary percentage. Automatic rollback when error rate exceeds threshold.

Disadvantages: Complex to implement (requires traffic splitting at ingress/load balancer level, or feature flags). Observability needed to compare v1 vs v2 metrics side by side. Not suitable for high-blast-radius changes.

Tools: Argo Rollouts (Kubernetes), Flagger, AWS CodeDeploy canary, LaunchDarkly.

When to Use Which

Scenario Strategy
Low-risk changes, simple rollback acceptable Rolling
High-risk change, need instant rollback Blue-green
Gradual confidence building in prod Canary
DB schema change, backwards-compat required Rolling + expand compatibility first
Full replacement with smoke testing Blue-green

Feature Flags vs Branch-Based Releases

Feature flags decouple code deployment from feature activation. The code is deployed to production but the feature is inactive until the flag is enabled.

Feature flags solve:

  • Trunk-based development for incomplete features
  • A/B testing (enable for 50% of users)
  • Targeted rollout (enable for internal users first, then by country, then globally)
  • Kill switch — instantly disable a misbehaving feature without deployment
  • Separation of deployment (engineering event) from release (business event)

The trade-off: Flags accumulate. A codebase with 200 stale feature flags is hard to reason about. Establish a lifecycle: every flag has an owner and a removal date. Flags should be short-lived (days to weeks for launch flags, long-lived for kill switches and operational toggles).

Feature flag services: LaunchDarkly, Optimizely, AWS AppConfig, Unleash (open source), or a simple database/config table for basic use cases.


Zero-Downtime Database Migrations

Database migrations are the hardest part of zero-downtime deployments. The standard approach is the expand-contract pattern (also called “parallel change”).

The problem: If you rename a column, the new code needs the new name, the old code needs the old name. During a rolling deployment, both versions run simultaneously — both must work against the same DB.

The Expand-Contract Pattern

Phase 1: Expand (backwards-compatible addition)

  • Add the new column (nullable, with default)
  • Start writing to both old and new columns
  • Deploy — old code reads old column, new code reads new column
  • Both coexist, database has both

Phase 2: Migrate data

  • Backfill the new column from the old column (use batched migration, not a single UPDATE that locks the table)
  • Verify data integrity

Phase 3: Contract (removal)

  • Deploy code that only uses the new column
  • Once all old-code instances are gone (rolling deployment complete), drop the old column in a separate migration

Total time: 2–3 deployments over days/weeks. Slower than a simple ALTER TABLE RENAME COLUMN, but zero downtime and instant rollback at every step.

Additive-Only Schema Changes (Safe in Rolling Deploys)

  • Adding a nullable column
  • Adding a new table
  • Adding an index (CONCURRENTLY in Postgres — no table lock)
  • Adding a new enum value (be careful — some ORMs break on unknown enum values)

Dangerous Schema Changes (Require Expand-Contract or Maintenance Window)

  • Renaming a column or table
  • Removing a column (old code still references it)
  • Changing a column type
  • Making a nullable column NOT NULL (without a default or backfill)

Tooling

Flyway / Liquibase: Version-controlled migration scripts. Run as part of deployment. Good for most teams — migrations are in source control alongside the code.

Best practice: Never run migrations in the application startup. Run them as a separate init container or pre-deployment step. Application startup should be fast and deterministic; migrations can be slow and irreversible.


Monorepo vs Polyrepo

Monorepo

All services in one repository. Google, Meta, and Twitter (X) use large monorepos.

Advantages:

  • Atomic cross-service changes. Change the API contract and update all consumers in one commit.
  • Unified tooling, standards, and dependency management. One version of a library used everywhere.
  • Easier code sharing and refactoring across service boundaries.
  • Simpler discovery — one place to search all code.
  • Single CI/CD pipeline (with build graph optimization — only build changed services).

Disadvantages:

  • Scale challenges. Naive monorepo tooling (running all tests on every commit) breaks at scale. Need Bazel, Nx, Turborepo, or similar build graph tools.
  • Clone and IDE performance. A 10GB repository is slow to clone and index.
  • Access control is harder. Restricting who can modify what requires CODEOWNERS or custom checks.

Polyrepo

Each service in its own repository.

Advantages:

  • Simpler per-service tooling and CI.
  • Clear ownership boundaries (repo = team).
  • No build system complexity for incremental builds.

Disadvantages:

  • Cross-service changes require PRs across multiple repos — coordination overhead.
  • Dependency management is hard — keeping library versions consistent across repos.
  • Code discovery is harder — where does this function live?
  • Duplication of boilerplate and configuration (CI templates, linting config, etc.).

The EM take: The choice often depends on team scale and discipline. A small, cohesive team in a monorepo moves fast. A large organization with independent team ownership often works better with polyrepo (or a hybrid: monorepo per domain, polyrepo across domains). Don’t choose monorepo unless you’re prepared to invest in build tooling (Bazel, Nx, Turborepo).