Blue-Green and Canary Deployments

Understand deployment strategies and their QA implications. Learn blue-green, canary, rolling, and shadow deployments with testing approaches for each.

Why Deployment Strategies Matter for QA

How software is deployed directly affects how it should be tested. A big-bang deployment (replacing everything at once) requires different QA approaches than a gradual canary rollout. Understanding deployment strategies helps QA engineers design appropriate validation steps and rollback procedures.

Deployment Strategies Overview

Big-Bang Deployment

Replace the old version entirely with the new version at once. Simple but risky — if something goes wrong, all users are affected immediately.

QA approach: Thorough pre-deployment testing. Comprehensive staging validation. Immediate smoke tests post-deployment.

Blue-Green Deployment

Maintain two identical production environments:

Blue: Currently serving traffic (old version)
Green: New version deployed and tested

After green is validated, traffic switches from blue to green instantly. Blue becomes the rollback target.

Before switch:     Users → [Load Balancer] → Blue (v1.0)
                                              Green (v1.1) ← QA validates

After switch:      Users → [Load Balancer] → Green (v1.1)
                                              Blue (v1.0) ← Rollback ready

QA validation steps:

Deploy new version to green environment
Run full smoke test suite against green
Verify database migrations work correctly
Check health endpoints and monitoring
Switch traffic to green
Run production smoke tests immediately
Monitor metrics for 15-30 minutes
Keep blue running for fast rollback (typically 24-48 hours)

Canary Deployment

Route a small percentage of traffic to the new version. Gradually increase if metrics are healthy.

Phase 1:  95% → v1.0    5% → v1.1   (canary)
Phase 2:  75% → v1.0   25% → v1.1
Phase 3:  50% → v1.0   50% → v1.1
Phase 4:   0% → v1.0  100% → v1.1   (full rollout)

QA validation at each phase:

Error rates (compare canary vs. baseline)
Response times (P50, P95, P99)
Business metrics (conversion rates, user engagement)
Infrastructure metrics (CPU, memory, disk)

Rolling Deployment

Update instances one at a time (or in batches). At any point, some instances run the old version and some run the new version.

QA concerns:

Users may hit different versions within the same session
API version compatibility between old and new instances
Database schema must be compatible with both versions simultaneously

Shadow Deployment

Deploy the new version alongside production but do not serve real traffic. Instead, mirror (copy) production traffic to the shadow environment and compare responses.

QA validation:

Compare response bodies between production and shadow
Check for errors in shadow that do not occur in production
Validate performance characteristics
Shadow results do not affect users

Rollback Procedures

Rollback Criteria

Define clear, measurable criteria before deployment:

Metric	Threshold	Action
Error rate	> 1% increase	Automatic rollback
P95 response time	> 500ms (baseline: 200ms)	Alert + manual decision
Conversion rate	> 5% decrease	Manual rollback
Health check failures	Any pod unhealthy for > 2 min	Automatic rollback

Rollback Testing

Critical: Test the rollback procedure itself. A rollback that has never been tested is not a rollback plan — it is a hope.

Deploy new version to staging
Simulate a failure scenario
Execute the rollback procedure
Verify the old version is serving correctly
Verify no data was lost or corrupted during rollback

Exercise: Design Deployment Validation

Your team is switching from big-bang deployments to canary. Design the validation plan for a 4-phase canary rollout of a payment processing update.

Solution

Pre-Deployment

Full regression suite passed in staging
Payment sandbox tests all pass
Database migration tested and reversible
Rollback procedure documented and tested

Phase 1: 1% Traffic (30 minutes)

Automated checks:

Payment success rate ≥ 99.5% (baseline: 99.7%)
API response time P95 < 300ms
Zero 500 errors from new version

Manual checks:

Process test transactions through canary
Verify webhook deliveries
Check payment provider dashboard

Go/No-Go: All metrics within thresholds → proceed to Phase 2

Phase 2: 10% Traffic (2 hours)

Automated checks:

Same as Phase 1 with statistical significance
Compare canary vs. baseline error rates
Monitor database connection pool usage

Manual checks:

Verify refund processing works
Check email receipt delivery
Review payment logs for anomalies

Phase 3: 50% Traffic (4 hours)

Automated checks:

Full metric comparison
Load testing validation at expected traffic
Memory and CPU usage within bounds

Phase 4: 100% Traffic

Post-rollout:

Full smoke test suite
Monitor for 24 hours
Keep rollback ready for 48 hours
Close deployment ticket only after 48-hour observation

Key Takeaways

Choose deployment strategy based on risk tolerance — critical systems need canary; internal tools can use blue-green
Define rollback criteria before deploying — not during a production incident
Test the rollback itself — an untested rollback is not a plan
Monitor actively during rollout — do not deploy and walk away
Canary deployments are QA’s best friend — they limit blast radius and provide real production feedback

Blue-Green and Canary Deployments

What You Will Learn

Why Deployment Strategies Matter for QA

Deployment Strategies Overview

Big-Bang Deployment

Blue-Green Deployment

Canary Deployment

Rolling Deployment

Shadow Deployment

Rollback Procedures

Rollback Criteria

Rollback Testing

Exercise: Design Deployment Validation

Pre-Deployment

Phase 1: 1% Traffic (30 minutes)

Phase 2: 10% Traffic (2 hours)

Phase 3: 50% Traffic (4 hours)

Phase 4: 100% Traffic

Key Takeaways

Knowledge Check

Blue-Green and Canary Deployments

What You Will Learn

Why Deployment Strategies Matter for QA #

Deployment Strategies Overview #

Big-Bang Deployment #

Blue-Green Deployment #

Canary Deployment #

Rolling Deployment #

Shadow Deployment #

Rollback Procedures #

Rollback Criteria #

Rollback Testing #

Exercise: Design Deployment Validation #

Pre-Deployment #

Phase 1: 1% Traffic (30 minutes) #

Phase 2: 10% Traffic (2 hours) #

Phase 3: 50% Traffic (4 hours) #

Phase 4: 100% Traffic #

Key Takeaways #

Knowledge Check

Why Deployment Strategies Matter for QA

Deployment Strategies Overview

Big-Bang Deployment

Blue-Green Deployment

Canary Deployment

Rolling Deployment

Shadow Deployment

Rollback Procedures

Rollback Criteria

Rollback Testing

Exercise: Design Deployment Validation

Pre-Deployment

Phase 1: 1% Traffic (30 minutes)

Phase 2: 10% Traffic (2 hours)

Phase 3: 50% Traffic (4 hours)

Phase 4: 100% Traffic

Key Takeaways