Why Test in Production?
Pre-production environments, no matter how carefully configured, never perfectly replicate production. Production has real user data patterns, real traffic volumes, real third-party integrations, and real infrastructure complexity. Some bugs only surface under these conditions.
Testing in production does not mean abandoning pre-production testing. It means adding a layer of validation that catches what pre-production testing cannot.
Safe Testing in Production Strategies
Synthetic Monitoring
Automated scripts that continuously execute critical user journeys against production:
// Synthetic check: Login flow
async function checkLogin() {
const response = await fetch('https://api.example.com/auth/login', {
method: 'POST',
body: JSON.stringify({
email: 'synthetic-monitor@example.com', // Dedicated test account
password: process.env.SYNTHETIC_PASSWORD,
}),
});
if (response.status !== 200) {
alert('Login flow broken!');
}
const data = await response.json();
if (!data.token) {
alert('Login returns no token!');
}
}
// Run every 5 minutes, 24/7
Key rules for synthetic monitoring:
- Use dedicated test accounts, never real user accounts
- Tests must be non-destructive (read-only operations or use test payment methods)
- Run from multiple geographic locations to detect regional issues
- Monitor both success rate and response time
Dark Launching
Deploy new code to production but do not expose it to users. The new code processes real requests in the background, but its results are discarded:
async function getProductRecommendations(userId) {
// Current (live) path
const currentResults = await currentEngine.recommend(userId);
// New engine (dark launch) — runs but results are not shown to users
try {
const newResults = await newEngine.recommend(userId);
// Log comparison for analysis
metrics.compare('recommendations', currentResults, newResults);
} catch (error) {
// New engine errors do not affect users
logger.warn('Dark launch error', error);
}
return currentResults; // Always return current results
}
Traffic Mirroring
Copy production traffic to a shadow environment:
# Istio traffic mirroring configuration
apiVersion: networking.istio.io/v1
kind: VirtualService
spec:
hosts:
- product-service
http:
- route:
- destination:
host: product-service
subset: v1
mirror:
host: product-service
subset: v2-shadow
mirrorPercentage:
value: 10.0 # Mirror 10% of traffic
Canary Testing
Route a small percentage of real traffic to the new version (covered in detail in Lesson 9.11).
Observability-Driven Testing
Use production monitoring to continuously verify quality:
- Error budgets: Track how much of your error budget has been consumed
- Anomaly detection: Alert when metrics deviate from learned baselines
- Real user monitoring (RUM): Track actual user experience metrics
When NOT to Test in Production
| Scenario | Risk | Alternative |
|---|---|---|
| Tests that create real orders | Financial impact | Use test accounts with sandbox payment |
| Tests that send real emails/SMS | User confusion | Use test notification channels |
| Load tests at full scale | Performance degradation | Run during low-traffic hours or use shadow env |
| Destructive database operations | Data loss | Never in production |
| Tests involving PII | Privacy violation | Use synthetic data |
Exercise: Design a Production Testing Strategy
Your team launches a new search engine for an e-commerce site. Design a production testing strategy that validates the new search without affecting users.
Solución
Phase 1: Dark Launch (Week 1)
- Deploy new search engine behind feature flag (off for all users)
- Mirror 5% of search queries to new engine
- Compare results: relevance, response time, error rate
- Log comparisons for analysis
- Criteria to proceed: new engine P95 < 200ms, zero errors, relevance score >= current
Phase 2: Synthetic Monitoring (Week 2)
- 50 predefined search queries running every 10 minutes
- Verify result count, response time, and result relevance
- Alert if any synthetic check fails twice consecutively
- Run from 3 geographic regions
Phase 3: Canary (Week 3)
- Enable new search for 1% of users via feature flag
- Compare metrics: click-through rate, conversion from search, bounce rate
- Monitor: error rate, response time, user complaints
- Gradually increase to 5%, 25%, 50%, 100%
Phase 4: Ongoing Production Testing
- Synthetic monitoring: 24/7, 50 queries every 10 minutes
- A/B experiments for search relevance improvements
- Real user monitoring for search performance
- Weekly review of search quality metrics
Key Takeaways
- Production testing supplements, not replaces, pre-production testing
- Synthetic monitoring catches outages before users report them
- Dark launching validates new code with real traffic, zero user impact
- Traffic mirroring tests at production scale without risk
- Always have safeguards — test accounts, feature flags, non-destructive operations