Why Estimation Matters
Every sprint planning, every project kickoff, every stakeholder meeting includes the question: “How long will testing take?” Getting this answer wrong has real consequences:
- Underestimate: Testing is rushed, bugs escape to production, team burns out
- Overestimate: Budget is wasted, team credibility suffers, features are delayed unnecessarily
Good estimation is not about being perfectly accurate — it is about being close enough to make informed decisions.
Factors Affecting Test Estimates
Before applying any technique, understand what influences testing time:
| Factor | Impact on Estimate |
|---|---|
| Feature complexity | More complex = more test cases, more edge cases |
| Team experience | New team or new domain = longer testing |
| Requirements quality | Vague requirements = more exploratory testing, more rework |
| Test automation maturity | More automation = less manual execution time |
| Environment stability | Unstable environments = blocked testing, delays |
| Dependencies | External services, other teams = waiting time |
| Defect density | More bugs = more retesting and regression |
| Regulatory requirements | Compliance testing adds overhead |
Estimation Techniques
1. Work Breakdown Structure (WBS)
WBS breaks testing into smaller, estimable tasks. This is the most intuitive and widely-used technique.
Steps:
- List all testing activities
- Break each activity into sub-tasks
- Estimate each sub-task in hours
- Sum up for total estimate
Example:
| Activity | Sub-tasks | Hours |
|---|---|---|
| Requirements review | Review 20 user stories | 8 |
| Write acceptance criteria | 4 | |
| Test case design | Write functional test cases (40 cases) | 16 |
| Write negative test cases (20 cases) | 8 | |
| Prepare test data | 4 | |
| Test execution | Execute manual tests (60 cases) | 24 |
| Execute automated tests (setup + run) | 8 | |
| Bug management | Log bugs, verify fixes, retest (est. 15 bugs) | 12 |
| Regression | Execute regression suite | 8 |
| Reporting | Daily status, final report | 4 |
| Total | 96 hours (12 days) |
Pros: Transparent, easy to justify, catches hidden tasks. Cons: Time-consuming for large projects, accuracy depends on task granularity.
2. Three-Point Estimation (PERT)
Three-Point Estimation accounts for uncertainty by using three values:
- O (Optimistic): Best case — everything goes perfectly
- M (Most Likely): Realistic case — normal conditions
- P (Pessimistic): Worst case — everything goes wrong
Formula (PERT):
Estimate = (O + 4M + P) / 6
Standard Deviation = (P - O) / 6
Example:
| Task | Optimistic | Most Likely | Pessimistic | PERT Estimate |
|---|---|---|---|---|
| Test case design | 12h | 16h | 28h | 17.3h |
| Test execution | 16h | 24h | 40h | 25.3h |
| Bug management | 4h | 12h | 24h | 12.7h |
| Regression | 4h | 8h | 16h | 8.7h |
| Total | 64h (8 days) |
Pros: Accounts for uncertainty, statistically grounded, produces confidence ranges. Cons: Requires experience to set O/M/P values, team may default to pessimistic.
3. Wideband Delphi
A consensus-based technique where multiple experts estimate independently, then discuss and converge.
Steps:
- Present the testing scope to 3-5 experts
- Each expert estimates independently (no discussion)
- Collect estimates anonymously
- Reveal all estimates simultaneously
- Discuss outliers — why did someone estimate 5 days vs. 15 days?
- Re-estimate (independently again)
- Repeat until estimates converge (usually 2-3 rounds)
Example Round 1:
| Expert | Estimate |
|---|---|
| QA Engineer A | 8 days |
| QA Engineer B | 14 days |
| QA Engineer C | 10 days |
| Dev Lead | 6 days |
Discussion reveals: QA Engineer B included performance testing; Dev Lead forgot about regression. After adjusting scope understanding, Round 2 converges to 10-12 days.
Pros: Reduces individual bias, leverages team knowledge, builds consensus. Cons: Requires multiple experts, time-consuming, dominant personalities may influence.
4. Use Case Point Method
Estimates testing effort based on the number and complexity of use cases.
Steps:
- Count use cases by complexity (Simple, Medium, Complex)
- Assign weights: Simple = 5, Medium = 10, Complex = 15
- Calculate Unadjusted Use Case Points (UUCP)
- Apply technical and environmental complexity factors
- Convert to testing hours using a productivity factor
Example:
| Complexity | Count | Weight | Points |
|---|---|---|---|
| Simple | 5 | 5 | 25 |
| Medium | 8 | 10 | 80 |
| Complex | 3 | 15 | 45 |
| UUCP | 150 |
Adjusted UCP = 150 × complexity factor (e.g., 0.8) = 120 Testing hours = 120 × productivity factor (e.g., 0.5 hours/point) = 60 hours
5. Function Point Analysis
Estimates based on the functional size of the software (inputs, outputs, inquiries, files, interfaces). More formal but common in large organizations and government contracts.
6. Historical Data (Analogy-Based)
Use data from previous similar projects to estimate the current one.
Example: “The login module in Project A took 5 days to test. This project’s login module is similar but adds two-factor authentication. Estimate: 7 days.”
Pros: Grounded in reality, fast, easy to justify. Cons: Requires historical data, assumes similar conditions, past projects may not be comparable.
Estimation Accuracy Over Time
Estimates become more accurate as you learn more:
±50% accuracy] --> E2[After Requirements
±25% accuracy] E2 --> E3[After Design
±15% accuracy] E3 --> E4[During Testing
±5% accuracy] end style E1 fill:#F44336,color:#fff style E2 fill:#FF9800,color:#fff style E3 fill:#FFC107,color:#000 style E4 fill:#4CAF50,color:#fff
This is called the Cone of Uncertainty. Early estimates are inherently less accurate. Plan for this by providing ranges, not single numbers.
Exercise: Estimate Testing Effort Using Two Techniques
You are the QA lead for a project to build a hotel booking system. Features include:
- User registration and login (with social login)
- Hotel search (by location, dates, price, rating)
- Room booking with payment (credit card, PayPal)
- Booking management (view, modify, cancel)
- Review system (rate hotels, write reviews)
- Admin panel (manage hotels, view bookings, generate reports)
Team: 2 QA engineers (1 senior, 1 mid-level), both have experience with e-commerce but not hotel booking systems specifically.
Your task:
- Estimate using WBS — break testing into activities and sub-tasks with hour estimates
- Estimate using Three-Point Estimation — provide O/M/P for major testing activities
- Compare the two estimates and explain which you would present to stakeholders
- List 3 risks that could make your estimates wrong
Hint
Consider:
- Payment testing needs extra time for security testing (PCI considerations)
- Social login integration has external dependencies (Google, Facebook APIs)
- Search functionality needs performance testing (complex queries)
- Admin panel is often underestimated — it has many features
- The team has no hotel booking domain experience — add learning time
- Don’t forget regression testing, environment setup, and reporting
Sample Solution
WBS Estimate
| Activity | Sub-tasks | Senior QA (h) | Mid QA (h) | Total (h) |
|---|---|---|---|---|
| Requirements review | Review all features, write acceptance criteria | 12 | 8 | 20 |
| Test case design | Registration/login (15 cases) | 4 | 4 | 8 |
| Search (20 cases) | 6 | 4 | 10 | |
| Booking + payment (25 cases) | 8 | 6 | 14 | |
| Booking management (15 cases) | 4 | 4 | 8 | |
| Reviews (10 cases) | 2 | 4 | 6 | |
| Admin panel (20 cases) | 6 | 4 | 10 | |
| Test data preparation | Users, hotels, bookings, payment test data | 4 | 4 | 8 |
| Test execution | Manual functional testing (105 cases) | 20 | 24 | 44 |
| Security testing (payments, auth) | 12 | 0 | 12 | |
| Performance testing (search, booking) | 8 | 0 | 8 | |
| Cross-browser/device testing | 4 | 8 | 12 | |
| Bug management | Log, verify, retest (~25 bugs estimated) | 8 | 8 | 16 |
| Regression | Two regression cycles | 8 | 8 | 16 |
| Reporting | Status reports, final report | 4 | 2 | 6 |
| Total | 110h | 88h | 198h (~25 days) |
Three-Point Estimate
| Activity | Optimistic | Most Likely | Pessimistic | PERT |
|---|---|---|---|---|
| Requirements + design | 30h | 56h | 80h | 56h |
| Test execution (all types) | 50h | 76h | 120h | 79h |
| Bug management | 8h | 16h | 32h | 17h |
| Regression | 10h | 16h | 28h | 17h |
| Environment + data | 8h | 12h | 24h | 13h |
| Reporting | 4h | 6h | 12h | 7h |
| Total | 110h | 182h | 296h | 189h (~24 days) |
Standard deviation: (296-110)/6 = 31h → 95% confidence range: 189 ± 62h = 127-251h (16-31 days)
Comparison
WBS estimate: 198 hours (25 days) PERT estimate: 189 hours (24 days)
Both estimates are close (~5% difference), which increases confidence. Present to stakeholders as: “24-25 working days, with a range of 20-31 days depending on defect density and environment stability.”
Risks That Could Affect Estimates
- Payment gateway integration issues: If the test sandbox is unreliable, payment testing could take 2x longer
- Higher-than-expected defect density: If we find 40 bugs instead of 25, bug management jumps from 16 to 28 hours
- Social login API changes: Google or Facebook could change their OAuth flow, requiring test case updates and new testing
Common Estimation Mistakes
- Forgetting non-testing tasks: Environment setup, meetings, documentation, learning time
- Estimating only happy paths: Negative testing often takes as long as positive testing
- Ignoring regression: Every bug fix needs regression testing
- Single-number estimates: Always provide ranges — “10 days” sounds precise but is misleading
- Not tracking actuals: Without historical data, estimates never improve
Pro Tips
Track your estimates vs. actuals. After every project, compare what you estimated with what actually happened. This builds your estimation accuracy over time.
Add a buffer for unknowns. For new domains or technologies, add 20-30% to your estimate. For familiar territory, 10-15%.
Present ranges, not points. “We estimate 15-20 days” is more honest and useful than “We estimate 17.3 days.”
Use Wideband Delphi for high-stakes estimates. When the estimate determines budget or headcount, involve multiple experts to reduce individual bias.
Estimate testing during planning, not after. If you estimate after development starts, you are already behind.