Property-Based Testing (PBT) revolutionizes how we think about test cases. Instead of manually crafting individual examples, PBT defines properties that should always hold true and automatically generates hundreds of test cases to verify those properties. This approach discovers edge cases that developers rarely anticipate.
What is Property-Based Testing?
Property-Based Testing focuses on specifying invariants—rules that must always be true regardless of inputs—rather than checking specific input-output pairs. A property-based testing framework generates random inputs, runs the code under test, and verifies that the properties hold.
Core Concepts
Property: An assertion about the system that should hold for all valid inputs.
Generator: A function that produces random test data matching specific constraints.
Shrinking: When a failing test is found, the framework automatically simplifies the input to find the minimal failing case.
Invariant: A condition that remains true throughout program execution or across transformations.
Properties vs. Example-Based Tests
Traditional Example-Based Testing
def test_reverse_list():
assert reverse([1, 2, 3]) == [3, 2, 1]
assert reverse([]) == []
assert reverse([42]) == [42]
Limitations: Only tests three specific cases; may miss edge cases like large lists, duplicates, or unusual values.
Property-Based Approach
from hypothesis import given
import hypothesis.strategies as st
@given(st.lists(st.integers()))
def test_reverse_property(lst):
# Property: Reversing twice returns original
assert reverse(reverse(lst)) == lst
# Property: Length is preserved
assert len(reverse(lst)) == len(lst)
# Property: First element becomes last
if lst:
assert reverse(lst)[0] == lst[-1]
Advantages: Tests hundreds of random lists automatically; discovers unexpected edge cases.
Property-Based Testing Frameworks
Hypothesis (Python)
Hypothesis is the most mature PBT framework for Python, offering sophisticated strategies and excellent shrinking.
from hypothesis import given, strategies as st, assume
@given(st.integers(), st.integers())
def test_addition_commutative(a, b):
# Property: Addition is commutative
assert a + b == b + a
@given(st.lists(st.integers(), min_size=1))
def test_max_is_in_list(lst):
# Property: Max value must be in the list
assert max(lst) in lst
@given(st.text())
def test_encode_decode(s):
# Property: Encoding then decoding returns original
encoded = s.encode('utf-8')
decoded = encoded.decode('utf-8')
assert s == decoded
QuickCheck (Haskell)
The original property-based testing framework that inspired all others.
-- Property: Reverse is its own inverse
prop_reverseInverse :: [Int] -> Bool
prop_reverseInverse xs = reverse (reverse xs) == xs
-- Property: Sorted list has all original elements
prop_sortPreservesElements :: [Int] -> Bool
prop_sortPreservesElements xs =
sort xs `sameElements` xs
where
sameElements a b = sort a == sort b
-- Property: Appending then taking length sums lengths
prop_appendLength :: [Int] -> [Int] -> Bool
prop_appendLength xs ys =
length (xs ++ ys) == length xs + length ys
fast-check (JavaScript/TypeScript)
Property-based testing for JavaScript ecosystem with TypeScript support.
import fc from 'fast-check';
// Property: Array map preserves length
fc.property(
fc.array(fc.integer()),
fc.func(fc.integer()),
(arr, f) => arr.map(f).length === arr.length
);
// Property: JSON serialization round-trip
fc.property(
fc.anything(),
(value) => {
const serialized = JSON.stringify(value);
const deserialized = JSON.parse(serialized);
return fc.stringify(value) === fc.stringify(deserialized);
}
);
JSVerify (JavaScript)
Earlier JavaScript PBT library, simpler but less feature-rich than fast-check.
const jsc = require('jsverify');
// Property: String split then join returns original
const splitJoinProperty = jsc.forall(
jsc.string,
jsc.string,
(str, sep) => {
if (sep === '') return true; // Skip empty separator
return str.split(sep).join(sep) === str;
}
);
jsc.assert(splitJoinProperty);
Generators and Strategies
Generators produce random test data constrained to specific types and ranges.
Built-in Generators
from hypothesis import strategies as st
# Basic types
st.integers() # Any integer
st.integers(min_value=0, max_value=100) # Range 0-100
st.floats() # Any float
st.text() # Unicode strings
st.booleans() # True/False
# Collections
st.lists(st.integers()) # Lists of integers
st.sets(st.text(), min_size=1) # Non-empty sets of strings
st.dictionaries(st.text(), st.integers()) # String->Int dicts
st.tuples(st.integers(), st.text()) # Fixed-size tuples
# Optionals and choices
st.none() # Always None
st.one_of(st.integers(), st.text()) # Either int or string
Custom Generators
from hypothesis import strategies as st
from hypothesis.strategies import composite
# Generate valid email addresses
@composite
def email_strategy(draw):
username = draw(st.text(
alphabet=st.characters(whitelist_categories=('Ll', 'Nd')),
min_size=1,
max_size=20
))
domain = draw(st.text(
alphabet=st.characters(whitelist_categories=('Ll',)),
min_size=1,
max_size=15
))
tld = draw(st.sampled_from(['com', 'org', 'net', 'edu']))
return f"{username}@{domain}.{tld}"
@given(email_strategy())
def test_email_validation(email):
assert '@' in email
assert '.' in email.split('@')[1]
Generator Composition
# Generate shopping cart with realistic constraints
@composite
def shopping_cart_strategy(draw):
num_items = draw(st.integers(min_value=0, max_value=50))
items = draw(st.lists(
st.tuples(
st.text(min_size=1), # Product name
st.integers(min_value=1, max_value=10), # Quantity
st.floats(min_value=0.01, max_value=1000.00) # Price
),
min_size=num_items,
max_size=num_items
))
return {'items': items}
Shrinking: Minimal Failing Cases
When a property fails, shrinking automatically reduces the input to the smallest case that still fails.
Example: Shrinking in Action
@given(st.lists(st.integers()))
def test_no_duplicates(lst):
# This property is false - lists CAN have duplicates
assert len(lst) == len(set(lst))
Initial failure: [0, -1, 3, 0, -5, 2]
After shrinking: [0, 0]
The framework automatically reduces the failing case from a 6-element list to the minimal 2-element duplicate.
Shrinking Strategies
Framework | Shrinking Approach | Quality |
---|---|---|
Hypothesis | Integrated reduction algorithms | Excellent |
QuickCheck | Type-based shrinking | Excellent |
fast-check | Custom shrinking per generator | Very Good |
JSVerify | Basic shrinking | Good |
Common Property Patterns
Inverse Functions
Functions that undo each other should round-trip perfectly.
@given(st.text())
def test_base64_roundtrip(s):
import base64
encoded = base64.b64encode(s.encode('utf-8'))
decoded = base64.b64decode(encoded).decode('utf-8')
assert s == decoded
Idempotence
Applying an operation multiple times has the same effect as applying it once.
@given(st.lists(st.integers()))
def test_sort_idempotent(lst):
# Sorting twice equals sorting once
assert sorted(sorted(lst)) == sorted(lst)
@given(st.sets(st.integers()))
def test_set_idempotent(s):
# Converting to set twice equals once
assert set(set(s)) == set(s)
Invariants
Certain properties remain true across transformations.
@given(st.lists(st.integers()))
def test_filter_preserves_order(lst):
filtered = [x for x in lst if x > 0]
# Order of filtered elements matches original
original_positives = [x for x in lst if x > 0]
assert filtered == original_positives
@given(st.dictionaries(st.text(), st.integers()))
def test_dict_keys_values_match(d):
# Keys and values maintain correspondence
assert len(d.keys()) == len(d.values())
for key in d.keys():
assert key in d
Oracle Comparison
Compare implementation against a simpler (but slower) reference.
def quicksort(lst):
# Fast but complex implementation
if len(lst) <= 1:
return lst
pivot = lst[0]
left = [x for x in lst[1:] if x < pivot]
right = [x for x in lst[1:] if x >= pivot]
return quicksort(left) + [pivot] + quicksort(right)
@given(st.lists(st.integers()))
def test_quicksort_matches_builtin(lst):
# Compare against Python's built-in sort
assert quicksort(lst) == sorted(lst)
Metamorphic Relations
Relate outputs for different inputs without knowing exact expected output.
@given(st.lists(st.integers()), st.integers())
def test_search_after_insert(lst, value):
# If we insert a value, searching for it must succeed
lst_with_value = lst + [value]
assert value in lst_with_value
Stateful Property Testing
Test stateful systems by generating sequences of operations.
from hypothesis.stateful import RuleBasedStateMachine, rule, invariant
import hypothesis.strategies as st
class BankAccountMachine(RuleBasedStateMachine):
def __init__(self):
super().__init__()
self.balance = 0
@rule(amount=st.integers(min_value=1, max_value=1000))
def deposit(self, amount):
self.balance += amount
@rule(amount=st.integers(min_value=1, max_value=1000))
def withdraw(self, amount):
if amount <= self.balance:
self.balance -= amount
@invariant()
def balance_never_negative(self):
assert self.balance >= 0
# Run stateful tests
TestBankAccount = BankAccountMachine.TestCase
Hypothesis Stateful Testing Example
class QueueMachine(RuleBasedStateMachine):
def __init__(self):
super().__init__()
self.queue = []
@rule(value=st.integers())
def enqueue(self, value):
self.queue.append(value)
@rule()
def dequeue(self):
if self.queue:
return self.queue.pop(0)
@invariant()
def queue_fifo_order(self):
# Elements maintain FIFO order
assert self.queue == self.queue # Simplified invariant
Assumptions and Preconditions
Use assume()
to filter generated inputs to valid scenarios.
from hypothesis import given, assume
import hypothesis.strategies as st
@given(st.integers(), st.integers())
def test_division(a, b):
assume(b != 0) # Skip cases where b is zero
result = a / b
assert result * b == a # Within floating-point precision
Warning: Overuse of assume()
can make tests inefficient by discarding many generated inputs.
Benefits of Property-Based Testing
Discovers Unexpected Edge Cases
PBT finds bugs developers don’t anticipate.
Case Study: Hypothesis discovered a Unicode handling bug in a production JSON parser that had 95% line coverage from example-based tests.
Serves as Executable Specification
Properties document system behavior more comprehensively than examples.
Reduces Test Maintenance
Properties remain valid as implementation details change.
ROI Example: 50% reduction in test updates during refactoring compared to example-based tests.
Complements Example-Based Tests
Use PBT for complex logic; use examples for specific regression tests and readability.
Challenges and Limitations
Writing Good Properties
Identifying meaningful properties requires practice and deep understanding.
Mitigation: Start with simple properties (round-trip, idempotence); add complex invariants gradually.
Performance
Generating and running hundreds of test cases takes longer than example-based tests.
Mitigation: Configure example counts; use CI for comprehensive runs, fewer examples locally.
from hypothesis import settings
@settings(max_examples=1000) # Default is 100
@given(st.lists(st.integers()))
def test_with_more_examples(lst):
assert len(lst) >= 0
Non-Deterministic Systems
Systems with external dependencies or randomness are harder to test with properties.
Mitigation: Use stateful testing with mocked dependencies; test at integration boundaries.
Best Practices
Start with Simple Properties
Begin with universally applicable properties.
# Simple: Type preservation
@given(st.lists(st.integers()))
def test_map_preserves_length(lst):
assert len(list(map(lambda x: x * 2, lst))) == len(lst)
Combine Multiple Properties
Test several properties in one test for comprehensive coverage.
@given(st.lists(st.integers()))
def test_sorting_properties(lst):
sorted_lst = sorted(lst)
# Property 1: Length preserved
assert len(sorted_lst) == len(lst)
# Property 2: All elements present
assert set(sorted_lst) == set(lst)
# Property 3: Ordered correctly
for i in range(len(sorted_lst) - 1):
assert sorted_lst[i] <= sorted_lst[i + 1]
Use Example Decorators for Regressions
Pin specific failing cases as examples while keeping property tests.
from hypothesis import given, example
import hypothesis.strategies as st
@given(st.lists(st.integers()))
@example([]) # Ensure empty list is always tested
@example([1, 2, 3]) # Pin specific regression case
def test_reverse(lst):
assert reverse(reverse(lst)) == lst
Configure Timeouts Appropriately
from hypothesis import settings
import hypothesis.strategies as st
@settings(deadline=500) # 500ms per test case
@given(st.lists(st.integers(), max_size=10000))
def test_large_lists(lst):
process(lst)
Real-World Applications
JSON Parsing Library
JSON library uses PBT to verify round-trip serialization for all data types.
Results: Discovered 3 edge cases in Unicode handling and floating-point precision.
Sorting Algorithm Validation
Comparison-based sort tested against built-in sort as oracle.
Results: 100% confidence in correctness across 10,000 generated inputs per test run.
API Request Validation
REST API validator tested with generated payloads matching schema constraints.
Results: Found 5 edge cases in nested object validation that manual tests missed.
Conclusion
Property-Based Testing shifts testing focus from individual examples to universal truths about system behavior. While requiring a different mindset than traditional example-based testing, PBT provides superior edge case discovery and serves as living documentation of system invariants.
Success with PBT comes from starting simple, identifying meaningful properties incrementally, and combining property-based tests with carefully chosen example-based tests for regression coverage and readability.