TL;DR
- Use Batfish for pre-deployment network analysis—validate routing, ACLs, and reachability before any device is touched
- Test firewall rules as code: define expected allow/deny rules, then verify with InSpec or custom assertions
- Implement network topology tests that verify connectivity paths, not just individual resource existence
Best for: Teams managing cloud VPCs, multi-cloud networks, or complex on-premise configurations Skip if: Your network is a single VPC with default routing (use cloud provider’s built-in tools instead) Read time: 14 minutes
Network configuration errors cause more outages than most teams realize. A misconfigured security group, an incorrect route table entry, or a missing firewall rule can take down production—and traditional testing approaches catch these issues too late. This guide covers proactive network testing that validates configurations before deployment.
For broader infrastructure testing context, see Infrastructure as Code Testing and Terraform Testing Strategies.
AI-Assisted Approaches
AI tools excel at generating network validation tests from requirements and troubleshooting complex routing issues.
Generating Batfish queries from requirements:
I need to validate these network requirements using Batfish:
1. Production subnet (10.0.1.0/24) can reach database subnet (10.0.2.0/24) on port 5432
2. No subnet can reach 0.0.0.0/0 except through the NAT gateway
3. Load balancer subnet must be reachable from internet on ports 80 and 443
4. Management subnet (10.0.10.0/24) must NOT be reachable from any other subnet
Generate pybatfish Python code with proper assertions and clear failure messages.
Include snapshot creation and network initialization.
Debugging routing issues:
My VPC has unexpected connectivity. Configuration:
- VPC CIDR: 10.0.0.0/16
- Public subnet: 10.0.1.0/24 with IGW route
- Private subnet: 10.0.2.0/24 with NAT route
- Database subnet: 10.0.3.0/24 (should be isolated)
Problem: Database subnet can reach the internet, but shouldn't.
Route tables attached are: [paste route table configs]
Use traceroute analysis to identify where the unwanted path exists.
Creating InSpec tests for firewall rules:
Write InSpec controls to verify these AWS security group rules:
1. Web tier SG allows inbound 443 from ALB SG only
2. App tier SG allows inbound 8080 from web tier SG only
3. DB tier SG allows inbound 5432 from app tier SG only
4. No security group allows inbound 22 from 0.0.0.0/0
Include resource lookup by tag and proper skip conditions
for missing resources.
When to Use Different Testing Approaches
Testing Strategy Decision Framework
| Test Type | Tool | When to Run | What It Catches |
|---|---|---|---|
| Configuration analysis | Batfish | Before deployment | Routing loops, unreachable hosts, ACL conflicts |
| Compliance checks | InSpec/Checkov | CI/CD pipeline | Policy violations, missing encryption |
| Integration tests | Terratest | After deployment | Actual connectivity, DNS resolution |
| Connectivity probes | Smoke tests | Post-deployment | Real traffic flow validation |
| Change impact analysis | Batfish differential | Before changes | Unintended side effects |
Use Batfish When
- Pre-deployment validation is critical: Catch issues before touching production
- Multi-vendor environments: Analyzing Cisco, Juniper, AWS, Azure configs together
- Complex routing: BGP, OSPF, or VRF configurations need validation
- Compliance requires proof: Generate reachability reports for auditors
Use Terratest/InSpec When
- Cloud-native infrastructure: AWS/GCP/Azure VPCs with simple routing
- Security group verification: Confirm actual firewall state matches expected
- Integration testing: Verify resources work together after deployment
Batfish for Network Analysis
Setting Up Batfish
# Run Batfish server in Docker
docker run -d -p 9997:9997 -p 9996:9996 batfish/batfish
# Install pybatfish client
pip install pybatfish
Basic Network Snapshot Analysis
from pybatfish.client.commands import bf_init_snapshot, bf_set_network
from pybatfish.question import bfq
from pybatfish.question.question import load_questions
# Initialize
load_questions()
bf_set_network("my-network")
bf_init_snapshot("./configs", name="current-config")
# Analyze configuration issues
issues = bfq.initIssues().answer().frame()
print(issues[["Type", "Issue_Type", "Details"]])
# Check for undefined references
undefined = bfq.undefinedReferences().answer().frame()
assert len(undefined) == 0, f"Found undefined references: {undefined}"
Reachability Testing
def test_web_to_database_connectivity():
"""Verify web tier can reach database on PostgreSQL port."""
result = bfq.reachability(
pathConstraints=PathConstraints(
startLocation="web-server",
endLocation="database-server"
),
headers=HeaderConstraints(
dstPorts="5432",
ipProtocols=["TCP"]
),
actions="SUCCESS"
).answer().frame()
assert len(result) > 0, "No path found from web to database on port 5432"
def test_database_internet_isolation():
"""Verify database cannot reach internet."""
result = bfq.reachability(
pathConstraints=PathConstraints(
startLocation="database-server",
endLocation="internet"
),
headers=HeaderConstraints(
dstIps="0.0.0.0/0"
),
actions="SUCCESS"
).answer().frame()
assert len(result) == 0, "Database has unintended internet access"
Differential Analysis for Changes
def test_route_change_impact():
"""Analyze impact of route table changes before applying."""
# Load current and proposed configs
bf_init_snapshot("./configs/current", name="current")
bf_init_snapshot("./configs/proposed", name="proposed")
# Compare reachability
diff = bfq.differentialReachability(
pathConstraints=PathConstraints(
startLocation="/production.*/",
endLocation="/database.*/"
)
).answer(
snapshot="proposed",
reference_snapshot="current"
).frame()
# Fail if any existing paths are broken
reduced = diff[diff["Snapshot_Action"] == "DENIED"]
assert len(reduced) == 0, f"Proposed change breaks paths: {reduced}"
Terraform VPC Testing
Testing VPC Configuration with Terratest
package test
import (
"testing"
"github.com/gruntwork-io/terratest/modules/aws"
"github.com/gruntwork-io/terratest/modules/terraform"
"github.com/stretchr/testify/assert"
)
func TestVPCConfiguration(t *testing.T) {
t.Parallel()
terraformOptions := terraform.WithDefaultRetryableErrors(t, &terraform.Options{
TerraformDir: "../modules/vpc",
Vars: map[string]interface{}{
"vpc_cidr": "10.0.0.0/16",
"environment": "test",
"enable_nat": true,
},
})
defer terraform.Destroy(t, terraformOptions)
terraform.InitAndApply(t, terraformOptions)
// Verify VPC created with correct CIDR
vpcID := terraform.Output(t, terraformOptions, "vpc_id")
vpc := aws.GetVpcById(t, vpcID, "us-east-1")
assert.Equal(t, "10.0.0.0/16", *vpc.CidrBlock)
// Verify DNS settings
assert.True(t, aws.IsVpcDnsEnabled(t, vpcID, "us-east-1"))
assert.True(t, aws.IsVpcDnsHostnamesEnabled(t, vpcID, "us-east-1"))
}
Testing Subnet Routing
func TestSubnetRouting(t *testing.T) {
t.Parallel()
// After terraform apply...
publicSubnetID := terraform.Output(t, terraformOptions, "public_subnet_id")
privateSubnetID := terraform.Output(t, terraformOptions, "private_subnet_id")
// Verify public subnet has internet gateway route
publicRoutes := aws.GetRouteTableForSubnet(t, publicSubnetID, "us-east-1")
hasIGWRoute := false
for _, route := range publicRoutes.Routes {
if route.GatewayId != nil && strings.HasPrefix(*route.GatewayId, "igw-") {
hasIGWRoute = true
break
}
}
assert.True(t, hasIGWRoute, "Public subnet missing IGW route")
// Verify private subnet has NAT gateway route
privateRoutes := aws.GetRouteTableForSubnet(t, privateSubnetID, "us-east-1")
hasNATRoute := false
for _, route := range privateRoutes.Routes {
if route.NatGatewayId != nil {
hasNATRoute = true
break
}
}
assert.True(t, hasNATRoute, "Private subnet missing NAT route")
}
Security Group Testing
InSpec Controls for AWS Security Groups
# controls/security_groups.rb
control 'web-tier-sg' do
impact 1.0
title 'Web tier security group allows only expected traffic'
web_sg = aws_security_group(group_name: 'web-tier-sg')
describe web_sg do
it { should exist }
it { should allow_in(port: 443, ipv4_range: '0.0.0.0/0') }
it { should allow_in(port: 80, ipv4_range: '0.0.0.0/0') }
it { should_not allow_in(port: 22, ipv4_range: '0.0.0.0/0') }
it { should_not allow_in(port: 3389, ipv4_range: '0.0.0.0/0') }
end
end
control 'database-tier-sg' do
impact 1.0
title 'Database tier only accessible from app tier'
db_sg = aws_security_group(group_name: 'database-tier-sg')
app_sg = aws_security_group(group_name: 'app-tier-sg')
describe db_sg do
it { should exist }
it { should allow_in_only(port: 5432, security_group: app_sg.group_id) }
it { should_not allow_in(ipv4_range: '0.0.0.0/0') }
end
end
control 'no-wide-open-egress' do
impact 0.7
title 'Security groups should not allow all egress'
aws_security_groups.group_ids.each do |sg_id|
describe aws_security_group(group_id: sg_id) do
it { should_not allow_out(ipv4_range: '0.0.0.0/0', port: '0-65535') }
end
end
end
Terraform Security Group Tests
func TestSecurityGroupRules(t *testing.T) {
// After terraform apply...
webSGID := terraform.Output(t, terraformOptions, "web_sg_id")
dbSGID := terraform.Output(t, terraformOptions, "db_sg_id")
// Verify web SG allows HTTPS from anywhere
webSG := aws.GetSecurityGroup(t, webSGID, "us-east-1")
httpsAllowed := false
for _, rule := range webSG.IpPermissions {
if *rule.FromPort == 443 && *rule.ToPort == 443 {
for _, ip := range rule.IpRanges {
if *ip.CidrIp == "0.0.0.0/0" {
httpsAllowed = true
}
}
}
}
assert.True(t, httpsAllowed, "Web SG should allow HTTPS from internet")
// Verify DB SG only allows from web SG
dbSG := aws.GetSecurityGroup(t, dbSGID, "us-east-1")
for _, rule := range dbSG.IpPermissions {
// Should not have any 0.0.0.0/0 rules
for _, ip := range rule.IpRanges {
assert.NotEqual(t, "0.0.0.0/0", *ip.CidrIp,
"Database SG should not allow public access")
}
}
}
CI/CD Pipeline Integration
GitHub Actions Workflow
name: Network Configuration Tests
on:
pull_request:
paths:
- 'terraform/networking/**'
- 'configs/network/**'
jobs:
batfish-analysis:
runs-on: ubuntu-latest
services:
batfish:
image: batfish/batfish
ports:
- 9997:9997
- 9996:9996
steps:
- uses: actions/checkout@v4
- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install pybatfish
run: pip install pybatfish pytest
- name: Run Batfish analysis
run: |
pytest tests/network/ -v --tb=short
env:
BATFISH_HOST: localhost
terraform-tests:
runs-on: ubuntu-latest
needs: batfish-analysis
steps:
- uses: actions/checkout@v4
- name: Setup Terraform
uses: hashicorp/setup-terraform@v3
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: us-east-1
- name: Terraform Validate
run: terraform validate
working-directory: terraform/networking/
- name: Run Checkov
uses: bridgecrewio/checkov-action@v12
with:
directory: terraform/networking/
framework: terraform
check: CKV_AWS_23,CKV_AWS_24,CKV_AWS_25
inspec-compliance:
runs-on: ubuntu-latest
needs: terraform-tests
if: github.event.pull_request.draft == false
steps:
- uses: actions/checkout@v4
- name: Setup InSpec
run: |
curl https://omnitruck.chef.io/install.sh | sudo bash -s -- -P inspec
- name: Run InSpec tests
run: |
inspec exec compliance/network \
-t aws:// \
--reporter cli json:inspec-results.json
env:
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
AWS_REGION: us-east-1
Measuring Success
| Metric | Before Testing | After Testing | How to Track |
|---|---|---|---|
| Network-related outages | 3-4/month | <1/quarter | Incident reports |
| Security group misconfigs | Found in audits | Caught in CI | Pipeline metrics |
| Change rollbacks | 20% of changes | <5% of changes | Deployment logs |
| Mean time to detect | Days | Minutes | Alert timestamps |
Warning signs your network testing isn’t working:
- Still finding misconfigurations in production
- Tests pass but connectivity fails after deployment
- Batfish and reality diverge (outdated snapshots)
- Security groups have unexplained rules
Conclusion
Effective network configuration testing requires multiple layers:
- Pre-deployment analysis with Batfish catches routing and ACL issues before they reach devices
- Compliance testing with InSpec/Checkov validates security policies
- Integration testing with Terratest confirms actual connectivity
- Continuous monitoring detects drift from desired state
The key insight: test the network as a system, not just individual components. A security group might be configured correctly in isolation but create problems in combination with route tables and NACLs.
Official Resources
See Also
- Infrastructure as Code Testing - Foundational IaC testing concepts
- Terraform Testing Strategies - Complete Terraform testing pyramid
- AWS Infrastructure Testing with LocalStack - Local AWS testing
- Security Group Testing - Deep dive into firewall rule validation
- Kubernetes Testing Strategies - Network policies in K8s
