The Imperative of Infrastructure as Code Testing

Infrastructure as Code (IaC) has revolutionized how we provision and manage infrastructure, but with great power comes great responsibility. Without proper testing, IaC can propagate misconfigurations at scale, leading to security vulnerabilities, compliance violations, and catastrophic outages. Testing IaC is no longer optional—it’s a critical requirement for maintaining reliable, secure, and compliant infrastructure.

The complexity of modern cloud environments demands sophisticated testing strategies that go beyond simple syntax validation. We need to verify security configurations, ensure compliance with organizational policies, validate cross-service integrations, and confirm that our infrastructure behaves correctly under various conditions. This comprehensive approach to IaC testing prevents costly mistakes and accelerates delivery while maintaining quality.

Testing Pyramid for Infrastructure as Code

Static Analysis and Linting

The foundation of IaC testing begins with static analysis. These tests run quickly and catch common issues early in the development cycle:

# terraform/modules/web-server/variables.tf
variable "instance_type" {
  description = "EC2 instance type"
  type        = string
  default     = "t3.micro"

  validation {
    condition = contains([
      "t3.micro",
      "t3.small",
      "t3.medium",
      "t3.large"
    ], var.instance_type)
    error_message = "Instance type must be one of the approved sizes."
  }
}

variable "subnet_ids" {
  description = "List of subnet IDs for deployment"
  type        = list(string)

  validation {
    condition = length(var.subnet_ids) >= 2
    error_message = "At least 2 subnets required for high availability."
  }
}

variable "environment" {
  description = "Environment name"
  type        = string

  validation {
    condition = can(regex("^(dev|staging|prod)$", var.environment))
    error_message = "Environment must be dev, staging, or prod."
  }
}

Terraform Validation Pipeline

# .github/workflows/terraform-validation.yml
name: Terraform Validation Pipeline

on:
  pull_request:
    paths:
      - 'terraform/**'
      - '.github/workflows/terraform-validation.yml'

jobs:
  validate:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v3

      - name: Setup Terraform
        uses: hashicorp/setup-terraform@v2
        with:
          terraform_version: 1.5.0

      - name: Terraform Format Check
        run: |
          terraform fmt -check -recursive terraform/

      - name: TFLint - Terraform Linter
        run: |
          curl -s https://raw.githubusercontent.com/terraform-linters/tflint/master/install_linux.sh | bash

          # Configure TFLint
          cat > .tflint.hcl <<EOF
          config {
            module = true
            force = false
          }

          plugin "aws" {
            enabled = true
            version = "0.24.0"
            source  = "github.com/terraform-linters/tflint-ruleset-aws"
          }

          rule "aws_instance_invalid_type" {
            enabled = true
          }

          rule "aws_resource_missing_tags" {
            enabled = true
            tags = ["Environment", "Owner", "CostCenter"]
          }
          EOF

          tflint --init
          tflint --recursive

      - name: Checkov Security Scanning
        run: |
          pip install checkov
          checkov -d terraform/ --framework terraform --output json --output-file-path checkov-report.json

          # Parse and fail on high severity issues
          python3 <<EOF
          import json
          with open('checkov-report.json', 'r') as f:
              report = json.load(f)

          failed_checks = report.get('summary', {}).get('failed', 0)
          if failed_checks > 0:
              print(f"Found {failed_checks} security issues")
              for check in report.get('results', {}).get('failed_checks', []):
                  print(f"  - {check['check_id']}: {check['check_name']}")
              exit(1)
          EOF

      - name: Terraform Validate
        run: |
          for dir in terraform/environments/*/; do
            echo "Validating $dir"
            cd $dir
            terraform init -backend=false
            terraform validate
            cd -
          done

      - name: Generate Documentation
        run: |
          # Install terraform-docs
          curl -sSLo terraform-docs.tar.gz https://github.com/terraform-docs/terraform-docs/releases/download/v0.16.0/terraform-docs-v0.16.0-linux-amd64.tar.gz
          tar -xzf terraform-docs.tar.gz
          chmod +x terraform-docs

          # Generate docs for each module
          for module in terraform/modules/*/; do
            ./terraform-docs markdown table --output-file README.md --output-mode inject $module
          done

Unit Testing with Terratest

Comprehensive Terratest Implementation

// test/terraform_aws_vpc_test.go
package test

import (
    "fmt"
    "strings"
    "testing"
    "time"

    "github.com/gruntwork-io/terratest/modules/aws"
    "github.com/gruntwork-io/terratest/modules/random"
    "github.com/gruntwork-io/terratest/modules/retry"
    "github.com/gruntwork-io/terratest/modules/terraform"
    "github.com/stretchr/testify/assert"
    "github.com/stretchr/testify/require"
)

func TestTerraformAwsVpc(t *testing.T) {
    t.Parallel()

    // Generate unique identifiers
    uniqueId := random.UniqueId()
    region := "us-east-1"
    vpcCidr := "10.0.0.0/16"

    terraformOptions := &terraform.Options{
        TerraformDir: "../terraform/modules/vpc",

        Vars: map[string]interface{}{
            "vpc_cidr":     vpcCidr,
            "environment":  "test",
            "name":        fmt.Sprintf("test-vpc-%s", uniqueId),
            "region":      region,
            "enable_nat":  true,
            "single_nat":  false,
        },

        EnvVars: map[string]string{
            "AWS_DEFAULT_REGION": region,
        },
    }

    defer terraform.Destroy(t, terraformOptions)
    terraform.InitAndApply(t, terraformOptions)

    // Validate outputs
    vpcId := terraform.Output(t, terraformOptions, "vpc_id")
    assert.NotEmpty(t, vpcId)

    publicSubnetIds := terraform.OutputList(t, terraformOptions, "public_subnet_ids")
    privateSubnetIds := terraform.OutputList(t, terraformOptions, "private_subnet_ids")

    assert.Equal(t, 3, len(publicSubnetIds))
    assert.Equal(t, 3, len(privateSubnetIds))

    // Verify VPC configuration
    vpc := aws.GetVpcById(t, vpcId, region)
    assert.Equal(t, vpcCidr, vpc.CidrBlock)
    assert.True(t, vpc.EnableDnsSupport)
    assert.True(t, vpc.EnableDnsHostnames)

    // Test subnet configurations
    for _, subnetId := range publicSubnetIds {
        subnet := aws.GetSubnetById(t, subnetId, region)
        assert.True(t, subnet.MapPublicIpOnLaunch)
        assert.Contains(t, subnet.AvailabilityZone, region)
    }

    for _, subnetId := range privateSubnetIds {
        subnet := aws.GetSubnetById(t, subnetId, region)
        assert.False(t, subnet.MapPublicIpOnLaunch)

        // Verify NAT gateway connectivity
        routeTable := aws.GetRouteTableForSubnet(t, subnet, region)
        hasNatRoute := false
        for _, route := range routeTable.Routes {
            if route.DestinationCidrBlock == "0.0.0.0/0" && route.NatGatewayId != "" {
                hasNatRoute = true
                break
            }
        }
        assert.True(t, hasNatRoute, "Private subnet should have NAT gateway route")
    }

    // Test network ACLs
    testNetworkAcls(t, vpcId, region)

    // Test security groups
    testSecurityGroups(t, terraformOptions, region)
}

func testNetworkAcls(t *testing.T, vpcId string, region string) {
    nacls := aws.GetNetworkAclsForVpc(t, vpcId, region)

    for _, nacl := range nacls {
        // Verify ingress rules
        for _, rule := range nacl.IngressRules {
            if rule.RuleNumber == 100 {
                assert.Equal(t, "tcp", rule.Protocol)
                assert.Equal(t, "0.0.0.0/0", rule.CidrBlock)
            }
        }

        // Verify egress rules
        for _, rule := range nacl.EgressRules {
            if rule.RuleNumber == 100 {
                assert.Equal(t, "-1", rule.Protocol) // All protocols
                assert.Equal(t, "0.0.0.0/0", rule.CidrBlock)
            }
        }
    }
}

func TestTerraformAwsEcs(t *testing.T) {
    t.Parallel()

    terraformOptions := &terraform.Options{
        TerraformDir: "../terraform/modules/ecs-cluster",

        Vars: map[string]interface{}{
            "cluster_name":         fmt.Sprintf("test-cluster-%s", random.UniqueId()),
            "capacity_providers":   []string{"FARGATE", "FARGATE_SPOT"},
            "container_insights":   true,
            "enable_execute_command": true,
        },
    }

    defer terraform.Destroy(t, terraformOptions)
    terraform.InitAndApply(t, terraformOptions)

    // Test cluster creation
    clusterArn := terraform.Output(t, terraformOptions, "cluster_arn")
    assert.Contains(t, clusterArn, "cluster/test-cluster")

    // Test service deployment
    deployTestService(t, terraformOptions)
}

func deployTestService(t *testing.T, terraformOptions *terraform.Options) {
    serviceOptions := &terraform.Options{
        TerraformDir: "../terraform/modules/ecs-service",

        Vars: map[string]interface{}{
            "cluster_id":     terraform.Output(t, terraformOptions, "cluster_id"),
            "service_name":   "test-service",
            "task_cpu":       256,
            "task_memory":    512,
            "desired_count":  2,
            "container_port": 8080,
        },
    }

    defer terraform.Destroy(t, serviceOptions)
    terraform.InitAndApply(t, serviceOptions)

    // Wait for service to stabilize
    retry.DoWithRetry(t, "Wait for ECS service", 30, 10*time.Second, func() (string, error) {
        // Check service status
        return "", nil
    })
}

Ansible Testing with Molecule

Molecule Test Configuration

# molecule/default/molecule.yml
---
dependency:
  name: galaxy

driver:
  name: docker

platforms:
  - name: ubuntu-2204
    image: ubuntu:22.04
    pre_build_image: false
    dockerfile: Dockerfile.j2
    privileged: true
    command: /lib/systemd/systemd
    volumes:
      - /sys/fs/cgroup:/sys/fs/cgroup:ro
    tmpfs:
      - /tmp
      - /run
    capabilities:
      - SYS_ADMIN

  - name: centos-8
    image: centos:8
    pre_build_image: false
    dockerfile: Dockerfile.j2
    privileged: true
    command: /usr/sbin/init
    volumes:
      - /sys/fs/cgroup:/sys/fs/cgroup:ro
    tmpfs:
      - /tmp
      - /run

provisioner:
  name: ansible
  config_options:
    defaults:
      callback_whitelist: profile_tasks
      fact_caching: jsonfile
      fact_caching_connection: /tmp/ansible_cache
  inventory:
    host_vars:
      ubuntu-2204:
        ansible_python_interpreter: /usr/bin/python3
  lint:
    name: ansible-lint
  playbooks:
    prepare: prepare.yml
    converge: converge.yml
    verify: verify.yml

verifier:
  name: testinfra
  options:
    verbose: true
  lint:
    name: flake8

scenario:
  name: default
  test_sequence:
    - dependency
    - lint
    - cleanup
    - destroy
    - syntax
    - create
    - prepare
    - converge
    - idempotence
    - side_effect
    - verify
    - cleanup
    - destroy

Ansible Playbook Testing

# molecule/default/tests/test_nginx.py
import os
import pytest

import testinfra.utils.ansible_runner

testinfra_hosts = testinfra.utils.ansible_runner.AnsibleRunner(
    os.environ['MOLECULE_INVENTORY_FILE']
).get_hosts('all')


def test_nginx_installed(host):
    """Test that nginx is installed"""
    nginx = host.package('nginx')
    assert nginx.is_installed
    assert nginx.version.startswith('1.')


def test_nginx_service(host):
    """Test nginx service is running and enabled"""
    nginx = host.service('nginx')
    assert nginx.is_running
    assert nginx.is_enabled


def test_nginx_config(host):
    """Test nginx configuration"""
    config = host.file('/etc/nginx/nginx.conf')
    assert config.exists
    assert config.is_file
    assert config.user == 'root'
    assert config.group == 'root'
    assert oct(config.mode) == '0o644'

    # Validate configuration syntax
    cmd = host.run('nginx -t')
    assert cmd.rc == 0
    assert 'syntax is ok' in cmd.stderr
    assert 'test is successful' in cmd.stderr


def test_nginx_ports(host):
    """Test nginx is listening on expected ports"""
    assert host.socket('tcp://0.0.0.0:80').is_listening

    # Test SSL if configured
    if host.file('/etc/nginx/sites-enabled/ssl').exists:
        assert host.socket('tcp://0.0.0.0:443').is_listening


def test_nginx_process(host):
    """Test nginx processes are running"""
    master = host.process.filter(comm='nginx', user='root')
    assert len(master) == 1

    workers = host.process.filter(comm='nginx', user='www-data')
    assert len(workers) >= 1


def test_nginx_log_files(host):
    """Test log files are created with correct permissions"""
    access_log = host.file('/var/log/nginx/access.log')
    error_log = host.file('/var/log/nginx/error.log')

    for log in [access_log, error_log]:
        assert log.exists
        assert log.is_file
        assert log.user == 'www-data'
        assert oct(log.mode) == '0o640'


def test_nginx_security_headers(host):
    """Test security headers are set"""
    response = host.run('curl -I http://localhost')
    assert response.rc == 0

    headers = response.stdout
    assert 'X-Frame-Options: SAMEORIGIN' in headers
    assert 'X-Content-Type-Options: nosniff' in headers
    assert 'X-XSS-Protection: 1; mode=block' in headers


@pytest.mark.parametrize('site', [
    'default',
    'app.example.com'
])
def test_nginx_sites(host, site):
    """Test nginx site configurations"""
    site_config = host.file(f'/etc/nginx/sites-available/{site}')
    assert site_config.exists

    site_enabled = host.file(f'/etc/nginx/sites-enabled/{site}')
    assert site_enabled.exists
    assert site_enabled.is_symlink
    assert site_enabled.linked_to == f'/etc/nginx/sites-available/{site}'

Policy as Code with Open Policy Agent

OPA Policies for Terraform

# policies/terraform/security.rego
package terraform.security

import future.keywords.if
import future.keywords.in

default allow = false

# Deny public S3 buckets
deny[msg] {
    resource := input.resource_changes[_]
    resource.type == "aws_s3_bucket_public_access_block"
    resource.change.after.block_public_acls == false
    msg := sprintf("S3 bucket %s must block public ACLs", [resource.address])
}

# Require encryption for RDS instances
deny[msg] {
    resource := input.resource_changes[_]
    resource.type == "aws_db_instance"
    not resource.change.after.storage_encrypted
    msg := sprintf("RDS instance %s must have encrypted storage", [resource.address])
}

# Enforce tagging requirements
required_tags := ["Environment", "Owner", "CostCenter", "Project"]

deny[msg] {
    resource := input.resource_changes[_]
    required_tag := required_tags[_]
    not resource.change.after.tags[required_tag]
    msg := sprintf("Resource %s missing required tag: %s", [resource.address, required_tag])
}

# Security group rules validation
deny[msg] {
    resource := input.resource_changes[_]
    resource.type == "aws_security_group_rule"
    resource.change.after.type == "ingress"
    resource.change.after.cidr_blocks[_] == "0.0.0.0/0"
    resource.change.after.from_port == 22
    msg := sprintf("Security group rule %s allows SSH from anywhere", [resource.address])
}

# Instance type restrictions
allowed_instance_types := [
    "t3.micro", "t3.small", "t3.medium",
    "m5.large", "m5.xlarge"
]

deny[msg] {
    resource := input.resource_changes[_]
    resource.type == "aws_instance"
    not resource.change.after.instance_type in allowed_instance_types
    msg := sprintf("Instance %s uses non-approved type: %s", [
        resource.address,
        resource.change.after.instance_type
    ])
}

# Network segmentation requirements
deny[msg] {
    resource := input.resource_changes[_]
    resource.type == "aws_instance"
    resource.change.after.associate_public_ip_address == true
    contains(resource.change.after.tags.Environment, "prod")
    msg := sprintf("Production instance %s cannot have public IP", [resource.address])
}

Sentinel Policies for Terraform Cloud

# policies/cost-control.sentinel
import "tfplan/v2" as tfplan
import "decimal"

# Maximum monthly cost threshold
max_monthly_cost = decimal.new(1000)

# Calculate estimated monthly cost
estimated_cost = decimal.new(tfplan.cost_estimate.total_monthly_cost)

# Main rule
main = rule {
    estimated_cost.less_than_or_equal_to(max_monthly_cost)
}

# Instance cost limits by type
instance_cost_limits = {
    "t3.micro":  50,
    "t3.small":  100,
    "t3.medium": 200,
    "t3.large":  300,
    "m5.large":  400,
    "m5.xlarge": 800,
}

# Validate individual instance costs
validate_instance_costs = func() {
    validated = true

    for tfplan.resource_changes as _, rc {
        if rc.type is "aws_instance" and rc.change.actions contains "create" {
            instance_type = rc.change.after.instance_type

            if instance_type in keys(instance_cost_limits) {
                # Estimate hourly cost (simplified)
                hourly_cost = instance_cost_limits[instance_type] / 730

                if hourly_cost > instance_cost_limits[instance_type] / 730 {
                    print("Instance", rc.address, "exceeds cost limit")
                    validated = false
                }
            }
        }
    }

    return validated
}

# Compliance checks
compliance_rule = rule {
    validate_instance_costs()
}

Integration Testing with Kitchen-Terraform

Kitchen Configuration

# .kitchen.yml
---
driver:
  name: terraform
  root_module_directory: test/fixtures/wrapper
  command_timeout: 1800

provisioner:
  name: terraform

verifier:
  name: terraform
  systems:
    - name: default
      backend: ssh
      hosts_output: public_ip
      user: ubuntu
      key_files:
        - test/assets/id_rsa

    - name: aws
      backend: awspec

platforms:
  - name: ubuntu-2204
    driver:
      variables:
        platform: ubuntu-2204

  - name: amazon-linux-2
    driver:
      variables:
        platform: amazon-linux-2

suites:
  - name: default
    driver:
      variables:
        instance_count: 2
        enable_monitoring: true
    verifier:
      inspec_tests:
        - test/integration/default
    lifecycle:
      pre_converge:
        - local: echo "Running pre-converge tasks"
      post_converge:
        - local: echo "Running post-converge tasks"

InSpec Integration Tests

# test/integration/default/controls/infrastructure.rb
control 'infrastructure-01' do
  title 'Verify VPC Configuration'
  desc 'Ensure VPC is configured correctly'

  describe aws_vpc(vpc_id: attribute('vpc_id')) do
    it { should exist }
    its('cidr_block') { should eq '10.0.0.0/16' }
    its('state') { should eq 'available' }
    its('enable_dns_support') { should eq true }
    its('enable_dns_hostnames') { should eq true }
  end
end

control 'infrastructure-02' do
  title 'Verify Security Groups'
  desc 'Ensure security groups are configured securely'

  aws_security_groups.where(vpc_id: attribute('vpc_id')).entries.each do |sg|
    describe aws_security_group(group_id: sg.group_id) do
      it { should_not allow_ingress_from_internet_on_port(22) }
      it { should_not allow_ingress_from_internet_on_port(3389) }

      # Custom matchers
      its('ingress_rules') { should_not include(
        from_port: 0,
        to_port: 65535,
        cidr_blocks: ['0.0.0.0/0']
      )}
    end
  end
end

control 'infrastructure-03' do
  title 'Verify EC2 Instances'
  desc 'Ensure EC2 instances meet security requirements'

  aws_ec2_instances.where(vpc_id: attribute('vpc_id')).entries.each do |instance|
    describe aws_ec2_instance(instance_id: instance.instance_id) do
      it { should be_running }
      it { should have_encrypted_root_volume }
      its('monitoring_state') { should eq 'enabled' }
      its('instance_type') { should be_in %w[t3.micro t3.small t3.medium] }

      # Check IMDSv2 is required
      its('metadata_options.http_tokens') { should eq 'required' }
      its('metadata_options.http_endpoint') { should eq 'enabled' }
    end
  end
end

control 'infrastructure-04' do
  title 'Verify S3 Buckets'
  desc 'Ensure S3 buckets are secure'

  aws_s3_buckets.entries.each do |bucket|
    describe aws_s3_bucket(bucket_name: bucket.name) do
      it { should have_default_encryption_enabled }
      it { should have_versioning_enabled }
      it { should_not be_public }

      its('bucket_acl.grants') { should_not include(
        grantee: { type: 'Group', uri: 'http://acs.amazonaws.com/groups/global/AllUsers' }
      )}
    end
  end
end

control 'infrastructure-05' do
  title 'Verify RDS Instances'
  desc 'Ensure RDS instances are configured securely'

  aws_rds_instances.entries.each do |db|
    describe aws_rds_instance(db_instance_identifier: db.db_instance_identifier) do
      it { should be_encrypted }
      it { should have_automated_backups_enabled }
      its('backup_retention_period') { should be >= 7 }
      its('multi_az') { should eq true }
      its('publicly_accessible') { should eq false }
      its('deletion_protection') { should eq true }
    end
  end
end

Continuous Compliance Pipeline

Complete CI/CD Pipeline for IaC

# .github/workflows/iac-pipeline.yml
name: Infrastructure as Code Pipeline

on:
  push:
    branches: [main, develop]
  pull_request:
    branches: [main]
  schedule:
    - cron: '0 2 * * *'  # Daily compliance check

env:
  TF_VERSION: '1.5.0'
  ANSIBLE_VERSION: '2.15.0'
  AWS_DEFAULT_REGION: 'us-east-1'

jobs:
  static-analysis:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Setup Tools
        run: |
          # Install required tools
          pip install checkov ansible-lint yamllint

          # Install tflint
          curl -s https://raw.githubusercontent.com/terraform-linters/tflint/master/install_linux.sh | bash

          # Install tfsec
          curl -s https://raw.githubusercontent.com/aquasecurity/tfsec/master/scripts/install_linux.sh | bash

      - name: YAML Lint
        run: yamllint -c .yamllint .

      - name: Ansible Lint
        run: ansible-lint ansible/

      - name: Terraform Security Scan
        run: |
          tfsec terraform/ --format json --out tfsec-report.json
          checkov -d terraform/ --output json --output-file-path checkov-report.json

      - name: Upload Security Reports
        uses: actions/upload-artifact@v3
        with:
          name: security-reports
          path: |
            tfsec-report.json
            checkov-report.json

  unit-tests:
    runs-on: ubuntu-latest
    needs: static-analysis

    strategy:
      matrix:
        test_suite: [vpc, ecs, rds, s3]

    steps:
      - uses: actions/checkout@v3

      - name: Setup Go
        uses: actions/setup-go@v4
        with:
          go-version: '1.20'

      - name: Run Terratest
        env:
          AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
          AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
        run: |
          cd test
          go mod download
          go test -v -timeout 30m -run Test${{ matrix.test_suite }}

  policy-validation:
    runs-on: ubuntu-latest
    needs: static-analysis

    steps:
      - uses: actions/checkout@v3

      - name: Setup OPA
        run: |
          curl -L -o opa https://openpolicyagent.org/downloads/latest/opa_linux_amd64_static
          chmod +x opa
          sudo mv opa /usr/local/bin/

      - name: Terraform Plan
        env:
          AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
          AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
        run: |
          cd terraform/environments/dev
          terraform init
          terraform plan -out=tfplan.binary
          terraform show -json tfplan.binary > tfplan.json

      - name: OPA Policy Check
        run: |
          opa eval -d policies/ -i terraform/environments/dev/tfplan.json \
            "data.terraform.security.deny[_]" --format pretty

          # Fail if policies are violated
          VIOLATIONS=$(opa eval -d policies/ -i terraform/environments/dev/tfplan.json \
            "data.terraform.security.deny" --format json | jq '.result[0].expressions[0].value | length')

          if [ "$VIOLATIONS" -gt 0 ]; then
            echo "Policy violations found!"
            exit 1
          fi

  integration-tests:
    runs-on: ubuntu-latest
    needs: [unit-tests, policy-validation]
    if: github.ref == 'refs/heads/main'

    steps:
      - uses: actions/checkout@v3

      - name: Setup Ruby
        uses: ruby/setup-ruby@v1
        with:
          ruby-version: '3.0'
          bundler-cache: true

      - name: Run Kitchen Tests
        env:
          AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
          AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
        run: |
          bundle exec kitchen test --parallel

  compliance-scan:
    runs-on: ubuntu-latest
    needs: integration-tests

    steps:
      - uses: actions/checkout@v3

      - name: Run InSpec Compliance
        env:
          AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
          AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
        run: |
          # Install InSpec
          curl https://omnitruck.chef.io/install.sh | sudo bash -s -- -P inspec

          # Run compliance profiles
          inspec exec compliance/ --reporter json:compliance-report.json

      - name: Generate Compliance Report
        run: |
          python3 scripts/generate_compliance_report.py \
            --input compliance-report.json \
            --output compliance-report.html

      - name: Upload Compliance Report
        uses: actions/upload-artifact@v3
        with:
          name: compliance-report
          path: compliance-report.html

Conclusion

Testing Infrastructure as Code is not just about preventing outages—it’s about building confidence in our infrastructure changes, ensuring security and compliance, and enabling rapid, safe deployments. The comprehensive testing strategies presented here form a robust framework that catches issues at every level, from syntax errors to complex integration problems.

The combination of static analysis, unit testing, integration testing, and policy validation creates a safety net that allows teams to move fast without breaking things. By implementing these practices, organizations can achieve the holy grail of infrastructure management: self-documenting, self-validating, and self-healing infrastructure that scales reliably.

Remember that IaC testing is an evolving discipline. As your infrastructure grows in complexity, so should your testing strategies. Start with the basics—linting and validation—then gradually add more sophisticated testing layers. The investment in comprehensive IaC testing pays dividends in reduced incidents, faster deployments, and improved team confidence.