Infrastructure as Code (IaC) has revolutionized how we provision and manage infrastructure, treating infrastructure configuration as software. Just as we test application code, IaC requires rigorous testing to prevent costly mistakes, security (as discussed in Shift-Left Testing: Early Problem Detection Strategy) vulnerabilities, and service disruptions. A single untested infrastructure change can bring down production systems, compromise security (as discussed in Monitoring and Observability for QA: Complete Guide), or generate unexpected cloud costs.

This comprehensive guide explores testing strategies for major IaC tools including Terraform, Ansible, and CloudFormation (as discussed in AI Copilot for Test Automation: GitHub Copilot, Amazon CodeWhisperer and the Future of QA), along with implementing Compliance as Code to ensure infrastructure meets organizational standards and regulatory requirements.

Understanding Infrastructure as Code Testing

Why Test Infrastructure Code?

Unlike application code where bugs might affect features, IaC errors can:

  • Cause complete service outages: Invalid configurations can destroy production resources
  • Create security vulnerabilities: Misconfigured access controls expose systems to attacks
  • Generate massive costs: Unintended resource provisioning can cost thousands per hour
  • Violate compliance requirements: Non-compliant infrastructure risks legal and financial penalties
  • Create drift and inconsistency: Untested changes lead to configuration drift across environments

Levels of IaC Testing

IaC testing follows a pyramid similar to application testing:

LevelPurposeToolsSpeedCoverage
Static AnalysisSyntax validation, linting, security scanningtflint, ansible-lint, cfn-lintFast (seconds)High
Unit TestingTest individual modules/resources in isolationTerratest, Molecule, TaskCatMedium (minutes)Medium
Integration TestingTest resource interactions and dependenciesTerratest, Kitchen, InSpecSlow (10-30 min)Medium
End-to-End TestingDeploy full infrastructure and validateTerratest, ServerspecVery Slow (30+ min)Low
Compliance TestingValidate against policies and standardsOPA, Sentinel, Cloud CustodianFast (seconds)High

Terraform Testing

Terraform is the most popular IaC tool, supporting multiple cloud providers with a declarative configuration language.

Static Analysis with tflint

tflint checks Terraform code for potential errors, deprecated syntax, and best practices.

Installation:

# macOS
brew install tflint

# Linux
curl -s https://raw.githubusercontent.com/terraform-linters/tflint/master/install_linux.sh | bash

# Windows
choco install tflint

Configuration (.tflint.hcl):

plugin "aws" {
  enabled = true
  version = "0.25.0"
  source  = "github.com/terraform-linters/tflint-ruleset-aws"
}

plugin "azurerm" {
  enabled = true
  version = "0.23.0"
  source  = "github.com/terraform-linters/tflint-ruleset-azurerm"
}

config {
  module = true
  force = false
}

rule "terraform_deprecated_interpolation" {
  enabled = true
}

rule "terraform_unused_declarations" {
  enabled = true
}

rule "terraform_comment_syntax" {
  enabled = true
}

rule "terraform_documented_outputs" {
  enabled = true
}

rule "terraform_documented_variables" {
  enabled = true
}

rule "terraform_typed_variables" {
  enabled = true
}

rule "terraform_module_pinned_source" {
  enabled = true
}

rule "terraform_naming_convention" {
  enabled = true
  format  = "snake_case"
}

rule "terraform_required_version" {
  enabled = true
}

rule "terraform_required_providers" {
  enabled = true
}

rule "terraform_standard_module_structure" {
  enabled = true
}

Running tflint:

# Initialize plugins
tflint --init

# Lint current directory
tflint

# Lint specific directory
tflint /path/to/terraform

# Output in different formats
tflint --format=json
tflint --format=checkstyle > tflint-report.xml

# Recursive linting
find . -type f -name "*.tf" -exec dirname {} \; | sort -u | xargs -I {} tflint {}

Terraform Validate and Format

Built-in validation:

# Format code to canonical style
terraform fmt -recursive

# Validate syntax and internal consistency
terraform validate

# Check if code is properly formatted
terraform fmt -check -recursive

Pre-commit hook for Terraform:

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/antonbabenko/pre-commit-terraform
    rev: v1.83.5
    hooks:
      - id: terraform_fmt
      - id: terraform_validate
      - id: terraform_docs
      - id: terraform_tflint
        args:
          - --args=--config=__GIT_WORKING_DIR__/.tflint.hcl
      - id: terraform_tfsec
        args:
          - --args=--format=json
      - id: terraform_checkov
        args:
          - --args=--quiet
          - --args=--framework=terraform

Security Scanning with tfsec

tfsec scans Terraform code for security issues and misconfigurations.

Installation:

# macOS
brew install tfsec

# Linux
curl -s https://raw.githubusercontent.com/aquasecurity/tfsec/master/scripts/install_linux.sh | bash

# Docker
docker run --rm -it -v "$(pwd):/src" aquasec/tfsec /src

Running tfsec:

# Scan current directory
tfsec .

# Output formats
tfsec . --format=json > tfsec-results.json
tfsec . --format=junit > tfsec-report.xml
tfsec . --format=sarif > tfsec-results.sarif

# Exclude specific checks
tfsec . --exclude=aws-s3-enable-bucket-encryption,aws-s3-enable-bucket-logging

# Set minimum severity
tfsec . --minimum-severity=HIGH

# Soft fail (exit 0 even with issues)
tfsec . --soft-fail

Example security issues detected:

Result #1 HIGH S3 Bucket does not have logging enabled.
────────────────────────────────────────────────────────────
  main.tf:15-20
────────────────────────────────────────────────────────────
   15  ┌ resource "aws_s3_bucket" "data" {
   16  │   bucket = "my-data-bucket"
   17  │   acl    = "private"
   18  │
   19  │   # Missing: logging configuration
   20  └ }
────────────────────────────────────────────────────────────
  Impact:     Audit trail of bucket access is not available
  Resolution: Enable logging for S3 buckets

Unit Testing with Terratest

Terratest is a Go library for testing infrastructure code by actually deploying it to real environments.

Installation:

# Initialize Go module
go mod init github.com/company/terraform-tests

# Install Terratest
go get github.com/gruntwork-io/terratest/modules/terraform

Example Terratest (vpc_test.go):

package test

import (
	"testing"

	"github.com/gruntwork-io/terratest/modules/aws"
	"github.com/gruntwork-io/terratest/modules/random"
	"github.com/gruntwork-io/terratest/modules/terraform"
	"github.com/stretchr/testify/assert"
)

func TestVPCCreation(t *testing.T) {
	t.Parallel()

	// Pick a random AWS region
	awsRegion := aws.GetRandomStableRegion(t, nil, nil)

	// Generate unique name
	uniqueID := random.UniqueId()
	vpcName := "test-vpc-" + uniqueID

	terraformOptions := terraform.WithDefaultRetryableErrors(t, &terraform.Options{
		// Path to Terraform code
		TerraformDir: "../examples/vpc",

		// Variables to pass
		Vars: map[string]interface{}{
			"vpc_name":   vpcName,
			"vpc_cidr":   "10.0.0.0/16",
			"aws_region": awsRegion,
		},

		// Environment variables
		EnvVars: map[string]string{
			"AWS_DEFAULT_REGION": awsRegion,
		},
	})

	// Clean up resources after test
	defer terraform.Destroy(t, terraformOptions)

	// Deploy infrastructure
	terraform.InitAndApply(t, terraformOptions)

	// Validate outputs
	vpcID := terraform.Output(t, terraformOptions, "vpc_id")
	assert.NotEmpty(t, vpcID)

	// Validate VPC exists in AWS
	vpc := aws.GetVpcById(t, vpcID, awsRegion)
	assert.Equal(t, vpcName, aws.GetTagsForVpc(t, vpcID, awsRegion)["Name"])

	// Validate CIDR block
	assert.Equal(t, "10.0.0.0/16", *vpc.CidrBlock)

	// Validate subnets were created
	subnetIDs := terraform.OutputList(t, terraformOptions, "subnet_ids")
	assert.Equal(t, 3, len(subnetIDs))

	// Validate each subnet
	for _, subnetID := range subnetIDs {
		subnet := aws.GetSubnetById(t, subnetID, awsRegion)
		assert.Equal(t, vpcID, *subnet.VpcId)
	}

	// Validate NAT gateway
	natGatewayID := terraform.Output(t, terraformOptions, "nat_gateway_id")
	assert.NotEmpty(t, natGatewayID)

	// Validate internet gateway
	igwID := terraform.Output(t, terraformOptions, "internet_gateway_id")
	assert.NotEmpty(t, igwID)
}

func TestVPCWithCustomCIDR(t *testing.T) {
	t.Parallel()

	awsRegion := "us-west-2"

	terraformOptions := &terraform.Options{
		TerraformDir: "../examples/vpc",
		Vars: map[string]interface{}{
			"vpc_name":   "test-custom-cidr-vpc",
			"vpc_cidr":   "172.16.0.0/16",
			"aws_region": awsRegion,
		},
	}

	defer terraform.Destroy(t, terraformOptions)
	terraform.InitAndApply(t, terraformOptions)

	vpcID := terraform.Output(t, terraformOptions, "vpc_id")
	vpc := aws.GetVpcById(t, vpcID, awsRegion)

	assert.Equal(t, "172.16.0.0/16", *vpc.CidrBlock)
}

func TestVPCTagging(t *testing.T) {
	t.Parallel()

	awsRegion := "us-east-1"
	expectedTags := map[string]string{
		"Environment": "test",
		"ManagedBy":   "terraform",
		"Owner":       "platform-team",
	}

	terraformOptions := &terraform.Options{
		TerraformDir: "../examples/vpc",
		Vars: map[string]interface{}{
			"vpc_name":   "test-tagged-vpc",
			"vpc_cidr":   "10.1.0.0/16",
			"aws_region": awsRegion,
			"tags":       expectedTags,
		},
	}

	defer terraform.Destroy(t, terraformOptions)
	terraform.InitAndApply(t, terraformOptions)

	vpcID := terraform.Output(t, terraformOptions, "vpc_id")
	actualTags := aws.GetTagsForVpc(t, vpcID, awsRegion)

	for key, expectedValue := range expectedTags {
		assert.Equal(t, expectedValue, actualTags[key])
	}
}

Running Terratest:

# Run all tests
go test -v -timeout 30m

# Run specific test
go test -v -timeout 30m -run TestVPCCreation

# Run tests in parallel
go test -v -timeout 45m -parallel 3

# Generate test coverage
go test -v -timeout 30m -coverprofile=coverage.out
go tool cover -html=coverage.out -o coverage.html

Terraform Plan Testing

Test changes before applying them:

# Generate and save plan
terraform plan -out=tfplan

# Convert plan to JSON for programmatic analysis
terraform show -json tfplan > tfplan.json

# Validate plan with custom script
python scripts/validate_plan.py tfplan.json

Example plan validation script (validate_plan.py):

#!/usr/bin/env python3
import json
import sys

def validate_plan(plan_file):
    with open(plan_file) as f:
        plan = json.load(f)

    errors = []
    warnings = []

    # Check for resource deletion
    for change in plan.get('resource_changes', []):
        actions = change.get('change', {}).get('actions', [])
        if 'delete' in actions:
            resource_name = change.get('address')
            errors.append(f"Plan includes deletion of {resource_name}")

    # Check for unencrypted S3 buckets
    for change in plan.get('resource_changes', []):
        if change.get('type') == 'aws_s3_bucket':
            after = change.get('change', {}).get('after', {})
            if not after.get('server_side_encryption_configuration'):
                resource_name = change.get('address')
                warnings.append(f"S3 bucket {resource_name} lacks encryption")

    # Check for public access
    for change in plan.get('resource_changes', []):
        if change.get('type') == 'aws_security_group':
            after = change.get('change', {}).get('after', {})
            for ingress in after.get('ingress', []):
                if '0.0.0.0/0' in ingress.get('cidr_blocks', []):
                    resource_name = change.get('address')
                    warnings.append(f"Security group {resource_name} allows public access")

    # Print results
    if errors:
        print("ERRORS:")
        for error in errors:
            print(f"  ❌ {error}")

    if warnings:
        print("WARNINGS:")
        for warning in warnings:
            print(f"  ⚠️  {warning}")

    if errors:
        sys.exit(1)
    elif warnings:
        print("\n⚠️  Plan has warnings but will proceed")
        sys.exit(0)
    else:
        print("✅ Plan validation passed")
        sys.exit(0)

if __name__ == '__main__':
    if len(sys.argv) != 2:
        print("Usage: validate_plan.py <plan.json>")
        sys.exit(1)

    validate_plan(sys.argv[1])

Ansible Playbook Testing

Ansible automates configuration management and application deployment using YAML playbooks.

Ansible Lint

ansible-lint checks playbooks for common mistakes and best practices.

Installation:

pip install ansible-lint

Configuration (.ansible-lint):

profile: production  # Or 'basic', 'moderate', 'safety', 'shared'

exclude_paths:
  - .cache/
  - .github/
  - test/fixtures/

# Enable/disable specific rules
skip_list:
  - experimental  # Experimental rules
  - galaxy  # Galaxy-specific rules

# Or enable only specific rules
# enable_list:
#   - yaml
#   - no-changed-when

# Rule-specific configuration
rules:
  line-length:
    max: 160
    allow-long-ansible-strings: true

  var-naming:
    allowed-names:
      - role_name
      - foo

  command-instead-of-module:
    enabled: true

# Tag management
tags:
  - skip_ansible_lint  # Ignore tasks with this tag

# Offline mode
offline: false

# Use colored output
colored: true

# Parseable output format
parseable: false

Running ansible-lint:

# Lint all playbooks in current directory
ansible-lint

# Lint specific playbook
ansible-lint playbooks/deploy.yml

# Lint with specific profile
ansible-lint --profile=safety

# Output formats
ansible-lint --format=json > lint-results.json
ansible-lint --format=codeclimate > lint-results.json

# Progressive mode (only new issues)
ansible-lint --progressive

# Generate rules documentation
ansible-lint --list-rules

Molecule: Ansible Testing Framework

Molecule provides a complete testing framework for Ansible roles and playbooks.

Installation:

pip install molecule molecule-docker ansible-lint

Initialize new role with Molecule:

molecule init role my_role --driver-name=docker

Molecule configuration (molecule/default/molecule.yml):

---
dependency:
  name: galaxy
  options:
    requirements-file: requirements.yml

driver:
  name: docker

platforms:
  - name: ubuntu-20
    image: geerlingguy/docker-ubuntu2004-ansible:latest
    pre_build_image: true
    privileged: true
    volumes:
      - /sys/fs/cgroup:/sys/fs/cgroup:ro
    command: /lib/systemd/systemd

  - name: centos-8
    image: geerlingguy/docker-centos8-ansible:latest
    pre_build_image: true
    privileged: true
    volumes:
      - /sys/fs/cgroup:/sys/fs/cgroup:ro
    command: /usr/sbin/init

provisioner:
  name: ansible
  config_options:
    defaults:
      callbacks_enabled: profile_tasks,timer
      stdout_callback: yaml
  inventory:
    host_vars:
      ubuntu-20:
        ansible_python_interpreter: /usr/bin/python3
      centos-8:
        ansible_python_interpreter: /usr/bin/python3
  playbooks:
    converge: converge.yml
    verify: verify.yml

verifier:
  name: ansible

lint: |
  set -e
  yamllint .
  ansible-lint .

scenario:
  name: default
  test_sequence:
    - dependency
    - lint
    - cleanup
    - destroy
    - syntax
    - create
    - prepare
    - converge
    - idempotence
    - side_effect
    - verify
    - cleanup
    - destroy

Molecule test playbook (molecule/default/verify.yml):

---
- name: Verify
  hosts: all
  gather_facts: true
  tasks:
    - name: Check if Nginx is installed
      package:
        name: nginx
        state: present
      check_mode: true
      register: nginx_install
      failed_when: nginx_install.changed

    - name: Check if Nginx service is running
      service:
        name: nginx
        state: started
        enabled: true
      check_mode: true
      register: nginx_service
      failed_when: nginx_service.changed

    - name: Verify Nginx is listening on port 80
      wait_for:
        port: 80
        timeout: 5
        state: started

    - name: Check Nginx configuration
      command: nginx -t
      changed_when: false
      register: nginx_config
      failed_when: nginx_config.rc != 0

    - name: Verify default site is accessible
      uri:
        url: http://localhost
        status_code: 200
        return_content: yes
      register: web_response

    - name: Check Nginx user
      command: ps aux | grep nginx | grep -v grep
      register: nginx_process
      changed_when: false
      failed_when: "'www-data' not in nginx_process.stdout"

    - name: Verify log files exist
      stat:
        path: "{{ item }}"
      register: log_files
      failed_when: not log_files.stat.exists
      loop:
        - /var/log/nginx/access.log
        - /var/log/nginx/error.log

Running Molecule tests:

# Run full test sequence
molecule test

# Run specific test steps
molecule create      # Create test instances
molecule converge    # Apply playbook
molecule verify      # Run verification tests
molecule destroy     # Destroy test instances

# Test idempotence (playbook should not make changes on second run)
molecule idempotence

# Debug mode
molecule --debug test

# Test specific scenario
molecule test -s centos

# List available scenarios
molecule list

Ansible Unit Testing with ansible-test

For testing Ansible modules and plugins:

# Sanity tests
ansible-test sanity --docker default

# Unit tests
ansible-test units --docker default

# Integration tests
ansible-test integration --docker default

CloudFormation Testing

AWS CloudFormation uses JSON or YAML templates to define infrastructure.

CloudFormation Linting with cfn-lint

Installation:

pip install cfn-lint

Configuration (.cfnlintrc):

templates:
  - templates/**/*.yaml
  - templates/**/*.yml
  - templates/**/*.json

ignore_checks:
  - E3012  # Property not documented

regions:
  - us-east-1
  - us-west-2
  - eu-west-1

append_rules:
  - custom_rules/

override_spec: custom-spec.json

Running cfn-lint:

# Lint template
cfn-lint template.yaml

# Lint all templates
cfn-lint templates/**/*.yaml

# Output formats
cfn-lint template.yaml --format json
cfn-lint template.yaml --format junit > cfn-lint-results.xml

# Ignore specific rules
cfn-lint template.yaml --ignore-checks E1029,W2001

# Set regions to validate against
cfn-lint template.yaml --regions us-east-1,eu-west-1

CloudFormation Validation

# Validate template syntax
aws cloudformation validate-template --template-body file://template.yaml

# Estimate cost
aws cloudformation estimate-template-cost \
  --template-body file://template.yaml \
  --parameters file://parameters.json

TaskCat: CloudFormation Testing Tool

TaskCat tests CloudFormation templates by deploying them to multiple regions.

Installation:

pip install taskcat

Configuration (.taskcat.yml):

project:
  name: my-infrastructure
  owner: platform-team
  regions:
    - us-east-1
    - us-west-2
    - eu-west-1

  parameters:
    KeyPairName: my-keypair
    InstanceType: t3.micro

  tags:
    Environment: test
    ManagedBy: taskcat

tests:
  default:
    template: templates/main.yaml
    regions:
      - us-east-1
      - us-west-2
    parameters:
      EnvironmentName: test-default

  production-like:
    template: templates/main.yaml
    regions:
      - us-east-1
    parameters:
      EnvironmentName: test-prod
      InstanceType: t3.large
      EnableHA: "true"

Running TaskCat:

# Run tests
taskcat test run

# List tests
taskcat test list

# Upload results to S3
taskcat test run --upload

# Clean up resources
taskcat test clean

Compliance as Code

Compliance as Code automates policy enforcement, ensuring infrastructure meets security, regulatory, and organizational standards.

Open Policy Agent (OPA)

OPA provides policy-based control using the Rego language.

Installation:

# macOS
brew install opa

# Linux
curl -L -o opa https://openpolicyagent.org/downloads/latest/opa_linux_amd64
chmod +x opa

Example policy (terraform.rego):

package terraform.analysis

import future.keywords

# Deny S3 buckets without encryption
deny[msg] {
    resource := input.resource_changes[_]
    resource.type == "aws_s3_bucket"
    not resource.change.after.server_side_encryption_configuration
    msg := sprintf("S3 bucket '%s' must have encryption enabled", [resource.address])
}

# Deny security groups with unrestricted ingress
deny[msg] {
    resource := input.resource_changes[_]
    resource.type == "aws_security_group"
    rule := resource.change.after.ingress[_]
    "0.0.0.0/0" in rule.cidr_blocks
    msg := sprintf("Security group '%s' allows unrestricted access", [resource.address])
}

# Warn about public S3 buckets
warn[msg] {
    resource := input.resource_changes[_]
    resource.type == "aws_s3_bucket_public_access_block"
    resource.change.after.block_public_acls == false
    msg := sprintf("S3 bucket '%s' allows public ACLs", [resource.address])
}

# Require specific tags
required_tags := ["Environment", "Owner", "CostCenter"]

deny[msg] {
    resource := input.resource_changes[_]
    tags := object.get(resource.change.after, "tags", {})
    missing := [tag | tag := required_tags[_]; not tags[tag]]
    count(missing) > 0
    msg := sprintf("Resource '%s' missing required tags: %v", [resource.address, missing])
}

Testing Terraform with OPA:

# Generate Terraform plan
terraform plan -out=tfplan
terraform show -json tfplan > tfplan.json

# Evaluate policy
opa eval --data terraform.rego --input tfplan.json "data.terraform.analysis.deny"

# Or use conftest
conftest test tfplan.json -p terraform.rego

HashiCorp Sentinel

Sentinel is HashiCorp’s policy-as-code framework, integrated with Terraform Cloud/Enterprise.

Example Sentinel policy:

import "tfplan/v2" as tfplan

# Allowed instance types
allowed_types = ["t3.micro", "t3.small", "t3.medium"]

# Find all EC2 instances
ec2_instances = filter tfplan.resource_changes as _, rc {
    rc.type is "aws_instance" and
    rc.mode is "managed" and
    (rc.change.actions contains "create" or rc.change.actions contains "update")
}

# Validate instance types
instance_type_valid = rule {
    all ec2_instances as _, instance {
        instance.change.after.instance_type in allowed_types
    }
}

# Main rule
main = rule {
    instance_type_valid
}

Checkov: Infrastructure Code Scanner

Checkov scans IaC for security and compliance issues.

Installation:

pip install checkov

Running Checkov:

# Scan Terraform
checkov -d .

# Scan specific file
checkov -f main.tf

# Scan CloudFormation
checkov -f template.yaml --framework cloudformation

# Output formats
checkov -d . --output json > checkov-results.json
checkov -d . --output junitxml > checkov-results.xml

# Skip specific checks
checkov -d . --skip-check CKV_AWS_18,CKV_AWS_19

# Soft fail
checkov -d . --soft-fail

# Scan container images
checkov --framework dockerfile -f Dockerfile

Custom Checkov check (custom_checks/s3_versioning.py):

from checkov.terraform.checks.resource.base_resource_check import BaseResourceCheck
from checkov.common.models.enums import CheckResult, CheckCategories

class S3VersioningEnabled(BaseResourceCheck):
    def __init__(self):
        name = "Ensure S3 bucket versioning is enabled"
        id = "CKV_AWS_CUSTOM_001"
        supported_resources = ['aws_s3_bucket']
        categories = [CheckCategories.BACKUP_AND_RECOVERY]
        super().__init__(name=name, id=id, categories=categories, supported_resources=supported_resources)

    def scan_resource_conf(self, conf):
        """
        Looks for versioning configuration on S3 buckets
        """
        if 'versioning' in conf:
            versioning = conf['versioning'][0]
            if isinstance(versioning, dict):
                if versioning.get('enabled') == [True]:
                    return CheckResult.PASSED
        return CheckResult.FAILED

check = S3VersioningEnabled()

Cloud Custodian

Cloud Custodian provides governance-as-code for cloud environments.

Installation:

pip install c7n

Policy example (custodian.yml):

policies:
  - name: s3-enforce-encryption
    resource: s3
    filters:
      - type: missing-encryption
    actions:
      - type: set-encryption
        crypto: AES256

  - name: ec2-stop-after-hours
    resource: ec2
    filters:
      - type: value
        key: "tag:Environment"
        value: dev
      - type: offhour
        offhour: 18
        default_tz: est
    actions:
      - stop

  - name: unused-ebs-volumes
    resource: ebs
    filters:
      - State: available
      - type: value
        key: CreateTime
        value_type: age
        op: greater-than
        value: 30
    actions:
      - type: delete

Running Cloud Custodian:

# Dry run (no actions)
custodian run -s output custodian.yml --dryrun

# Apply policies
custodian run -s output custodian.yml

# Generate report
custodian report -s output custodian.yml --format grid

Integrating IaC Testing in CI/CD

Complete GitHub Actions workflow:

name: Infrastructure Testing

on:
  pull_request:
    paths:
      - 'terraform/**'
      - 'ansible/**'
  push:
    branches: [main]

jobs:
  terraform-validation:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - uses: hashicorp/setup-terraform@v2
        with:
          terraform_version: 1.5.0

      - name: Terraform Format Check
        run: terraform fmt -check -recursive
        working-directory: terraform

      - name: Terraform Init
        run: terraform init
        working-directory: terraform

      - name: Terraform Validate
        run: terraform validate
        working-directory: terraform

      - name: TFLint
        uses: terraform-linters/setup-tflint@v3
        with:
          tflint_version: latest

      - name: Run TFLint
        run: tflint --recursive
        working-directory: terraform

      - name: tfsec
        uses: aquasecurity/tfsec-action@v1.0.0
        with:
          working_directory: terraform
          format: sarif
          soft_fail: true

      - name: Checkov
        uses: bridgecrewio/checkov-action@master
        with:
          directory: terraform
          framework: terraform
          output_format: sarif
          soft_fail: true

  terraform-test:
    needs: terraform-validation
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - uses: actions/setup-go@v4
        with:
          go-version: '1.21'

      - name: Run Terratest
        run: go test -v -timeout 30m
        working-directory: test
        env:
          AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
          AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}

  ansible-validation:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - uses: actions/setup-python@v4
        with:
          python-version: '3.10'

      - name: Install dependencies
        run: pip install ansible ansible-lint molecule molecule-docker

      - name: Ansible Lint
        run: ansible-lint playbooks/

      - name: Molecule Test
        run: molecule test
        working-directory: roles/webserver

Best Practices for IaC Testing

  1. Test in isolation: Use separate accounts/projects for testing
  2. Clean up resources: Always destroy test infrastructure
  3. Use realistic data: Test with production-like configurations
  4. Version control tests: Store tests alongside infrastructure code
  5. Automate everything: Run tests in CI/CD pipelines
  6. Monitor costs: Track spending on test infrastructure
  7. Parallelize tests: Speed up feedback with parallel execution
  8. Document policies: Make compliance requirements clear
  9. Regular policy updates: Keep security rules current
  10. Measure coverage: Track what’s tested and what’s not

Conclusion

Infrastructure as Code testing is essential for maintaining reliable, secure, and compliant infrastructure. By combining static analysis, unit testing, integration testing, and compliance-as-code practices, teams can catch issues early, prevent production disasters, and maintain high infrastructure quality standards.

The key is to implement multiple layers of testing—from quick static checks to comprehensive deployment tests—and automate them in CI/CD pipelines. Start with basic validation and linting, then progressively add more sophisticated testing as your infrastructure grows in complexity.

Key Takeaways:

  • IaC requires the same rigor as application code testing
  • Static analysis catches most issues before deployment
  • Unit and integration tests validate actual infrastructure behavior
  • Compliance as Code automates policy enforcement
  • Multiple testing layers provide defense in depth
  • Automation in CI/CD ensures consistent testing
  • Clean up test resources to control costs
  • Document and version control all tests and policies