Living Documentation: Auto-Generate Documentation from Code and Tests

Traditional documentation faces a persistent challenge: it becomes outdated the moment code changes. Living Documentation solves this problem by generating documentation directly from source code, tests, and executable specifications. This approach ensures documentation stays synchronized with implementation, reduces maintenance burden, and provides always-current insights into system behavior.

The Problem with Static Documentation

Manual documentation suffers from inherent limitations:

Staleness: Documentation quickly diverges from actual implementation
Maintenance overhead: Every code change requires separate documentation updates
Low trust: Teams stop trusting outdated docs and read code instead
Duplicate effort: Same information exists in code, tests, and documentation

Living Documentation addresses these issues by treating documentation as a first-class build artifact, automatically generated from the single source of truth: your codebase.

Core Principles of Living Documentation

1. Single Source of Truth

Documentation should be extracted from code, not duplicated in separate documents. Use:

Code annotations and docstrings for API documentation
Executable specifications (BDD scenarios) for business requirements
Test results for feature coverage and system behavior
Architecture decision records (ADRs) versioned with code

2. Automation and CI/CD Integration

Documentation generation should be part of your automated build pipeline:

# Example: GitHub Actions workflow for documentation generation
name: Generate Living Documentation

on:
  push:
    branches: [main, develop]
  pull_request:
    branches: [main]

jobs:
  generate-docs:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v3

      - name: Set up Python
        uses: actions/setup-python@v4
        with:
          python-version: '3.11'

      - name: Install dependencies
        run: |
          pip install sphinx sphinx-rtd-theme
          pip install -r requirements.txt

      - name: Generate API docs with Sphinx
        run: |
          cd docs
          sphinx-apidoc -f -o source/api ../src
          make html

      - name: Run BDD tests and generate Cucumber reports
        run: |
          pytest --cucumber-json=cucumber.json
          node generate-cucumber-report.js

      - name: Generate OpenAPI spec
        run: |
          python generate_openapi.py > openapi.yaml

      - name: Deploy documentation
        uses: peaceiris/actions-gh-pages@v3
        with:
          github_token: ${{ secrets.GITHUB_TOKEN }}
          publish_dir: ./docs/_build/html

3. Documentation Verification

If documentation can be generated from code, it should also be tested:

Syntax validation: Ensure docstrings and annotations are properly formatted
Link checking: Verify internal and external references
Example execution: Test code examples in documentation actually work
Coverage metrics: Track which APIs/features lack documentation

API Documentation with OpenAPI/Swagger

Generating OpenAPI Specifications from Code

Modern frameworks can auto-generate API documentation from code annotations:

Python (FastAPI example):

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel, Field
from typing import List, Optional
from enum import Enum

app = FastAPI(
    title="E-commerce API",
    description="API for managing products, orders, and customers",
    version="2.0.0",
    docs_url="/api/docs",
    redoc_url="/api/redoc"
)

class ProductCategory(str, Enum):
    """Product category enumeration"""
    ELECTRONICS = "electronics"
    CLOTHING = "clothing"
    BOOKS = "books"

class Product(BaseModel):
    """Product model with validation"""
    id: Optional[int] = Field(None, description="Unique product identifier")
    name: str = Field(..., min_length=1, max_length=200, description="Product name")
    description: str = Field(..., description="Detailed product description")
    price: float = Field(..., gt=0, description="Product price in USD")
    category: ProductCategory = Field(..., description="Product category")
    in_stock: bool = Field(True, description="Availability status")

    class Config:
        schema_extra = {
            "example": {
                "id": 123,
                "name": "Laptop Pro 15",
                "description": "High-performance laptop with 16GB RAM",
                "price": 1299.99,
                "category": "electronics",
                "in_stock": True
            }
        }

@app.get(
    "/products",
    response_model=List[Product],
    summary="List all products",
    description="Retrieve a paginated list of all products in the catalog",
    response_description="List of products with pagination metadata"
)
async def list_products(
    skip: int = Field(0, ge=0, description="Number of records to skip"),
    limit: int = Field(10, ge=1, le=100, description="Maximum records to return"),
    category: Optional[ProductCategory] = Field(None, description="Filter by category")
):
    """
    List products with optional filtering.

    - **skip**: Pagination offset (default: 0)
    - **limit**: Page size, max 100 (default: 10)
    - **category**: Optional category filter

    Returns a list of products matching the criteria.
    """
    # Implementation here
    pass

@app.post(
    "/products",
    response_model=Product,
    status_code=201,
    summary="Create new product",
    description="Add a new product to the catalog",
    responses={
        201: {"description": "Product created successfully"},
        400: {"description": "Invalid product data"},
        409: {"description": "Product already exists"}
    }
)
async def create_product(product: Product):
    """
    Create a new product in the catalog.

    Required fields:
    - name: Product name (1-200 characters)
    - description: Detailed description
    - price: Price in USD (must be positive)
    - category: One of: electronics, clothing, books

    Returns the created product with assigned ID.
    """
    # Implementation here
    pass

This code automatically generates interactive API documentation at /api/docs (Swagger UI) and /api/redoc (ReDoc), including:

Complete endpoint catalog
Request/response schemas
Validation rules
Example payloads
Interactive testing interface

Extending Generated Documentation

Enhance auto-generated docs with additional context:

from fastapi import FastAPI
from fastapi.openapi.utils import get_openapi

def custom_openapi():
    """Customize OpenAPI schema with additional metadata"""
    if app.openapi_schema:
        return app.openapi_schema

    openapi_schema = get_openapi(
        title="E-commerce API",
        version="2.0.0",
        description="""
        ## Overview
        RESTful API for e-commerce platform with product management,
        order processing, and customer accounts.

        ## Authentication
        All endpoints except `/health` require Bearer token authentication:
        ```
        Authorization: Bearer <your_token>
        ```

        ## Rate Limiting
        - Unauthenticated: 100 requests/hour
        - Authenticated: 5000 requests/hour

        ## Environments
        - Production: https://api.example.com
        - Staging: https://api-staging.example.com
        - Development: http://localhost:8000

        ## Support
        Contact: api-support@example.com
        """,
        routes=app.routes,
    )

    # Add security scheme
    openapi_schema["components"]["securitySchemes"] = {
        "BearerAuth": {
            "type": "http",
            "scheme": "bearer",
            "bearerFormat": "JWT"
        }
    }

    # Add server information
    openapi_schema["servers"] = [
        {"url": "https://api.example.com", "description": "Production"},
        {"url": "https://api-staging.example.com", "description": "Staging"},
        {"url": "http://localhost:8000", "description": "Development"}
    ]

    app.openapi_schema = openapi_schema
    return app.openapi_schema

app.openapi = custom_openapi

BDD Documentation with Cucumber/Gherkin

Executable Specifications as Documentation

Behavior-Driven Development (BDD) scenarios serve dual purposes: they’re both executable tests and human-readable documentation.

Feature file example:

# features/checkout.feature
Feature: Shopping Cart Checkout
  As a customer
  I want to complete purchases through a streamlined checkout process
  So that I can quickly and securely buy products

  Background:
    Given the product catalog contains:
      | id | name           | price  | stock |
      | 1  | Laptop Pro 15  | 1299.99| 10    |
      | 2  | Wireless Mouse | 29.99  | 50    |
    And I am logged in as "john@example.com"

  @critical @payment
  Scenario: Successful checkout with credit card
    Given I have added the following items to my cart:
      | product_id | quantity |
      | 1          | 1        |
      | 2          | 2        |
    When I proceed to checkout
    And I enter shipping address:
      | field       | value              |
      | street      | 123 Main St        |
      | city        | San Francisco      |
      | state       | CA                 |
      | postal_code | 94102              |
    And I select "Standard Shipping" delivery method
    And I pay with credit card:
      | field        | value            |
      | number       | 4111111111111111 |
      | expiry       | 12/25            |
      | cvv          | 123              |
      | name         | John Doe         |
    Then the order should be confirmed
    And I should receive an order confirmation email
    And the total amount charged should be $1,369.97
    And my cart should be empty

  @edge-case
  Scenario: Checkout fails with insufficient stock
    Given product "Laptop Pro 15" has only 1 unit in stock
    When I attempt to add 2 units of "Laptop Pro 15" to cart
    Then I should see error message "Insufficient stock available"
    And the cart should contain 0 items

  @security
  Scenario: Checkout requires authentication
    Given I am not logged in
    When I attempt to access the checkout page
    Then I should be redirected to the login page
    And I should see message "Please log in to continue"

Generating Reports from BDD Tests

Convert test execution results into comprehensive documentation:

// generate-cucumber-report.js
const reporter = require('cucumber-html-reporter');

const options = {
  theme: 'bootstrap',
  jsonFile: 'cucumber.json',
  output: 'docs/test-reports/cucumber-report.html',
  reportSuiteAsScenarios: true,
  scenarioTimestamp: true,
  launchReport: false,
  metadata: {
    "App Version": "2.0.0",
    "Test Environment": "Staging",
    "Browser": "Chrome 118",
    "Platform": "Ubuntu 22.04",
    "Executed": new Date().toISOString()
  },
  customData: {
    title: 'E-commerce Test Execution Report',
    data: [
      {label: 'Project', value: 'E-commerce Platform'},
      {label: 'Release', value: 'Sprint 24'},
      {label: 'Cycle', value: 'Regression Testing'}
    ]
  }
};

reporter.generate(options);

The generated report provides:

Feature overview: All features with pass/fail status
Scenario details: Step-by-step execution with screenshots
Metrics: Pass rate, duration, trends over time
Tags: Filter by @critical, @security, @regression, etc.
Search: Find specific scenarios or steps

Documentation as Code

Treating Documentation Like Software

Apply software engineering practices to documentation:

1. Version Control

Store documentation in Git alongside code:

project/
├── docs/
│   ├── architecture/
│   │   ├── adr/               # Architecture Decision Records
│   │   │   ├── 001-use-microservices.md
│   │   │   └── 002-choose-postgresql.md
│   │   ├── diagrams/
│   │   │   └── system-context.puml
│   │   └── README.md
│   ├── api/
│   │   ├── openapi.yaml       # Generated from code
│   │   └── changelog.md
│   ├── testing/
│   │   ├── test-strategy.md
│   │   └── reports/           # Generated test reports
│   └── user-guide/
│       └── getting-started.md
├── src/
└── tests/

2. Documentation Testing

Validate documentation quality automatically:

# tests/test_documentation.py
import pytest
import re
from pathlib import Path

def test_all_code_examples_are_valid():
    """Ensure all Python code examples in docs are syntactically correct"""
    docs_path = Path("docs")

    for doc_file in docs_path.rglob("*.md"):
        content = doc_file.read_text()

        # Extract Python code blocks
        code_blocks = re.findall(r'```python\n(.*?)\n```', content, re.DOTALL)

        for i, code in enumerate(code_blocks):
            try:
                compile(code, f'{doc_file}:block-{i}', 'exec')
            except SyntaxError as e:
                pytest.fail(f"Invalid Python code in {doc_file}, block {i}: {e}")

def test_no_broken_internal_links():
    """Verify all internal documentation links are valid"""
    docs_path = Path("docs")
    all_files = {f.relative_to(docs_path) for f in docs_path.rglob("*.md")}

    for doc_file in docs_path.rglob("*.md"):
        content = doc_file.read_text()

        # Find markdown links
        links = re.findall(r'\[.*?\]\((.*?)\)', content)

        for link in links:
            if link.startswith('http'):
                continue  # Skip external links

            # Resolve relative path
            target = (doc_file.parent / link).resolve().relative_to(docs_path.resolve())

            if target not in all_files:
                pytest.fail(f"Broken link in {doc_file}: {link}")

def test_api_endpoints_documented():
    """Ensure all API endpoints have OpenAPI documentation"""
    from app import app  # Your FastAPI/Flask app

    documented_paths = set()

    # Load OpenAPI spec
    openapi = app.openapi()
    documented_paths = set(openapi['paths'].keys())

    # Get actual routes
    actual_paths = set()
    for route in app.routes:
        if hasattr(route, 'path'):
            actual_paths.add(route.path)

    undocumented = actual_paths - documented_paths

    assert not undocumented, f"Undocumented endpoints: {undocumented}"

Architecture Decision Records (ADRs)

Document architectural choices in a structured, version-controlled format:

# ADR-005: Adopt Living Documentation Approach

## Status
Accepted

## Context
Our team struggles with documentation becoming outdated quickly.
Developers rarely update docs after code changes, leading to:
- Low trust in documentation accuracy
- Wasted time debugging based on incorrect docs
- Onboarding friction for new team members

We need documentation that stays synchronized with code automatically.

## Decision
We will adopt Living Documentation principles:

1. Generate API docs from code annotations (OpenAPI/Swagger)
2. Use BDD scenarios as executable requirements documentation
3. Auto-generate test reports in CI/CD pipeline
4. Store ADRs in version control alongside code
5. Implement automated documentation validation tests

## Consequences

### Positive
- Documentation always reflects current implementation
- Reduced manual documentation maintenance
- Single source of truth (code)
- Documentation becomes testable and verifiable
- Better onboarding with accurate, up-to-date docs

### Negative
- Initial setup effort for tooling and CI/CD integration
- Team needs training on documentation-as-code practices
- Some documentation still requires manual writing (user guides, tutorials)

## Implementation
- Week 1-2: Set up Swagger/OpenAPI generation
- Week 3-4: Integrate Cucumber reporting in CI/CD
- Week 5-6: Implement documentation tests
- Week 7: Team training and migration of existing docs

## Related Decisions
- ADR-003: API-first development approach
- ADR-004: Adopt BDD for requirement specification

Tool Ecosystem for Living Documentation

Sphinx and Read the Docs

For Python projects, Sphinx generates comprehensive documentation from docstrings:

# src/payment_processor.py
class PaymentProcessor:
    """
    Handles payment processing for various payment methods.

    This class provides a unified interface for processing payments
    through different providers (Stripe, PayPal, etc.) and manages
    transaction lifecycle including authorization, capture, and refund.

    Attributes:
        provider (str): Payment gateway provider name
        api_key (str): Authentication key for the provider
        timeout (int): Request timeout in seconds (default: 30)

    Example:
        >>> processor = PaymentProcessor('stripe', api_key='sk_test_...')
        >>> result = processor.charge(amount=99.99, currency='USD', card_token='tok_...')
        >>> print(result.status)
        'succeeded'

    See Also:
        :class:`RefundProcessor`: For processing refunds
        :class:`PaymentValidator`: For validating payment data
    """

    def charge(self, amount, currency, card_token, idempotency_key=None):
        """
        Charge a payment method.

        Args:
            amount (float): Amount to charge in the specified currency
            currency (str): Three-letter ISO currency code (e.g., 'USD', 'EUR')
            card_token (str): Tokenized card identifier from payment provider
            idempotency_key (str, optional): Unique key to prevent duplicate charges

        Returns:
            PaymentResult: Object containing transaction details

        Raises:
            InvalidAmountError: If amount is negative or zero
            CardDeclinedError: If payment is declined by issuer
            NetworkError: If connection to payment provider fails

        Example:
            >>> processor.charge(amount=49.99, currency='USD', card_token='tok_visa')
            PaymentResult(id='ch_123', status='succeeded', amount=49.99)

        Note:
            All amounts are processed with two decimal precision.
            Idempotency keys expire after 24 hours.
        """
        pass

Sphinx configuration (docs/conf.py):

import os
import sys
sys.path.insert(0, os.path.abspath('../src'))

project = 'E-commerce Platform'
copyright = '2024, Engineering Team'
author = 'Engineering Team'
version = '2.0'
release = '2.0.0'

extensions = [
    'sphinx.ext.autodoc',       # Auto-generate docs from docstrings
    'sphinx.ext.napoleon',      # Support Google/NumPy docstring styles
    'sphinx.ext.viewcode',      # Add links to highlighted source code
    'sphinx.ext.intersphinx',   # Link to other projects' documentation
    'sphinx.ext.todo',          # Support for TODO items
    'sphinx.ext.coverage',      # Check documentation coverage
    'sphinx_rtd_theme',         # Read the Docs theme
]

autodoc_default_options = {
    'members': True,
    'member-order': 'bysource',
    'special-members': '__init__',
    'undoc-members': True,
    'exclude-members': '__weakref__'
}

html_theme = 'sphinx_rtd_theme'

Docusaurus for Multi-Language Documentation Sites

Docusaurus creates versioned, searchable documentation websites:

// docusaurus.config.js
module.exports = {
  title: 'E-commerce Platform Docs',
  tagline: 'Comprehensive API and integration guide',
  url: 'https://docs.example.com',
  baseUrl: '/',

  plugins: [
    [
      'docusaurus-plugin-openapi-docs',
      {
        id: 'openapi',
        docsPluginId: 'classic',
        config: {
          ecommerce: {
            specPath: 'openapi.yaml',
            outputDir: 'docs/api',
            sidebarOptions: {
              groupPathsBy: 'tag',
            },
          },
        },
      },
    ],
  ],

  presets: [
    [
      '@docusaurus/preset-classic',
      {
        docs: {
          sidebarPath: require.resolve('./sidebars.js'),
          editUrl: 'https://github.com/example/docs/edit/main/',
          showLastUpdateTime: true,
          showLastUpdateAuthor: true,
        },
        theme: {
          customCss: require.resolve('./src/css/custom.css'),
        },
      },
    ],
  ],
};

Best Practices for Living Documentation

1. Start Small, Iterate

Don’t try to automate all documentation at once:

Phase 1: API documentation from code annotations
Phase 2: Test reports in CI/CD
Phase 3: BDD scenarios as requirements docs
Phase 4: Architecture diagrams as code (PlantUML, Mermaid)

2. Maintain What Can’t Be Generated

Some documentation requires human authorship:

User guides and tutorials: Step-by-step instructions
Architecture overviews: High-level system design
Troubleshooting guides: Common issues and solutions
Migration guides: Breaking changes and upgrade paths

Keep these docs close to code (in repo) and include them in review processes.

3. Documentation in Pull Requests

Make documentation updates part of code review:

## Pull Request Checklist

- [ ] Code changes implemented
- [ ] Unit tests added/updated
- [ ] Integration tests pass
- [ ] Docstrings updated
- [ ] OpenAPI spec reflects changes
- [ ] BDD scenarios updated (if behavior changed)
- [ ] Migration guide updated (if breaking change)
- [ ] Changelog entry added

4. Monitor Documentation Health

Track metrics to ensure quality:

Coverage: Percentage of public APIs with documentation
Freshness: Time since last update
Accuracy: Automated validation test pass rate
Usability: Analytics on doc site (search queries, popular pages)

5. Educate the Team

Ensure everyone understands:

How to write good docstrings/annotations
The value of executable specifications
Documentation-as-code workflows
Tools and automation in place

Conclusion

Living Documentation transforms documentation from a maintenance burden into an automatic byproduct of development. By extracting documentation from code, tests, and executable specifications, teams ensure accuracy, reduce duplication, and build trust in their documentation.

The key is integration: make documentation generation part of your standard development workflow, treat it as code (version-controlled, tested, reviewed), and automate everything possible. With the right tools and practices, your documentation will always reflect the current state of your system—no manual updates required.