SEO Testing for QA

Learn SEO testing fundamentals for QA engineers: meta tags, canonical URLs, structured data, robots.txt, sitemaps, and crawlability verification techniques.

Why QA Engineers Need to Test SEO

SEO (Search Engine Optimization) directly impacts how many users find a website through search engines. A single misconfigured meta tag, a broken canonical URL, or an accidental noindex directive can cause pages to disappear from search results, potentially costing thousands of visitors.

QA engineers are in a unique position to catch SEO issues because they already test the HTML output, verify page behavior, and check edge cases that developers might miss. Technical SEO testing fits naturally into the web testing workflow.

Essential SEO Elements to Test

Title Tags

The <title> tag appears in search results and browser tabs. Test:

Every page has a unique title tag
Title length is 50-60 characters (search engines truncate longer titles)
Title contains the primary keyword naturally
Title does not duplicate other pages
Dynamic pages generate correct titles (product name, category)

<!-- Good -->
<title>Cypress Tutorial for Beginners: Complete Guide 2025 | YourSite</title>

<!-- Bad: Too long, will be truncated -->
<title>The Complete and Comprehensive Cypress Tutorial for Beginners Who Want to Learn Test Automation from Scratch in 2025</title>

<!-- Bad: Generic, not unique -->
<title>Page</title>

Meta Description

Appears as the snippet below the title in search results:

Present on every page (150-160 characters)
Unique per page (no duplicates)
Contains a call to action or value proposition
Includes the target keyword naturally

Canonical Tags

Prevents duplicate content issues:

<link rel="canonical" href="https://example.com/blog/seo-testing" />

Test that:

Every page has a canonical tag
The canonical URL points to the correct page (not a redirect chain)
Paginated pages have correct canonicals
HTTP pages canonical to HTTPS versions
Trailing slash consistency (choose one pattern)

Hreflang Tags (Multilingual Sites)

For sites with multiple language versions:

<link rel="alternate" hreflang="en" href="https://example.com/page" />
<link rel="alternate" hreflang="es" href="https://example.com/es/page" />
<link rel="alternate" hreflang="x-default" href="https://example.com/page" />

Test that:

Every language version references all other versions
x-default points to the primary language
URLs are absolute, not relative
Language codes are valid ISO 639-1

Open Graph and Twitter Meta Tags

Control how pages appear when shared on social media:

<meta property="og:title" content="Page Title" />
<meta property="og:description" content="Description" />
<meta property="og:image" content="https://example.com/image.jpg" />
<meta property="og:url" content="https://example.com/page" />
<meta name="twitter:card" content="summary_large_image" />

Test that OG images exist, have correct dimensions (1200x630px recommended), and URLs are absolute.

Crawlability Testing

robots.txt

Located at /robots.txt, this file tells search engines what to crawl:

User-agent: *
Allow: /
Disallow: /admin/
Disallow: /api/
Sitemap: https://example.com/sitemap.xml

Critical tests:

Production robots.txt does not contain Disallow: / (blocks entire site)
Staging/dev robots.txt DOES block crawling (prevent indexing test environments)
Important pages are not accidentally disallowed
Sitemap URL is correct and accessible

XML Sitemap

Located at /sitemap.xml:

All important pages are included
No 404 or redirect URLs in the sitemap
lastmod dates are accurate
Sitemap is valid XML (use a validator)
Sitemap is referenced in robots.txt
For large sites: sitemap index links to sub-sitemaps correctly

Noindex/Nofollow

<meta name="robots" content="noindex, nofollow" />

Test that:

Production pages do NOT have accidental noindex tags
Pages that should be excluded (admin, thank-you pages) DO have noindex
The X-Robots-Tag HTTP header is not set to noindex on public pages

Structured Data Testing

What to Test

Structured data uses Schema.org vocabulary to describe page content:

Page Type	Schema Type	Key Properties
Article	Article / BlogPosting	headline, author, datePublished, image
Product	Product	name, price, availability, review
FAQ	FAQPage	question, answer pairs
Breadcrumbs	BreadcrumbList	itemListElement chain
Organization	Organization	name, logo, contactPoint

Validation Tools

Google Rich Results Test (search.google.com/test/rich-results)
Schema.org Validator (validator.schema.org)
View source and search for application/ld+json or itemscope

Exercise: SEO Audit of a Web Page

Perform a technical SEO audit on a page from your project or any public website.

Step 1: Meta Tags Audit

Open the page and inspect the <head> section in DevTools. Document:

Element	Present?	Value	Issues
Title			Length? Unique?
Meta description			Length?
Canonical			Correct URL?
OG title			Matches page?
OG description
OG image			Valid URL? Dimensions?
Hreflang (if multilingual)			All versions?

Step 2: Crawlability Check

# Check robots.txt
curl https://example.com/robots.txt

# Check sitemap
curl https://example.com/sitemap.xml | head -50

# Check for noindex
curl -s https://example.com/page | grep -i "noindex"

# Check canonical
curl -s https://example.com/page | grep -i "canonical"

Step 3: Structured Data Validation

Copy the page URL
Open Google Rich Results Test
Paste the URL and run the test
Document: What schema types are detected? Any errors or warnings?

Step 4: Link Audit

Check internal and external links on the page:

Are there any broken links (404)?
Do all links have descriptive anchor text (not “click here”)?
Are external links using rel="noopener" or rel="nofollow" where appropriate?

Solution: SEO Audit Checklist Template

Page: https://example.com/blog/cypress-tutorial

Meta Tags:

Title: “Cypress Tutorial for Beginners” (32 chars) — WARN: Could be longer
Description: “Learn Cypress testing…” (145 chars) — OK
Canonical: https://example.com/blog/cypress-tutorial — OK
OG image: Present, 1200x630 — OK
Hreflang: EN, ES, RU — OK, x-default points to EN

Crawlability:

robots.txt: Does not block /blog/ — OK
Sitemap: Page included with correct lastmod — OK
No noindex tag — OK
HTTP redirects to HTTPS — OK

Structured Data:

BlogPosting schema detected — OK
Missing dateModified — WARN
Author schema present — OK
BreadcrumbList present — OK

Links:

2 broken internal links found — BUG
1 external link without rel=“noopener” — WARN
All anchor text is descriptive — OK

Priority Fixes:

Fix 2 broken internal links (high impact)
Add dateModified to structured data (medium impact)
Extend title to 50-60 characters (low impact)
Add rel=“noopener” to external link (low impact)

Automating SEO Checks

Integrate SEO validation into your test suite:

// Example: Playwright SEO checks
test('page has valid SEO meta tags', async ({ page }) => {
  await page.goto('/blog/my-article');

  // Title exists and has proper length
  const title = await page.title();
  expect(title.length).toBeGreaterThan(30);
  expect(title.length).toBeLessThan(65);

  // Meta description exists
  const description = await page.$eval(
    'meta[name="description"]',
    el => el.content
  );
  expect(description.length).toBeGreaterThan(100);
  expect(description.length).toBeLessThan(165);

  // Canonical tag exists and matches current URL
  const canonical = await page.$eval(
    'link[rel="canonical"]',
    el => el.href
  );
  expect(canonical).toContain('/blog/my-article');

  // No noindex on production pages
  const robots = await page.$('meta[name="robots"][content*="noindex"]');
  expect(robots).toBeNull();
});

Common SEO Bugs Found by QA

Staging noindex leaked to production — The most dangerous bug. Always verify robots meta after deployment.
Canonical pointing to wrong URL — Especially after URL migrations or redesigns.
Missing hreflang reciprocal links — Language A links to B, but B does not link back to A.
Duplicate title tags — Template defaults not overridden on individual pages.
Sitemap including 301/404 URLs — Sitemap not regenerated after URL changes.

Key Takeaways

QA engineers should test SEO elements as part of routine web testing
The most critical checks are: title tags, canonical URLs, noindex directives, and robots.txt
Structured data validation ensures rich snippets display correctly in search results
Always verify that staging/dev configurations do not leak to production
Automate SEO checks in your test suite to catch regressions early
Use Google Rich Results Test and PageSpeed Insights as validation tools

SEO Testing for QA

What You Will Learn

Why QA Engineers Need to Test SEO

Essential SEO Elements to Test

Title Tags

Meta Description

Canonical Tags

Hreflang Tags (Multilingual Sites)

Open Graph and Twitter Meta Tags

Crawlability Testing

robots.txt

XML Sitemap

Noindex/Nofollow

Structured Data Testing

What to Test

Validation Tools

Exercise: SEO Audit of a Web Page

Step 1: Meta Tags Audit

Step 2: Crawlability Check

Step 3: Structured Data Validation

Step 4: Link Audit

Automating SEO Checks

Common SEO Bugs Found by QA

Key Takeaways

Knowledge Check

SEO Testing for QA

What You Will Learn

Why QA Engineers Need to Test SEO #

Essential SEO Elements to Test #

Title Tags #

Meta Description #

Canonical Tags #

Hreflang Tags (Multilingual Sites) #

Open Graph and Twitter Meta Tags #

Crawlability Testing #

robots.txt #

XML Sitemap #

Noindex/Nofollow #

Structured Data Testing #

What to Test #

Validation Tools #

Exercise: SEO Audit of a Web Page #

Step 1: Meta Tags Audit #

Step 2: Crawlability Check #

Step 3: Structured Data Validation #

Step 4: Link Audit #

Automating SEO Checks #

Common SEO Bugs Found by QA #

Key Takeaways #

Knowledge Check

Why QA Engineers Need to Test SEO

Essential SEO Elements to Test

Title Tags

Meta Description

Canonical Tags

Hreflang Tags (Multilingual Sites)

Open Graph and Twitter Meta Tags

Crawlability Testing

robots.txt

XML Sitemap

Noindex/Nofollow

Structured Data Testing

What to Test

Validation Tools

Exercise: SEO Audit of a Web Page

Step 1: Meta Tags Audit

Step 2: Crawlability Check

Step 3: Structured Data Validation

Step 4: Link Audit

Automating SEO Checks

Common SEO Bugs Found by QA

Key Takeaways