Why QA Engineers Need to Test SEO
SEO (Search Engine Optimization) directly impacts how many users find a website through search engines. A single misconfigured meta tag, a broken canonical URL, or an accidental noindex directive can cause pages to disappear from search results, potentially costing thousands of visitors.
QA engineers are in a unique position to catch SEO issues because they already test the HTML output, verify page behavior, and check edge cases that developers might miss. Technical SEO testing fits naturally into the web testing workflow.
Essential SEO Elements to Test
Title Tags
The <title> tag appears in search results and browser tabs. Test:
- Every page has a unique title tag
- Title length is 50-60 characters (search engines truncate longer titles)
- Title contains the primary keyword naturally
- Title does not duplicate other pages
- Dynamic pages generate correct titles (product name, category)
<!-- Good -->
<title>Cypress Tutorial for Beginners: Complete Guide 2025 | YourSite</title>
<!-- Bad: Too long, will be truncated -->
<title>The Complete and Comprehensive Cypress Tutorial for Beginners Who Want to Learn Test Automation from Scratch in 2025</title>
<!-- Bad: Generic, not unique -->
<title>Page</title>
Meta Description
Appears as the snippet below the title in search results:
- Present on every page (150-160 characters)
- Unique per page (no duplicates)
- Contains a call to action or value proposition
- Includes the target keyword naturally
Canonical Tags
Prevents duplicate content issues:
<link rel="canonical" href="https://example.com/blog/seo-testing" />
Test that:
- Every page has a canonical tag
- The canonical URL points to the correct page (not a redirect chain)
- Paginated pages have correct canonicals
- HTTP pages canonical to HTTPS versions
- Trailing slash consistency (choose one pattern)
Hreflang Tags (Multilingual Sites)
For sites with multiple language versions:
<link rel="alternate" hreflang="en" href="https://example.com/page" />
<link rel="alternate" hreflang="es" href="https://example.com/es/page" />
<link rel="alternate" hreflang="x-default" href="https://example.com/page" />
Test that:
- Every language version references all other versions
x-defaultpoints to the primary language- URLs are absolute, not relative
- Language codes are valid ISO 639-1
Open Graph and Twitter Meta Tags
Control how pages appear when shared on social media:
<meta property="og:title" content="Page Title" />
<meta property="og:description" content="Description" />
<meta property="og:image" content="https://example.com/image.jpg" />
<meta property="og:url" content="https://example.com/page" />
<meta name="twitter:card" content="summary_large_image" />
Test that OG images exist, have correct dimensions (1200x630px recommended), and URLs are absolute.
Crawlability Testing
robots.txt
Located at /robots.txt, this file tells search engines what to crawl:
User-agent: *
Allow: /
Disallow: /admin/
Disallow: /api/
Sitemap: https://example.com/sitemap.xml
Critical tests:
- Production robots.txt does not contain
Disallow: /(blocks entire site) - Staging/dev robots.txt DOES block crawling (prevent indexing test environments)
- Important pages are not accidentally disallowed
- Sitemap URL is correct and accessible
XML Sitemap
Located at /sitemap.xml:
- All important pages are included
- No 404 or redirect URLs in the sitemap
lastmoddates are accurate- Sitemap is valid XML (use a validator)
- Sitemap is referenced in robots.txt
- For large sites: sitemap index links to sub-sitemaps correctly
Noindex/Nofollow
<meta name="robots" content="noindex, nofollow" />
Test that:
- Production pages do NOT have accidental
noindextags - Pages that should be excluded (admin, thank-you pages) DO have
noindex - The X-Robots-Tag HTTP header is not set to
noindexon public pages
Structured Data Testing
What to Test
Structured data uses Schema.org vocabulary to describe page content:
| Page Type | Schema Type | Key Properties |
|---|---|---|
| Article | Article / BlogPosting | headline, author, datePublished, image |
| Product | Product | name, price, availability, review |
| FAQ | FAQPage | question, answer pairs |
| Breadcrumbs | BreadcrumbList | itemListElement chain |
| Organization | Organization | name, logo, contactPoint |
Validation Tools
- Google Rich Results Test (
search.google.com/test/rich-results) - Schema.org Validator (
validator.schema.org) - View source and search for
application/ld+jsonoritemscope
Exercise: SEO Audit of a Web Page
Perform a technical SEO audit on a page from your project or any public website.
Step 1: Meta Tags Audit
Open the page and inspect the <head> section in DevTools. Document:
| Element | Present? | Value | Issues |
|---|---|---|---|
| Title | Length? Unique? | ||
| Meta description | Length? | ||
| Canonical | Correct URL? | ||
| OG title | Matches page? | ||
| OG description | |||
| OG image | Valid URL? Dimensions? | ||
| Hreflang (if multilingual) | All versions? |
Step 2: Crawlability Check
# Check robots.txt
curl https://example.com/robots.txt
# Check sitemap
curl https://example.com/sitemap.xml | head -50
# Check for noindex
curl -s https://example.com/page | grep -i "noindex"
# Check canonical
curl -s https://example.com/page | grep -i "canonical"
Step 3: Structured Data Validation
- Copy the page URL
- Open Google Rich Results Test
- Paste the URL and run the test
- Document: What schema types are detected? Any errors or warnings?
Step 4: Link Audit
Check internal and external links on the page:
- Are there any broken links (404)?
- Do all links have descriptive anchor text (not “click here”)?
- Are external links using
rel="noopener"orrel="nofollow"where appropriate?
Solution: SEO Audit Checklist Template
Page: https://example.com/blog/cypress-tutorial
Meta Tags:
- Title: “Cypress Tutorial for Beginners” (32 chars) — WARN: Could be longer
- Description: “Learn Cypress testing…” (145 chars) — OK
- Canonical: https://example.com/blog/cypress-tutorial — OK
- OG image: Present, 1200x630 — OK
- Hreflang: EN, ES, RU — OK, x-default points to EN
Crawlability:
- robots.txt: Does not block /blog/ — OK
- Sitemap: Page included with correct lastmod — OK
- No noindex tag — OK
- HTTP redirects to HTTPS — OK
Structured Data:
- BlogPosting schema detected — OK
- Missing
dateModified— WARN - Author schema present — OK
- BreadcrumbList present — OK
Links:
- 2 broken internal links found — BUG
- 1 external link without rel=“noopener” — WARN
- All anchor text is descriptive — OK
Priority Fixes:
- Fix 2 broken internal links (high impact)
- Add dateModified to structured data (medium impact)
- Extend title to 50-60 characters (low impact)
- Add rel=“noopener” to external link (low impact)
Automating SEO Checks
Integrate SEO validation into your test suite:
// Example: Playwright SEO checks
test('page has valid SEO meta tags', async ({ page }) => {
await page.goto('/blog/my-article');
// Title exists and has proper length
const title = await page.title();
expect(title.length).toBeGreaterThan(30);
expect(title.length).toBeLessThan(65);
// Meta description exists
const description = await page.$eval(
'meta[name="description"]',
el => el.content
);
expect(description.length).toBeGreaterThan(100);
expect(description.length).toBeLessThan(165);
// Canonical tag exists and matches current URL
const canonical = await page.$eval(
'link[rel="canonical"]',
el => el.href
);
expect(canonical).toContain('/blog/my-article');
// No noindex on production pages
const robots = await page.$('meta[name="robots"][content*="noindex"]');
expect(robots).toBeNull();
});
Common SEO Bugs Found by QA
- Staging noindex leaked to production — The most dangerous bug. Always verify robots meta after deployment.
- Canonical pointing to wrong URL — Especially after URL migrations or redesigns.
- Missing hreflang reciprocal links — Language A links to B, but B does not link back to A.
- Duplicate title tags — Template defaults not overridden on individual pages.
- Sitemap including 301/404 URLs — Sitemap not regenerated after URL changes.
Key Takeaways
- QA engineers should test SEO elements as part of routine web testing
- The most critical checks are: title tags, canonical URLs, noindex directives, and robots.txt
- Structured data validation ensures rich snippets display correctly in search results
- Always verify that staging/dev configurations do not leak to production
- Automate SEO checks in your test suite to catch regressions early
- Use Google Rich Results Test and PageSpeed Insights as validation tools