Skip to content

Improve SEO #133

@Musilah

Description

@Musilah

Severity Levels

These are the Severity Levels of each SEO checklist change:

  • P0 — Critical: Must be correct before production launch or major content release.
  • P1 — High: Should be part of every SEO rollout and backlog for all active sites.
  • P2 — Recommended: Strong improvements that raise quality, trust, and long-term discoverability.

P0 — Required On Every Indexable Page

1. Every page must have a unique title, description, canonical URL, and H1

  • Unique <title> for every indexable page.
  • Unique meta description for every indexable page.
  • Canonical URL pointing to the final production URL, always using HTTPS.
  • One clear H1 that includes the page's real topic, product, or target keyword.
  • Homepage H1 must contain actual brand/product terms, not only slogan language.

Rules:

  • Never point canonicals to ngrok, localhost, preview or dev domains.
  • Never reuse the homepage title/description on product pages or docs index pages.
  • Never let /docs, /about, /product, and / all share the same H1 structure.

2. Every page must ship production-safe social metadata

Checklist:

  • og:title, og:description, og:url, og:image
  • twitter:card, twitter:title, twitter:description, twitter:image
  • Image should be stable, production-hosted, and ideally 1200x630.
  • og:url must match the canonical URL.
  • Use the correct og:type for the page where practical:
    • website for generic pages
    • article for articles/docs posts if you maintain article metadata
    • product for product pages

Rules:

  • Never use preview-host image URLs in social metadata.
  • Never inherit homepage OG values on all child pages.

3. Every page must include valid, page-appropriate structured data

  • JSON-LD is one of the clearest machine-readable trust signals for Google and AI systems.
  • Broken or misleading schema is worse than no schema.

Minimum schema expectations:

  • Homepage:
    • Organization
    • WebSite
    • SoftwareApplication or Product where applicable
  • Product pages:
    • Product
    • BreadcrumbList
  • Documentation/article pages:
    • TechArticle or Article
    • BreadcrumbList
  • FAQ-heavy pages:
    • FAQPage

Rules:

  • Validate that JSON-LD actually renders as JSON, not a literal string template.
  • Use real production URLs in schema.
  • Only claim implemented features and supported protocols.
  • Only emit datePublished, dateModified, and similar fields if they will be kept accurate.
  • Only keep SearchAction if the target is a real, crawlable search experience.

Practical standard:

  • Every indexable page should have at least one relevant schema block.
  • Sitewide entity schema should connect with @id consistently

4. Every page must be internally connected and free of dead references

Why it matters:

  • Removed pages that remain in nav, sitemap, llms files, or internal links waste crawl budget and hurt trust.
  • Docs and product ecosystems need deliberate internal linking for discoverability.

Checklist:

  • No internal links to removed pages.
  • No sitemap entries for removed pages.
  • No llms.txt or llms-full.txt references to dead URLs.
  • Every page should link to at least one related page:
    • product to docs
    • docs to product
    • homepage to strongest commercial or technical asset

Rules:

  • If a page is removed, remove or redirect every reference to it.
  • Redirect old high-value URLs if they have backlinks or are cited externally.

P1 — Required On Every Site

5. Every site must have a clean crawl-control package

Why it matters:

  • Search engines and AI crawlers need a consistent discovery layer.
  • The most common org-wide failures are conflicting robots rules, missing sitemap entries, and missing AI-readable summaries.

Checklist:

  • robots.txt
  • sitemap.xml
  • llms.txt
  • llms-full.txt where useful for richer AI ingestion
  • Correct sitemap declaration in robots.txt
  • Preview/non-prod environments set to noindex or blocked appropriately

AI crawler policy:

  • If llms.txt exists, the robots strategy should not contradict it.
  • Avoid a situation where the site advertises AI-readable content and then blocks GPTBot, ClaudeBot, PerplexityBot, or Google-Extended by mistake.

6. Every site must have a reliable sitemap strategy

Why it matters:

  • The sitemap is not just a list of URLs; it is a canonical crawl plan.

Checklist:

  • Include all important indexable pages.
  • Exclude test pages, preview pages, dead docs pages, and temporary routes.
  • Match trailing slash behavior exactly with canonical behavior.
  • Prefer lastModified only if you can maintain it accurately.
  • If lastModified is noisy or wrong, omit it rather than lying.

Rules:

  • Avoid known low-value fields if they are not useful in the setup.
  • Do not keep URLs in the sitemap that redirect unnecessarily.
  • Do not let sitemap URLs conflict with canonical URLs.

7. Every site must enforce a production-only deployment baseline

Why it matters:

  • Several audit failures were caused by development builds, preview URLs, or dev-mode assets being exposed publicly.
  • Search systems treat these as quality and trust regressions.

Checklist:

  • Production domain serves a real production build.
  • No HMR/devtool/Turbopack dev assets on public URLs.
  • No preview-domain canonicals.
  • No preview-domain OG images.
  • No staging or ngrok URLs in metadata, schema, RSS, or structured content.

Rules:

  • Public production domains must never run next dev.
  • Preview URLs should be noindex.

8. Every site must have a trust and legal baseline

Why it matters:

  • E-E-A-T depends on visible ownership, legitimacy, and contactability.
  • This matters especially for B2B infra, open-source platforms, and hardware products.

Checklist:

  • Footer links to legal pages or org-approved legal destinations.
  • Visible organization name and contact path.
  • Company address or contact email where appropriate.
  • About/team/founder evidence on at least one trust page.
  • Product claims tied back to named company or project.

Rules:

  • If legal pages live on the parent brand domain, keep that policy consistent and obvious.
  • Do not ship anonymous company/product sites with no ownership signals.

P2 — Recommended On Most Sites

9. Every important page should have a dedicated citation target

Why it matters:

  • AI systems and technical buyers cite specific pages, not whole sites.
  • The best citation targets are comparison pages, product pages, architecture pages, benchmarks, and FAQ-rich explainers.

Checklist:

  • Have at least one primary citable page per product or product line.
  • Put strong factual claims, specs, comparison tables, and supporting evidence on that page.
  • Make sure the URL is stable and linked from the homepage or docs hub.

Rules:

  • Do not bury your strongest differentiator in a PDF, a removed page, or a hidden doc.

10. Every site should include an AI-readable brand and product summary

Why it matters:

  • llms.txt and llms-full.txt are useful when they are concise, factual, and internally consistent.

Checklist:

  • State what the company is.
  • State what each product is.
  • Link to the best docs/product URLs.
  • Include trust facts: legal entity, grant, standards, notable integrations, certifications, or ecosystem standing.
  • Add disambiguation where product names are ambiguous.

Rules:

  • Do not let llms.txt contradict the main site copy or robots rules.

What Every New Page Should Pass Before Merge

Use this as the page-level gate:

  • Unique title
  • Unique meta description
  • Canonical URL set to production
  • One clear H1 with real topic/product keywords
  • Correct OG/Twitter tags
  • At least one relevant JSON-LD block
  • Main content visible in initial HTML
  • No dead links
  • Linked from at least one existing page
  • No unsupported technical claims

What Every New Site Or Major Relaunch Should Pass Before Launch

  • robots.txt present and intentional
  • sitemap.xml present and aligned with canonicals
  • llms.txt present if AI discoverability is desired
  • Production build only, no dev artifacts
  • Preview environments blocked or noindex
  • Security headers configured at the real serving layer
  • Legal/trust footer in place
  • Homepage, docs hub, and product pages all have distinct metadata and schema

Working Rule For The Org

From now on, no page should merge unless it is:

  • identifiable
  • canonicalized
  • machine-readable
  • crawlable
  • trustable
  • internally linked
  • free of unsupported claims

That is the baseline standard for SEO across all org websites.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions