Skip to main content

    Sitemap

    A sitemap helps search engines discover URLs. XML sitemaps are common; large sites split using a sitemap index.

    Definition

    A sitemap is a mechanism to provide a list of URLs to help search engines discover, crawl, and refresh your content more efficiently. The most common form is an XML sitemap (sitemap.xml). Large or programmatic sites often use a sitemap index to split multiple sitemaps.

    Why it matters

    • Speeds up discovery and crawling of new pages
    • Reduces crawl waste on unimportant URLs
    • Helps manage coverage for multilingual and large-scale sites
    • Helps crawlers discover orphan pages (pages without internal links)
    • Provides lastmod to signal content freshness to search engines
    • Serves as key data source for Search Console index monitoring
    • Supports extended markup for video, image, and news content

    How to implement

    • Include only canonical + indexable URLs (exclude noindex/redirects/404s)
    • Split large sitemaps (50,000 URL limit) using a sitemap index
    • Reference sitemap in robots.txt via Sitemap: directive
    • Submit and monitor in Search Console / Bing Webmaster Tools
    • Use lastmod only when content actually changes (avoid auto-updating dates)
    • Consider separate sitemaps for different content types (pages, images, videos)
    • Automate sitemap generation in your build/deploy pipeline

    Examples

    xml
    <?xml version="1.0" encoding="UTF-8"?>
    <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
      <url>
        <loc>https://example.com/</loc>
        <lastmod>2025-01-01</lastmod>
        <changefreq>weekly</changefreq>
        <priority>1.0</priority>
      </url>
      <url>
        <loc>https://example.com/blog/seo-guide</loc>
        <lastmod>2025-01-15</lastmod>
      </url>
    </urlset>
    xml
    <!-- Sitemap Index for large sites -->
    <?xml version="1.0" encoding="UTF-8"?>
    <sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
      <sitemap>
        <loc>https://example.com/sitemap-pages.xml</loc>
        <lastmod>2025-01-01</lastmod>
      </sitemap>
      <sitemap>
        <loc>https://example.com/sitemap-blog.xml</loc>
        <lastmod>2025-01-15</lastmod>
      </sitemap>
    </sitemapindex>

    Related

    FAQ

    Common questions about this term.

    Back to glossary