sitemap-generator Svelte Themes

Sitemap Generator

Chrome extension β€” Generates sitemap.xml and robots.txt for any website, including static and dynamic (React, Vue, Svelte, etc.) sites.

πŸ—ΊοΈ Dynamic Sitemap Generator

Chrome extension β€” Generates sitemap.xml and robots.txt for any website, including static and dynamic (React, Vue, Svelte, etc.) sites.


πŸ“‹ Table of Contents


✨ Features

Feature Description
SPA Support Collects links after JS render on single-page apps (React, Vue, Svelte, Angular)
Automatic Crawl Crawls all same-site pages from a start URL using BFS
sitemap.xml Outputs standard sitemaps.org XML
robots.txt Generates sample robots.txt with User-agent, Allow, and Sitemap
Progress Bar Visual progress based on pages crawled
Error List Lists URLs that timed out or failed to load
Download One-click download of generated sitemap and robots files

πŸ“¦ Installation

Requirements

  • Google Chrome (or Chromium-based browser: Edge, Brave, etc.)
  • Extension is loaded unpacked in Developer mode

Steps

  1. Clone the repo

    git clone https://github.com/eros1sh/sitemap-generator.git
    cd sitemap-generator
    
  2. Load the extension in Chrome

    • Open chrome://extensions/ in the address bar
    • Enable Developer mode (top right)
    • Click Load unpacked
    • Select the sitemap-generator folder
  3. The Dynamic Sitemap Generator icon appears in the toolbar.


πŸš€ Usage

  1. Open the site you want to generate a sitemap for in Chrome (e.g. https://example.com).
  2. Click the Dynamic Sitemap Generator icon in the toolbar.
  3. Click Start Crawl.
  4. Wait until the crawl finishes; the progress bar and status message update as it runs.
  5. sitemap.xml and robots.txt content appears in the text areas.
  6. Use Download Sitemap and Download Robots.txt to save the files.
  7. Place the downloaded sitemap.xml in your site root and adjust robots.txt if needed.

πŸ”§ How It Works

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Popup (popup.js)                                                β”‚
β”‚  β€’ "Start Crawl" β†’ gets active tab URL                           β”‚
β”‚  β€’ Sends START_CRAWLING to background                            β”‚
β”‚  β€’ Listens for CRAWL_STATUS / ERROR_FOUND / CRAWL_COMPLETE      β”‚
β”‚  β€’ Shows progress, error list, sitemap/robots                    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            β”‚ chrome.runtime.sendMessage
                            β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Service Worker (background.js)                                  β”‚
β”‚  β€’ Queue + visitedUrls + errors                                  β”‚
β”‚  β€’ Per URL: open tab β†’ wait 3s β†’ executeScript                   β”‚
β”‚  β€’ getLinksOnPage(siteOrigin) collects same-origin <a href>      β”‚
β”‚  β€’ BFS crawl of all pages                                        β”‚
β”‚  β€’ generateSitemap() β†’ XML urlset                                β”‚
β”‚  β€’ generateRobotsTxt() β†’ User-agent, Allow, Sitemap              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
  • Origin scope: Only links with the same origin as the start URL are included (other subdomains/domains are not).
  • Wait: After each page loads, the extension waits 3 seconds so dynamic content and client-side routing can render.
  • Timeout: If a page doesn’t load within 20 seconds, it’s skipped and added to the error list.
  • XML safety: Characters <, >, &, ', " in sitemap URLs are escaped as XML entities.

πŸ” Permissions

Permission Purpose
activeTab Access URL of the tab where the extension is used
scripting Run script in pages to collect links
storage Reserved for optional use
downloads Download sitemap.xml and robots.txt
tabs Open/close tabs for background crawl
<all_urls> Host access to crawl any site

The extension only crawls when you click Start Crawl; it does not collect data in the background continuously.


⚠️ Limitations

  • Large sites: Crawling hundreds or thousands of pages can take a long time and may hit Chrome tab limits.
  • Single origin: Only the start URL’s origin is crawled; other subdomains or domains are not included.
  • Hash (#) URLs: Fragment URLs may be treated as one page; manual editing may be needed.
  • JS-generated links: On heavy SPAs, 3 seconds may not be enough; you can increase the wait in background.js (e.g. to 5000 ms).

🀝 Contributing

  1. Fork this repo.
  2. Create a branch: git checkout -b feature/your-feature
  3. Commit your changes: git commit -m 'Add some feature'
  4. Push the branch: git push origin feature/your-feature
  5. Open a Pull Request against this repo.

πŸ“„ License

This project is licensed under the MIT License. See LICENSE for details.


πŸ‘€ Author

eros1sh


This extension was built to simplify generating sitemaps and robots files for SEO and indexing.

Top categories

Loading Svelte Themes