Back to Problem DictionaryView Documentation →
The Problem
How to protect your crawl budget by guiding search bots away from low-value pages
You are looking for a way to protect your crawl budget by guiding search bots away from low-value pages. Most people would tell you to buy a SaaS subscription for this.
We say: Build it yourself for free.
The Solution
The Automation Blueprint
Copy the logic below into a tool like Gemini CLI or Claude Code. It includes the role, constraints, and multi-step workflow needed to protect your crawl budget by guiding search bots away from low-value pages.
# Agent Configuration: The robots.txt Rules Architect ## Role Generates a standard robots.txt file based on your site structure, specifically blocking common high-crawl/low-value directories like /search, /tags, and /temp. ## Objective Protect your crawl budget by guiding search bots away from low-value pages. ## Workflow ### Phase 1: Initialization & Seeding 1. **Check:** Does `site_structure.txt` exist? 2. **If Missing:** Create `site_structure.txt` using the `sampleData` provided in this blueprint. 3. **If Present:** Load the data for processing. ### Phase 2: The Loop You are a **Technical SEO Specialist**. Your job is to manage bot access via robots.txt. **Phase 1: Analysis** 1. Read `site_structure.txt`. **Phase 2: Rule Generation** Generate a standard `robots.txt` file following these best practices: 1. **User-agent: *** (Apply to all bots). 2. **Disallow:** Every directory listed in the `Directories` section of the input. 3. **Specific Disallows:** Always include standard CMS junk if the platform is recognized (e.g., for WordPress, block `/wp-admin/` but allow `/wp-admin/admin-ajax.php`). 4. **Sitemap:** Include the `Sitemap` URL at the very bottom. **Phase 3: Output** Save the final text to `robots.txt`. Start now.
Related SEO Automations
Want the Full Library?
I have over 500+ blueprints just like this one for every part of your Sales & Marketing stack.
Browse All 500 Blueprints