# =================================================================== # robots.txt for bsouq.com # 📦 The Apex Edition v4.0 # # 🎯 PHILOSOPHY: Precision, Strategy, and Clarity. This file is the # definitive guide for all web crawlers, maximizing visibility where # it matters and protecting resources where it's needed. # =================================================================== # [1] GLOBAL DIRECTIVES (Default for All Bots) # ------------------------------------------------------------------- # These rules are the baseline. They are intentionally strict to protect # the site, and we will grant specific permissions later. User-agent: * # --- Core System & Security Exclusions --- Disallow: /wp-admin/ Disallow: /wp-includes/ Disallow: /wp-content/plugins/ Disallow: /xmlrpc.php Disallow: /readme.html Disallow: /license.txt # --- E-commerce & User-Specific Paths --- Disallow: /cart/ Disallow: /checkout/ Disallow: /my-account/ Disallow: /wishlist/ Disallow: /compare/ Disallow: /coupon/ # --- Parameters & Duplicate Content --- Disallow: /*?add-to-cart= Disallow: /*?filter_* Disallow: /*?orderby= Disallow: /*?ref= Disallow: /*?gclid= Disallow: /*?fbclid= Disallow: /*?utm_* # --- Low-Value Pages (Search, Feeds, Pagination) --- Disallow: /?s= Disallow: /search/ Disallow: /feed/ Disallow: /*/feed/ Disallow: /page/ Disallow: /*/page/ # --- Tag Pages --- Disallow: /product-tag/ # --- Development & Temporary Paths --- Disallow: /staging/ Disallow: /private/ Disallow: /tmp/ # [2] CRITICAL RENDERING FOR TOP SEARCH ENGINES # ------------------------------------------------------------------- # This is the most important and advanced block. We override the global # rules for Google and Bing, giving them explicit permission to access # ANY .css and .js file they need for perfect page rendering. User-agent: Googlebot User-agent: Bingbot Allow: /wp-includes/*.js Allow: /wp-includes/*.css Allow: /wp-content/themes/*.js Allow: /wp-content/themes/*.css Allow: /wp-content/plugins/*.js Allow: /wp-content/plugins/*.css # [3] AI & SOCIAL MEDIA POLICY # ------------------------------------------------------------------- # A modern, strategic approach to the new web ecosystem. # --- Social Media Bots (ALLOWED for Rich Previews) --- User-agent: FacebookExternalHit User-agent: Twitterbot User-agent: Pinterestbot Allow: / # --- Search-Integrated AI (ALLOWED for Future Visibility) --- User-agent: Google-Extended User-agent: PerplexityBot # No Disallow rule means they are allowed to crawl. # --- Data-Scraping AI (BLOCKED to Protect Content) --- User-agent: GPTBot Disallow: / User-agent: ChatGPT-User Disallow: / User-agent: ClaudeBot Disallow: / User-agent: Anthropic-ai Disallow: / User-agent: CCBot Disallow: / # [4] SEO & ANALYSIS TOOLS # ------------------------------------------------------------------- # Polite crawl-delay to manage server load from third-party tools. User-agent: AhrefsBot Crawl-delay: 5 User-agent: SemrushBot Crawl-delay: 5 User-agent: MJ12bot Crawl-delay: 10 User-agent: DotBot Crawl-delay: 10 # [5] SITEMAP DECLARATION # ------------------------------------------------------------------- # The sitemap index is the only entry needed. It's clean and efficient. Sitemap: https://bsouq.com/sitemap_index.xml # === End of File: The Apex Edition ===