# ===================================================================== # robots.txt — ligadoscampeoes.com # Estratégia: # - Crawlers de busca tradicional: acesso total, crawl-delay sensato. # - AI scrapers (treino + RAG): bloqueados por defeito. Política # editorial: o conteúdo é nosso, não pasto para LLMs. # - Excepção: AI search engines com link-back (Perplexity, Applebot) # têm acesso para indexação real-time, mas não para treino. # ===================================================================== # --- Default: tudo aberto --- User-agent: * Allow: / Disallow: /api/ Disallow: /og/ Disallow: /apostas Disallow: /notificacoes Disallow: /favoritos Disallow: /pesquisa Disallow: /comparador Disallow: /confronto-directo/ # --- Crawl-rate sane para os principais --- User-agent: Googlebot Crawl-delay: 1 User-agent: Bingbot Crawl-delay: 1 User-agent: DuckDuckBot Crawl-delay: 1 User-agent: YandexBot Crawl-delay: 2 # --- AI scrapers de TREINO: bloqueados --- User-agent: GPTBot Disallow: / User-agent: ChatGPT-User Disallow: / User-agent: OAI-SearchBot Disallow: / User-agent: ClaudeBot Disallow: / User-agent: Claude-Web Disallow: / User-agent: anthropic-ai Disallow: / User-agent: CCBot Disallow: / User-agent: Google-Extended Disallow: / User-agent: Amazonbot Disallow: / User-agent: Applebot-Extended Disallow: / User-agent: Bytespider Disallow: / User-agent: cohere-ai Disallow: / User-agent: Diffbot Disallow: / User-agent: FacebookBot Disallow: / User-agent: Meta-ExternalAgent Disallow: / User-agent: meta-externalfetcher Disallow: / User-agent: Omgilibot Disallow: / User-agent: PetalBot Disallow: / User-agent: Timpibot Disallow: / User-agent: Bardeen Disallow: / User-agent: ImagesiftBot Disallow: / User-agent: Webzio-Extended Disallow: / # --- AI search engines com link-back: PERMITIDOS (search real-time) --- # Perplexity faz "answer engine" com citações. Vale a indexação. User-agent: PerplexityBot Allow: / Disallow: /api/ Disallow: /og/ Crawl-delay: 2 # Apple Search & Spotlight (separado do "Extended" de treino, esse continua bloqueado) User-agent: Applebot Allow: / Disallow: /api/ Disallow: /og/ # --- Sitemaps --- # /sitemap.xml é o índice master (agrega static + articles + matches/SSR) Sitemap: https://ligadoscampeoes.com/sitemap.xml # Sub-sitemaps directos (defesa: para crawlers que não seguem o índice) Sitemap: https://ligadoscampeoes.com/sitemap-0.xml Sitemap: https://ligadoscampeoes.com/sitemap-articles.xml Sitemap: https://ligadoscampeoes.com/sitemap-matches.xml