# National Albanian Registry — robots.txt # Posture: open to all reputable search and AI crawlers. We want to be cited. User-agent: * Allow: / Disallow: /api/ # PostHog reverse-proxy base — only /ph/* is proxied; not a content path Disallow: /ph Disallow: /admin Disallow: /register/step- Disallow: /thank-you Disallow: /unsubscribe Disallow: /regional-leads/kit/ Disallow: /pitch # Magic-link short redirects — single-use tokens, not for indexing Disallow: /v/ Disallow: /vo/ # Member portal — auth-gated; bare /account 302s to /account/login Disallow: /account Disallow: /sq/account # Org portal — auth-gated; bare /org 302s to /org/login. # MUST stay end-anchored ($) + subtree (/org/) — a bare "Disallow: /org" is a # prefix match under Google/RFC 9309 longest-match semantics and blocks the # entire public Organizations Hub (/organizations, /sq/organizations and every # org detail page) from crawling. Disallow: /org$ Disallow: /org/ Disallow: /sq/org$ Disallow: /sq/org/ # Legacy redirect aliases (301 to canonical paths) — block to keep them out of GSC Disallow: /home Disallow: /index Disallow: /handbook Disallow: /press-kit Disallow: /sq/press-kit # Non-HTML resources — GSC was crawling these (correctly never indexing them); # block to save crawl budget. Browsers/clients ignore robots.txt, so the # blog client-search fetch and subtitles keep working. Disallow: /blog/search-index.json Disallow: /docs/ Disallow: /press/videos/ # AI crawlers — explicitly allowed for citation visibility (GEO/AIO strategy) User-agent: GPTBot Allow: / User-agent: ChatGPT-User Allow: / User-agent: OAI-SearchBot Allow: / User-agent: ClaudeBot Allow: / User-agent: Claude-User Allow: / User-agent: Claude-SearchBot Allow: / User-agent: anthropic-ai Allow: / User-agent: PerplexityBot Allow: / User-agent: Perplexity-User Allow: / User-agent: Google-Extended Allow: / User-agent: GoogleOther Allow: / User-agent: Applebot-Extended Allow: / User-agent: Bytespider Allow: / User-agent: CCBot Allow: / User-agent: cohere-ai Allow: / User-agent: Diffbot Allow: / User-agent: FacebookBot Allow: / User-agent: Meta-ExternalAgent Allow: / User-agent: Meta-ExternalFetcher Allow: / User-agent: Amazonbot Allow: / # Sitemaps Sitemap: https://albanianregistry.org/sitemap-index.xml