Apple has formally documented a new use for its web crawler: crawled content can now serve as real-time grounding context when Apple Intelligence and Siri generate answers to broad world-knowledge queries. The documentation, published June 8, 2026, marks Apple’s first explicit statement that Applebot feeds generative AI output, not just classic search results in Spotlight and Safari.
According to Apple’s own About Applebot documentation, crawled data may provide additional context when AI models generate output across Apple products and services. The specific example Apple names is Siri answering broad knowledge questions, where responses may include links to source pages. That is retrieval-augmented generation running at platform scale, delivered to hundreds of millions of iOS and macOS users.
Two separate opt-out controls apply to two distinct uses, and conflating them is the most common misconfiguration risk.
The first is Applebot-Extended, a secondary user-agent entry in robots.txt. Disallowing Applebot-Extended tells Apple not to use that content for training its foundation models, including Apple Intelligence. It does not prevent Applebot from crawling the page. It does not remove the page from Spotlight or Siri search results. It affects only model training, not live answer grounding.
The second control is the nosnippet meta tag. Apple’s documentation states explicitly: pages tagged nosnippet will not be used as additional context when AI models generate output for display in Apple products and services. This is the opt-out for live answer grounding, the retrieval layer that powers Siri’s real-time responses. It also means those pages will not receive snippet-based citation links in AI-generated answers.
The distinction matters for publishers trying to protect paywalled or proprietary content. Apple also confirms that isAccessibleForFree: false in schema.org JSON-LD structured data blocks a page from the AI grounding layer, even while keeping it eligible for standard search results. Section-level hasPart markup is not supported; the signal applies at the page level only.
One important boundary: opting out of both Applebot-Extended and nosnippet does not remove a page from Apple’s standard search index. The documentation is clear that content remains discoverable through Spotlight, Siri suggestions, and Safari regardless of either opt-out signal. Publishers who want full exclusion from Apple’s ecosystem still need to disallow Applebot itself in robots.txt.
A separate crawl-behavior note in the documentation is worth logging: Applebot does not honor the crawl-delay directive. Site operators who rely on crawl-delay for rate-limiting across bots will need to use server-side rate controls or IP-based rules to manage Applebot traffic. The crawler adjusts its own rate based on server error signals, but publisher-set delays are ignored.
For GEO (generative engine optimization, the practice of optimizing content for LLM-driven answer surfaces) practitioners, the Apple grounding update opens a new citation channel. A page cited inside a Siri answer on an iPhone represents referral traffic that does not currently appear in most analytics pipelines as a distinct source. Publishers who have not already checked their server logs for Applebot traffic should do so now, both to confirm crawl frequency and to verify that the crawler can fully render JavaScript-dependent pages.
Teams managing content that includes proprietary research, subscription-gated material, or data they do not want surfaced in AI answers should audit their nosnippet tag deployment against their current Applebot crawl access. The two settings are independent, and allowing crawl access while omitting nosnippet means that content is eligible for live grounding today.
Source: Apple’s About Applebot documentation (support.apple.com/en-us/119829), updated June 8, 2026.