Google: Markdown Pages for AI Search Strip the Signals That Help You Rank

John Mueller and Martin Splitt say HTML already works fine for LLMs, and building machine-only Markdown pages removes the very elements Google uses to understand and rank content.

Alessandro Benigni

PUBLISHED JUN 17, 2026

3 MIN READ

Follow on Google

-137 MIN AGO

Google: Markdown Pages for AI Search Strip the Signals That Help You Rank. Featured image for SEO

Google’s John Mueller and Martin Splitt used a new “Search Off the Record” episode to push back on one of the most popular tactics in GEO, generative engine optimization: serving stripped-down Markdown versions of pages to help AI systems read your content better.

Their core position is direct. HTML is the standard for Search. It is what Google’s systems are built to process, and serving Markdown in its place provides no ranking benefit while actively removing signals the system relies on.

The episode lands at a moment when a growing number of SEO and GEO practitioners have been building or recommending parallel Markdown renditions of pages, whether as alternate URL paths, conditional delivery based on user-agent, or new content pipelines. The implicit assumption behind those setups is that LLMs struggle with HTML’s noise and prefer clean text. Mueller and Splitt rejected that assumption directly.

Mueller noted that large language models already parse HTML well. The idea that a crawler or language model needs a simpler format to understand a page’s content does not reflect how Google’s systems actually work. Parsing and crawling HTML, however structurally complex it appears to human eyes, is trivial for Google’s infrastructure.

The more pointed warning addresses what gets stripped in the process. When publishers produce Markdown-only or content-only versions of pages, they remove the structural and semantic elements that Google uses for ranking and topical understanding. Exactly which signals are at stake was not enumerated in detail, but the framing is clear: the elements AI-SEO practitioners are treating as noise are, in Google’s view, useful data.

Mueller also raised a governance concern that any technical team should take seriously. Building pages “that no user sees” sits in uncomfortable territory. A machine-only Markdown rendition, delivered only when a bot is detected, could reasonably be treated as cloaking depending on implementation. Mueller’s phrasing was a caution, not an enforcement announcement, but the risk is real and worth flagging before any such system goes to production.

This episode fits a consistent pattern in Google’s recent guidance. The company has repeatedly stated it does not use llms.txt, the proposed machine-readable file format some publishers have adopted to signal content preferences to AI crawlers, for Search. The message across these statements is the same: there is no separate optimized format for AI search visibility, and building infrastructure around the assumption that there is wastes engineering resources and can remove helpful signals.

For any team currently running a Markdown delivery setup or evaluating one, the practical decision tree simplifies considerably. If the goal is Google Search visibility, including AI Overviews and AI Mode surfaces, the investment should go into well-structured HTML with strong semantic markup, not a parallel Markdown pipeline. If the goal is feeding content to third-party LLM products or retrieval-augmented generation systems outside Google’s index, a separate Markdown export may still serve that narrower use case, but it has no documented benefit for Google.

The distinction matters because the two goals are often conflated in GEO discussions. “Optimizing for AI” gets treated as a single problem requiring a single solution. Mueller and Splitt’s position separates those concerns: Google’s AI products read the same content Google’s crawler indexes, through the same HTML pipeline, using the same signals that have always driven ranking.

Teams that have already built Markdown delivery should audit what signals those pages drop relative to their canonical HTML. If structural elements, heading hierarchies, schema markup, internal link context, or semantic HTML tags are absent from the Markdown version, and if that version is what some crawlers receive, those teams are serving a degraded signal set to a system that did not need a simpler format in the first place.

No special format unlocks AI search visibility. The 90-day action is to redirect that engineering capacity toward canonical HTML quality instead.

Google’s “Search Off the Record” podcast, episode titled “Markdown vs HTML,” with John Mueller and Martin Splitt, reported by Search Engine Journal on June 16, 2026.

Google: Markdown Pages for AI Search Strip the Signals That Help You Rank

Get it by email instead.

Search Shift

Google: Markdown Pages for AI Search Strip the Signals That Help You Rank

The morning brief on the shift from search to AI answers.

More in Ai-search

Munich Court Rules Google Owns AI Overview Claims, Not Just Links

Your brand is recognized by AI but still not recommended

Apple confirms Applebot data now feeds Siri AI answers