Articles classified as mostly AI-generated reached 49.9 percent of a large sample of English-language web content in the first quarter of 2026, according to research from content platform Graphite. The finding, reported by Search Engine Land on May 20, marks the point where machine-written and human-written publishing reached rough parity online. For content teams, the number is less alarming than the shape of the curve behind it.

Graphite analyzed 55,400 English-language articles drawn from Common Crawl and published between January 2020 and March 2026. Each article was scored by three detection systems, Pangram, Copyleaks, and GPTZero, with results averaged across all three. Graphite reported false-positive and false-negative rates below 2 percent in validation testing, which is a stronger accuracy claim than most single-detector studies offer.

The growth pattern is the part worth reading closely. AI-classified articles climbed fast after ChatGPT’s late-2022 release, hitting 35.9 percent within 12 months. By late 2024 the figure reached roughly 48 percent. Then it stopped climbing. The proportion has held near 50 percent since the first quarter of 2025, with AI articles briefly edging past human ones in late 2025 before settling back.

A plateau after a steep rise is not what a pure-volume story predicts. If AI publishing were purely a cost arbitrage, the share would keep climbing toward saturation. It did not. Graphite’s own reading is that publishers “are learning that heavily AI-generated articles don’t always perform well in search.” That is a market correcting, not a market collapsing.

The study’s most honest line is what it did not measure. Graphite did not track search rankings, organic traffic, or visibility for AI versus human content. The 50 percent figure describes what gets published, not what gets found. A page can exist and still earn nothing, and Common Crawl indexes the published web without weighting for performance.

That distinction matters for how SEO teams should act on the number. The headline reads as a flood. The operational reality is closer to a saturated low-value tier that ranks poorly and a smaller body of content that still does the work of earning citations and links. Google’s helpful-content guidance never banned AI assistance. It penalized content created primarily to game search rather than to help a reader, and the plateau suggests that distinction is holding.

The competitive consequence is specific. When half of published articles are commodity AI output, the differentiator is no longer production speed, because everyone has that. The differentiator is what a model cannot generate from training data alone: original research, proprietary data, named expert review, firsthand testing, and the genuine experience signals inside Google’s E-E-A-T framework. Those are the inputs that separate a page from the other 49.9 percent.

There is also a GEO angle. Generative engines synthesize answers from the same web, and a corpus that is half machine-written raises the risk of models citing other models. Content with verifiable primary sourcing becomes more valuable to an AI answer engine precisely because it breaks that loop. Distinctive, sourced content is now a citation asset, not only a ranking asset.

For the next quarter, content teams should stop benchmarking against publishing volume and start benchmarking against citation share. Audit your library for pages that carry original data or named expertise, and treat anything that a competitor’s model could reproduce in a single prompt as a candidate for consolidation rather than expansion.