AI Research Agents Cited a Fake Brand 62% of the Time After a 13-Word Edit

Cornell Tech's WARP attack shows that a single sentence appended to a Reddit thread can steer OpenAI and Gemini deep-research outputs toward attacker-chosen brands.

Alessandro Benigni

PUBLISHED JUN 25, 2026

3 MIN READ

Follow on Google

-122 MIN AGO

A 13-word sentence planted in a Reddit thread is enough to push a fabricated brand into AI-generated research reports at mention rates between 38 and 62 percent, according to a paper from Cornell Tech researchers posted to arXiv in May.

The Cornell Tech team named the technique WARP (Web Agent Retrieval Poisoning). It does not require access to any model, prompt, or retrieval system. The attacker simply appends fluent text to a page the agent already tends to retrieve, then waits for the agent to pull that page, cite it, and fold the injected claim into an otherwise normal-looking answer.

The failure mode is specific: deep-research agents run many sub-searches for a single user request, and the Cornell Tech paper found that user-generated pages recur across those related sub-queries. A single poisoned page does not get one citation shot. It gets pulled multiple times, compounding its influence on the final report.

Reddit is the structural weak point. Across three open-source deep-research systems tested (STORM, Co-STORM, and OmniThink), 17 to 23 percent of all retrieved URLs came from user-generated platforms. Reddit accounted for 54 to 71 percent of those user-generated URLs. That concentration means a poisoning effort on a small number of high-traffic threads can reach a disproportionate share of deep-research outputs in a given category.

In a test case, a 15-word sentence inserted into a simulated thread promoted a fabricated cryptocurrency called BananaCoin as an emerging long-term investment option. The altered source appeared alongside legitimate references in the final report. When one poisoned page was retrieved, the fake entity appeared in 38 to 51 percent of reports. Targeting multiple pages pushed that range to 42 to 62 percent. Even when injected text made up less than 4 percent of a full retrieved thread, mention rates held at 30 to 53 percent.

Defenses tested by the researchers did not hold. Blocking user-generated domains entirely stopped the attack but also stripped firsthand product reviews and community knowledge from outputs. AI-written injections are fluent, so perplexity-based filters flagged ordinary user text more reliably than the poisoned passages. Report-level inspection failed because the agent had already absorbed the fabricated claim into a coherent narrative.

The researchers analyzed OpenAI Deep Research and Gemini Deep Research for UGC citation patterns but did not run live manipulation tests on either system, because doing so would require publishing altered content to the open web.

For GEO and SEO teams, the research surfaces a monitoring gap that most brand strategies have not closed. Category answers in AI research outputs are often shaped by a handful of retrieved community pages, not by a brand’s own site. If a competitor, a bad actor, or a misinformation campaign edits those source pages, the citation record inside AI answers shifts without any change to search rankings or your own content. Teams should audit which Reddit threads, Wikipedia entries, and forum posts are the anchor sources for their category’s AI answers, then monitor those pages for unauthorized edits the same way they monitor backlink profiles for toxic links. The attack surface is the UGC layer, and it is currently unguarded.

Reported by Search Engine Land on June 24, 2026, citing a paper by Tingwei Zhang, Harold Triedman, and Vitaly Shmatikov of Cornell Tech posted to arXiv on May 22, 2026.

AI Research Agents Cited a Fake Brand 62% of the Time After a 13-Word Edit

Get it by email instead.

Search Shift

AI Research Agents Cited a Fake Brand 62% of the Time After a 13-Word Edit

The morning brief on the shift from search to AI answers.

More in Technical

Mueller Urges Caution on ccTLD-to-gTLD Moves Driven by Branding Alone

Mueller: Localized URL folders give zero ranking edge

Google Signals no longer gates Ads data; Consent Mode now rules alone