← Back to Research AI Safety · Recommendation Agents · June 2026

SafeGEO

Paid search once steered patients toward dubious hospitals; with AI assistants as the new gateway, will the same story repeat? SafeGEO offers the first systematic measurement.

arXiv submitted · link incoming
600recommendation cases
22GEO attack variants
40,800materialized samples
83.2%max harmful promotion increase
39.2%max Target@3 mitigation reduction

More and more purchase decisions now start with a question to an AI assistant: “Which air purifier is worth buying?” Sellers have noticed — web copy is increasingly written for AI readers, a practice with a name: Generative Engine Optimization (GEO), the SEO of the AI era. SafeGEO measures the question that follows: when a genuinely flawed product dresses its page up as an “independent buyer guide”, can the agent recommending on the user’s behalf hold its judgment?

SafeGEO overview: seller-controlled GEO sources influence recommendation agents and push flawed target products into recommendation sets.
Paper overview. SafeGEO targets the recommendation pipeline: GEO text shifts the evidence balance the agent sees, pushing flawed target products into recommendation sets.

When product pages are written for AI readers

Recommendation agents (AI systems that search, compare, and advise on the user’s behalf) are becoming a new entry point for commerce and content platforms. Their judgment rests on what they can read online — product pages, reviews, FAQs; and seller-controlled sources such as product pages (material written by the sellers themselves) carry a built-in incentive to be optimized, or manipulated.

The story in the headline deserves unpacking. In the era of paid search ranking, the script — platform optimized by sellers, price paid by users — played out once already: patients in urgent need were steered toward the highest bidder, not the best doctor. With the entry point shifting from the search box to AI assistants, the same contest moves to a new arena — what decides the ranking is no longer keyword bids, but the “evidence” an AI reads.

GEO is not inherently harmful: a clearer page helps humans and AI alike. The problem is the boundary — once a rewrite starts hiding flaws, fabricating reputation, or impersonating an “independent review”, the evidence the agent reads is systematically polluted. SafeGEO is the first controlled measurement of this risk: how far attacks can push recommendations away from the user’s interest, and how much existing defenses recover.

A controlled testbed

SafeGEO covers six evidence-grounded product verticals: AI meeting transcription tools, baby monitors, carry-on backpacks, home air purifiers, noise-canceling headphones, and office chairs. Each case keeps candidate products, canonical attributes, non-target evidence, and hidden utility labels (ground-truth quality annotations known to the evaluator but never shown to the model) fixed while rewriting exactly one seller-controlled source — so any change in the recommendation can be cleanly attributed to that single rewrite.

Recommendation cases600
Avg. candidates per case19.96
GEO targets per case3
Attack variants22
Total samples40,800
Evaluation metricsTarget@3, HCV@1, GT@3, uNDCG@5

Benchmark statistics. Each base case expands into control and attack conditions so recommendation changes can be attributed to one rewritten seller source.

Evaluation uses four metrics: Target@3 (how often the attacked, flawed product enters the top three recommendations), HCV@1 (how often it takes the top slot), GT@3 (how often genuinely good products remain in the top three), and uNDCG@5 (how well the top five match the user’s true utility). The 40,800 materialized samples (concrete evaluation instances expanded from each case under different attack conditions) keep every condition pairwise comparable.

SafeGEO dataset construction pipeline: recommendation scenarios, candidate set, hidden utility and evidence annotations, GEO attack construction, instance materialization, and evaluation.
Dataset construction. Candidate sets, hidden utility, and non-target evidence stay fixed; GEO rewrites one target product source at a time.
ConditionAverage source-text length
No GEO3,911 [3,901, 3,921]
Truthful-rewrite3,905 [3,895, 3,915]
Avg. GEO, 22 variants3,925 [3,924, 3,926]

Source-length control. GEO and control contexts are closely matched in length, so the observed harm is not explained by simply giving the model more text.

How far attacks go

Experiments show that GEO attacks can substantially promote flawed target products. Realistic seller-facing variants are especially strong: they package false fit, evidence padding, and salience manipulation into one plausible-looking seller document, rather than mechanically stacking keywords. On DeepSeek-V4-Flash, the flawed product enters the top three only 6.2% of the time with no attack; under the realistic “selective comparison note” variant, that rises to 82.3%.

Target@3 uplift bar chart across 22 GEO attack variants, with realistic variants producing especially high uplift.
Target@3 uplift across the 22 attack variants. Realistic variants are strongest overall, suggesting that coherent seller-facing source realization matters more than mechanical primitive stacking.
Representative realistic GEO attack results (DeepSeek-V4-Flash)
SettingTarget@3ΔHCV@1ΔGT@3ΔuNDCG@5Δ
No GEO6.2--24.5--66.7--77.0--
Truthful-rewrite control4.6--23.0--67.7--78.8--
Caveat-buried FAQ77.5+72.976.2+53.257.7-10.066.3-12.5
Popularity-heavy profile71.2+66.671.4+48.457.6-10.167.3-11.5
Citation-padded note78.7+74.178.4+55.458.1-9.666.2-12.7
Independent buyer guide77.9+73.377.3+54.356.5-11.266.0-12.9
False-fit checklist79.1+74.678.4+55.457.7-9.966.1-12.7
Selective comparison note82.3+77.781.8+58.856.9-10.865.4-13.5
Avg. realistic72.6+68.073.4+50.457.7-10.066.9-11.9
Realistic variants raise Target@3 and HCV@1 while degrading utility-quality metrics.

Mechanistically, an attack succeeds almost exactly to the extent that it hijacks the agent’s citations: the more the model’s citations are steered toward misleading lines, the higher the flawed product ranks — a correlation of r=0.91.

Scatter plot showing misleading GEO-line citation rate versus Target@3, with a strong positive relationship.
Mechanism analysis. Variants that redirect citations toward misleading GEO lines also achieve higher target placement; the paper reports r=0.91.

How much defenses recover

Simple mitigations help, but they are not enough. Defensive prompting (explicitly instructing the agent to watch for marketing manipulation) reduces harmful promotion; evidence breakdown (requiring the agent to list supported, missing, and conflicting evidence for each candidate before ranking) is strongest, cutting Target@3 by 39.2 percentage points on Qwen3.6 27B. Even the strongest defense, though, does not restore no-attack behavior.

Heatmap of Target@3 reduction for mitigation layers on Gemma 4 31B IT across realistic attack variants.
Variant-level mitigation effects. L3 Evidence sheet is the broadest and strongest layer; prompt-only defenses help, but unevenly.
Mitigation results on the same attacked instances (excerpt)
ModelMitigationTarget@3ΔHCV@1ΔGT@3ΔuNDCG@5Δ
Gemma 4 31B ITNo mitigation79.6--75.6--67.9--68.6--
Gemma 4 31B ITDefensive prompt64.5-15.160.8-14.869.3+1.372.6+4.0
Gemma 4 31B ITEvidence breakdown49.9-29.746.6-29.169.5+1.674.4+5.7
Qwen3.6 27BNo mitigation78.3--83.7--60.8--63.6--
Qwen3.6 27BDefensive prompt67.3-11.066.2-17.568.5+7.673.4+9.8
Qwen3.6 27BEvidence breakdown39.1-39.242.1-41.669.7+8.877.4+13.9
Devstral Small 2No mitigation90.9--90.7--47.9--59.2--
Devstral Small 2Evidence breakdown73.2-17.778.9-11.843.4-4.556.3-2.8
Evidence breakdown is usually the strongest mitigation, but it still does not restore no-GEO behavior.

A caution for agent safety

Makes GEO risk measurable.The work grounds visibility optimization in concrete recommendation choices.
Focuses on seller-controlled evidence.It studies information sources that can realistically be optimized or manipulated.
Shows simple defenses are incomplete.Prompting and evidence checks help, but GEO remains a serious agent-safety risk.