What AI Share of Voice actually is (and why it beats a mention count)
AI Share of Voice is your brand's share of AI-answer mentions relative to your competitors — not a raw count of your own mentions. That distinction is the whole game. A mention count is absolute and contextless: "we appeared 5 times" could be brilliant or catastrophic depending entirely on what everyone else got. Share of Voice reframes it as a competitive ratio, so the number answers the real question buyers create every time they open ChatGPT: when they ask, who does the AI recommend, and how often is it you versus them?
The reframe in one line
Stop asking "did the AI mention me?" Start asking "of all the times the AI named a brand like mine, what slice was me?" The first is a vanity check. The second is a market-share read on the surface where buying decisions now start.
Because it is a share, every brand in a category competes for one pie that sums to 100%. If you gain, someone loses. That is what makes it strategically useful: it tracks competitive position over time, not just whether you exist. It is the AI-search version of the share-of-voice metric marketers have used for decades — see the Share of Voice definition for the lineage — pointed at ChatGPT, Perplexity, Gemini and Google AI Overviews instead of TV ad spend.
The formula, with a worked example
The simplest, most widely agreed formula — practitioners and Search Engine Land converge on it — is this:
AI Share of Voice formula
SOV = (your brand mentions ÷ total brand + competitor mentions across your prompt set) × 100. In Search Engine Land's words: "answers mentioning your brand divided by answers mentioning your brand or competitors."
The denominator is the part people skip — and skipping it is what turns a real metric into a vanity one. You only count answers that name you OR a competitor, because those are the answers where the recommendation actually happened. An answer that names nobody in your category is not part of the contest.
Worked example
- 1You run a stable set of buyer prompts and collect **1,500 answer outputs** that mention you or a competitor.
- 2Across those, your brand is named **300 times**.
- 3**300 ÷ 1,500 = 0.20 → 20% Share of Voice.**
Now read that against the alternative: reporting "300 mentions" with no denominator. Is 300 good? You cannot know. If the market leader pulls 750 of those 1,500 (50% SOV) and you sit at 20%, you are a distant second and the chart is a wake-up call. The denominator is not a technicality — it is the entire signal.
20%
Share of Voice from 300 brand mentions across 1,500 competitive answer outputs — the denominator is what makes the number mean anything
Want to see which engines name your brand at all before you build a prompt set? Run a free AI SEO audit — it checks whether ChatGPT, Perplexity, Gemini and Claude can read and recognize your site in about 15 seconds.
Position-weighting: why being #1 is not the same as being #5
A flat count — even a competitive one — treats every mention as equal. It is not. In an AI answer, the brand named first is the recommendation; the brand named fifth is a footnote a reader may never reach. If you only ever appear last, a flat SOV flatters you. Position-weighting fixes that by scoring prominence, not just presence.
The simple version: 1 ÷ position
A common, easy-to-explain weight gives each mention a value of 1 ÷ its position in the answer: 1st place = 1.00, 2nd = 0.50, 3rd = 0.33, and so on. A brand that appears five times but always dead last scores 5 × 0.20 = 1.00 total weight — roughly 5% weighted SOV — despite "5 mentions" on a flat count. Same five mentions, very different reality.
| Position in answer | Flat count value | Weighted value (1 ÷ position) |
|---|---|---|
| 1st (the recommendation) | 1 | 1.00 |
| 2nd | 1 | 0.50 |
| 3rd | 1 | 0.33 |
| 4th | 1 | 0.25 |
| 5th (a footnote) | 1 | 0.20 |
The research-backed version: prominence, not presence
This is not a marketer's heuristic — it has peer-reviewed precedent. The GEO paper (Aggarwal et al., a Princeton-led study published at KDD 2024) uses **Position-Adjusted Word Count** as its primary visibility metric: each citing sentence's word count is weighted by an exponential decay based on where it sits in the answer. In plain terms, the model rewards both how early you appear *and* how much of the answer's text you occupy. Their work makes the same point this section does — visibility means prominence, not bare presence. They also defined a richer seven-dimension "subjective impression" metric (relevance, influence, uniqueness, subjective position and count, click likelihood, and diversity) for cases where word count alone is too blunt.
Why this matters for your dashboard
If your tool reports flat mention counts only, you can be "winning" on volume while losing every actual recommendation. Pick a weighting — 1 ÷ position is fine to start — and apply it consistently. The exact curve matters less than being honest that first place and fifth place are not equal.
The same GEO study built a 10,000-query benchmark (GEO-bench) and showed its content methods lifted visibility by up to 30–40% — Quotation Addition alone moved the top metric ~41%. The takeaway for measurement: the things that lift a brand into earlier, more prominent positions are exactly what a position-weighted SOV will reward you for, and a flat count will miss. The tactics behind those lifts are the subject of generative engine optimization.
Measure per engine — blending is a trap
The single biggest measurement mistake after ignoring the denominator is mashing every engine into one number. Engines source their answers from wildly different places, so the same brand can dominate one and be invisible on another. A blended SOV hides exactly the gap you need to fix.
A Profound study of **680 million citations** (August 2024–June 2025) makes the divergence concrete. ChatGPT leaned heavily on Wikipedia — 47.9% of its top-10 sources. Perplexity leaned on Reddit — 46.7% of its top-10. Google AI Overviews was more balanced (Reddit 21.0%, YouTube 18.8%). If your brand has a strong Wikipedia presence but no community footprint, you may post a healthy ChatGPT SOV and near-zero Perplexity SOV from the same prompt set. Blend them and you would never see it.
| Engine | Where it pulls answers from (top source signal) | What that means for your SOV |
|---|---|---|
| ChatGPT | Wikipedia-heavy (47.9% of top sources) | Encyclopedic authority and a clean entity wins |
| Perplexity | Reddit-heavy (46.7% of top sources) | Community footprint and discussion wins |
| Google AI Overviews | Balanced (Reddit 21.0%, YouTube 18.8%) | Mixed signals; classic fundamentals still carry |
So track at least the major engines separately: ChatGPT, Gemini, Perplexity, Claude and Google AI Overviews. Report a per-engine SOV for each, then a blended view only as a rough headline. The per-engine split is what tells you where you are losing and which source to earn next — a Wikipedia entry, a stronger Reddit presence, fresher pages, or schema-marked content. Engine-by-engine playbooks live in how to rank in ChatGPT and how to rank in Perplexity.
Per-engine SOV is exactly what SourceWatch measures: it tracks whether ChatGPT, Perplexity, Gemini and Claude cite your brand and how your share compares to competitors on each engine — on a schedule — alongside the real AI-crawler and AI-referral traffic actually hitting your pages.
Track your AI Share of Voice with SourceWatchPair SOV with citation rate and sentiment
Share of Voice tells you how often you show up relative to rivals. It does not tell you whether the engine cited you as a source, recommended you as the answer, or named you while trashing you. Two companion metrics close those gaps — Search Engine Land recommends pairing all three, and AI citation tracking is how you capture the cite-level detail.
- **Citation Rate** — the percentage of AI answers that cite your brand. SOV is relative (you vs competitors); citation rate is absolute (how often you appear at all). Read together, they separate "small market, winning it" from "big market, losing it."
- **Sentiment** — positive, neutral, or negative. You can hold a high SOV and still be losing if the engine keeps mentioning you as the cautionary example. Tone is invisible to a mention count.
- **Cite vs. recommend** — being named as a *source* the engine read is different from being recommended as the *solution*. Entity recommendation ("the best option is X") is worth far more than a passing citation, and SOV alone will not distinguish them.
The "mentioned but trashed" trap
A pure SOV dashboard can show you climbing while your reputation in the answers erodes. Always read SOV next to sentiment. A rising share of *negative* mentions is a problem disguised as progress.
There is a real business reason to get this right beyond vanity: AI referral traffic now converts. Adobe found AI-referred traffic converted 42% better than non-AI traffic in March 2025 — a striking reversal from being 43% *worse* in July 2024 — and Semrush reported AI-referred visitors converting at roughly 4.4x organic for informational queries. Frame those vendor figures as directional, not gospel, but the direction is clear: showing up well in AI answers is no longer a soft branding win.
The prompt set IS the measurement
Here is the uncomfortable truth most teams learn late: the formula is the easy part. The prompt set — the list of questions you run through each engine — is what determines whether your SOV reflects reality or noise. Get the prompts wrong and a perfectly correct formula produces a perfectly useless number.
Build it in three layers
- 1
1. Brand positioning prompts
Start from how you describe yourself and the category you compete in. These are the "best [category] tool" and "alternatives to [competitor]" commercial-intent queries — the ones where a recommendation actually changes a purchase. Vague informational prompts ("what is a CRM") do not test share of voice; buying-intent prompts do.
- 2
2. Voice-of-customer prompts
Mine how real buyers phrase things: G2 and Capterra review language, Reddit and community threads, and your own sales-call recordings. Buyers do not ask the way you write marketing copy. The closer your prompts match real phrasing, the closer your SOV matches reality.
- 3
3. Search-data prompts
Pull from "People Also Ask," autocomplete and your real search queries. This grounds the set in demand that actually exists rather than questions you wish people asked.
Then organize the set by category × funnel stage × persona so you can slice SOV by segment — your share in "enterprise buyers comparing options" may look nothing like your share in "small-business first-time buyers." Start at around 50 high-intent prompts (Search Engine Land's suggested floor) and scale toward 100–200 as you mature. More prompts means a more stable number, not just a bigger one.
Run it on a cadence, not once
LLM answers are non-deterministic — ask the same question twice and you can get two different brand lists. A single snapshot is noise. Re-run the same fixed prompt set on a schedule (weekly or monthly), hold the prompts constant, and trend the line. The movement over time is the signal; any one reading is not.
For the broader practice of monitoring these answers over time, see the pillar on how to track AI mentions.
Common mistakes that make AI SOV meaningless
Almost every "our AI SOV looks weird" case is one of these. Check them before you trust any number on a dashboard.
- **No competitive denominator** — counting your own mentions in isolation. "We got 5 mentions" is a vanity metric without knowing competitors got 50. SOV requires the you-or-a-rival denominator.
- **Flat counts that ignore position** — treating a #1 recommendation and a #5 footnote as equal. Weight by position so prominence beats mere presence.
- **One-shot queries instead of a stable set** — asking once and trusting it. Answers vary run to run; you need a fixed prompt set re-run on a cadence and trended.
- **Blending engines into one number** — sourcing behavior differs wildly (Wikipedia-heavy ChatGPT vs Reddit-heavy Perplexity). A blended SOV hides where you are actually losing.
- **Measuring presence but ignoring sentiment and cite-vs-recommend** — a high but negative or merely-cited SOV is not the win the number implies.
- **Prompts that do not map to buyer intent** — testing "what is a CRM" instead of "best CRM for a 10-person team." Commercial-intent prompts are where recommendations — and share — actually happen.
Fix the prompt set and the denominator first; they cause more bad SOV numbers than any tooling gap. Everything else — weighting, per-engine splits, sentiment — refines a measurement that is already pointed at the right question.
Not sure where your brand stands on any engine yet? Start with a free check of whether ChatGPT, Perplexity, Gemini and Claude can even read and recognize your site — then build your prompt set from there.
Run a free AI SEO audit