Skip to content
AI Search

Share of Voice in AI Search: How to Measure It

When a buyer asks ChatGPT "what is the best CRM," the engine names a handful of brands. AI Share of Voice answers the only question that matters next: across all the prompts your buyers actually ask, how often is one of those brands you — versus your competitors? It is a competitive, zero-sum metric. Every brand in a market shares one pie that adds up to 100%, which is exactly why it beats counting your own mentions in isolation. "We got 5 mentions" tells you nothing if your rival got 50. This guide gives you the formula, a worked example, the position-weighting upgrade that the peer-reviewed research backs, and the prompt-set discipline that makes the number trustworthy. For the deeper AI Share of Voice breakdown, see the main guide.

TL;DR

  • **AI Share of Voice = your brand mentions ÷ total brand + competitor mentions across your prompt set × 100.** It is a *competitive* metric — all brands in the market sum to 100%.
  • **Worked example:** 300 of your mentions across 1,500 prompt outputs that name you or a rival = **20% SOV**. A raw "300 mentions" alone is a vanity number; the denominator is the whole point.
  • **Position-weight it.** Being recommended #1 is not the same as being named #5. A common weight is **1 ÷ position** (P1 = 1.00, P5 = 0.20). The peer-reviewed GEO paper weights visibility by an *exponential* decay on position — presence is not prominence.
  • **Measure per engine, never blended.** Across **680M citations**, ChatGPT leaned Wikipedia (47.9% of top sources) while Perplexity leaned Reddit (46.7%). The same brand can be loud on one engine and invisible on another.
  • **Pair SOV with citation rate and sentiment.** SOV alone misses tone and the cite-vs-recommend distinction — you can be mentioned but trashed, or named without being recommended.
  • **The prompt set IS the measurement.** Most teams get the formula right and the prompt set wrong. Start at 50 high-intent prompts; trend it on a fixed cadence because answers vary run to run.

What AI Share of Voice actually is (and why it beats a mention count)

AI Share of Voice is your brand's share of AI-answer mentions relative to your competitors — not a raw count of your own mentions. That distinction is the whole game. A mention count is absolute and contextless: "we appeared 5 times" could be brilliant or catastrophic depending entirely on what everyone else got. Share of Voice reframes it as a competitive ratio, so the number answers the real question buyers create every time they open ChatGPT: when they ask, who does the AI recommend, and how often is it you versus them?

The reframe in one line

Stop asking "did the AI mention me?" Start asking "of all the times the AI named a brand like mine, what slice was me?" The first is a vanity check. The second is a market-share read on the surface where buying decisions now start.

Because it is a share, every brand in a category competes for one pie that sums to 100%. If you gain, someone loses. That is what makes it strategically useful: it tracks competitive position over time, not just whether you exist. It is the AI-search version of the share-of-voice metric marketers have used for decades — see the Share of Voice definition for the lineage — pointed at ChatGPT, Perplexity, Gemini and Google AI Overviews instead of TV ad spend.

The formula, with a worked example

The simplest, most widely agreed formula — practitioners and Search Engine Land converge on it — is this:

AI Share of Voice formula

SOV = (your brand mentions ÷ total brand + competitor mentions across your prompt set) × 100. In Search Engine Land's words: "answers mentioning your brand divided by answers mentioning your brand or competitors."

The denominator is the part people skip — and skipping it is what turns a real metric into a vanity one. You only count answers that name you OR a competitor, because those are the answers where the recommendation actually happened. An answer that names nobody in your category is not part of the contest.

Worked example

  1. 1You run a stable set of buyer prompts and collect **1,500 answer outputs** that mention you or a competitor.
  2. 2Across those, your brand is named **300 times**.
  3. 3**300 ÷ 1,500 = 0.20 → 20% Share of Voice.**

Now read that against the alternative: reporting "300 mentions" with no denominator. Is 300 good? You cannot know. If the market leader pulls 750 of those 1,500 (50% SOV) and you sit at 20%, you are a distant second and the chart is a wake-up call. The denominator is not a technicality — it is the entire signal.

20%

Share of Voice from 300 brand mentions across 1,500 competitive answer outputs — the denominator is what makes the number mean anything

Want to see which engines name your brand at all before you build a prompt set? Run a free AI SEO audit — it checks whether ChatGPT, Perplexity, Gemini and Claude can read and recognize your site in about 15 seconds.

Position-weighting: why being #1 is not the same as being #5

A flat count — even a competitive one — treats every mention as equal. It is not. In an AI answer, the brand named first is the recommendation; the brand named fifth is a footnote a reader may never reach. If you only ever appear last, a flat SOV flatters you. Position-weighting fixes that by scoring prominence, not just presence.

The simple version: 1 ÷ position

A common, easy-to-explain weight gives each mention a value of 1 ÷ its position in the answer: 1st place = 1.00, 2nd = 0.50, 3rd = 0.33, and so on. A brand that appears five times but always dead last scores 5 × 0.20 = 1.00 total weight — roughly 5% weighted SOV — despite "5 mentions" on a flat count. Same five mentions, very different reality.

Position in answerFlat count valueWeighted value (1 ÷ position)
1st (the recommendation)11.00
2nd10.50
3rd10.33
4th10.25
5th (a footnote)10.20

The research-backed version: prominence, not presence

This is not a marketer's heuristic — it has peer-reviewed precedent. The GEO paper (Aggarwal et al., a Princeton-led study published at KDD 2024) uses **Position-Adjusted Word Count** as its primary visibility metric: each citing sentence's word count is weighted by an exponential decay based on where it sits in the answer. In plain terms, the model rewards both how early you appear *and* how much of the answer's text you occupy. Their work makes the same point this section does — visibility means prominence, not bare presence. They also defined a richer seven-dimension "subjective impression" metric (relevance, influence, uniqueness, subjective position and count, click likelihood, and diversity) for cases where word count alone is too blunt.

Why this matters for your dashboard

If your tool reports flat mention counts only, you can be "winning" on volume while losing every actual recommendation. Pick a weighting — 1 ÷ position is fine to start — and apply it consistently. The exact curve matters less than being honest that first place and fifth place are not equal.

The same GEO study built a 10,000-query benchmark (GEO-bench) and showed its content methods lifted visibility by up to 30–40% — Quotation Addition alone moved the top metric ~41%. The takeaway for measurement: the things that lift a brand into earlier, more prominent positions are exactly what a position-weighted SOV will reward you for, and a flat count will miss. The tactics behind those lifts are the subject of generative engine optimization.

Measure per engine — blending is a trap

The single biggest measurement mistake after ignoring the denominator is mashing every engine into one number. Engines source their answers from wildly different places, so the same brand can dominate one and be invisible on another. A blended SOV hides exactly the gap you need to fix.

A Profound study of **680 million citations** (August 2024–June 2025) makes the divergence concrete. ChatGPT leaned heavily on Wikipedia — 47.9% of its top-10 sources. Perplexity leaned on Reddit — 46.7% of its top-10. Google AI Overviews was more balanced (Reddit 21.0%, YouTube 18.8%). If your brand has a strong Wikipedia presence but no community footprint, you may post a healthy ChatGPT SOV and near-zero Perplexity SOV from the same prompt set. Blend them and you would never see it.

EngineWhere it pulls answers from (top source signal)What that means for your SOV
ChatGPTWikipedia-heavy (47.9% of top sources)Encyclopedic authority and a clean entity wins
PerplexityReddit-heavy (46.7% of top sources)Community footprint and discussion wins
Google AI OverviewsBalanced (Reddit 21.0%, YouTube 18.8%)Mixed signals; classic fundamentals still carry

So track at least the major engines separately: ChatGPT, Gemini, Perplexity, Claude and Google AI Overviews. Report a per-engine SOV for each, then a blended view only as a rough headline. The per-engine split is what tells you where you are losing and which source to earn next — a Wikipedia entry, a stronger Reddit presence, fresher pages, or schema-marked content. Engine-by-engine playbooks live in how to rank in ChatGPT and how to rank in Perplexity.

Per-engine SOV is exactly what SourceWatch measures: it tracks whether ChatGPT, Perplexity, Gemini and Claude cite your brand and how your share compares to competitors on each engine — on a schedule — alongside the real AI-crawler and AI-referral traffic actually hitting your pages.

Track your AI Share of Voice with SourceWatch

Pair SOV with citation rate and sentiment

Share of Voice tells you how often you show up relative to rivals. It does not tell you whether the engine cited you as a source, recommended you as the answer, or named you while trashing you. Two companion metrics close those gaps — Search Engine Land recommends pairing all three, and AI citation tracking is how you capture the cite-level detail.

  • **Citation Rate** — the percentage of AI answers that cite your brand. SOV is relative (you vs competitors); citation rate is absolute (how often you appear at all). Read together, they separate "small market, winning it" from "big market, losing it."
  • **Sentiment** — positive, neutral, or negative. You can hold a high SOV and still be losing if the engine keeps mentioning you as the cautionary example. Tone is invisible to a mention count.
  • **Cite vs. recommend** — being named as a *source* the engine read is different from being recommended as the *solution*. Entity recommendation ("the best option is X") is worth far more than a passing citation, and SOV alone will not distinguish them.

The "mentioned but trashed" trap

A pure SOV dashboard can show you climbing while your reputation in the answers erodes. Always read SOV next to sentiment. A rising share of *negative* mentions is a problem disguised as progress.

There is a real business reason to get this right beyond vanity: AI referral traffic now converts. Adobe found AI-referred traffic converted 42% better than non-AI traffic in March 2025 — a striking reversal from being 43% *worse* in July 2024 — and Semrush reported AI-referred visitors converting at roughly 4.4x organic for informational queries. Frame those vendor figures as directional, not gospel, but the direction is clear: showing up well in AI answers is no longer a soft branding win.

The prompt set IS the measurement

Here is the uncomfortable truth most teams learn late: the formula is the easy part. The prompt set — the list of questions you run through each engine — is what determines whether your SOV reflects reality or noise. Get the prompts wrong and a perfectly correct formula produces a perfectly useless number.

Build it in three layers

  1. 1

    1. Brand positioning prompts

    Start from how you describe yourself and the category you compete in. These are the "best [category] tool" and "alternatives to [competitor]" commercial-intent queries — the ones where a recommendation actually changes a purchase. Vague informational prompts ("what is a CRM") do not test share of voice; buying-intent prompts do.

  2. 2

    2. Voice-of-customer prompts

    Mine how real buyers phrase things: G2 and Capterra review language, Reddit and community threads, and your own sales-call recordings. Buyers do not ask the way you write marketing copy. The closer your prompts match real phrasing, the closer your SOV matches reality.

  3. 3

    3. Search-data prompts

    Pull from "People Also Ask," autocomplete and your real search queries. This grounds the set in demand that actually exists rather than questions you wish people asked.

Then organize the set by category × funnel stage × persona so you can slice SOV by segment — your share in "enterprise buyers comparing options" may look nothing like your share in "small-business first-time buyers." Start at around 50 high-intent prompts (Search Engine Land's suggested floor) and scale toward 100–200 as you mature. More prompts means a more stable number, not just a bigger one.

Run it on a cadence, not once

LLM answers are non-deterministic — ask the same question twice and you can get two different brand lists. A single snapshot is noise. Re-run the same fixed prompt set on a schedule (weekly or monthly), hold the prompts constant, and trend the line. The movement over time is the signal; any one reading is not.

For the broader practice of monitoring these answers over time, see the pillar on how to track AI mentions.

Common mistakes that make AI SOV meaningless

Almost every "our AI SOV looks weird" case is one of these. Check them before you trust any number on a dashboard.

  • **No competitive denominator** — counting your own mentions in isolation. "We got 5 mentions" is a vanity metric without knowing competitors got 50. SOV requires the you-or-a-rival denominator.
  • **Flat counts that ignore position** — treating a #1 recommendation and a #5 footnote as equal. Weight by position so prominence beats mere presence.
  • **One-shot queries instead of a stable set** — asking once and trusting it. Answers vary run to run; you need a fixed prompt set re-run on a cadence and trended.
  • **Blending engines into one number** — sourcing behavior differs wildly (Wikipedia-heavy ChatGPT vs Reddit-heavy Perplexity). A blended SOV hides where you are actually losing.
  • **Measuring presence but ignoring sentiment and cite-vs-recommend** — a high but negative or merely-cited SOV is not the win the number implies.
  • **Prompts that do not map to buyer intent** — testing "what is a CRM" instead of "best CRM for a 10-person team." Commercial-intent prompts are where recommendations — and share — actually happen.

Fix the prompt set and the denominator first; they cause more bad SOV numbers than any tooling gap. Everything else — weighting, per-engine splits, sentiment — refines a measurement that is already pointed at the right question.

Not sure where your brand stands on any engine yet? Start with a free check of whether ChatGPT, Perplexity, Gemini and Claude can even read and recognize your site — then build your prompt set from there.

Run a free AI SEO audit

Frequently asked questions

What is AI Share of Voice?

AI Share of Voice is your brand's share of AI-answer mentions relative to your competitors — not a raw count of your own mentions. It is a competitive, zero-sum metric: every brand in a category shares one pie that sums to 100%. It answers "when buyers ask an AI engine, how often is the recommended brand me versus them?" rather than the weaker "did the AI mention me at all?"

How do you calculate AI Share of Voice?

SOV = (your brand mentions ÷ total brand + competitor mentions across your prompt set) × 100. You only count answers that name you or a competitor, because those are the answers where a recommendation actually happened. Example: 300 of your mentions across 1,500 competitive answer outputs = 20% Share of Voice. The denominator is the whole point — "300 mentions" alone is meaningless without knowing what competitors got.

Source: Search Engine Land — Measure brand visibility in AI search
Why is Share of Voice better than just counting my own mentions?

A raw mention count is absolute and contextless — "we got 5 mentions" could be excellent or terrible depending entirely on what everyone else got. Share of Voice adds the competitive denominator, turning the number into a market-share read: your slice of all the recommendations in your category. If a rival pulls 50 mentions to your 5, the count looks fine but the SOV exposes that you are being buried.

Should I weight mentions by their position in the AI answer?

Yes. The brand named first is the recommendation; the brand named fifth is a footnote, and a flat count treats them as equal. A simple weight of 1 ÷ position (1st = 1.00, 5th = 0.20) is enough to start. The peer-reviewed GEO study goes further, weighting visibility by an exponential decay on position and by how much of the answer's text you occupy — its point is that visibility means prominence, not bare presence.

Source: GEO: Generative Engine Optimization — Aggarwal et al. (arXiv)
Should I measure AI Share of Voice per engine or blend it into one number?

Per engine, always. Engines source answers from very different places — across 680M citations, ChatGPT leaned Wikipedia (47.9% of top sources) while Perplexity leaned Reddit (46.7%) — so the same brand can be loud on one engine and invisible on another. A blended SOV hides exactly the gap you need to fix. Track ChatGPT, Gemini, Perplexity, Claude and Google AI Overviews separately, and use a blended figure only as a rough headline.

Source: Profound — AI Platform Citation Patterns (680M citations)
What should I track alongside Share of Voice?

Pair SOV with citation rate (the percentage of AI answers that cite your brand at all) and sentiment (positive, neutral, negative). SOV alone misses tone and the cite-vs-recommend distinction — you can hold a high share while being mentioned negatively, or be named as a source without ever being recommended as the solution. Reading all three together is what separates real progress from a flattering chart.

Source: Search Engine Land — Measure brand visibility in AI search
How many prompts do I need to measure AI Share of Voice reliably?

Start at around 50 high-intent prompts and scale toward 100–200 as you mature. Build them in three layers: brand-positioning and "best [category]" commercial queries, voice-of-customer phrasing (G2 reviews, Reddit, sales calls), and real search data (People Also Ask, autocomplete). Then re-run the same fixed set on a schedule, because LLM answers vary run to run — the trend over time is the signal, not any single snapshot.

Source: Alex Birkett — AI Share of Voice (formula + prompt-set method)
Does showing up in AI answers actually drive results, or is it just a vanity metric?

It increasingly drives results. Adobe found AI-referred traffic converted 42% better than non-AI traffic in March 2025 (a reversal from 43% worse in July 2024), and Semrush reported AI-referred visitors converting at roughly 4.4x organic for informational queries. Treat those vendor figures as directional rather than guarantees, but the direction is unmistakable: a strong, well-weighted AI Share of Voice maps to real downstream value, not just exposure.

Source: Adobe — The explosive rise of generative AI referral traffic

Further reading

Keep reading

See how often ChatGPT, Perplexity, Gemini & Claude recommend your brand versus competitors — per engine, on a schedule.

Connect your first site and watch SourceWatch score your AI visibility in minutes.