What AI share of voice actually measures
Share of voice is an old advertising idea — your share of the total noise in a market — adapted to generative engines. In AI search, the "noise" is brand mentions inside answers. When someone asks Perplexity "what's the best payroll software for a 20-person team," the engine names a short list. AI share of voice is the percentage of those named slots that belong to you, across all the prompts your category is searched with.
The formula
AI Share of Voice = (your brand mentions ÷ total brand mentions across all tracked brands) × 100. Example: across your tracked prompts, AI answers produce 100 brand mentions and 25 are yours → 25% share of voice. Read the full method on the share of voice definition page.
Two things about that number trip up almost every tool and every spreadsheet:
- **The percentages don't sum to 100%.** AI answers routinely name several brands in one response, so competitor shares overlap. If three brands each appear in 60% of answers, that's real — it's not a rounding error. Treat share of voice as "how often you're in the room," not a slice of a fixed pie.
- **The benchmark is relative, not absolute.** In a two-horse market, ~50% is parity. In a fragmented category with ten credible alternatives, 15% can be outright leadership. Most B2B brands appear in under 30% of their relevant queries — so context, not a magic number, tells you whether you're winning.
Presence alone also undersells the metric. Being named last in a list of five is not the same as being named first. A position-weighted view (a simple Weight = 1 ÷ position) captures whether you're the lead recommendation or the afterthought — and the lead slot carries the real trust signal. SourceWatch tracks both: how often you appear, and where in the answer you land.
Why a blended number lies to you
This is the single most important thing to get right, and the place most dashboards quietly mislead you. The same brand, on the same prompt set, scores very differently from one engine to the next. Each engine has its own training data, its own live-search layer, and its own taste in sources. A brand can dominate one and be nearly invisible in another.
| Engine | Illustrative SOV range | What it tells you |
|---|---|---|
| Perplexity | ~28–38% | Source-heavy and citation-driven — often the easiest to influence with quotable content |
| Gemini | ~12–20% | Tied to Google's index and signals; rewards established web presence |
| ChatGPT | ~10–16% | Huge reach; a brand can lead here and trail elsewhere |
| Claude | ~3–7% | Conservative on recommendations; mentions are harder-won and high-signal |
Illustrative only — not a benchmark, not your numbers
These ranges are directional, drawn from the field to make a point — they are not SourceWatch's own research, not a published benchmark, and not what your account will show. Your real per-engine spread depends entirely on your category and prompt set. The pattern is what matters: a brand can hold ~40% in one engine and ~15% in another for the very same queries. Blend them into one average and you erase exactly the signal you need — which engine to fix first.
So SourceWatch reports share of voice **separately per engine**, every time. You see that you're strong in Perplexity but losing in Gemini, instead of a single comfortable-looking average that hides the gap. That's the difference between a number you can act on and a number you can only nod at.
Why a screenshot isn't a measurement
Ask an AI engine the same question twice and you'll often get two different answers — different brands, different order, different wording. This is non-determinism, and it's baked into how these models generate text; it persists even at temperature zero. It's why "I asked ChatGPT and we weren't there" is an anecdote, not data. The next run might name you.
Credible measurement controls for it the same way any noisy signal is measured — by sampling. SourceWatch runs each prompt multiple times (typically 3–5+) on a regular cadence and averages the result, so your share of voice reflects the engine's actual tendency rather than one lucky or unlucky roll. A single observation is statistically unreliable by design.
- 1
Fix the inputs
Lock a consistent prompt set and a consistent competitor list. Share of voice is only comparable over time if the denominator holds still.
- 2
Sample, don't snapshot
Run each prompt several times per cadence so non-determinism averages out instead of masquerading as a trend.
- 3
Run on a schedule
Weekly by default. Answers drift as models update and the web changes, so a one-time reading is stale almost immediately.
- 4
Watch the trend, per engine
The signal is the direction — is your share of voice climbing or sliding in each engine? — not any single week's figure.
900M+
weekly active ChatGPT users (OpenAI via TechCrunch, Feb 2026) — and that's one engine. Every session is a buyer who might be handed a short list of brands.
The competitor set is the whole product
A share-of-voice number is only as good as the denominator behind it. Count only your own mentions and you get a vanity metric that always looks fine. Pick the wrong rivals and you're benchmarking against companies the engines never mention alongside you. The defensible version of this metric measures the brands that *actually surface* in answers next to you — the real competitive set, not an aspirational one.
That's where SourceWatch starts. It reads the answers, surfaces who else is being named for your prompts, and lets you lock that roster as your benchmark. From there every figure is competitive by construction: your share, their share, who's gaining, who's slipping, and on which engine. You're not measuring whether AI likes you in a vacuum — you're measuring whether AI picks you over the people you're actually up against.
See who AI names instead of you. A free single-page AI SEO audit shows whether engines can read and recognize your site — in about 15 seconds, no card required.
Run a free AI SEO auditHow SourceWatch measures it
Most tools in this category infer your visibility one way: they fire synthetic prompts at the engines and read the answers. SourceWatch does that well — per engine, multi-run, against your locked competitor set — and then adds a second, harder-to-fake source of truth that almost no competitor has.
Moat 1 — first-party AI traffic, not just inferred mentions
When an AI engine reads your site, its crawler hits your pages. When its answer sends someone to you, that's a real referral click. SourceWatch captures both from your own traffic with a drop-in Cloudflare Worker or in-site snippet, and verifies them against the AI vendors' published IP ranges — so a bot pretending to be GPTBot doesn't pollute your data. That's ground truth: the actual AI crawlers reading you and the actual visitors arriving from AI, alongside the synthetic share-of-voice scores. Prompt-sampling alone can badly undercount what the engines really do; measuring both sides is how you trust the number.
Moat 2 — MCP-native, so Claude Code can act on it
The same data is exposed through an MCP server for Claude Code. Claude can read your share of voice, pull the exact queries the engines ran, see your citation gaps, and help you act — in the same loop, without you copy-pasting out of a dashboard. It's a self-serve capability that, among competitors, is otherwise effectively enterprise-only. (A public REST API is on the roadmap; the MCP server is the integration path today.)
What you see, in plain terms
Per-engine share of voice vs your tracked competitors · mention rate and mention position · sentiment of each mention · the real search queries the models ran · most-cited domains and your citation gaps · and the first-party AI-crawler + AI-referral traffic actually landing on your site.
To keep this honest: SourceWatch measures and reports — it does not write your content for you (no AI content generation), the public REST API is on the roadmap (MCP is the integration today), and the instant audit covers a single page; tracking your full site and prompt set is what the trial unlocks.
Share of voice is a leading indicator, not a vanity score
In traditional media, share of voice has long been treated as a forward signal: brands that out-share their market tend to grow their market share next. The same logic is now playing out in AI search — winning AI mentions today is an early read on tomorrow's consideration, because the answer is increasingly where the buyer's shortlist gets formed in the first place.
There's hard evidence the work pays off. The peer-reviewed GEO research (KDD 2024), tested across 10,000 queries spanning 25 domains, found that the highest-impact tactics — citing sources, adding quotations, and adding statistics — each lifted a source's visibility in generative answers by roughly 30–40%, with gains over 40% for the strongest. Notably, lower-ranked sites gained the most (rank-5 pages jumped ~115%), which levels the field for challenger brands. And the demand is shifting under everyone's feet: Gartner projects traditional search volume will fall 25% by 2026 as generative engines absorb the queries.
| Signal | Figure | Source |
|---|---|---|
| GEO tactics lift visibility | up to ~40% (cite sources · add quotations · add statistics) | GEO paper, arXiv 2311.09735 (KDD 2024) |
| Tested at scale | 10,000 queries across 25 domains | GEO paper |
| Challenger upside | rank-5 sites gained ~115%; top sites lost ~30% | GEO paper |
| Search demand shift | traditional search volume −25% by 2026 | Gartner press release, Feb 2024 |
| Scale of the channel | ChatGPT 900M weekly active users | TechCrunch, Feb 2026 |
The academic foundation under "share of voice" is worth knowing, because it's exactly what SourceWatch operationalizes. The GEO paper formally measures a source's presence two ways: a **position-adjusted word count** (the share of an answer's words attributed to a source, weighted toward earlier mentions — literally share of voice computed inside one response) and a **subjective impression** score (an LLM-judged read of how influential the source was, capturing quality of mention, not just count). Mention rate, mention position, and sentiment in SourceWatch are the practical, trackable version of those same ideas.
How SourceWatch compares
An honest look at where SourceWatch fits. The category is full of capable tools; the differences that matter for *measuring share of voice accurately* are per-engine reporting, multi-run sampling, first-party traffic capture, and an agent-native (MCP) workflow. Here's the straight version, gaps included.
| Capability | SourceWatch | Typical prompt-sampling tool | Enterprise platform |
|---|---|---|---|
| Per-engine share of voice (not blended) | Yes | Sometimes | Yes |
| Multi-run sampling for non-determinism | Yes (3–5+ per cadence) | Often single-run | Yes |
| Locked competitor set as the benchmark | Yes | Varies | Yes |
| First-party AI-crawler capture | Yes (IP-verified) | Rare | Some |
| First-party AI-referral click capture | Yes (drop-in Worker/snippet) | Rare | Rare |
| MCP server for Claude Code | Yes (self-serve) | No | Enterprise-only, if at all |
| AI content generation | No — measures, doesn't write | Often yes | Often yes |
| Public REST API | Coming soon (MCP today) | Often yes | Yes |
| Starting price | From $99/mo · 14-day trial | $29–$99/mo | $2,000/mo+ or annual |
Where SourceWatch deliberately doesn't compete: it won't write your articles, and its public API is still on the way (the MCP server is the integration path today). Where it wins: it measures share of voice the way the metric actually behaves — per engine, multi-run, against the real competitor set — and backs the synthetic scores with first-party AI traffic almost nobody else captures, at a self-serve price. If you want a deeper head-to-head, see the Profound alternative and Conductor alternative pages.
Track your AI share of voice across ChatGPT, Perplexity, Gemini and Claude — per engine, on a schedule. 14-day free trial, card optional.
Start tracking your share of voice