How AuraScope measures AI visibility

AuraScope methodology, the four pillars and what we measure

What this page is

This is the canonical methodology page every AuraScope case study links back to. If a claim sounds aggressive, this is where you check the maths.

What AuraScope measures

AuraScope tracks how three answer engines (ChatGPT, Gemini, Perplexity) respond to a controlled prompt set on a scheduled audit cadence. For each prompt we record whether the brand was named, where, in what context, and which sources the engine cited. Those observations aggregate into pillar scores.

The four levers we score

Every score on the AuraScope dashboard maps to one of four levers an LLM uses when deciding what to recommend:

External trust. Earned media, third-party citations, reviews. What other authoritative pages say about you.
Crawlability. Semantic HTML, valid heading hierarchy, schema markup, internal linking. How parseable your page is to an agent.
Freshness. Visible created and last-edited timestamps. Whether the agent can tell your content is current.
Coverage. Content cadence that answers the questions buyers ask. Whether you own the question space in your niche.

These four are the locked vocabulary across every case study on the site. Older phrasing (off-site signals, third-party trust signals, external verification) all refers to one of these four.

Audit cadence

Audits run every other day across all three engines. After any intervention (a new citation, a schema rollout, a content refresh), the standard re-audit cadence is +3, +7 and +14 days from the change going live, mapped to the nearest scheduled audit day. Where a case study uses a non-standard cadence, it is called out in that case's methodology notes.

ChatGPT attribution mode

This is the one most readers want to push back on. When a case study says "ChatGPT picked the brand up within seven days," it does not mean ChatGPT retrained on the brand. Model training data is months out of date by construction.

What we measure is the engine's answer-time retrieval behaviour. With browsing or grounding enabled, ChatGPT, Gemini and Perplexity all fetch and rank current web content per query. A new citation on an authoritative site enters that retrieval surface the moment it is indexed.

So "within seven days" means: from intervention live to consistent answer-set inclusion under the standard prompt set, measured in retrieval mode, across every-other-day audits. It does not mean retraining.

Prompt-set stability

Every engagement starts with a frozen prompt set, typically 20 to 60 prompts the buyer ICP would realistically ask the engine. The prompt set is locked for the duration of an experiment. If it has to change mid-stream (rare), we flag that as a non-comparability point for share-of-answers in the case study's caveats and do not report a delta on that pillar.

Calibration and ongoing research

AI answer engines change underneath us. Models retrain, retrieval stacks evolve, browsing and grounding behaviour shifts. To keep pillar scores honest we maintain a control prompt set: a frozen group of prompts and brands we evaluate alongside every client audit. If the control prompts shift without any underlying brand or web change, that tells us the engine has moved, not the brand. We then recalibrate scoring before reading client deltas as real signal.

We also run ongoing experiments on engine behaviour: prompt phrasing, citation indexing latency, retrieval bias by vertical, and the effect of agentic-engineering improvements (longer reasoning, better tool use) on what the engines actually surface. Methodology here is not finished work. It is current best practice, and we update it as the engines change. We publish those updates as we make them.

The attribution problem (in progress)

The hardest open problem in AI search is attribution. When a brand sees a new lead and the prospect says "ChatGPT sent me," we can correlate the message with audit deltas in the same window, but we cannot yet causally decompose the outcome to a specific lever pulled on a specific date. We are actively working on an attribution model for AI search and will publish methodology updates as it firms up.

These are early days. The four-lever model gives us directional signal. The control prompt set lets us tell engine drift from brand lift. The attribution model is the missing piece, and it is in active development. Read every causal claim on this site with that caveat in mind, and read this section as the part of the methodology we are still building.

What we can prove. What we cannot.

What we can prove:

Whether a brand appeared in an engine's answer for a given prompt on a given date.
Whether pillar scores moved during the same window as an intervention.
Which citations the engines surfaced, and from where.
Whether the control prompt set moved in the same window (engine-drift signal).

What we cannot prove:

That a specific lever was the sole driver of a downstream business outcome (a lead, a sale, a meeting). LLM behaviour is stochastic; attribution is correlational, not causal. The attribution model we are building is the work to close this gap.
That a future query will return the same answer. Retrieval is probabilistic.
That a result will reproduce on a different brand or vertical. Each case is a case, not a universal.

Every AuraScope case study reads with these limits in mind. Read the cases with them too.

Re-audit transparency

Every score moves with a dated audit. If a case study quotes a pillar delta, the baseline date and the re-audit date are in its methodology notes. If they are missing, that is a bug, and we want to know.

Anonymisation

Some cases are published with the client (and sometimes the destination publication) anonymised at the client's request. Anonymisation caps the third-party verification a sceptical reader can do. Where it applies, we say so up front in the case background. The mechanism, pillar evidence and methodology remain readable; the named-brand verification step does not.