AI Visibility Audit · Pages 02–0329 Apr 20269 min read

Sample AI Visibility Audit — exec summary + crawler matrix

What a Doxia Axis AI visibility audit deliverable actually contains. Pages 2 and 3 of the dossier rendered as long-form text. The executive summary that lands in week one, and the crawler-access matrix that scores how each foundation-model bot reads your site.

Open the dossier. What's on page 2?

The executive summary. One page. Three numbers, three paragraphs. A CEO reads it in four minutes and walks out knowing exactly what to do.

What follows is the page reproduced from a real Tier 0 audit, anonymized. Real numbers, real findings, real recommendation. Only the firm name is stripped.

Executive Summary — what the machine found.

GEO Score: 34 / 100. Bottom quartile. Category median is 52. Top decile is 71.

Crawler Access: 3 / 8. GPTBot and Claude-Web blocked at robots.txt. Five other engines reach the site at varying depths.

90-day target: +180% AI citations. From a baseline of zero. Tier 3 retainer benchmark.

.01 — Your site is technically sound but structurally invisible to two of the three AI engines responsible for 71% of category citation traffic. Robots.txt blocks GPTBot and Claude-Web. Lifting those constraints alone is estimated to surface 4 to 7 citable entities within thirty days.

.02 — Schema.org coverage is 17% against a category median of 52%. The 35-point gap concentrates in Organization, FAQPage, and Article schema. Exactly the three types AI engines prefer for citation. llms.txt is absent.

.03 — A 14-day Tier 2 integration plus 60 days of Tier 3 retainer execution closes the gap against the three named category competitors in this dossier. Estimated attributable AI-engine citation volume at 90 days: 28 to 44 citations per month, up from 0. Recommended path: Tier 2 → Tier 3 retainer.

That's the whole page. No filler. No hedging. Three numbers up top so the reader can't miss them, three paragraphs underneath so the reader can defend the recommendation in a board meeting.

What does GEO Score 34 actually mean?

It means the audited firm is in the bottom quartile of category sites for AI-engine citability. Not "behind on SEO." Not "needs more content." Specifically: when one of the six AI answer engines is asked a category question, this firm has near-zero probability of being cited. Even though it's a real, established business with real revenue.

The score is composite. Six dimensions, weighted:

AI Citability (25%) — how quotable the on-page prose is for an LLM
Brand Authority (20%) — third-party mentions, entity signals across the open web
Content E-E-A-T (20%) — experience, expertise, authoritativeness, trustworthiness signals
Technical GEO (15%) — crawler access, llms.txt presence, rendering path
Schema & Structured Data (10%) — schema.org coverage and validity
Platform Optimization (10%) — presence on platforms LLMs train on (Reddit, YouTube, Wikipedia, GitHub)

A 34 means at least three of those six dimensions are scoring below 30 individually. Usually, the bottom three are technical, schema, and platform. The audited firm in this dossier scored 22, 17, and 38 across those three.

What does category median 52 mean? It means a typical site in this vertical clears the bar. Not impressively. Just clears it. The 35-point gap below median is the structural problem.

The 90-day target of +180% citations is not a marketing target. It's modeled from the Doxia Axis citation-attribution model v3 — the same model that produced the per-finding revenue tags on page 19. We assume Tier 2 ships in 14 days, Tier 3 ships three two-week sprints over the next 60, and the model treats each finding's citation lift as compounding sigmoidally over the 90-day window. The +180% figure is the median outcome. The confidence band at day 90 is ±18%.

So what's actually on page 2 — three paragraphs each

Marker .01. Marker .02. Marker .03. Each marker is a finding category, not a finding. The exec summary doesn't try to enumerate every issue. The audit will have between 18 and 40 individual findings depending on site complexity. Page 2 ranks them into three buckets and tells the operator which bucket matters most.

Marker .01 is always the technical-access bucket. What's blocked, what's broken, what won't render. This is the bucket where five-minute fixes hide. Lifting a robots.txt block can take fifteen minutes and surface 4 to 7 citable entities in the first re-crawl.

Marker .02 is the substrate bucket. Schema, content shape, structured data, llms.txt. This is the work that's harder to do but compounds longer. The 35-point gap in this dossier breaks down to: Organization at 17%, FAQPage at 8%, Article at 24%, Product at 0%, Breadcrumb at 62%, Review at 11%, Person at 0%. Page 11 of the dossier breaks each row down with a per-page deployment list.

Marker .03 is the recommendation bucket. What to do, in what order, by when, with what expected outcome. The audit doesn't punt this to a sales meeting. The recommendation is on page 2 because the operator already knows their constraints — budget, timeline, team capacity — and can self-select the right engagement tier from the named recommendation. Most operators, after reading page 2, ask one or two clarifying questions and then either commit to the engagement or take the dossier to fix internally. Both are clean outcomes.

Page 3 — the crawler-access matrix

Page 2 names the bucket. Page 3 names the bots. This is the crawler-access matrix, anonymized from the same audit:

| Crawler | Engine | Status | Detail | |---|---|---|---| | GPTBot | OpenAI · ChatGPT | BLOCKED | robots.txt Disallow: / | | Claude-Web | Anthropic · Claude | BLOCKED | robots.txt Disallow: / | | PerplexityBot | Perplexity AI | ALLOWED | full index | | Google-Extended | Google · Gemini | PARTIAL | 6 pages blocked | | Applebot-Extended | Apple Intelligence | ALLOWED | full index | | Bingbot | Microsoft · Copilot | ALLOWED | full index | | CCBot | Common Crawl | BLOCKED | no UA rule set | | Meta-ExternalAgent | Meta AI | BLOCKED | UA not recognized |

Three allowed. One partial. Four blocked.

The pattern is more common than operators expect. The site was built four years ago. The robots.txt was written by a developer who copy-pasted a permissive starter and then added a wildcard Disallow: / block somewhere in the deploy pipeline. The wildcard never got revisited. Four years later, the same wildcard is locking the brand out of OpenAI, Anthropic, Common Crawl, and Meta — engines that route majority traffic for the category.

How we read the matrix — who's in, who's out, what each costs

Three columns of action.

The blocks are the cheap fix. GPTBot and Claude-Web blocked at robots.txt is a five-minute change. Once lifted, GPTBot and ClaudeBot re-crawl on a 7 to 14 day cadence. The Common Crawl block matters more than people think — most foundation models are trained on Common Crawl snapshots, so a CCBot block compounds across every model that trains on CC, including future versions of Claude, GPT, Gemini, and any open-source model that uses CC as a corpus.

The partials are the medium fix. Google-Extended is allowed but six pages are blocked at the page level via <meta name="robots" content="noindex"> tags inherited from a CMS template. Each page-level block is a one-line change.

The allows are the validation step. Bingbot, Applebot-Extended, and PerplexityBot reach the site, but reaching is necessary not sufficient. The downstream question is whether the prose, schema, and structured data give those engines anything to extract. That's the page-11 schema scorecard and the page-19 revenue-quantified findings. Two pages further into the dossier.

The matrix is the bot inventory. Pages 9 and 10 of the dossier have the full robots.txt review and per-user-agent rule list. Page 19 ties each crawler-access fix to a specific revenue-impact estimate. Lifting the GPTBot block alone, in this dossier, was tagged at $148K ARR over a 12-month attributable window.

What page 3 changed for the audited firm

The operator read page 3 in the kickoff call. Six minutes after the call ended, the developer had pushed a robots.txt change to production. GPTBot and Claude-Web crawled within 11 days. By day 30, the firm appeared in three ChatGPT category answers it had never appeared in before. By day 60, it appeared in nine.

This is the dynamic the dossier is built to surface. The audit isn't a thinkpiece. Half the findings are five-minute fixes that the operator ships before the engagement officially starts. The 14-day sprint that follows handles the structural work. The 60-day retainer handles the compounding.

What's on the rest of the dossier

Pages 2 and 3 are the diagnosis surface. The rest is the prescription.

Pages 4 to 8 — content-shape grades, citability rubric, direct AI-engine query results
Page 11 — schema-coverage scorecard, every type, every page (covered in the GEO audit sample)
Pages 14 to 18 — competitive citation analysis (covered in the competitive benchmark sample)
Page 19 — revenue-quantified findings (covered in the revenue gap sample)
Pages 32 to 40 — sprint plan, risk register, sources (covered in the 14-day sprint plan sample)

Every page is reproducible. Every number is sourced. The methodology that produces them is at what is an AI visibility audit.

Where to go from here

If you read this and want to see your own page 2 — three numbers, three paragraphs, the recommendation that fits your business — request the audit. Five business days from intake to dossier delivery. No charge.

Want the methodology? What is an AI visibility audit?
Want to see it applied to a real firm? Estate planning · Charlotte
Or just request the audit: /audit. The qualification gate is real, but the dossier is the same one ChatGPT, Claude, and Perplexity will be reading from in 90 days.