DEFINITION29 Apr 20268 min read

What is an AI visibility audit?

It's a structured scan of how the AI answer engines treat your business — whether their crawlers can reach you, whether your content is shaped to be quoted, and whether your brand actually surfaces when someone asks a question your business should be the answer to. Every gap quantified in revenue. Every fix sequenced by impact.

Start with the question you actually have

Will ChatGPT mention my business when someone asks for it? Will Claude? Perplexity? Gemini?

That's the question. An AI visibility audit is the structured way to answer it. We measure three things in order — and every gap gets tagged with a dollar amount before we ship the dossier.

First, can the engines even reach your content? Second, when they reach it, can they parse and quote it? Third — the hardest one — when someone asks the engine a question your business should be the answer to, does your name actually come up?

Three questions, one diagnostic. The Doxia Axis version ships in five business days as a 14-page dossier. The free Tier 0 audit is the same artifact most agencies sell at $1,500 to $7,500.

So what does the audit actually measure?

Six dimensions. Each one a different way the citation can fail.

Dimension 1 — can the bots even get in?

The basics first. We check robots.txt to see which AI crawlers are allowed and which are blocked. Most operators are surprised by what they find on their own site. The bots we check for:

GPTBot, ChatGPT-User, OAI-SearchBot (OpenAI)
ClaudeBot, Claude-Web, anthropic-ai (Anthropic)
PerplexityBot, Perplexity-User (Perplexity)
Google-Extended, GoogleOther (Google's AI training surface — distinct from regular Googlebot)
Applebot-Extended (Apple Intelligence)
Meta-ExternalAgent (Meta)
Bytespider (ByteDance / TikTok)
CCBot (Common Crawl, which most foundation models train on)

A common audit finding: a site that blocks all crawlers via a wildcard Disallow: / rule, written years ago, never revisited. Or one that allows Googlebot but blocks GPTBot specifically — locking the brand out of every citation surface OpenAI products generate. Either pattern is a 30-second fix that can compound for years.

We also check <meta name="robots"> tags and the X-Robots-Tag HTTP header. And — this is the one most operators miss — whether the page actually renders to the crawler. If the AI bot fetches your homepage and gets back an empty React shell because everything lives behind JavaScript hydration, the rest of the audit is moot. The bot never saw the content.

Dimension 2 — is the content shaped for extraction?

The engines extract from JSON-LD structured-data blocks more reliably than from HTML prose. Always. That's not a future trend — that's how they're built.

So we inventory every schema type your site deploys, every type it should deploy, and the gap between them. The canonical set worth checking:

Organization and ProfessionalService — entity grounding
Person — for the founder or named operator
WebSite — with a SearchAction potentialAction
FAQPage — the most-cited type in the answer engines right now (covered in detail at what is FAQPage schema)
Article or BlogPosting with a citation array — for long-form content
BreadcrumbList — every nested page
Service and OfferCatalog — service businesses
HowTo — process pages
Review and AggregateRating — wherever review volume justifies them
Vertical-specific types — Attorney, LegalService, LocalBusiness, LodgingBusiness, Restaurant, Product, etc.

The full canonical set with worked deployment examples lives at what schema matters for AI visibility.

Dimension 3 — is the prose itself citable?

Schema tells the engines what content exists. Content shape determines whether they bother extracting it.

So we grade the on-page prose against a citability rubric:

Does each section open with a definitional or thesis sentence the engine can quote standalone?
Are claims paired with concrete numbers, named tools, dated facts? Or are they hedged abstractions?
Are sources cited inline — with hyperlinks — so the page itself becomes a node in the citation graph?
Are common questions answered with their question-shape headings preserved? (So the engine matches user intent verbatim.)
Is the content fresh? Or is the most recent dateModified from 2021?

A site can have perfect schema and still fail this one. Aphorism-heavy copy reads beautiful for humans and goes invisible for engines.

Dimension 4 — what do the engines actually say?

This is the part of the audit operators find most uncomfortable to read. And the most useful to act on.

We run a defined set of queries through each of the six engines and record what gets cited. For a Charlotte estate-planning firm, the queries look like "best estate planning attorney Charlotte NC", "Board-Certified estate planning specialist in Charlotte", "flat-fee estate planning North Carolina". For a Hudson Valley wedding venue, "luxury wedding venues Hudson Valley", "300-guest wedding venues with on-site lodging upstate NY", "best wedding venues with river views New York".

Then we record what we see. Verbatim. Which engines cited which firm. What they quoted. What content the cited firms had that the audited firm didn't.

It's the kind of evidence a prospect can't argue with. Their name isn't there. The competitor's name is. Here's the source the engine quoted.

Dimension 5 — who's winning, and why?

For the top three to five competitors in your zone, we measure share of voice across all six engines on the same query set. The output is a competitive scorecard naming exactly which competitor wins which query — and the specific reason. Usually a schema deployment, a content asset, or a third-party citation pattern your site doesn't have.

Dimension 6 — what's it costing you?

Every finding gets tagged in dollars. The math depends on your vertical:

Service businesses with measurable inquiry-to-engagement rates: gap between current citation share and an achievable benchmark, multiplied by average engagement value.
E-commerce: AI-traffic share of total intent traffic, multiplied by conversion rate and average order value.
B2B SaaS: pipeline contribution by source paired with citation share, surfacing the share of pipeline AI search now intermediates.

The point of revenue-tagging isn't precision. It's sequencing. Eight findings ranked by estimated dollar impact tell you which fix to ship first. Eight findings ranked by anything else are an opinion, not a plan.

What's in the dossier?

Fourteen pages. Same shape every time. The diagnosis is what changes.

Executive summary — the one-page version a CEO reads in 4 minutes
Crawler-access matrix — every engine, every relevant bot, allowed/blocked
Schema-coverage scorecard — every type, every page, every gap
Content-shape grades — section-by-section citability scoring
Direct AI-engine test results — verbatim quotes from each engine
Competitive citation analysis — the three-to-five competitors winning what
Revenue trajectory — modeled gap between current and achievable visibility
Revenue-quantified findings — every gap in dollars
14-day sprint plan — the day-by-day deliverable schedule for the recommended fix
Risk register — three to five things most likely to go wrong
Sources — every external reference used 12-14. Appendices — full schema gap detail, raw query logs, methodology notes

Want to see what each page actually looks like? The Sample AI Visibility Audit deliverable walks through pages 2 and 3 of an actual dossier.

Wait — isn't this just an SEO audit?

It's not. Same site, different surface, different signals.

| | SEO audit | AI visibility audit | |---|---|---| | Measured surface | Google blue-link SERP | ChatGPT / Claude / Perplexity / Gemini / Copilot / Grok answers | | Primary signals | Backlinks, keyword density, on-page SEO, click-through rate | Schema density, source citations, entity clarity, third-party mentions in training-data sources | | Decay cycle | Weeks to months | Foundation-model training cutoffs (months to years) | | Volume metric | Search rankings, traffic | Citation share-of-voice, brand surfaces in answers | | Crawler relationship | Googlebot | GPTBot, ClaudeBot, PerplexityBot, Google-Extended, etc. |

The two disciplines overlap on schema and on technical fundamentals. They diverge on content shape, on the value of third-party citations, and on the role of fresh publishing cadence. The longer comparison is at GEO vs SEO.

What this audit isn't

It's not a generic SEO audit with the word "AI" pasted in front. It's not a chatbot recommendation. It's not a tool list. It's not a strategy deck.

It's a diagnostic deliverable with a fixed shape. Run against six AI engines. Scored against a structured rubric. Quantified in dollars. Sequenced by impact.

If you read the dossier and decide not to engage paid work, that's a clean outcome. The audit is the proof artifact whether or not we ship the fix. The Doxia Axis free Tier 0 audit is the cheapest way to see the methodology applied to your own business in five business days.

Where to go next

See what the deliverable looks like: Sample AI Visibility Audit.
See it applied to four real businesses: estate planning · personal injury · wedding venues · boutique hospitality.
Want the long comparison? GEO vs SEO.
Or just request the audit: /audit. Qualification gate is real, but if you fit the operator profile, the dossier ships in 5 business days. No charge.