Which Schema.org types do AI engines actually use to decide whether to cite a business?

DECISION GUIDE29 Apr 20267 min read

What schema matters for AI visibility?

The canonical schema set for AI-cited businesses — Organization, Person, Service, FAQPage, Article, BreadcrumbList, OfferCatalog, HowTo, DefinedTerm, QAPage, Review, AggregateRating. Which schemas the answer engines actually weigh, which are nice-to-have, and which are wasted markup.

Want the short list?

In order of leverage — Organization + ProfessionalService, Person, WebSite, FAQPage, Article / BlogPosting with citation arrays, Service and OfferCatalog, BreadcrumbList, HowTo, DefinedTerm and QAPage. Plus Review / AggregateRating where review volume justifies it. Vertical-specific types (Attorney, LegalService, LodgingBusiness, LocalBusiness, Product) layer on top for industry queries.

That's the canonical set. Deploy all of these on a site that wants to be cited and you've closed the schema dimension of the AI-visibility-audit rubric. Skip them, and the engines fall back to extracting from prose — which is structurally lower-fidelity than extracting from JSON-LD.

What follows is each one, with worked deployment examples and the reason it matters.

Tier 1 — entity-graph foundations

These are non-optional. Every site that wants to be cited needs all four.

`Organization` (paired with `ProfessionalService`, `LocalBusiness`, or `LegalService`)

The brand-entity anchor. Names the business. Gives it a logo. Lists the sameAs array (LinkedIn, Crunchbase, GitHub, Wikipedia where applicable). Declares foundingDate. Names the founder with @id linkage to a separate Person block.

{
  "@context": "https://schema.org",
  "@type": ["Organization", "ProfessionalService"],
  "@id": "https://doxiaaxis.com/#organization",
  "name": "Doxia Axis",
  "url": "https://doxiaaxis.com",
  "logo": "https://doxiaaxis.com/brand/03-mark-icon.svg",
  "sameAs": ["https://www.linkedin.com/company/doxia-axis", "https://x.com/doxiaaxis"],
  "foundingDate": "2026-01-01",
  "founder": { "@id": "https://doxiaaxis.com/#founder" },
  "areaServed": ["India", "United States", "United Kingdom", "European Union", "Worldwide"],
  "knowsAbout": ["AI consulting", "AI strategy", "AI implementation", "AI agents", "RAG systems"]
}

`Person` for the founder or named operator

The authority anchor. Establishes who runs the business. What they know. What credentials they hold (hasCredential). What awards (award). What alumni affiliations (alumniOf). Which third-party profiles confirm identity (sameAs).

{
  "@context": "https://schema.org",
  "@type": "Person",
  "@id": "https://doxiaaxis.com/#founder",
  "name": "Dhruva Kumar",
  "jobTitle": "Founder · Operator",
  "worksFor": { "@id": "https://doxiaaxis.com/#organization" },
  "url": "https://doxiaaxis.com/about",
  "knowsAbout": ["Artificial Intelligence", "AI agents", "Retrieval-augmented generation"]
}

`WebSite` with `SearchAction`

Sitewide identity. Tells engines this is the canonical site for the brand. Supports the search-box rich result.

`BreadcrumbList`

On every nested page. Restates the path through the site so the engines understand information architecture without inferring it.

Tier 2 — answer-shaped content surfaces

This is where citation actually happens.

`FAQPage`

The single most-cited schema type in the answer engines. Wraps question-and-answer blocks so the engines extract Q&A pairs verbatim. Detail in what is FAQPage schema. Belongs on pricing pages, service pages, practice-area pages, product pages, property pages.

`Article` or `BlogPosting` with `citation` arrays

For long-form thought-leadership content. The citation array is the load-bearing field — it tells the engines this article is grounded in primary sources and lists exactly which.

{
  "@type": "BlogPosting",
  "headline": "...",
  "datePublished": "2026-04-29",
  "dateModified": "2026-04-29",
  "author": { "@id": "https://doxiaaxis.com/#founder" },
  "citation": [
    {
      "@type": "ScholarlyArticle",
      "name": "GEO: Generative Engine Optimization",
      "url": "https://arxiv.org/abs/2311.09735",
      "author": "Aggarwal, Murahari, et al."
    }
  ]
}

`HowTo`

For procedural content. Step-by-step processes. Methodologies. Sprint plans. We deploy this on /how-we-work for both engagement tracks.

`Service` and `OfferCatalog`

For service businesses. Names every service. Every tier. Every price band. Every Offer's properties. Deployed on /services (ItemList of Services) and /pricing (OfferCatalog with PriceSpecifications per tier).

`DefinedTerm` and `QAPage`

For answer-page surfaces. DefinedTerm for "What is X" definitional pages. QAPage for comparative or how-to answer pages. We deploy both across the /answers cluster.

Tier 3 — vertical-specific types

These layer on top of Tier 1 and Tier 2 for industry-specific citation.

Attorney / LegalService — for law firms. Detail in how law firms appear in ChatGPT and Perplexity.
LodgingBusiness / Hotel / BedAndBreakfast — for hospitality. Pairs with amenityFeature, petsAllowed, priceRange, aggregateRating.
LocalBusiness — for any geography-bound service business. Critical for "near me"-class queries.
MedicalBusiness / Physician / Dentist — healthcare. Pairs with medicalSpecialty, acceptsInsurance.
Product with Offer, aggregateRating, review — e-commerce.
Restaurant — pairs with servesCuisine, menu, acceptsReservations.
SoftwareApplication — for SaaS, with applicationCategory, operatingSystem, offers.

Choose the most specific type that fits. Deploy Attorney instead of just LegalService if the page is about a specific lawyer. Deploy Hotel instead of just LodgingBusiness if the property fits cleanly. Specificity tightens the entity-graph match.

Tier 4 — social-proof types

Where review volume justifies them.

Review — individual reviews, with author, reviewRating, datePublished.
AggregateRating — averaged across all reviews, with ratingValue, reviewCount, bestRating, worstRating.

These produce the star-rating rich result in Google SERPs and signal social-proof depth to AI engines. The Charlotte estate-law audit and the Savannah PI audit both surfaced this as a high-leverage gap — the firms had real review volume (179 and 80+ reviews respectively, both at 4.9-star averages) and zero Review/AggregateRating schema deployed.

Don't fabricate review counts. The engines validate against indexed review-platform data. Mismatch is detectable, and treated as a quality signal against the site.

What about schemas that get over-deployed?

Three types we see in audits more often than they earn their keep.

SiteNavigationElement — nice for technical hygiene. No measurable citation impact.
WebPage standalone — over-specified. The page already exists. The schema doesn't add information the engines couldn't already infer.
SpeakableSpecification standalone — only useful inside an FAQPage or Article speakable block. Standalone deployment is wasted markup.

We don't skip them where they add hygiene. We don't prioritize them over the Tier 1-2 set when sequencing.

What's the single most underrated technique?

@id linkage between schema blocks.

Every Tier 1 and Tier 2 schema should declare an @id URI, and every cross-reference should use the @id rather than re-declaring the entity inline.

// In Organization schema:
"founder": { "@id": "https://doxiaaxis.com/#founder" }

// In Article schema:
"author": { "@id": "https://doxiaaxis.com/#founder" }
"publisher": { "@id": "https://doxiaaxis.com/#organization" }

// In Service schema:
"provider": { "@id": "https://doxiaaxis.com/#organization" }

This builds an internal entity graph the engines can resolve with confidence. When ChatGPT or Perplexity is deciding whether to cite the article, the @id linkage from author to the Person schema tells it the author is the founder of the publisher organization, which has a thick sameAs array, which resolves the whole chain to a recognized entity.

The link graph is doing structural authority work no amount of prose can replicate.

So where does each schema actually go?

A practical map:

| Page type | Tier 1 | Tier 2 | Tier 3 | |---|---|---|---| | Homepage | Org, Person, WebSite | FAQPage | (per vertical) | | About | Org, Person | — | — | | Service / pricing | Org | Service, OfferCatalog, FAQPage | (per vertical) | | Practice / property page | Org | FAQPage | LegalService, LodgingBusiness, etc. | | Long-form article | Org, Person | BlogPosting + citation | — | | Process / methodology page | Org | HowTo | — | | Definitional answer page | Org | DefinedTerm or QAPage | — | | Case study | Org, Person | Article (caseStudy) | — | | Review hub | — | — | Review, AggregateRating |

BreadcrumbList belongs on every page except the homepage.

How do you validate it?

Three tools.

Schema.org validator — for structural correctness.
Google Rich Results Test — for Google-specific surface eligibility.
Bing Webmaster Tools URL Inspection — for Bing/Copilot.

These three give green lights at the structural level. They don't predict citation rates — those depend on content shape and entity-graph thickness, which aren't validator-checkable. The validator is necessary. It is not sufficient.

Where to go next

FAQPage in detail: what is FAQPage schema.
The diagnostic that surfaces schema gaps: what is an AI visibility audit.
The discipline that closes the gaps: what is GEO.
A worked vertical: how law firms appear in ChatGPT and Perplexity.
Live demonstration: the entity page renders the full canonical schema set against the Doxia Axis brand.
Or just request the audit: /audit. Five-business-day deliverable. Every missing schema type named. Every deployment specified.

What schema matters for AI visibility?

Want the short list?

Tier 1 — entity-graph foundations

Organization (paired with ProfessionalService, LocalBusiness, or LegalService)

Person for the founder or named operator

WebSite with SearchAction

BreadcrumbList