← All articles·§ technical·Pillar

Schema markup for AI search: Organization, FAQPage, Article, LocalBusiness

Six Schema.org types materially affect AI citation: Organization with sameAs, FAQPage, Article/BlogPosting with declared Person author, LocalBusiness/Service, BreadcrumbList, AggregateRating. Complete reference with copy-paste JSON-LD patterns.

Data for AI Search Editorial Team··15 min read

Schema markup — structured data declared in JSON-LD format and embedded in HTML — is one of the most leverage-dense AI search optimizations available, and it's also one of the most consistently underused. As of mid-2026, six Schema.org types materially affect AI assistant citation: Organization with sameAs array, FAQPage, Article or BlogPosting with declared Person author entity, LocalBusiness or Service for local brands, BreadcrumbList, and AggregateRating or Review. These six types contribute to Check 4 in our 10-Point AI Citation Audit, scored 0-10 with sub-points per schema type. A site-wide schema rollout — typically a one-day engineering project — lifts Check 4 scores by 4-7 points and produces measurable AI citation lift within 30-45 days across ChatGPT, Perplexity, Claude, and especially Gemini (which weights schema validation heavily because Gemini operates atop Google's structured-data parsers). This guide is the practical reference: what each schema type does, the exact JSON-LD pattern to ship, what AI assistants extract from each, and the common mistakes that silently break schema validity.

What is schema markup?

Schema markup is structured data — machine-readable metadata about page content — declared using vocabulary from Schema.org and embedded in the page's HTML. The dominant serialization in 2026 is JSON-LD, which lives inside a <script type="application/ld+json"> tag in the page head or body.

A simple example:

<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "Organization",
  "name": "Acme Corp",
  "url": "https://acme.example.com",
  "logo": "https://acme.example.com/logo.png"
}
</script>

The JSON declares the page is about an Organization called Acme Corp with a specific URL and logo. AI parsers, search engines, and other automated systems read this structure to understand the page semantically — versus relying on text patterns, which are noisier.

Schema markup helps AI citation through three mechanisms:

Entity confirmation. A declared schema with consistent attributes across pages confirms the entity to AI assistants. Brands with complete Organization schema + matching LocalBusiness schema + declared author entity get cited more confidently than brands without the structural confirmation.

Content type recognition. FAQPage schema tells AI assistants "this page contains a FAQ structure" with explicit question-and-answer pairs. ChatGPT and Perplexity lift FAQ answers verbatim from FAQPage-marked content at materially higher rates than from prose Q&A without markup.

Authority signal. Article schema with declared Person author whose sameAs array points to LinkedIn, Wikipedia, and verified profiles provides credibility signal that Claude and Perplexity weight heavily. Without declared authorship, the same content is treated as anonymous and gets cited at lower rates.

Which schema types matter for AI search?

Six types account for 90% of the schema-related citation lift we see across audits. Listed in order of weight in our 10-Point AI Citation Audit Check 4:

  1. Organization schema with sameAs array (2 pts) — universal foundational
  2. FAQPage schema (2 pts) — content pages with Q&A
  3. Article or BlogPosting schema with declared Person author (2 pts) — every blog post
  4. LocalBusiness or Service schema (2 pts) — local businesses
  5. BreadcrumbList schema (1 pt) — navigation context
  6. AggregateRating or Review schema (1 pt) — reviewed entities

Below the six, other schema types matter less universally — Event, Product, JobPosting, Recipe, HowTo — and apply only to brands with content in those categories.

Organization schema with sameAs array

The foundational schema for any brand. Declares the entity, its URL, logo, and crucially its sameAs array linking to verified external profiles.

The canonical pattern:

{
  "@context": "https://schema.org",
  "@type": "Organization",
  "name": "Acme Corp",
  "url": "https://acme.example.com",
  "logo": "https://acme.example.com/logo.png",
  "description": "Acme Corp builds widgets for the global market.",
  "sameAs": [
    "https://www.linkedin.com/company/acme-corp",
    "https://www.crunchbase.com/organization/acme-corp",
    "https://en.wikipedia.org/wiki/Acme_Corp",
    "https://www.wikidata.org/wiki/Q12345678",
    "https://twitter.com/acmecorp"
  ],
  "address": {
    "@type": "PostalAddress",
    "streetAddress": "100 Main Street",
    "addressLocality": "San Francisco",
    "addressRegion": "CA",
    "postalCode": "94102",
    "addressCountry": "US"
  }
}

The sameAs array is the most important element. AI assistants (especially Gemini and Perplexity) use it to cross-validate entity claims. A brand declaring Organization schema with sameAs links to LinkedIn, Wikipedia, and Wikidata is positioning itself as an entity that AI assistants can verify externally — and that verification correlates with citation confidence.

The Organization schema should appear once site-wide — typically in the homepage <head> or in a shared layout component that renders on every page. Pages don't need to repeat Organization schema; one canonical declaration is sufficient.

FAQPage schema deep-dive

FAQPage schema marks up explicit question-and-answer pairs in machine-readable form. AI assistants extract FAQ answers verbatim at materially higher rates from FAQPage-marked content than from prose Q&A without markup.

The canonical pattern:

{
  "@context": "https://schema.org",
  "@type": "FAQPage",
  "mainEntity": [
    {
      "@type": "Question",
      "name": "What is Answer Engine Optimization?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "Answer Engine Optimization (AEO) is the practice of engineering content so AI assistants cite your brand by name when answering buyer-intent questions."
      }
    },
    {
      "@type": "Question",
      "name": "Is AEO different from SEO?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "Yes. AEO targets AI-generated answers; SEO targets ranked search results. The signals overlap roughly 40%, with the remaining 60% diverging on content structure, authority, and entity signals."
      }
    }
  ]
}

FAQPage schema should be deployed on every page with substantive Q&A content. Every Cluster 1, Cluster 2, and Cluster 3 article on dataforaisearch.com ships with FAQPage schema. We have not audited a brand where adding FAQPage schema to top content pages failed to lift FAQ-shaped query citation rates within 30 days.

Critical: the FAQPage schema must mirror the visible FAQ content exactly. Schema declaring questions and answers that don't appear in the visible page content can be flagged as deceptive by search engines and AI assistants. Validate via Google's Rich Results Test that the schema matches the rendered page.

Article and BlogPosting schema with declared Person author

Article schema (or its sub-type BlogPosting) marks up editorial content with metadata: headline, publication date, modification date, author, image, publisher. The most important sub-element is the author field, which should be a Person entity with sameAs array linking to verified profiles.

The canonical pattern:

{
  "@context": "https://schema.org",
  "@type": "BlogPosting",
  "headline": "How to get cited by ChatGPT",
  "datePublished": "2026-06-23",
  "dateModified": "2026-06-23",
  "mainEntityOfPage": "https://yourdomain.com/learn/how-to-get-cited-by-chatgpt",
  "image": [
    "https://yourdomain.com/images/article-hero.jpg"
  ],
  "author": {
    "@type": "Person",
    "name": "Author Name",
    "url": "https://yourdomain.com/about",
    "sameAs": [
      "https://www.linkedin.com/in/authorname/",
      "https://www.wikidata.org/wiki/QXXXXXX",
      "https://twitter.com/authorhandle"
    ]
  },
  "publisher": {
    "@type": "Organization",
    "name": "Your Brand",
    "logo": {
      "@type": "ImageObject",
      "url": "https://yourdomain.com/logo.png"
    }
  }
}

The declared Person author entity is the single most important element. Brands that ship Article schema without a Person author — or with only a generic "Editorial Team" string — leave citation rates on the table. Claude especially weights declared authorship; brands with verified author entities get cited at 15-25% higher rates than identical content with anonymous authorship per our internal audits.

Site-wide Article schema deployment is a one-day engineering project — inject the schema in the head template for every blog post, populating author/date/headline from CMS data. The lift compounds across the entire content surface.

LocalBusiness and Service schema

Local businesses (anyone with a physical service area or storefront) should ship LocalBusiness schema. National service brands without physical locations should ship Service schema for each service offered.

LocalBusiness pattern:

{
  "@context": "https://schema.org",
  "@type": "LocalBusiness",
  "name": "Acme Painting CA",
  "image": "https://acmepainting.example.com/storefront.jpg",
  "address": {
    "@type": "PostalAddress",
    "streetAddress": "1234 Main St",
    "addressLocality": "San Diego",
    "addressRegion": "CA",
    "postalCode": "92101",
    "addressCountry": "US"
  },
  "telephone": "+1-619-555-0123",
  "openingHours": "Mo-Fr 07:30-17:00",
  "priceRange": "$$",
  "areaServed": [
    "San Diego, CA",
    "La Jolla, CA",
    "Carlsbad, CA"
  ],
  "geo": {
    "@type": "GeoCoordinates",
    "latitude": "32.7157",
    "longitude": "-117.1611"
  }
}

LocalBusiness schema cross-validates with Google Business Profile, contributing to Gemini's Knowledge Graph entity confirmation per the Two-Track Law. Brands with both LocalBusiness schema AND complete GBP get cited more confidently on Gemini than brands with only one or the other.

BreadcrumbList and AggregateRating

BreadcrumbList schema declares the navigation hierarchy of a page — useful for AI assistants reasoning about content context.

{
  "@context": "https://schema.org",
  "@type": "BreadcrumbList",
  "itemListElement": [
    {
      "@type": "ListItem",
      "position": 1,
      "name": "Home",
      "item": "https://yourdomain.com"
    },
    {
      "@type": "ListItem",
      "position": 2,
      "name": "Learn",
      "item": "https://yourdomain.com/learn"
    },
    {
      "@type": "ListItem",
      "position": 3,
      "name": "Schema for AI Search",
      "item": "https://yourdomain.com/learn/schema-markup-for-ai-search"
    }
  ]
}

AggregateRating schema declares aggregated review/rating data for an entity. Should only be used when the brand has real review data backing the aggregate; fabricated aggregate ratings can be penalized by search engines.

{
  "@context": "https://schema.org",
  "@type": "Organization",
  "name": "Acme Corp",
  "aggregateRating": {
    "@type": "AggregateRating",
    "ratingValue": "4.7",
    "reviewCount": "247"
  }
}

How to validate your schema

Three validators to run quarterly:

Google Rich Results Testhttps://search.google.com/test/rich-results. Tests whether schema is eligible for Google's rich result features. Critical for Gemini since Gemini uses Google's parsers.

Schema.org Validatorhttps://validator.schema.org/. Tests pure Schema.org compliance independent of any consumer system. Catches issues Google's tool may not flag.

Bing Webmaster Tools Markup Validator — covers Bing's interpretation. Less critical than Google's but useful for Microsoft Copilot optimization.

Run all three quarterly. Schema rot is real — a deploy can introduce errors that silently break validity for months.

Common schema mistakes that hurt AI citation

Schema doesn't match visible content. FAQPage declaring questions not visible on the page; Article schema with a different headline than the visible H1; LocalBusiness schema with a different address than the visible footer. All can be flagged as deceptive.

Missing or malformed sameAs array. Organization schema without sameAs provides no cross-validation signal. With broken URLs, it provides negative signal.

Anonymous Article author. BlogPosting schema with no author field, or with author as a generic string ("Editorial Team") instead of a Person entity. Cuts Claude citation rate materially.

Stale dates. dateModified that doesn't actually reflect when the page was modified. Perplexity weights dateModified strongly; misrepresented dates harm credibility.

Multiple conflicting schemas on the same page. Two Organization schemas with different sameAs arrays. Two Article schemas with different headlines. Validators flag conflicts; AI assistants may default to citing neither.

JSON-LD inside SPA components without server-side rendering. Single-page apps that inject schema via JavaScript after initial page load may have schema invisible to crawlers that don't execute JS. Server-render or pre-render schema for crawler compatibility.

Frequently asked questions

Does schema markup directly improve Google rankings?

Not directly. Schema markup makes pages eligible for rich result features and improves entity understanding, both of which can indirectly influence ranking. The direct citation benefit on AI assistants is the bigger story in 2026.

Should every page have every type of schema?

No. Pages should have schema appropriate to their content. Article schema on a blog post; LocalBusiness schema on the homepage and contact page; FAQPage schema on FAQ-bearing pages. Over-schemaing dilutes signal.

Does schema markup affect Perplexity differently from ChatGPT?

Yes meaningfully. Perplexity weights schema validation heavily because of entity confirmation requirements. ChatGPT weights schema less heavily because it relies more on content geometry and brand mention frequency. Per the Two-Track Law, schema is a Track 2 signal that benefits Perplexity and Gemini disproportionately.

Can AI-generated content with schema markup still rank well?

Yes. AI assistants don't directly detect or penalize AI-generated content (yet). What they detect: shallow content, missing source attribution, anonymous authorship, factually unverifiable claims. AI-assisted content with declared authorship, sourced statistics, and meaningful depth ranks fine.

What about Speakable schema for voice assistants?

SpeakableSpecification is an experimental Schema.org type for content optimized for voice assistant reading. Still adopted unevenly across AI assistants as of mid-2026. Worth implementing on FAQ pages where voice query traffic matters; defer for sites where voice is not a primary buyer journey.


Companion guides: AI bot robots.txt complete guide · The Cloudflare GPTBot trap · llms.txt complete guide for 2026 · The 10-Point AI Citation Framework.