Free for May: AI Visibility Audit — see how your site shows up in ChatGPT and Google's AI Overviews.Run yours →

White paper 02 · SEO · GEO · Marketing

Generative-Engine Optimization (GEO). The practical playbook for getting cited by ChatGPT, Perplexity, and Google AI Mode.

A practitioner guide for B2B and ecommerce brands competing for AI-search visibility. Schema foundations, citation-friendly content patterns, and a 30-day GEO audit checklist.

Version
1.0 · May 2026
Length
~24 pages · 4,800 words
Lead author
Shaili Gupta
Audience
Marketing leaders, founders
Reading time
~22 minutes

Executive summary

What this paper covers, in plain English.

AI search has split the funnel. Half of consumers now use ChatGPT, Perplexity, Claude, or Google's AI Mode for everyday questions (McKinsey, 2025). When those engines answer a question, they cite a small set of source pages. If your brand is in that set, you get qualified, intent-rich traffic. If it is not, your competitor does.

This paper is a working playbook, not a literature review. It documents the technical foundations that make a page citable, the content patterns that AI engines reach for, the brand-mention monitoring tools that prove the work is paying back, and a 30-day audit checklist any team can run today.

The takeaway is simple: GEO is mostly accessibility, structured data, and clear writing, with three new layers on top. If you are already serious about SEO and accessibility, you are 70% of the way there.

01 · The shift from search to answers

Search has split. Roughly half of all consumer search queries now go to AI answer engines (ChatGPT, Perplexity, Claude, Google's AI Mode, Bing Copilot) instead of the classic ten blue links. The behaviour is different, the funnel is different, and the optimization moves are different.

When a user types a question into a generative engine, the engine does three things in sequence:

  • Decomposes the question into sub-queries
  • Retrieves a small set of candidate documents per sub-query
  • Synthesises an answer, citing the documents it leaned on

The traffic prize is no longer rank #1. The prize is being one of the 3-7 documents the engine cites. That is a binary outcome: cited or not cited. There is no second page.

This paper is about how to engineer your site so that you are in the cited set, repeatedly, for the question types your buyers ask.

02 · How AI engines decide what to cite

AI engines do not read websites the way a human reader does. They do not bounce, they do not scroll, they do not follow side links. They see a structured representation of a page — typically a markdown-or-text version after JavaScript has run — and they look for three things:

1. Direct answers to the sub-query. A passage that answers the exact question the engine is trying to ground. The closer your sentence is to the way the user asked the question, the higher the citation probability.

2. Trust signals. Author bylines, organisation schema, citations to primary sources, plain dates. Engines have learned to down-weight pages with no author, no date, no source list.

3. Structural clarity. H1 / H2 hierarchy that matches the page's argument. Lists where lists make sense. Tables where tables make sense. The structure tells the model what to extract.

A blog post that hides the answer in paragraph six, with no bylines and no schema, will lose the citation slot to a competitor's clearly-structured page even if your post is better-written.

03 · Technical foundations

GEO is mostly accessibility, structured data, and clear writing. If you are already doing those well, you are 70% of the way there.

Server-side rendering or pre-rendering

Most engines do not execute JavaScript when fetching pages. SPAs that render content client-side are essentially invisible. Use SSR (Next.js, Astro, Nuxt) or pre-rendering. Verify by viewing the raw HTML of your top pages — the answer copy must be in the source, not painted in by JS.

Schema.org structured data

Minimum useful set:

  • Organization on the homepage
  • Article or BlogPosting on every editorial page
  • Product on every commerce page
  • FAQPage on Q&A blocks
  • BreadcrumbList site-wide
  • Service on every service page

Headings, lists, and tables

One H1 per page. H2s for major sections. H3s for sub-sections. Lists for parallel items. Tables for comparison. Engines extract on these boundaries.

Author and date metadata

Visible byline, visible publish date, visible last-updated date. Behind the scenes: author and datePublished in schema. Pages with neither are 30-40% less likely to be cited (Brightedge, 2025).

04 · Content patterns that work

The five content patterns AI engines lean on most heavily:

Direct-answer paragraphs. A 30-60 word answer placed within the first 200 words of the page, followed by a longer explanation. The first paragraph does the citation work.

Numbered or bulleted lists. When the answer is a sequence (steps, options, criteria), give the engine a list. Do not bury the list inside prose.

Comparison tables. When the user is comparing two or more things, a table is far easier for the engine to extract than a paragraph that says "X is Y but Z is W".

Definitions with examples. "X is Y. For example, A and B." The engine cites the definition; the example is the proof.

Original numbers and case data. Numbers from your own data — "In our 2025 audit of 47 K-12 sites, 71% had broken H1 hierarchy" — are a strong citation magnet. Engines prefer original sources to summaries of summaries.

What to avoid: walls of text, sales-y copy with no factual claim, content that requires an account or paywall to read, stock-photo-heavy pages with sparse copy.

05 · Measurement & monitoring

GEO has poor analytics. None of the major engines pass referral traffic the way Google does. You will not see "Perplexity" as a source in GA4 today. The proxy metrics that work:

  • Direct-traffic uplift to the pages you GEO-optimised, year over year. Most AI-driven traffic lands as direct.
  • Brand-mention monitoring. Tools like Profound, AthenaHQ, and Otterly run prompt suites against the engines and flag whether your brand appears in answers.
  • Server-side log analysis. AI engines crawl with identifiable user agents. Track GPTBot, ClaudeBot, PerplexityBot in your access logs.
  • Manual prompt testing. Once a month, run your top 20 buyer questions through ChatGPT, Perplexity, and Google AI Mode. Note when you appear and when a competitor does instead.

Do not chase a single dashboard number. Treat it the way SEO teams treated rank tracking pre-2010: directional, not absolute.

06 · The 30-day audit checklist

Run this against your top 20 commercial pages. Score each item Yes / No / Partial.

Week 1 — technical foundations

  • Page renders on the server (view-source contains the answer copy)
  • One H1 per page, sensible H2 / H3 hierarchy
  • Organization schema on homepage
  • Article or Service schema on the page
  • BreadcrumbList schema site-wide
  • Visible author byline and publish date
  • Last-updated date when content changed

Week 2 — content patterns

  • Direct-answer paragraph in the first 200 words
  • At least one bulleted or numbered list when the answer is a sequence
  • A comparison table when the user is comparing options
  • Original numbers, citations, or case-data
  • No content gated behind a form for top-of-funnel queries

Week 3 — trust and authority

  • Author has a real bio page on your site
  • Author bio links to LinkedIn or other public profile
  • The page cites primary sources (not just other blog posts)
  • The page has internal links to related deeper content

Week 4 — monitoring setup

  • Manual prompt sweep on 20 buyer questions, monthly
  • Server logs filtered for GPTBot / ClaudeBot / PerplexityBot
  • Direct-traffic baseline set in GA4 for the optimised pages
  • One brand-mention tool (Profound, AthenaHQ, Otterly) running

07 · Common mistakes

Patterns we see repeatedly in audits.

Treating GEO as a separate channel. It is not. The same page serves Google, ChatGPT, Perplexity, and Bing. You are not building a different page for each engine; you are building one well-structured page they all agree on.

Schema soup. Adding every schema type without thinking. Engines down-weight pages with mismatched or contradictory schema (e.g. Product schema on a blog post). Pick the right one and fill it in completely.

Hiding answers behind forms. A page that asks for an email before it shows the answer is invisible to AI engines. Run the gated content separately, or move the answer above the form and gate the deeper material.

Ignoring accessibility. WCAG 2.2 AA is largely the same checklist as GEO foundations: alt text, headings, contrast, semantic HTML. Teams that have shipped accessibility already have most of the work done.

Setting up dashboards before content. A monitoring dashboard with no optimised pages to monitor is a waste. Optimise first, then measure.

08 · About the authors

Shaili Gupta is President at OpenSource Technologies. Shaili has run the GEO programme for OST clients in K-12, government, and ecommerce since 2024.

Manish Mittal is CEO at OpenSource Technologies and a Forbes Technology Council member. Manish has been working on accessibility-driven SEO since 2011 and led the technical foundation work behind this playbook.

This paper is published under a permissive license — quote it, cite it, share it. If you find a mistake or want to add a case, email shaili@ost.agency.

Want the full 24-page paper?

Download as PDF or talk to the lead author.

Ask AI