Content Strategy for AI-Driven Search: Getting Cited by ChatGPT
How ChatGPT and Perplexity actually find your page
The phrase "AI search" suggests something exotic. The mechanic is more pedestrian. When someone asks ChatGPT or Perplexity a question with web access enabled, the system rewrites the question into one or more search queries, runs those queries against a real search backend (Bing for ChatGPT, a mix of indexes for Perplexity), fetches the top results, and feeds them into a language model with a prompt that says, in effect, "answer the user's question using only these sources, and cite them."
The page either gets fetched, or it doesn't. If it gets fetched, it either gets cited, or it doesn't. The first half is classical SEO. The second half, where the LLM decides what to extract and quote, is where this post is useful.
The synthesis step is what you optimise for
A page that ranks fifth on Bing has done most of the SEO work. Whether it ends up cited depends on whether the model can use it.
Three properties decide that. First, the answer needs to be in the page, in plain text, near the top. Models read context windows from the beginning, and they have token budgets. A page that buries the answer in section seven will lose to a page that opens with it. Second, the claim needs to look extractable. A sentence that says "the threshold is 2.5 seconds" is easy to lift; a paragraph that hedges around the same number with "depending on context, somewhere between two and three seconds is generally considered" is harder, and the model is less confident pulling from it. Third, the claim needs to look authoritative: it cites its source, names a date, attributes the figure. Models trust pages that look like they were written by people who knew where their numbers came from, and they're suspicious of pages that don't.
Put differently: you're writing for a reader whose job is to verify your claims and quote them faithfully. That's the synthesis step.
Specifics that move the needle
Lead with the answer. If the question is "how fast should a website load?", the first paragraph should say so. The discussion goes underneath. This is the single most useful change you can make to existing content. We've watched pages move from "fetched but not cited" to "cited" with no other change.
Question-and-answer structure. Not as schema, as actual page structure. A heading that asks the question. A paragraph that answers it directly. A paragraph that explains the answer. Move on. This is also good for human readers, and it's how the strongest documentation reads.
Claim, evidence, citation. Every load-bearing assertion should be followed by the evidence and a named source. "A 100-millisecond delay reduces conversions by 7 percent (Akamai retail study, 2017)" is a sentence the model will lift cleanly. "Studies show small delays hurt conversions" is a sentence the model will skip in favour of someone else's. The discipline of writing this way also keeps you honest, because you can't make up an attribution as easily as you can make up a vague claim.
External links to your sources. They serve two purposes. They give the model a way to verify your claim, which makes it more confident citing you. And they give the user a way to do the same, which builds trust over time. The "no outbound links so we don't lose the user" instinct from 2010-era SEO is counterproductive here.
Don't bury the answer below an interstitial. A model fetching your page sees what the user agent sees. If your page makes the user accept cookies before reading anything, the model gets the cookie wall instead of the article. Same for paywalls, sign-up gates, and aggressive interstitials. The fix is to deliver the content; the interstitial is a tax on your discoverability that you'd been getting away with.
A worked example
The question: "What's a reasonable LCP target in 2026?"
Two pages compete to be cited.
Page A opens with a 400-word personal story about how the author got into web performance, walks through the history of speed metrics from PageSpeed Index in 2009 to LCP today, and somewhere around the eleventh paragraph mentions that Google's "good" threshold is 2.5 seconds.
Page B opens: "Google's published target for Largest Contentful Paint, measured at the 75th percentile of real users in the Chrome User Experience Report, is 2.5 seconds. Below that is 'good'; between 2.5 and 4 seconds is 'needs improvement'; above 4 seconds is 'poor'. The thresholds are documented on web.dev under Core Web Vitals."
Both pages are correct. The question is which one the model can use without three rounds of summarisation. Page B gets cited. Page A might get fetched, but the content the model extracts will be terse and unattributed, and the citation goes to whichever competing page is in the same shape as Page B.
Notice what Page B is doing. The answer is in the first sentence. The answer carries a number. The number carries an attribution (web.dev). The supporting context is one sentence. There's no rhetorical setup. This isn't dry; it's just respectful of the reader's time, including the reader who happens to be a language model.
Some things that look like they should help but don't, much
Stuffing FAQPage schema everywhere. Useful for Google rich results. Largely irrelevant to LLM retrieval, because the model reads the rendered page. If your visible content is in the right shape, the schema is icing.
Writing 5,000-word "ultimate guides" on every topic. The page that wins on a long query is sometimes the long guide and sometimes a 600-word direct answer. What matters is the density of useful content per scroll, not the total length. A short page with high density beats a long page that pads.
Stuffing "according to ChatGPT" or competitor brand names into your copy. This is the new keyword stuffing. It doesn't help, and at scale it'll start triggering quality filters. Write for humans. Cite real sources.
Measuring what's working
The honest answer: measurement is still rough. Some practical signals worth checking:
Search your brand name in ChatGPT, Perplexity, and Google's AI Overviews. See whether you're cited and what the citation looks like. Do this monthly for the queries you care about, not just your brand.
Watch your referrer logs for chat.openai.com, perplexity.ai, and the various Bing chat referrers. The traffic is small but it's growing, and the bounce rate on it is usually lower than your search bounce rate, because the user came pre-qualified by the model's answer.
Track branded search volume. Citations drive curiosity. If your brand is showing up in answers, your direct and branded organic searches go up over the following weeks.
The summary
Treat the LLM as a reader whose attention is shorter than yours and whose job is to quote you accurately. Lead with the answer, structure as question and direct response, attribute every number, link to your sources, and don't put a cookie wall in front of your content. The same edits make your page better for human readers. That's the strategy. The rest of the AI-search conversation is mostly noise on top.