AI Search / GEO

How to Get Your Site Cited by ChatGPT & Perplexity

When you query an AI assistant for "the best tool to X" or "how to do Y," only a handful of resources get cited and hyperlinked, while all others essentially vanish from that user's reality. There’s no magic formula that guarantees you’ll be cited, but there are definite prerequisites that will put your URL in the running. We’ll go over all the prerequisites, and how you can quickly verify where your website stands with them.

Understand what gets you quoted

It's helpful to know how a citation actually happens when a search engine gives you an answer. It gathers a pool of possible pages from which it can formulate an answer. It then scans the text of these pages to build a response that uses those it feels are most relevant and trustworthy, and attributes these as the sources of its answer. Reversing this thought process gives you some ideas of what the prerequisites might be. The page needs to be accessible to the engine. It needs to be readable to the engine. It needs to contain the answer to the question. It needs to appear authoritative. Everything that follows is just fleshing out these requirements in greater detail.

Let crawlers in

This is the step that gets overlooked the most, and causes the most pain. Every engine accesses the web with its own unique user-agent, including GPTBot and OAI-SearchBot for OpenAI, ClaudeBot for Anthropic, PerplexityBot for Perplexity, Google-Extended for Gemini, and a simple rule in robots.txt will block all of them. A significant amount of sites disallow these bots, sometimes from an outdated template they've been running for years, or other times they explicitly request this, only to wonder why they never come up in answers. At the top of your checklist is making sure none of the bots you care about are blocked. The AI readiness checker tells you this for each bot, and the guide to controlling AI crawlers shows you where to look for these disallow rules.

Ensure your page is static

Letting the bot crawl your site doesn't mean a thing if there isn't actually content for it to scrape upon arrival. Most AI crawlers don't render JavaScript, so if your page arrives in an empty state and the content is loaded by JavaScript, the bot won't have a whole lot to go on. The answer is to serve your page's core text in the underlying HTML, either through a server side render or a pre-render, so even if a crawler can't run any code, it can still see the content. Read all about this process (and how you can test it) at do AI crawlers render JavaScript.

Address the query immediately

This is where you start to diverge a little with your standard web copy. A model is searching for a passage it can extract and rely upon, and it favors content that delivers its core answer immediately instead of hiding it behind a wall of introductory fluff. You should present your direct response up front, then add your supporting details. Organize your article into distinct, standalone sections with accurate headings, making sure any single paragraph can serve as a self-contained quote without relying on context from the rest of the page. Establish the context. When you make a statement, be precise, as a factual figure, an illustrative case study, or a direct comparison is more quotable and authoritative than a broad, imprecise claim. What makes a page actually useful for humans is also what makes it easy for a model to summarize and restate.

Provide search engines with the cues they need

In addition to the writing itself, the machine-readable cues that surround it are important. An accurate title, a clear hierarchy of headers, and Schema.org markup assist search engines in understanding the general nature and specific focus of a page. Tag all your content, such as blog posts, catalog items, Q&A sections and any information pertaining to your company, whenever appropriate. Nothing here is cutting edge or unusual; it's simply standard technical SEO, but today, it's performing double duty, serving both the traditional search algorithms and the AI answer engines in their ability to digest the content of your page. (For a high-level explanation of how all these different pieces fit together into a single framework, read our generative engine optimization overview).

Acquire a level of authority that will be influential in the outcome

If a number of web pages all provide good enough responses to a given question, the search engine will tend to defer to the most authoritative option among them. This is the one that is being spoken about, cited and alluded to across the internet, the one with established name recognition in the subject matter. This is where you're not able to set something in a configuration field and call it good. The level of authority you earn is a result of producing truly worthy work that others will want to cite, such as primary research, explanatory articles, functional tools and the attention those things attract. It's the longer process, to be sure, but also the more sustainable. Over time, what distinguishes the websites and publications that get mentioned over and over from those that occasionally surface is the strength of their authority.

Remain up-to-date and do a review

Answer engines are more inclined to give their top spots to the freshest, best-maintained content, which means a page that gets regular attention is more likely to stay relevant than a page that is just allowed to gather dust. And since there are no guarantees in this game, you'll want to repeat these steps on a recurring basis rather than performing them just once: put the groundwork in place, then go and search your key questions on ChatGPT, Perplexity and the like to see whether your website makes it into the answer. If it doesn't, the list above is where you'll want to start troubleshooting. First, verify whether the AI search engines can access and read your page properly using the AI readiness check, since that is most frequently the problem.

Frequently asked questions

How do AI engines decide which sites to cite?

They retrieve a set of candidate pages for a question, read them, and quote the ones that answer it most clearly and credibly. So three things matter: the engine must be able to fetch your page, it must be able to read the content without running JavaScript, and the content has to answer the question directly and trustworthily.

Can I guarantee that ChatGPT will cite my site?

No, and anyone promising that is overselling. You can only stack the odds: make sure the crawlers are allowed in, the content is readable and well structured, the answers are direct and specific, and your brand is mentioned and trusted across the web. Citations follow from those conditions; they cannot be forced.

Does blocking AI crawlers hurt my chances?

Completely. If GPTBot, OAI-SearchBot, ClaudeBot or PerplexityBot are disallowed in your robots.txt, those engines cannot fetch your pages, so they cannot cite you. It is the single most common reason a site is absent from AI answers.

Is this different from ranking on Google?

It shares the same foundation but rewards different things. Ranking is about being the best result to click; citation is about being the cleanest passage to quote. Direct answers, clear structure and factual specificity matter even more when a model is lifting a sentence to put in front of a user.

Link exchange