What We Learned Analyzing 40,000+ AI Citations

Editorial magnifying glass examining a bar chart — analyzing AI citation data

Last updated: May 22, 2026

What We Learned Analyzing 40,000+ AI Citations | Tofu

Originally published on LinkedIn by Tofu Team. Cross-posted with additional context.

What We Learned Analyzing 40,000+ AI Citations

Here is something we did not expect: the single biggest predictor of whether an AI assistant will cite your content has almost nothing to do with your domain authority.

We found that out the hard way. Our team at Tofu, an AI-native B2B marketing platform, spent the last quarter obsessing over a question that keeps coming up in every conversation we have with marketing leaders: How do you actually get cited by AI?

Not theoretically. Not based on a blog post someone wrote after testing three queries. We wanted real data.

So we partnered with Profound, an AEO analytics platform that tracks AI citation patterns at scale, and went deep. What we found challenged most of the assumptions we had been operating on. And honestly, it changed how we think about content strategy entirely.

The Experiment

Let me lay out what we actually did, because the methodology matters.

Using Profound's analytics engine, we analyzed over 40,000 citations generated by four major AI assistants: ChatGPT, Claude, Perplexity, and Gemini. We focused specifically on B2B marketing categories because that is our world. These were citations surfaced in response to real user queries about topics like account-based marketing, demand generation, marketing automation, content personalization, and sales enablement.

We were not cherry-picking. We tracked citations across hundreds of queries, spanning informational searches ("what is account-based marketing"), comparison queries ("best tools for B2B personalization"), and how-to questions ("how to scale ABM content"). We logged which domains got cited, what type of content was referenced, how old the content was, and whether the cited page had specific structural elements.

The goal was simple: find the patterns that separate content that AI assistants cite from content they ignore. Here is what jumped out.

5 Things That Surprised Us

1. Definition-style content gets cited at 3x the rate of everything else

This was the most striking finding. Content that clearly defines a concept, explains what something is, or provides a structured overview gets cited at roughly three times the rate of other content types in B2B marketing categories.

We are talking about pages that answer "what is X" questions directly and thoroughly. Not thin glossary pages. We mean substantive pieces that define a concept, explain why it matters, describe how it works, and provide concrete examples. The ones that performed best were typically 1,500 to 3,000 words long and followed a clear definitional structure.

Comparison content ("X vs Y") and how-to guides also performed well, but nothing touched pure definitional content for raw citation volume. If you are only going to create one type of content for AI visibility, make it the definitive explanation of the thing your company does.

The implication here is big. Most B2B content teams are focused on bottom-of-funnel case studies and feature comparisons. But AI assistants are disproportionately pulling from top-of-funnel educational content when building their responses. You need to own the definition layer.

2. Smaller brands are winning citations against much bigger competitors

This one caught us off guard. In traditional SEO, domain authority is a massive competitive moat. A DR 90 site will almost always outrank a DR 40 site for competitive keywords. But in AI citations, we found that the correlation between domain authority and citation frequency is much weaker than anyone assumed.

In our dataset, we saw brands with domain ratings below 50 earning citation rates comparable to — and in some cases higher than — category incumbents with domain ratings above 80. The difference was not brand recognition. It was content specificity.

The smaller brands that punched above their weight all had one thing in common: they went deep on narrow topics. Instead of writing broad overview content on "marketing automation," they wrote detailed, expert-level pieces on specific subtopics like "how to set up multi-touch attribution for ABM campaigns" or "email personalization strategies for enterprise accounts."

AI assistants seem to value expertise density over raw authority. When a page goes deeper on a specific topic than anything else on the web, it gets cited — regardless of who published it. This is genuinely leveling the playing field in ways that traditional search never did.

3. Structured content dramatically outperforms unstructured content

We looked at whether content structure correlated with citation rates, and the answer was a clear yes. Pages with explicit structural elements — clear H2/H3 headings, numbered lists, comparison tables, definition-style formatting, and bullet-point summaries — were cited at significantly higher rates than narrative-only content covering the same topics.

This makes intuitive sense when you think about how these AI systems work. They are extracting information from pages and synthesizing it into responses. The easier you make it for a model to identify, extract, and attribute a specific claim or piece of information, the more likely it is to cite your page.

A few specific patterns stood out:

  • Pages with comparison tables were heavily cited for "X vs Y" queries. If your content compares tools or approaches in a structured table format, AI assistants love pulling from it.
  • Step-by-step formats dominated how-to citation responses. Numbered steps with clear action verbs made the cut far more often than prose-heavy guides.
  • Pages with a clear summary or TL;DR section near the top were cited more frequently, likely because that section contained the extractable claims the models were looking for.
  • Schema.org markup appeared on a disproportionate share of cited pages. We cannot prove causation, but the correlation was strong enough that we now treat structured data as table stakes.

The takeaway: write for extraction. Every page should have at least one section that could be cleanly pulled into an AI-generated response and attributed to your brand.

4. Recency matters, but not the way you think

We expected to find that newer content gets cited more. That turned out to be only partially true — and the nuance matters a lot for how you should think about your content calendar.

For time-sensitive queries (think "best AI marketing tools in 2026" or "latest trends in B2B personalization"), recency was a strong citation signal. AI assistants clearly preferred content published or substantially updated within the last six months.

But for evergreen conceptual queries ("what is demand generation" or "how account-based marketing works"), we saw content from 2023 and 2024 still getting heavily cited in 2026 — as long as it was genuinely comprehensive and well-structured. Some of the most-cited pages in our dataset were over two years old.

The pattern we identified: AI assistants care about recency for factual claims and tool comparisons, but they care about depth and authority for conceptual explanations. The most effective strategy is not to constantly churn out new content. It is to build a library of genuinely authoritative evergreen pieces and then keep them updated with fresh data and examples.

This is actually great news for lean marketing teams. You do not need to publish three blog posts a week. You need to publish the best possible piece on your core topics and then refresh them regularly.

5. Integration and ecosystem mentions are a hidden citation driver

This was the finding we almost missed. When we looked at the content that got cited in recommendation and comparison queries ("best tools for X," "alternatives to Y"), we noticed that pages mentioning integrations with well-known platforms were cited at higher rates.

Specifically, content that mentioned integrations with HubSpot, Salesforce, Marketo, and other major platforms appeared in AI responses more frequently than comparable content that did not include integration details. The effect was strongest on Perplexity and ChatGPT.

Our theory: AI assistants are attempting to give practically useful recommendations. When they evaluate whether to cite a particular tool or platform, the presence of integration mentions serves as a signal that the product fits into existing tech stacks. It is a proxy for real-world utility that the models have learned to pick up on.

The implication for B2B marketers: your integration pages, partner ecosystem content, and "works with" documentation are not just sales enablement collateral. They are AEO assets. Make sure they are publicly accessible, well-structured, and detailed enough for AI assistants to parse.

What This Means for B2B Marketers

When we stepped back and looked at these five findings together, a clear picture emerged. AI citation is not a black box. It follows patterns that are surprisingly consistent across platforms, and those patterns reward a specific type of content strategy.

Here is what we think every B2B marketing team should be thinking about:

Invest in definitive category content. If your company operates in a named category (or is creating one), the single highest-ROI content investment you can make is to publish the most comprehensive, well-structured explanation of that category on the web. This is not a 500-word blog post. It is a 2,000+ word piece that becomes the reference document AI assistants reach for when users ask about your space.

Structure everything for extraction. Every piece of content you publish should include at least one section that reads like a self-contained answer. Use clear headings, structured data, comparison tables, and summary sections. Think of each page as having both a human reading experience and an AI extraction layer.

Go deep rather than broad. Stop trying to rank for every keyword in your category. Instead, identify the ten to fifteen subtopics where you have genuine expertise and create content that goes deeper than anything else available. AI assistants are not fooled by thin content that covers a topic superficially. They reward depth.

Refresh rather than replace. Build your core content library once, then update it regularly. Add new data points, fresher examples, and updated comparisons. This is more effective than publishing new standalone posts on the same topics.

Make your ecosystem visible. Your integration pages, partner content, and technical documentation are AEO assets. Treat them with the same strategic care you give your blog content. Ensure they are crawlable, well-structured, and detailed.

What We Changed

We would be hypocrites if we shared all of this and did not tell you how it changed our own approach. So here is what we did at Tofu after running this analysis.

We rebuilt our content architecture around definitions. We identified every core concept in our space — AI-native marketing, generative marketing, campaign orchestration, content personalization at scale — and either created or rewrote our definitive page for each one. These are not blog posts anymore. They are pillar pages designed specifically to be the reference AI assistants pull from.

We added structured extraction layers to every page. Every piece of content we publish now includes summary sections, comparison tables where relevant, and clear H2/H3 hierarchies. We added Schema.org markup across the board. We started thinking of each page as having two audiences: humans who read it and AI systems that extract from it.

We cut publishing volume and increased depth. We went from publishing four to five posts per week to two to three, but each one is substantially more detailed and authoritative. Our average word count went up by about 60%, and every post now targets a specific citation-worthy subtopic rather than chasing keyword volume.

We made our integration content a first-class citizen. Our HubSpot integration page, our Salesforce documentation, our API docs — all of it got a structural overhaul. These pages now read like standalone guides rather than feature lists. And we have already seen them start appearing in AI-generated recommendations.

We started tracking AI citations as a core metric. Using Profound, we now monitor our citation share across ChatGPT, Claude, Perplexity, and Gemini the same way we used to track organic search rankings. The dashboard shows us which pages are being cited, for which queries, and on which platforms. It has become one of the most watched metrics on our marketing team.

The Bigger Picture

If there is one thing this analysis hammered home for us, it is this: AI citation is becoming a primary discovery channel for B2B buyers. Not a secondary one. Not a "nice to have." A primary one.

When a VP of Marketing asks ChatGPT "what is the best platform for personalizing ABM content at scale," the answer that model gives carries enormous weight. It is not just a search result that might get clicked. It is a direct recommendation delivered with the authority of the AI assistant itself. Being cited in that response is arguably more valuable than ranking first on Google for the equivalent search query.

The good news is that unlike traditional SEO, the playing field is more open. You do not need a massive backlink profile or decades of domain history. You need great content, structured well, going deep on topics you genuinely understand.

We are publishing the full research with detailed breakdowns, platform-by-platform analysis, and more granular findings on the Tofu blog. If you are serious about understanding AI citation dynamics, the full dataset is worth your time.

And if you are thinking about how to actually operationalize this — how to create structured, citation-optimized content at scale across dozens of accounts and campaigns — that is exactly the problem we built Tofu to solve. We would love to show you how it works.

Ready to make your content citable by AI?

See how Tofu helps B2B marketing teams create structured, AI-optimized content at scale.

Book a Demo
SHARE THIS POST

Stay up to date with the latest marketing tips and tricks

Thank you!
Your submission has been received!
Oops! Something went wrong while submitting the form.

Other articles in this category

No items found.

Want to give tofu A try?

Request a custom demo to see how Tofu can supercharge your GTM efforts.

DOWNLOAD FULL GUIDE NOW

ABM IN THE AI ERA

A playbook for 1:1 marketing in the AI era

Get notified when "ABM IN THE AI ERA" launches
Sign up today for the first 3 ABM plays
First Name*
Last Name*
Work Email*
Title*
We're committed to your privacy. Tofu uses the information you provide to us to contact you about our relevant content, products, and services. You may unsubscribe from these communications at any time. For more information, check out our Privacy Policy.
You're all set! Check your email for the full ABM in the AI Era Guide
Oops! Something went wrong while submitting the form.

Hear from leading experts

"I take a broad view of ABM: if you're targeting a specific set of accounts and tailoring engagement based on what you know about them, you're doing it. But most teams are stuck in the old loop: Sales hands Marketing a list, Marketing runs ads, and any response is treated as intent."

Kevin White
Head of GTM Strategy
Common Room

"ABM has always been just good marketing. It starts with clarity on your ICP and ends with driving revenue. But the way we get from A to B has changed dramatically."

Latané Conant
Chief Revenue Officer
6sense

"ABM either dies or thrives on Sales-Marketing alignment; there's no in-between. When Marketing runs plays on specific accounts or contacts and Sales isn't doing complementary outreach, the whole thing falls short."

Michael Pannone
Director of Global Demand Generation
G2

"In our research at 6sense, few marketers view ABM as critical to hitting revenue goals this year. But that's not because ABM doesn't work; it's because most teams haven't implemented it well."

Kerry Cunningham
Head of Research & Thought Leadership
6sense

"To me, ABM isn't a campaign; it's a go-to-market operating model. It starts with cross-functional planning: mapping revenue targets, territories, and board priorities."

Corrina Owens
Fractional ABM
Orum

"With AI, we can personalize not just by account, but by segment, by buying group, and even by individual. That level of precision just wasn't possible a few years ago."

Guy Yalif
Chief Evangelist
Webflow

What's Inside

This comprehensive guide provides a blueprint for modern ABM execution:

check icon

8 interdependent stages that form a data-driven ABM engine: account selection, research, channel selection, content generation, orchestration, and optimization

check icon

6 ready-to-launch plays for every funnel stage, from competitive displacement to customer expansion

check icon

Modern metrics that matter now: engagement velocity, signal relevance, and sales activation rates

check icon

Real-world case studies from Snowflake, Unanet, LiveRamp, and more

Transform your ABM strategy

Sign up now to receive your copy the moment it's released and transform your ABM strategy with AI-powered personalization at scale.

Download Now

Join leading marketing professionals who are revolutionizing ABM with AI