Analysis & Opinion

Why Your GEO Score Is Wrong (And What Block-Level Scoring Fixes)

Published: 26 March 2026 Author: Cited By AI® Reading time: 8 min
Originally published on LinkedIn Pulse. This is the canonical version.
Version 1.0 | Published 26 March 2026 | Last verified: 26 March 2026 | Source: citedbyai.info AI Visibility Intelligence

Most GEO dashboards give you a number. Some give you a trend line. A few give you a sentiment breakdown. None of them tell you which paragraph got cited - or why the one next to it didn't.

That's the gap. And it's not a minor oversight in how AI visibility tools are built. It's a fundamental misunderstanding of how AI retrieval actually works.

The unit of AI retrieval is not your page

When a large language model generates an answer that cites your content, it doesn't read your page the way a human does. It doesn't absorb the narrative arc, pick up the thesis in paragraph one, and carry it through to your conclusion.

It retrieves chunks.

Specifically: RAG (Retrieval-Augmented Generation) systems - the architecture behind Perplexity, ChatGPT's web search mode, Gemini, and most AI search surfaces you care about - break web content into discrete text segments before they ever generate a response. Each segment gets embedded, scored for relevance, and either pulled into the answer or discarded.

The operative chunk length, based on how most production RAG pipelines are tuned and validated against real citation behaviour, sits in the 134–167 word range. That's one strong paragraph, three tight ones, or a complete standalone answer with a fact and a conclusion.

Your page might have twelve of these chunks. Four might be strong. Eight might be invisible to AI retrieval - not because your content is bad, but because those sections aren't structured to survive the chunking process.

The core problem: A page with a Citation Probability Score® (CPS®) of 68 might contain three blocks scoring above 80 and five blocks scoring below 30. The high-scoring ones get cited. The low-scoring ones get ignored. Your dashboard shows you a 68 and tells you things are fine.

What page-level scoring actually measures

To be fair to the tools that report page-level scores: they're measuring something real. Domain authority signals, structured data presence, crawlability, freshness - these matter. Getting the technical foundations right is a precondition for citation, not an afterthought.

But page-level scoring is a blunt instrument when the question you're trying to answer is: which parts of my content is AI actually using?

Here's a concrete example. Say you're a B2B SaaS company. You have a 1,200-word service page covering what you do, who you serve, your pricing model, and a case study. A user asks Perplexity: "Which B2B SaaS companies offer transparent usage-based pricing?"

Your page is crawlable. It has schema. It's indexed. Your GEO score is respectable. But the section about pricing is 90 words buried between a header and a CTA. It doesn't open with a direct answer. It references a table that's image-rendered and therefore invisible to the retrieval system. The fact density - named figures, specific percentages - is low.

Perplexity retrieves the chunk. It scores low for the query. Your competitor's shorter, denser, more direct answer gets pulled instead. Your GEO dashboard doesn't show you this. It shows you the page score. You don't know the gap is there.

What the five block-level pillars actually capture

The Citation Probability Score® evaluates each content block - each retrievable chunk - across five pillars. Here's what each one measures and why it only makes sense at the chunk level, not the page level.

These aren't abstract criteria. Each one maps to a specific, observable behaviour in how RAG pipelines score and select content. And each one only makes sense measured at the chunk level — because that's where the retrieval decision actually happens.

The specificity gap in practice

The Fact Density pillar is worth dwelling on because the gap between high-scoring and low-scoring content is most visible here.

Low CPS® — vague

"Pricing varies by usage and can be customised for enterprise clients."

No verifiable signal. AI retrieval skips it.
High CPS® — specific

"Pricing starts at $0.008 per API call, with volume discounts applied above 500,000 requests monthly."

Two verifiable signals. Same information. Cited.

Both may rank on Google. Only the second gets cited by ChatGPT. The difference isn't quality — it's specificity. Adding one named statistic per paragraph is the fastest Fact Density improvement available to most content teams.

The practical consequence

If you're using a GEO tool that scores your pages, you're optimising for the wrong unit. You might be improving overall page authority while the specific blocks matched to high-value queries stay broken.

This creates a pattern that's hard to diagnose without block-level data: your AI visibility metrics look reasonable, but your citation rate in competitive queries stays flat. You're not missing because you're invisible. You're missing because the wrong paragraphs are doing the work.

The fix isn't a complete content rewrite. Often the lift comes from restructuring three or four underperforming blocks per page: leading with a direct answer, tightening the word count into the optimal range, adding one specific statistic, cutting the cross-references that make the block context-dependent. Those changes don't move your page-level GEO score much. They move your actual citation rate significantly.

Why this matters more as AI search matures

Right now, most brands are optimising for presence — appearing in AI-generated responses at all. That's the right first step. But the category is maturing fast.

Perplexity already shows named citations with source attribution. ChatGPT's web search mode selects sources at the passage level. Google's AI Mode pulls specific excerpts, not whole pages. The precision of retrieval is increasing, which means the margin between a block that gets cited and one that doesn't is narrowing.

Page-level optimisation gets you into the game. Block-level optimisation determines whether you win the specific query that matters — the one a buyer is asking at the moment they're deciding between you and a competitor.

The brands building block-level granularity into their content now are the ones who'll own specific query clusters in AI responses twelve months from now. The ones relying on page-level scores will know their overall GEO health and not much else.

What to do with this

You don't need to rebuild your entire content library. Start with your highest-value pages — the ones that should be cited when someone asks a purchase-intent query in your category. Run each one through a block-level audit. Find the chunks that are underperforming. Fix the structure, density, and self-containment issues in those specific blocks.

Then measure citation rate on those queries, not page authority.

That's the feedback loop that actually tells you whether your GEO work is doing anything. Not a dashboard number. Not a trend line. A specific query, a specific block, a specific citation.

The test: Pick your three most important purchase-intent queries. Ask Perplexity each one. Note which paragraph on your site it cites — if it cites you at all. That paragraph is your highest-performing block. The ones it skips are your audit backlog.

Get a block-level CPS® audit

Free instant check at citedbyai.info. Full audits from £49.

Get Your Free Audit →