RAG & Knowledge
When RAG fails in production — and what to fix first
Common retrieval failure modes in enterprise settings: stale corpora, citation theater, chunking mismatches, and permission leaks — plus practical fixes.
April 12, 2026 · 9 min read

Retrieval-Augmented Generation is easy to demo and hard to operate. The failure isn’t always “bad embeddings.” Often it’s workflow: documents update constantly, ownership is messy, and users ask questions that don’t map cleanly to chunks.
Below are patterns we see repeatedly — generalized, not tied to any single client.
Stale corpora beat bad models
If your source documents change weekly and your index updates monthly, you’ll get confident wrong answers. The model will sound authoritative because the citation exists — but the answer is outdated.
Fix retrieval freshness first: define what “authoritative” means per source, automate ingest, and surface “last updated” context to users when helpful.
Citation without verification is theater
Citations reduce anxiety, but they’re not proof. Users need confidence that the cited passage actually supports the claim — especially in regulated contexts.
Invest in evaluation: sample production questions, verify answer-to-source alignment, and track citation accuracy as a metric, not as a vibe.
Chunking mismatches dominate “vector quality” debates
If chunks split across semantic boundaries, retrieval returns the wrong context. Tables, policies, and versioned PDFs are famous for this.
Sometimes the right fix is structure: section-aware chunking, metadata filters, hybrid search, or retrieving at the document level first and chunk second.
Permissions are a retrieval problem
The risk isn’t only “wrong answer” — it’s “right-looking answer sourced from something the user shouldn’t see.” Enforcement must happen in the retrieval path, not as an afterthought in the prompt.
What to fix first
If you’re triaging a struggling RAG system, start with: freshness, permissioning, chunking/search architecture, evaluation — then model swaps.
Swapping embeddings or models without fixing retrieval mechanics is expensive churn with capped upside.
Related reading

Production agent evaluations that don’t rot after launch
How to keep agentic systems trustworthy over time: eval sets, regression gates, rollback paths, and human review — without fake demos.
Read article →

AI delivery milestones procurement teams can actually approve
How to structure agentic AI and RAG engagements with clear acceptance criteria, observability, and stakeholder checkpoints — built for enterprise buying, not hype.
Read article →
Want help applying this in your environment? Book a short strategy call — we'll align on scope, risks, and a sensible first milestone.
Book a Strategy Call →