Back to AI/ML Overview
Enterprise RAG — Reference Architectures

Full production RAG, wired end-to-end — across verticals

Most RAG tutorials show you three boxes and call it a day. These pages show the full pipeline — user to response, auth to audit log, embedding to reranker — wired for a specific industry, with the domain-specific gotchas most candidates only learn by shipping to production. Start with the Anatomy page to understand the generic pattern, then drill into a vertical.

🛤️RAG Learning PathRead in order to build a production RAG system
Not building a RAG system? The Model Committee deep-dive is a parallel track covering the eight specialized model families and routing patterns — read it after Foundations instead of RAG Anatomy if model composition is what you're after.

🧭Start here — the generic anatomy

Before the vertical case studies, there's one page that explains how the generic production RAG pipeline works — with two diagrams (architectural and sequence), a 15-step walkthrough, the naive-vs-agentic delta, and a misconceptions FAQ. Read this first and the verticals will read like specializations of the same pattern.

🧭
Required reading

Anatomy: When you press Enter, who decides what gets retrieved? →

A patient, rigorous walkthrough of the full production pipeline. Two diagrams (architectural + sequence), 15 numbered steps, agentic RAG delta, and a misconceptions FAQ. This is the page I wish existed when I was learning — and it's the page every vertical below is built on.

Architecture + Sequence15 stepsNaive vs Agentic10 misconceptions cleared

🏗️Vertical case studies

Each vertical reuses the same generic pipeline from the Anatomy page, but specialized for the data sources, compliance requirements, and failure modes of a specific industry. New verticals are added as new domains become relevant to the work.

👥
Live

HR Knowledge Base

Employees asking policy questions (parental leave, PTO, benefits, compliance). Most universal enterprise RAG use case.

Distinguishing wrinkle
Access control. Every retrieved chunk must be filtered by the asker’s jurisdiction, role, and reporting chain — before the vector search runs, not after.
Read the architecture →
💳
Coming soon

Customer Transactions

Customers asking about their own orders, refunds, subscriptions, account history. Structured + unstructured hybrid.

Distinguishing wrinkle
Freshness. A customer asking ‘did my refund process?’ needs live transaction DB (text-to-SQL), not embeddings of yesterday’s snapshot. Router agent picks structured vs unstructured.
🎓
Coming soon

Educational Content

Students and teachers querying textbook content, lessons, and learning materials across grade levels.

Distinguishing wrinkle
Pedagogical scaffolding. A 6th grader and a grad student asking ‘what is photosynthesis’ need different retrieval scopes and different prompt personas.
🏥
Coming soon

Healthcare

Clinicians and patients querying medical records, drug interactions, and clinical guidelines.

Distinguishing wrinkle
HIPAA compliance and clinical accuracy. Wrong answer isn’t just bad UX — it’s a patient safety issue. Requires citation enforcement and refusal on uncertain ground.
🛡️
Coming soon

Insurance

Agents and claimants querying policy terms, claim histories, and coverage rules.

Distinguishing wrinkle
Jurisdictional regulation. Every answer must be filtered by the state’s insurance code, and the system must never offer coverage advice that hasn’t been verified against the actual policy.
💹
Coming soon

Financial Services

Advisors and clients querying research reports, account positions, and regulatory filings.

Distinguishing wrinkle
SEC/FINRA compliance + adversarial queries. System must refuse to give investment advice and flag anything that resembles market manipulation.

🤔Why build an ‘Enterprise RAG’ section at all?

Most AI/ML portfolios show you one of two things: a single toy RAG notebook (“LangChain + Chroma + OpenAI, works on a 10-document corpus”) or a thin buzzword list (“experienced with vector databases”). Neither tells a senior hiring manager whether the candidate can actually wire this up for a real enterprise.

💡What senior engineers actually care about in RAG systems
The questions that matter in a production RAG system aren't “what is an embedding?” They're: who owns the metadata filter? How do you handle permission scoping across tenants? What happens when the reranker times out? How do you evaluate retrieval precision against a golden set? How do you roll out a new embedding model without re-embedding everything at once? This hub is where I walk through those answers — in architectural detail, for real verticals.

Each vertical page walks through the full pipeline for a specific industry with the gotchas that matter for that industry. It's the artifact every senior RAG engineer should have — a concrete, architecture-level answer to “can this be wired up for our domain, and what does it take?”