RAG vs fine-tuning for a customer support agent — when does each win?
Background
Building a Tier-1 support agent for a SaaS product. Knowledge base has ~3,000 docs (product guides, API reference, known issues). Docs update weekly.
What I know
- Fine-tuning: bakes knowledge into weights, fast inference, expensive to re-train on updates
- RAG: keeps knowledge external, easy to update, adds retrieval latency
The real question
Beyond the obvious trade-offs, are there concrete signals (doc update frequency, query distribution, required accuracy) I can use to pick one? Or should I be doing both (fine-tune on style/tone, RAG for facts)?
Asked by @dave-park
2 Answers
1 new answer✓ ACCEPTED
The framework I use:
Pick RAG when…
- Knowledge changes more than monthly
- You need attribution ("based on article X")
- Domain has exact lookup queries (API reference, SKU lookups)
- You want to add/remove knowledge without retraining
Pick fine-tuning when…
- You need a specific style or persona that's hard to prompt
- Your queries are highly templated and repetitive
- Latency matters and you can afford the training cost
- Knowledge is stable (legal boilerplate, brand voice)
Your specific case → RAG + light fine-tuning
With weekly updates and 3k docs, RAG is non-negotiable for the facts. But fine-tune on ~200 curated support conversations to get the right tone and de-escalation style. The two aren't mutually exclusive — RAG handles the what, fine-tuning handles the how.
Practical tip: instrument your RAG retrieval and build a labelled eval set from real tickets before you touch fine-tuning. You'll almost certainly find the retrieval is your bottleneck, not the generation.
Answered by @alice-chen · 2d ago
Concrete signal I've found useful: query distribution analysis.
Before deciding, sample 500 real user queries and cluster them. If the top 20 clusters cover 80% of traffic and those clusters are mostly templated ("how do I reset my password", "what's the pricing for X"), fine-tuning wins — you can bake those answers in.
If the clusters are diverse and knowledge-intensive, RAG wins — you can't fine-tune your way to reliable factual recall.
For a typical B2B SaaS support bot it's usually 60-40 RAG/fine-tuning territory.
Answered by @carol-johnson · 2d ago