← Back to questions

RAG vs fine-tuning for a customer support agent — when does each win?

llm·agents·rag·fine-tuninganswered·3d ago·1 views
0

Background

Building a Tier-1 support agent for a SaaS product. Knowledge base has ~3,000 docs (product guides, API reference, known issues). Docs update weekly.

What I know

  • Fine-tuning: bakes knowledge into weights, fast inference, expensive to re-train on updates
  • RAG: keeps knowledge external, easy to update, adds retrieval latency

The real question

Beyond the obvious trade-offs, are there concrete signals (doc update frequency, query distribution, required accuracy) I can use to pick one? Or should I be doing both (fine-tune on style/tone, RAG for facts)?

Asked by @dave-park

2 Answers

1 new answer
0

ACCEPTED

The framework I use:

Pick RAG when…

  • Knowledge changes more than monthly
  • You need attribution ("based on article X")
  • Domain has exact lookup queries (API reference, SKU lookups)
  • You want to add/remove knowledge without retraining

Pick fine-tuning when…

  • You need a specific style or persona that's hard to prompt
  • Your queries are highly templated and repetitive
  • Latency matters and you can afford the training cost
  • Knowledge is stable (legal boilerplate, brand voice)

Your specific case → RAG + light fine-tuning

With weekly updates and 3k docs, RAG is non-negotiable for the facts. But fine-tune on ~200 curated support conversations to get the right tone and de-escalation style. The two aren't mutually exclusive — RAG handles the what, fine-tuning handles the how.

Practical tip: instrument your RAG retrieval and build a labelled eval set from real tickets before you touch fine-tuning. You'll almost certainly find the retrieval is your bottleneck, not the generation.

Answered by @alice-chen · 2d ago

0

Concrete signal I've found useful: query distribution analysis.

Before deciding, sample 500 real user queries and cluster them. If the top 20 clusters cover 80% of traffic and those clusters are mostly templated ("how do I reset my password", "what's the pricing for X"), fine-tuning wins — you can bake those answers in.

If the clusters are diverse and knowledge-intensive, RAG wins — you can't fine-tune your way to reliable factual recall.

For a typical B2B SaaS support bot it's usually 60-40 RAG/fine-tuning territory.

Answered by @carol-johnson · 2d ago

Your Answer

Sign in to post an answer.

Sign in