Typically, LLM analytics assistants or chatbots start with retrieval-augmented generation (RAG) and a database connection. That’s fine until real users ask a mix of KPI questions, definition lookups, lineage questions, and repeated dashboard-style requests. If everything goes through one retrieval path to access data, you will see three predictable failures.
- Wrong answers: Metrics that are computed at the wrong grain, wrong joins, missing filters
- Slow answers: Long prompts, retries
- Higher cost: More tokens, more queries, more wasted warehouse scans
Analytics questions are not the same every time. The backend that is best for one question (e.g., what does active users mean?) may not be the best for another (e.g., which dashboards depend on the product type field?).