Skip to content
Retail · enterprise

Sales-quote assistant with RAG over internal catalog

How we cut quote-generation time for an enterprise retailer without touching their ERP.

Size
Enterprise · 1000+ employees
Stack
Llama 3.1 70B · pgvector · LangGraph
Published
April 12, 2026
Client
Technical demo · implementation example

The problem

The sales team took hours to build quotes for B2B clients. Catalog data lived in a legacy ERP, pricing in a separate module, and per-client discounts in local spreadsheets. Each quote required opening three different systems and manually verifying availability.

“What killed us wasn’t building the quote — it was the back-and-forth on stock and pricing.” — Sales lead, first technical session

Why the previous attempt failed

A prior pilot tried a chatbot wired to SAP via API. The problem: answers were slow because each query hit the ERP, and the model couldn’t reason about contextual discounts without access to history.

What we built

An agent with RAG over three sources:

  1. Product catalog vectorized with pgvector and incremental ingestion every 15 minutes.
  2. Order history indexed per client to infer discounts and recurring volumes.
  3. Pricing policy authored in MDX and embedded with a dedicated loader.

The agent drafts a quote in seconds. A human always validates before sending — human-in-the-loop on critical transactions, not optional.

Stack

  • Model: Llama 3.1 70B on on-prem GPU
  • Vector DB: Postgres + pgvector (same cluster as the ERP)
  • Orchestration: LangGraph with figure guardrails
  • Observability: OpenTelemetry → Grafana
  • Privacy: data never leaves the client’s datacenter

How we shipped it

  • Weeks 1-2: ingestion and quality QA against 500 historical quotes.
  • Weeks 3-4: pilot with 5 reps; prompt and confidence-threshold tuning.
  • Weeks 5-8: gradual rollout to the full sales force (~80 people) with adoption tracking.
  • Week 9+: 99.5% SLA operation and bi-weekly release cadence.

Outcomes

Illustrative figures based on comparable projects: quote time from hours to ~4 minutes, 97% accuracy versus the human benchmark, and 80% adoption across the sales force by the end of rollout.

Technical demo · capability showcase, not an NDA project
06 — Contact

We start with a 45-minute technical session.

No endless form. Tell us briefly about the challenge and we will book a call. If it is not a fit, we say so.