Skip to the content.

← Back to Home

πŸ€– RAG Chatbot System

Production document Q&A using retrieval-augmented generation

πŸš€ Live Demo GitHub Repository

πŸ“ Overview

A production-ready RAG (Retrieval-Augmented Generation) system that enables natural language Q&A over any PDF document. Upload a document, ask questions, and get accurate answers grounded in the content - no hallucinations!

Key Innovation: Combines semantic search with LLM generation to provide context-aware, verifiable answers.


🎯 Key Features

βœ… Document Processing Pipeline

βœ… Semantic Search

βœ… LLM Integration

βœ… Production Features


πŸ—οΈ Architecture

User Query
    ↓
[1] Embedding Generation (Sentence-Transformers)
    ↓
[2] Vector Search (FAISS - finds 3 similar chunks)
    ↓
[3] Context Assembly (combines chunks)
    ↓
[4] Prompt Template (injects context + question)
    ↓
[5] LLM Generation (FLAN-T5)
    ↓
Answer (grounded in document)

πŸ’» Technical Implementation

Document Processing

# Recursive text splitting preserves natural boundaries
RecursiveCharacterTextSplitter(
    chunk_size=500,        # ~100 words per chunk
    chunk_overlap=50,      # Preserves context
    separators=["\n\n", "\n", " ", ""]  # Try paragraphs β†’ lines β†’ words
)

Vector Database

# FAISS for fast semantic search
FAISS.from_documents(
    chunks,
    embeddings,           # 384-dim sentence-transformers
    metric="cosine"       # Similarity metric
)

LLM Prompt Engineering

template = """Use ONLY the context below to answer. 
If you don't know, say you don't know - don't make up answers.

Context: {context}
Question: {question}
Answer:"""

πŸ› οΈ Tech Stack


πŸ“Š Performance Metrics


πŸŽ“ Key Learnings

1. Chunking Strategy Matters

2. Prompt Engineering is Critical

3. Model Caching = Production Essential

4. LCEL > Old LangChain Patterns


πŸš€ Future Enhancements


πŸ“Έ Screenshots

Document Upload Interface: Upload

Query & Answer: QA



← Back to Home