Open to Remote Opportunities

ML Engineer
Building AI that ships.

3 years at PwC building production AI/ML systems for Fortune 500 clients. Specializing in RAG pipelines, multi-agent systems, and LLM applications — from architecture to HuggingFace deployment.

3+
Years at PwC
4
Live ML Apps
50TB+
Data Processed
3
Fortune 500 Clients

Live ML Applications

● Live RAG · LLM
RAG Agent System — Production Document AI

End-to-end multi-PDF RAG pipeline with FAISS vector search across 10K+ chunks. Features a 3-tier LLM fallback chain (Groq → OpenAI → FLAN-T5), source attribution, chat history, and local hosting with no API key required. 10× faster than baseline.

10K+chunks indexed
~1sresponse time
10×faster than baseline
LangChain LCEL LangGraph FAISS Groq API Sentence-Transformers Gradio FastAPI
● Live Analytics
LLM Comparison Dashboard

Benchmarked 3 GPT-2 variants across 5 NLP metrics with 40% faster inference via ThreadPoolExecutor parallelism. DistilBERT achieves 89% accuracy. All 100+ benchmark sessions tracked persistently in SQLite with interactive Plotly visualizations.

40%faster inference
89%DistilBERT accuracy
100+sessions tracked
Streamlit HuggingFace Transformers GPT-2 DistilBERT Plotly SQLite
🔒 Internal Tool Code Analysis · PwC
SAP ABAP Code Analyzer

AI classification pipeline using LangChain + GPT-4o with a deterministic override layer to assess legacy ABAP codebases for SAP S/4HANA migration. Engineered a 4-level token-aware fallback system — reducing consultant assessment time from weeks to minutes.

Weeks → minsassessment time reduction
4-leveltoken-aware fallback
Zeropipeline failures
Python LLM SAP ABAP Streamlit Gradio

Technical Stack

LLM & RAG
LangChain LangGraph RAG Pipelines FAISS Groq Ollama Prompt Engineering
ML & Deep Learning
PyTorch Scikit-learn Random Forest HuggingFace Transformers CNNs Transfer Learning
Deployment
HuggingFace Spaces Streamlit Gradio FastAPI Docker GitHub Pages
Data Engineering
SAP Migration Syniti BODS Alteryx SQL ETL Pipelines LTMC

Where I've Worked

Data & ML Engineer
@ PwC India
Mar 2023 — Present
  • Built an AI classification pipeline using LangChain + GPT-4o with deterministic override layer to assess legacy ABAP codebases for SAP S/4HANA migration — reducing consultant assessment from weeks to minutes
  • Engineered a 4-level token-aware fallback system (full context → RAG retrieval → chunked summarization → LLM merge) using tiktoken + LangChain — ensuring zero pipeline failures on codebases of any size
  • Designed a dual RAG architecture using FAISS + OpenAI Embeddings — global store for cross-project retrieval and scoped per-module stores — auto-generating technical and functional spec docs as Word/Excel outputs
  • Diagnosed and resolved silent production upload failures caused by three compounding issues (middleware conflict, CORS preflight mishandling, read-only container FS) — restoring full functionality with zero data loss
  • Reduced LLM API costs via input-hash-based call caching, model tiering, and batched file classification — eliminating redundant API calls on repeat runs
BioMarin Kimberly-Clark Rockwell Automation

Let's Build Something