Multi-agent orchestration, tool use & conditional routing — drafting, profiling and automation that runs end-to-end.
LoRA / QLoRA & distributed training for low-resource translation and domain adaptation — cheaper inference, sharper output.
Real-time detection, multilingual OCR and semantic retrieval — pipelines that read, parse and index the messy real world.
SCROLL TO RUN THE PIPELINE ↓
Parse corpora, OCR & structure into DB/CSV-grounded sources.
LoRA / QLoRA adaptation, distributed runs & eval loops.
Wire agents, tools & routing into a reliable workflow.
Deploy on FastAPI / Docker / AWS — measure, cut cost, repeat.
An AI-powered legal-technology product for Indian law firms. I cut NER extraction cost 25% on an 80-lakh judgment corpus by replacing a 7-stage LLM pipeline with a 2-call Gemini-2.5-Flash-Lite architecture — and hit 98% document-level zero-failure across 40-lakh test rows.
Multi-agent orchestration routing each request to the right specialist — 42 agents, config-driven.
Compliance agent with hybrid RAG, grounding guardrails and LLM-as-judge evaluation.
Swap LLM providers inside Claude Code — a published npm package, global install.