AI Engineer shipping production RAG systems, multi-modal document intelligence, and domain-specific LLM applications — with live demos to prove it.
Architecting production RAG systems and domain-specific LLM applications. Designed and shipped the bilingual Glass Industry Expert System — a fine-tuned Qwen2.5-14B with hybrid retrieval over 247K+ chunks — sustaining ~85% hit rate at 1–5 s latency. Also building autonomous retrieval pipelines and n8n workflow automations integrated with Gemini Vision.
Architected n8n automated workflows to streamline sustainability operations, and delivered a RAG + LLM chatbot prototype with LangChain and Groq API to automate customer inquiries, provide intelligent product advice, and optimize support efficiency.
Deployed real-world AI/ML solutions including an intelligent cheque processing system powered by neural networks and Python — automating data extraction, validation, and financial workflows for measurable productivity gains.
Completed intensive remote training across ML algorithms and deep learning frameworks. Strengthened production ML workflows, model deployment, and optimization strategies through hands-on Python projects.
Enhanced Python programming through hands-on internship work focused on data structures, algorithms, and clean coding practices for scalable software.
Drop any PDF and chat with it. Hybrid retrieval (sentence-transformers MiniLM + BM25 fused via reciprocal rank fusion), Groq Llama-3.3-70B for cited answers, structured JSON extraction with one-click CSV export. FastAPI backend on Render + Next.js 14 frontend on Vercel.
Full-stack RAG application: drop any PDF, chat with it, and get answers grounded in the source with clickable page citations. Hybrid retrieval (sentence-transformers MiniLM dense vectors fused with BM25 lexical search via reciprocal rank fusion), one-shot structured extraction with downloadable CSVs, and a polished Next.js 14 + Tailwind workspace with embedded react-pdf viewer. FastAPI backend on Render + Next.js frontend on Vercel + Groq Llama-3.3-70B for inference.
End-to-end Malayalam → English speech and text translator deployed on HuggingFace Spaces. Accepts mic recording or any audio format (OGG, MP3, WAV) via Gradio, transcribes with Google Speech API, translates with deep-translator. Includes an offline IndicTrans2 + ctranslate2 NMT pipeline as an alternative backend.
End-to-end ML pipeline for automating loan approvals — data preprocessing, feature engineering, model selection (Random Forest / Logistic Regression), and a Streamlit web interface deployed on Streamlit Cloud. Tackles real-world challenges: missing data, outliers, and multicollinearity.
Production RAG chatbot over Khansaheb's sustainability documents — LangChain + Groq for fast inference, vector retrieval over PDF corpus, and a Node.js front-end. Automates ESG inquiries with grounded, citation-backed answers. Built for Khansaheb Sustainability, Dubai; source code lives in a private client repository.
Production-grade bilingual (English + Farsi) AI assistant for glass manufacturing engineers. Fine-tuned Qwen2.5-14B (vLLM, 4-bit LoRA), hybrid retrieval (pgvector HNSW + BM25 + Reciprocal Rank Fusion) over 247K+ chunks, cross-encoder reranking, Redis semantic caching, React 18 + TypeScript chat UI, deployed via Docker Compose — ~85% retrieval hit rate, 1–5s end-to-end latency. Built for MTA Investment LLC, Dubai; source code lives in a private company repository.
Take-home assessment built in Flask + OpenAI. Full mini app with user authentication, campaign creation (product, keywords, targeting, banner upload), AI-generated ad copy via GPT-3.5-turbo with a template fallback, and simulated performance analytics. Demonstrates end-to-end web-app + LLM integration in a single self-contained codebase.
Interactive OpenCV + Streamlit demo covering all the foundational image-processing techniques: contrast enhancement (global HE, CLAHE), morphology (erosion, dilation, opening, closing), edge detection (Canny, Sobel, Laplacian, morphological gradient), and segmentation (thresholding, K-means, contours). Upload any image, drag the parameter sliders, download the result.
Three-part practical guide to autoencoders: fundamentals, image denoising, and anomaly detection. Demonstrates dimensionality reduction, feature extraction, and reconstruction-loss analysis with TensorFlow/Keras.
End-to-end ML pipeline classifying primary students into VARK learning styles (Visual / Auditory / Reading-Writing / Kinesthetic) from a 15-question survey + study-habit features. Realistic synthetic dataset (5,000 students), feature engineering, and a Streamlit demo. Lifted accuracy from a 59% LogReg baseline to 84.5% with a Random Forest on engineered features (+25.3 pp).
Contributed to a neural network-based system for automated cheque extraction and validation at Direct Axis Technologies, Dubai. Streamlined financial workflows with high-accuracy OCR. Source code lives in a private team repository.
Shoot me a message on linkedin here
or email me at zainrafeeque@gmail.com