Skip to content
@ruthuraraj-ml

Ruthuraraj ML & AI Lab

A structured learning journey exploring Machine Learning, Deep Learning pipelines, Multimodal RAG systems, and Agentic AI workflows.

banner

stats


LinkedIn GitHub Portfolio LeetCode Exercism


💡 Philosophy

These projects are not about replicating state-of-the-art benchmarks or shipping polished end products.

The goal of every repository here is to build from scratch after genuinely understanding the underlying concepts — and then to be honest about where things break, what the model cannot do, and why. Limitations are documented as carefully as results, because understanding failure modes is how real learning happens.

This matters especially coming from a Mechanical Engineering background: the instinct here is not to chase accuracy numbers, but to ask why does this architecture work, where does it fail, and what does that tell us about the problem? That question drives every project in this portfolio.


⭐ Dream Project

Type LLM Infra Repo

R-B.A.T (RAG-Based Academic Tutor & Assessment Console) is my long-term effort to build an institution-ready academic AI system that goes beyond chatbot-style interactions.

The project explores how Retrieval-Augmented Generation, local LLMs, structured workflows, and educational guardrails can be combined to support the complete teaching-learning cycle—from content delivery and student tutoring to assessment creation and academic resource generation.

Designed around real institutional constraints such as data privacy, limited hardware, and zero cloud dependency, the entire system runs locally using open-source models through Ollama.

Core Capabilities

🎓 Tutor Mode Grounded question answering over prescribed textbooks with syllabus-aware retrieval, diagram support, and multiple explanation styles.

📝 Assessment Mode Automatic generation of Internal and Semester Examination question papers with Bloom's Taxonomy alignment, CO mapping, and institutional formatting.

📖 Evaluation Mode Generation of textbook-grounded model answers and marking schemes for faculty reference and student preparation.

📊 Presentation Mode Automated lecture presentation generation using RAG-derived content, diagrams, flowcharts, and customizable academic themes.

Details

Why It Matters

Most educational AI tools focus on answering questions.

R-B.A.T focuses on generating academically useful artifacts:

  • Question Papers
  • Model Answers
  • Lecture Presentations
  • Tutor Responses
  • Diagram-Based Explanations

all grounded in institution-approved learning resources rather than general web knowledge.

Architecture & Technical Details

High-level architecture of R-B.A.T, illustrating the interaction between academic workflows (Tutor, Assessment, Evaluation, and Presentation), the Academic Knowledge Layer, Retrieval Engine, Local LLM Runtime, and structured output generation.

Gemma Mistral Ollama FAISS SentenceTransformers Streamlit ReportLab python-pptx

Current focus: evolving R-B.A.T from a textbook-grounded tutor into a comprehensive academic co-pilot for teaching, assessment, and classroom content generation.


🗺️ Learning Progression

Classical Machine Learning  →  Neural Networks & Deep Learning
        ↓                               ↓
Representation Learning     →  Generative & Multimodal AI
                                       ↓
              Agentic AI Systems & LLM Applications
                            ↓
              AI for Engineering Applications

Each project is packaged with a report, notebook, README, requirements, and reproducible workflow.


Classical ML · Deep Learning · Representation Learning — the ground every later project is built on.


Type Framework

A concept-driven demonstration of why linear models fail and why hidden layers are necessary. Walks through OR and AND (linearly separable), breaks logistic regression on XOR to show where it fails — then solves it with a single hidden layer MLP. The focus is entirely on decision boundaries and architectural necessity, not accuracy.

Logistic Regression MLP Decision Boundaries


📐 Classical Machine Learning

Projects

View all 9 projects
# Project Type Models
1 Advertising Sales Prediction Regression Linear Regression
2 Bike Sharing Demand Prediction Time-Pattern Regression Linear Regression
3 Diabetes Prediction Medical Classification Logistic Regression
4 Titanic Survival Prediction Binary Classification Logistic Regression
5 Wine Quality Prediction Multiclass Classification Random Forest
6 Health Risk Classification for Insurance Premium Optimization Medical Risk Classification LR · DT · RF
7 Online Payment Fraud Detection Imbalanced Classification LR · RF · XGBoost
8 NYC Taxi Trip Duration Prediction Geospatial Regression LR · Ridge · Lasso · DT · RF · GB
9 Deep Learning for Groundwater Quality Assessment Regression + Multiclass ANN · BatchNorm · Dropout · Optuna

🧠 Deep Learning & Neural Networks

Projects

View all 3 projects
# Project Focus Area Techniques
1 Twitter Sentiment Analysis NLP & Sequence Modeling RNN · LSTM · GRU · BERT · Transfer Learning
2 Groundwater Quality Assessment Applied Deep Learning ANN · Optimizer Comparison · BatchNorm · Dropout · Optuna
3 Neural Networks — From Basics to Stabilization Deep Learning Fundamentals BatchNorm · Dropout · Optimizers · Training Dynamics

Tokenization · Embeddings · Transformers · LoRA · QLoRA


Type Framework

End-to-end Word2Vec (Skip-Gram + Negative Sampling) built from scratch in PyTorch, extended into an interactive browser-based embedding explorer. Exports intermediate checkpoints across epochs and visualises how semantic structure gradually emerges from random vectors — nearest neighbours, similarity scoring, analogy solving, geometric clustering.

Concepts covered

Distributional hypothesis · negative sampling · cosine similarity · semantic clustering · vector arithmetic · geometry of learned representations · effect of training progression on embedding quality

Word2Vec Skip-Gram FAISS Embedding Visualisation WikiText-2


Type Framework

Built a custom Byte Pair Encoding (BPE) tokenizer from scratch using the WikiText-2 corpus and extended it into an interactive browser-based tokenizer visualizer. The project explores corpus cleaning, vocabulary learning, subword formation, rare-word decomposition, compression efficiency, and tokenizer evaluation while demonstrating how modern NLP systems represent language through reusable subword units.

Concepts covered

Byte Pair Encoding (BPE) · subword tokenization · vocabulary construction · corpus preprocessing · tokenization consistency · compression ratio · out-of-vocabulary handling · rare-word decomposition · tokenizer evaluation · Hugging Face Tokenizers

BPE Tokenization Subword Learning Vocabulary Analysis WikiText-2 HuggingFace Tokenizers


Type Models Focus

A hands-on exploration of Parameter-Efficient Fine-Tuning (PEFT), progressing from LoRA-based adaptation of BERT to QLoRA-based instruction tuning of Gemma 2B. The project investigates how large language models can be adapted by training only a tiny fraction of their parameters while significantly reducing memory requirements through 4-bit quantization.

What makes this different: the repository documents the complete engineering journey — including an attempted QLoRA implementation on BERT, debugging of bitsandbytes compatibility issues, architectural analysis of encoder vs decoder models, and a successful migration to Gemma 2B. Rather than hiding failed experiments, the project preserves them as learning artifacts.

Architecture, results & concepts covered

BERT + LoRA

  • Accuracy: 90.41%
  • Trainable Parameters: 591K (0.5372%)

Gemma 2B + QLoRA

  • Accuracy: 97.04%*
  • Trainable Parameters: 6.39M (0.2438%)
  • 4-bit NF4 Quantization
  • Fine-tuned on a Tesla T4 GPU

Concepts Covered

Parameter-Efficient Fine-Tuning · LoRA · QLoRA · 4-bit Quantization · Instruction Tuning · Transformer Architectures · Encoder vs Decoder Models · Hugging Face PEFT · BitsAndBytes · Memory-Efficient LLM Adaptation

BERT Gemma 2B LoRA QLoRA PEFT BitsAndBytes Transformers PyTorch Hugging Face

  • Metrics computed on valid generated predictions that could be confidently mapped to sentiment labels.

Multi-agent orchestration · LangGraph · CrewAI Flows · Reflection loops · Persistent memory


Type Retrieval Memory

A memory-driven multi-agent research system that transforms a user query into an evidence-backed research report through planning, retrieval, critique, reflection, and persistent learning. Built with LangGraph, D.A.R.I.A. combines domain-specific RAG, web research, evidence curation, a dedicated Research Critic, and structured summarization to move beyond traditional one-shot question answering.

What makes this different: retrieved evidence is independently evaluated by a Research Critic that identifies missing concepts, generates improvement feedback, and triggers replanning when research quality is insufficient. Research plans, information gaps, critic feedback, and final reports are stored as persistent memory and reused in future investigations.

Architecture, agent workflow & concepts covered

Multi-agent workflow:

Memory Agent → Information Needs Analyst → RAG/Web Retrieval → Evidence Curator → Research Critic → Reflection Loop → Summarizer → Memory Update

LangGraph workflow showing conditional routing, parallel hybrid retrieval, reflection-driven replanning, and persistent memory integration.

Core Capabilities

  • Memory-Augmented Research
  • Information Gap Analysis
  • Dynamic Route Selection
  • Domain-Specific RAG
  • Web Research
  • Parallel Hybrid Retrieval
  • Evidence Curation
  • Critique-Based Evaluation
  • Reflection Loops
  • Persistent Learning
  • Structured Report Generation
  • DOCX Export

Concepts Covered

LangGraph StateGraph · shared state management · conditional routing · parallel execution · hybrid retrieval · persistent memory · research planning · information gap analysis · evidence curation · reflection loops · critique-driven evaluation · structured summarization · explainable AI workflows

LangGraph LiteLLM Gemini 2.5 Flash ChromaDB SentenceTransformers Docling Tavily Pydantic Streamlit python-docx


Type LLM Memory

A CrewAI Flow-powered logistics decision intelligence platform that transforms supply chain metrics into executive-level optimization playbooks. Parallel analytical branches (inventory + logistics) synchronize through Flow barriers, pass into a memory-aware strategist, and every strategy is reviewed by an independent Critic Agent before a revision cycle triggers when necessary.

What makes this different: historical optimization playbooks are actively retrieved and injected into future strategy generation — delta reasoning where recommendations evolve from previous decisions rather than restarting from scratch.

Architecture, multi-LLM design & concepts covered

Multi-LLM cognitive architecture: Llama 3.3 70B for inventory interpretation · Gemma 4 26B for logistics analysis · Gemma 4 31B for strategic synthesis · Gemini Flash Lite for independent validation.

Concepts covered: CrewAI Flows · multi-agent orchestration · multi-LLM specialization · reflection-driven strategy revision · persistent SQLite memory · delta reasoning · parallel execution branches · synchronization barriers · executive decision support · logistics optimization · geospatial analytics

CrewAI CrewAI Flows Llama 3.3 70B Gemma 4 Gemini Flash Lite SQLite Gradio Pandas Pydantic


Type Collaboration Reasoning

A multi-agent financial intelligence platform that transforms bill images into actionable financial insights through deterministic analytics, expense forecasting, and collaborative AI reasoning. Built with AutoGen AgentChat, the system combines OCR, spending analytics, recurring expense detection, and a dedicated Advisor–Reviewer collaboration workflow to produce explainable, verified financial recommendations.

What makes this different: deterministic financial analytics are deliberately separated from LLM reasoning. Structured financial reports are generated first, then independently reviewed by a Financial Reviewer before the Financial Advisor produces the final verified consultation. Every recommendation is supported by structured analytics and accompanied by an explainable consultation trace.

Architecture, multi-agent workflow & concepts covered

Multi-agent workflow:

Bill Processing → Expense Analytics → Recurring Expense Detection → Spending Forecast → Financial Advisor → Financial Reviewer → Verified Consultation → Response Synthesis

Production-style architecture combining deterministic financial analytics with AutoGen-based collaborative reasoning and structured response synthesis.

Core Capabilities

  • OCR Bill Processing
  • Expense Categorization
  • Spending Analytics
  • Recurring Expense Detection
  • Monthly Spending Forecasting
  • Advisor–Reviewer Collaboration
  • Structured Financial Consultation
  • Reflection-Based Explainability
  • Interactive Financial Dashboard

Concepts Covered

AutoGen AgentChat · planner–executor architecture · group chat collaboration · deterministic analytics · structured financial reasoning · advisor–reviewer verification · Pydantic validation · modular coordinator design · response serialization · explainable AI workflows

AutoGen AgentChat Gemini Groq Streamlit EasyOCR Pandas Plotly Pydantic


Type LLM Memory

A LangGraph ReAct competitor intelligence platform for clothing stores — built in 2 days. Discovers nearby competitors via Apify, enriches with BestTime traffic data (with an inference fallback for missing coverage), runs a reflection-driven validation loop, and stores every completed analysis in ChromaDB as a queryable RAG assistant.

What makes this different: the dual-layer traffic strategy (empirical API → inference fallback) guarantees 100% traffic coverage even when primary data is unavailable.

Concepts covered & stack

LangGraph StateGraph design · ReAct cycle · reflection-driven loop control · dual-layer data resilience · weighted competitive scoring · persistent vector memory · RAG over longitudinal market data · PDF/Excel report generation

LangGraph LangChain Gemini ChromaDB Apify BestTime Plotly Streamlit ReportLab OpenPyXL


Type LLMs

A confidence-scored reflection loop for travel planning: after initial data collection, the agent evaluates its own information completeness (0–100%), identifies knowledge gaps, and re-searches with targeted queries before generating the final guide. Maximum 2 re-search cycles to prevent runaway API consumption. Every cycle's verdict and gaps are surfaced in an Agent Insights tab — internal reasoning fully auditable.

Concepts covered & stack

LangGraph StateGraph · confidence-scored self-evaluation · conditional re-search · multi-tool orchestration (weather, search, images) · session memory · deduplicated research merging · transparent AI trace

LangGraph LangChain Gemini Groq LLaMA Tavily WeatherAPI Pexels Streamlit


Type LLMs

Autonomous research agent built on the ReAct paradigm from scratch — no LangChain, no LangGraph, no framework. Full Thought → Action → Observation → Summary loop per research question using a multi-LLM split: Groq LLaMA 3.3 70B for reasoning steps, Gemini Flash Lite for planning and synthesis.

Honest implementation: limitations (single-pass loop, in-session memory only, no reflection) are documented as the natural next improvements, not hidden.

Gemini Groq LLaMA 3.3 70B Tavily ReAct Pattern Streamlit


Type

Four specialised agents — Research (RAG) → Image (diffusion) → Reviewer → Manager — orchestrated to transform a topic prompt into structured educational content. Failure cases (handoff breakdowns, irrelevant RAG retrievals, divergent diffusion diagrams) documented alongside the working pipeline.

CrewAI Gemini Groq FLUX RAG Streamlit


Type Built

First agentic implementation. Built as a live demo for the final session of a 3-day AI workshop (SNS College of Technology, Jan 2026). Replaces LLM routing with a FAISS L2 distance threshold — one fewer API call per query, fully deterministic, and makes the agent's decision logic transparent to a non-CS audience.

Gemini API SentenceTransformers FAISS pypdf Streamlit


RAG pipelines · Multimodal systems · Generative models · Vision-Language


🔹 Paperwise RAG — Multimodal Research Paper Q&A (Repository Temporarily Offline)

Type LLM Embeddings

A multimodal RAG pipeline handling text, tables, and figures from academic PDFs in a single unified system. Docling preserves structure across all three modalities; BGE embeds into FAISS; Gemini answers with vision-enabled summarisation for figures and tables. Queries each modality separately and merges before generation.

Engineering details & concepts

Addresses a genuine engineering challenge: most RAG pipelines treat PDFs as plain text, losing tables and figures. OOM crashes on high-resolution page images were fixed by tuning Docling's image scale and disabling full-page rasterisation.

Concepts: multimodal document parsing · three-modality retrieval (text / table / figure) · BGE embeddings · FAISS vector search · vision-language generation · memory-efficient PDF processing

Docling BGE Embeddings FAISS Gemini Flash Lite Gradio Python


Type

×4 image reconstruction (SRGAN and RRDB-based ESRGAN) built from scratch with patch-based training, warm-up stability phases, and adversarial fine-tuning on DIV2K. Results under constrained compute are documented honestly — mode collapse, discriminator instability, and the gap between perceptual loss and PSNR are all explained rather than cherry-picked around.

SRGAN ESRGAN RRDB VGG Perceptual Loss DIV2K


Type

Caption generation progressing from CNN–LSTM baseline (InceptionV3 + LSTM) through Transformer decoder to pretrained vision–language transformers. Validates learning via controlled overfitting before scaling up. Documents why pretrained models outperform naive fine-tuning on limited data — the analysis of what breaks is as central as the results.

InceptionV3 LSTM Transformer Beam Search HuggingFace


🛠️ Technology Stack

💻 Languages

Python

📊 Machine Learning & Data Science

Scikit-Learn XGBoost NumPy Pandas

🧠 Deep Learning & NLP

PyTorch TensorFlow Transformers HuggingFace Tokenizers

🚀 LLMs & Parameter-Efficient Fine-Tuning

LoRA QLoRA PEFT BitsAndBytes

🤖 Agentic AI & Orchestration

LangChain CrewAI LangGraph

🔎 RAG & Vector Databases

FAISS ChromaDB RAG GraphRAG

⚡ LLM Providers & Local AI

Gemini Groq Ollama OpenAI

🌐 Applications & Deployment

Streamlit Gradio GitHub Pages Jupyter

🛠️ Developer Tools

Git GitHub VS Code


🎯 Current Focus

  • Building LLM foundations and applied NLP systems, from tokenization and embeddings to parameter-efficient fine-tuning
  • Developing RAG, GraphRAG, and agentic AI applications for education, research, and decision support
  • Applying AI to engineering, manufacturing, logistics, and supply chain optimization
  • Creating interactive AI learning experiences, workshops, and educational tools for students and faculty
  • Exploring local and hybrid LLM deployments (Gemma, Ollama, Groq, Gemini) for scalable and cost-efficient AI systems
  • Designing multi-agent and multi-LLM architectures that combine specialized models, tools, and retrieval workflows

🔬 R&D Roadmap

Agentic AI, LLM Systems & Engineering AI

🤖 Agentic AI & Knowledge Systems

  • Enterprise AI Agents — Multi-agent systems for logistics, decision support, and operational intelligence
  • GraphRAG & Memory-Aware Agents — Knowledge graphs, long-term memory, and reasoning-enhanced retrieval
  • Educational AI Systems — Next-generation RAG tutors, assessment assistants, and learning copilots

🧠 LLMs & Generative AI

  • LLM Fine-Tuning & PEFT — LoRA, QLoRA, instruction tuning, and domain adaptation
  • Multimodal AI Applications — Text, image, and document understanding workflows
  • Local AI Infrastructure — Hybrid deployments using Ollama, Gemma, and open-source LLMs

⚙️ AI for Engineering & Manufacturing

  • Predictive Manufacturing — Quality, tool wear, and process optimization using ML/DL
  • Engineering Knowledge Systems — AI assistants and retrieval systems for technical education
  • Decision Intelligence for Supply Chains — Optimization, forecasting, and autonomous planning

Research Direction: Building interpretable, retrieval-augmented, and memory-aware AI systems that bridge modern LLMs with real-world engineering and educational applications.


👨‍🏫 About

R. Ruthuraraj · Assistant Professor · Mechanical Engineering · SNS College of Technology AICTE QIP Programme — AI to Generative AI, IIIT Allahabad

This portfolio documents a self-directed learning journey from classical machine learning and statistical modelling to deep learning, generative AI, retrieval-augmented generation (RAG), and modern agentic AI systems.

What began as an effort to learn Python for teaching and engineering applications gradually evolved into a deeper exploration of how intelligent systems reason, retrieve information, collaborate, critique their own outputs, and learn from past decisions. Every project in this portfolio represents not only a completed system, but also the questions, experiments, debugging sessions, architectural redesigns, and lessons learned along the way.

My long-term goal is to bridge Artificial Intelligence and Engineering, applying machine learning, NLP, generative AI, and agentic systems to manufacturing, engineering education, and real-world decision support. This portfolio showcases that journey through practical AI systems, continuous learning and experimentation.


🙏 Acknowledgements

SNS College of Technology · AICTE QIP Programme · IIIT Allahabad . NPTEL Course Instructors · Kaggle · UCI · Hugging Face · CrewAI · LangGraph · LangChain · PyTorch · TensorFlow · Open-Source AI Communities


If you find these projects useful for learning, teaching, or exploring AI systems, consider starring the repositories.

Popular repositories Loading

  1. Embedding_Search Embedding_Search Public

    An end-to-end Word2Vec architecture (Skip-Gram with Negative Sampling) built entirely from scratch in PyTorch and deployed as an interactive, browser-based visualizer. Instead of treating embedding…

    Jupyter Notebook 1

  2. .github .github Public

  3. Advertising-Sales-Prediction-using-Linear-Regression Advertising-Sales-Prediction-using-Linear-Regression Public

    Jupyter Notebook

  4. Bike-Sharing-Demand-Prediction Bike-Sharing-Demand-Prediction Public

    Jupyter Notebook

  5. Diabetes-Prediction-using-Logistic-Regression Diabetes-Prediction-using-Logistic-Regression Public

    Jupyter Notebook

  6. Titanic-Survival-Prediction-using-Logistic-Regression Titanic-Survival-Prediction-using-Logistic-Regression Public

    Jupyter Notebook

Repositories

Showing 10 of 27 repositories

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…