Mansi Dhanania MansiDhanania

Hi there, I'm Mansi 👋

AI Systems Engineer · Applied ML · Multimodal AI · Montreal, Canada

I'm a grad student at McGill finishing my M.Sc in ECSE. I spend my time building AI systems to solve real world problems.

My thesis is a multimodal AI assistant for blind users. Developing a system that genuinely cannot fail, hallucinate or crash mid-session has taught me more about building reliable LLM pipelines than anything else could. I worked on the backend architecture: multi-model orchestration, Redis session memory, Traefik infrastructure, MCP servers, n8n workflows, routing logic. One backend, three frontends (web, iOS/Android, smart glasses), no changes needed.

Apart from my thesis, I've been chasing two questions:

🤔 Can LLMs actually be creative? Not "generate interesting text" but capable of genuinely novel ideas in a measurable sense. I spent a semester at Mila building a 4-agent RL-LLM loop to investigate this.
⚙️ How do you build agentic systems that don't fall apart? Multi-model routing, memory that helps rather than bloats context, fallback cascades that fail loudly. The engineering here is underrated.

🔨 Things I've built

ShelfScout (M.Sc. thesis — backend architecture, live here) Real-time AI assistant for blind and visually impaired users. This included n8n orchestration across Claude, Gemini, LLaMA-4, Qwen3-VL and GPT-OSS; Redis session memory; Traefik + Docker; MCP servers. Frontend-agnostic by design: the same backend serves web, iOS/Android, and will talk to smart glasses. Benchmarked against Be My AI, Meta Ray-Ban glasses, and Gemini Live.

Novelty in LLM-Guided RL 4-agent RL-LLM loop where agents propose physics hypotheses, write their own reward functions, and critique each other. Cosine-similarity rejection sampling forces genuine novelty over paraphrasing. My novelty seeker agent hit 3.6× higher embedding distance than the baseline DQN. Whether that counts as creativity is still an open question.

OpenUBA Open-source insider threat detection over 32M+ behavioural logs. Five algorithms compared, AUC 0.9923, SHAP/LIME explainability.

📄 Published

How do Transformer Embeddings Represent Compositions? A Functional Analysis Findings of ACL 2025, Vienna — with Nagar, Rawal & Tan

TL;DR: we tested whether transformer models are actually compositional. Ridge regression wins, but plain vector addition is surprisingly competitive. BERT is bad at this. Mistral is not.

🧰 Tech Stack

🤖 LLMs & Agents

🧠 ML / DL

⚙️ Infra & DevOps

💻 Languages

🏅 Recognition

🏆 BLUE Fellowship — Mila & McGill Building 21 (2026)
🎓 McGill Graduate Excellence Award (2025)
🌐 McCall MacBain Regional Scholarship (2024)
🔬 Mitacs Accelerate Research Scholarship (2024)
🌏 A*Star Singapore International Pre-Graduate Award (2023–24)
🔬 Mitacs Globalink Research Internship Scholarship (2023)

🔭 Currently

Wrapping up M.Sc. at McGill (May 2026)
Building a RAG + agentic AI project — watch this space 👀
Loking for AI Engineer and Applied ML Research roles

always down for coffee chats, collaborations, and solving interesting problems
mansidhanania@gmail.com

Provide feedback

Saved searches

Use saved searches to filter your results more quickly