mostly backend systems and applied ai & ml infrastructure. the kind of stuff that runs in the background and either works or takes everything down with it.
| project | what it does |
|---|---|
callmind |
voice agent platform that handles phone calls end-to-end, routes across llms, transcribes in real time |
autonomous-ml |
mlops system that watches your model in prod, catches drift, and retrains itself without going down |
prompt-compression |
shrinks llm prompts by up to 70% using textrank + bart, tracks token delta per model |
speculative-decoding |
speculative decoding with gpt-2 and distilgpt-2 as draft model |
model-distillation |
resnet18 teacher → 43x smaller student, 4.3x faster inference |
kv-cache |
key-value cache from scratch |
kernel-suite |
custom cuda kernels |
search-ranking |
learning-to-rank search engine with a pairwise ranknet model, served over http in Go |
streaming-token |
streaming token generation over websockets |
neural-ode |
neural ordinary differential equationss in jax for modeling continuous-time dynamics |
scale-churn |
churn prediction pipeline on hadoop |
researcher |
multi-agent system with analyst, writer, reviewer agents generating full reports |
ml pytorch tensorflow jax cuda huggingface scikit-learn opencv langchain crewai
backend python go fastapi spring boot websockets
data postgresql mongodb redis mysql
infra docker kubernetes gcp aws linux



