●Open for freelance & contract projects — DM me

I build and ship production AI. Real-time computer vision, LLM agents, and the infrastructure that keeps them reliable.

I'm Arush, an AI/ML engineer. Most of my work lives where a model meets a real system — vision pipelines running under 200ms, RAG agents over thousands of documents, all deployed and monitored on GCP and AWS.

See selected work

6×: hackathon wins
10K+: images/day in prod
<200ms: live vision latency

Selected work

TraceRAG

2026

Enterprise GraphRAG observability

→ ~90% lower latency (4.5s → <500ms)

Live demo

The problem. Root-cause analysis in distributed systems is slow. When something breaks, engineers end up tracing service dependencies by hand to work out what a given bug affected.

What I built. An observability platform built on a custom GraphRAG engine that maps service dependencies and automates root-cause analysis. A hybrid LLM router classifies intent using Groq Llama-3.1-8B and GLiNER for zero-shot NER, backed by LadybugDB, a custom multi-hop graph database that runs batched Cypher traversals. The React Flow frontend visualizes the blast radius of a bug across GitHub PRs and Jira tickets, with plain-English summaries.

Batched Cypher queries in LadybugDB eliminate N+1 bottlenecks
Sustained 13 req/s with zero errors under a 25-way concurrent load test
Interactive blast-radius view spanning GitHub PRs and Jira tickets

PythonFastAPIReact FlowGroq Llama-3GLiNER

Bodhi

2026

AI mock-interview platform

→ Sub-millisecond session latency at scale

Live demo

The problem. Interview prep tends to be either expensive coaching or a static question bank. Neither feels realistic, offers meaningful feedback, or scales.

What I built. A voice-first interview platform with a 6-phase adaptive pipeline and 4-dimensional scoring, with live proctoring via FaceNet512, MediaPipe, and YOLOv8. Session state moves through a 3-tier store, from in-memory to Redis to PostgreSQL.

RAG over 10K+ document chunks keeps follow-up questions grounded
Sarvam AI for STT/TTS, with a gamification engine spanning XP and 7 rank tiers

LangGraphFastAPIPostgreSQLRedisYOLOv8MediaPipe

AltEgo

2026

Parallel-universe AI for X profiles

→ 4 validated personas from one LLM call

Live demo Source

The problem. A viral AI toy is easy to demo and hard to keep working — external APIs rate-limit, keys go missing, and results have to unfurl cleanly when someone shares them.

What I built. AltEgo turns any public X/Twitter handle into four exaggerated 'parallel-universe' versions of the account. It fetches recent tweets and runs them through a single Groq call (Kimi K2, with an automatic Llama 3.3 70B fallback) that returns validated JSON for all four personas — each a different archetype from a pool of 20, with a parody handle, tweets that mimic the person's real writing style, vibe stats, and an absurd engagement projection. Every external dependency degrades gracefully: with no API keys it runs a fully functional demo mode on deterministic sample data, so local dev and CI need zero setup. Results are cached per handle in Upstash Redis, and each one gets a short URL with a dynamically generated Open Graph image so it previews cleanly on X.

Graceful degradation: full demo mode with zero API keys, real APIs swap in when configured
Per-result Open Graph images via next/og, plus downloadable 4-in-1 share cards
Custom hand-drawn 'sketchbook' design system — SVG doodles, marker type, Framer Motion

Next.js 15TypeScriptGroqUpstash Redisnext/ogFramer Motion

FlowState

2026

Realtime collaborative whiteboard

→ In active development

Source

The problem. Multiplayer whiteboards like Figma and Excalidraw depend on realtime sync that doesn't conflict and clean recovery after a dropped connection. The hard part is the backend, not the canvas.

What I built. A backend-first collaborative whiteboard built around an immutable operation log, snapshot-based replay, per-room version ordering, and websocket broadcast — the same primitives behind Figma and Excalidraw multiplayer. I designed the system before implementing it, with 18+ diagrams covering architecture, schema, sequence flows, reconnect recovery, and a Redis pub/sub scaling path.

Cursors and presence are kept off the persisted write path to keep the op-log clean
Scaling path designed from single-instance to Redis pub/sub

FastAPIWebSocketsPostgreSQLReactKonva

SmartFlow

2025

RL traffic optimization

→ 30% lower average wait time

Live demo

The problem. Fixed-timer traffic signals don't respond to real congestion, so an intersection can sit red over an empty lane while traffic backs up beside it.

What I built. A real-time vision pipeline tracks vehicles across 4-lane intersections, and a reinforcement-learning agent uses the live counts to adjust signal timing dynamically.

94% vehicle detection accuracy at 30 FPS
Analyzes 500+ traffic patterns an hour, improving flow efficiency by 40%

YOLOv8DeepSORTReinforcement Learning

Applyzer

2025

AI job-application automation

→ 100+ applications tracked end-to-end

The problem. Applying to jobs manually is slow, repetitive, and rarely tailored to each specific posting.

What I built. It generates tailored LaTeX resumes, cover letters, and cold emails, with an async scheduler that polls Gmail and sends context-aware follow-ups. A human stays in the loop throughout.

Hybrid scoring: TF-IDF 40%, keyword 35%, tech-match 25%
Redis-cached sub-second recommendations

FastAPIPostgreSQLRedisLangGraphGroq LLaMA 3

ATS Scorer

2025

Semantic resume matching

→ Deployed full-stack on AWS

The problem. Manual resume screening is slow and prone to bias, and strong candidates often get filtered out over keyword mismatches.

What I built. It scores resumes against a job description using SentenceTransformers for semantic matching, with LLaMA 3 generating feedback on what's missing.

Semantic matching rather than simple keyword overlap
Groq generates the feedback; MongoDB tracks applications

FastAPIReactGroq LLaMA 3SentenceTransformersAWSMongoDB

AeroMix

2024

Gesture-based audio control

→ 96% recognition across 12 commands

Source

The problem. Hands-free audio control without wearables or extra hardware — just a webcam.

What I built. A computer-vision system that maps hand gestures to audio commands in real time, using MediaPipe landmarks and lightweight ML classifiers.

Real-time at 30 FPS with <100ms latency

MediaPipeOpenCVML classifiers

About

What interests me most is the engineering around a model, not just the model itself. Reaching 94% accuracy is the straightforward part. Making it run at 30 frames per second on a live video feed, behind an API, under 200 milliseconds, and reliably enough for real users is where the real work is.

Most of what I've built sits at the boundary between a model and the system around it — voice interview pipelines, real-time proctoring, RAG agents that route across multiple LLM providers, and the Redis and Postgres infrastructure that keeps them fast. I've also spent time teaching it, including an 8-week NLP course and workshops for a few hundred people.

Agentic systems & RAG

LLM workflows that retrieve their own context and act on it, returning sub-second answers over 10K+ document chunks and routing across multiple providers when one model isn't enough.

LangGraph · LangChain · Groq LLaMA 3 · pgvector · Redis

Real-time computer vision

Vision models that run live on video rather than recorded clips — 94% vehicle detection at 30 FPS, with gaze tracking and proctoring under 200ms.

YOLOv8 · DeepSORT · MediaPipe · OpenCV · WebRTC

ML in production

Containerized services on GCP and AWS handling 10K+ images a day, with monitoring that surfaces problems before users run into them.

GCP Cloud Run · AWS · Docker · FastAPI

Backend & data plumbing

The infrastructure that makes AI reliable rather than a demo — websockets, vector stores, pub/sub, and 3-tier caching.

FastAPI · PostgreSQL · Redis Pub/Sub · MongoDB

Winner

Built an RL traffic system that cut wait time by 30%, beating 200+ teams at Hackers Playground.

1st place

First place at the IIIT-Delhi Ideathon for an AI-driven system architecture.

Finalist ×5

Reached the finals of 5 national hackathons, placing in the top 1% of 500+ teams across India.

Production

Shipped a pipeline processing 10K+ aerial images a day at sub-200ms latency.

Teaching

Designed and taught an 8-week NLP course with a 90% completion rate.

Community

Ran AI/ML hackathons and workshops, mentoring 100+ participants.

Experience

May 2026 – Present

Applied AI Intern

·Gravity AI

I prototype and ship features across the AI stack — prompting, retrieval, evaluation, and lightweight model integrations, along with backend work where it's needed. Much of the role is experimental: building and labeling datasets, defining success metrics, and turning the results into production code. The throughline is making the agents more reliable for real users, working closely with product and engineering to ship on schedule.

LLMsRAGEvaluationPythonFastAPI

Feb 2026 – Apr 2026

AI Engineer

·HiTouchCX (REBOO8)

Built four production AI systems, all deployed on GCP: live proctoring, a RAG support agent, a voice interviewer, and a LangGraph assessment engine.

YOLOv8DeepSORTMediaPipeLangGraphFastAPIRedis

Jun 2025 – Present

ML DevOps Trainee

·Kanchan Drones India

Built 5+ ML pipelines processing 10,000+ aerial images a day, plus a Unity desktop app that gives 15+ drone operators real-time imaging. The pipelines cut manual intervention by 65%.

OpenCVWebSocketsUnityML Pipelines

Jun 2025 – Sep 2025

AI/ML Intern

·Circus Theory · Dubai

Built edge ML pipelines across 8+ devices, reaching 92% accuracy on 50GB+ datasets with 45% lower latency.

YOLOv8TensorFlowPyTorchESP32

Nov 2025 – Dec 2025

AI/ML Instructor

·Apna College

Designed and taught an 8-week NLP curriculum covering everything from tokenization to transformers, with a 90% completion rate across 20+ students.

NLPTransformersCurriculum

Sep 2024 – Oct 2024

Credit Risk Modeling Intern

·Peak2Tails

Built Basel-compliant credit scorecards for 5,000+ users, improving prediction accuracy by 22% and cutting prep time by 40%.

PythonStatistical ModelingPandas

Writing

The Village Doesn't Scale

What a growing Indian city can teach you about system design — how settlements evolve through the same architectural stages as software.

Jul 2, 2026·18 min read

Contact

Working on something that needs to ship?

Open for freelance & contract projects — DM me. Email or any of the links below works, and I read everything that comes in.

arushkarnatak1881@gmail.com