Open for freelance & contract projects — DM me

I build and ship production AI. Real-time computer vision, LLM agents, and the infrastructure that keeps them reliable.

I'm Arush, an AI/ML engineer. Most of my work lives where a model meets a real system — vision pipelines running under 200ms, RAG agents over thousands of documents, all deployed and monitored on GCP and AWS.

hackathon wins
10K+
images/day in prod
<200ms
live vision latency
01

Selected work

TraceRAG

2026

Enterprise GraphRAG observability

~90% lower latency (4.5s → <500ms)

The problem. Root-cause analysis in distributed systems is slow. When something breaks, engineers end up tracing service dependencies by hand to work out what a given bug affected.

What I built. An observability platform built on a custom GraphRAG engine that maps service dependencies and automates root-cause analysis. A hybrid LLM router classifies intent using Groq Llama-3.1-8B and GLiNER for zero-shot NER, backed by LadybugDB, a custom multi-hop graph database that runs batched Cypher traversals. The React Flow frontend visualizes the blast radius of a bug across GitHub PRs and Jira tickets, with plain-English summaries.

  • Batched Cypher queries in LadybugDB eliminate N+1 bottlenecks
  • Sustained 13 req/s with zero errors under a 25-way concurrent load test
  • Interactive blast-radius view spanning GitHub PRs and Jira tickets
PythonFastAPIReact FlowGroq Llama-3GLiNER

Bodhi

2026

AI mock-interview platform

Sub-millisecond session latency at scale

The problem. Interview prep tends to be either expensive coaching or a static question bank. Neither feels realistic, offers meaningful feedback, or scales.

What I built. A voice-first interview platform with a 6-phase adaptive pipeline and 4-dimensional scoring, with live proctoring via FaceNet512, MediaPipe, and YOLOv8. Session state moves through a 3-tier store, from in-memory to Redis to PostgreSQL.

  • RAG over 10K+ document chunks keeps follow-up questions grounded
  • Sarvam AI for STT/TTS, with a gamification engine spanning XP and 7 rank tiers
LangGraphFastAPIPostgreSQLRedisYOLOv8MediaPipe

FlowState

2026

Realtime collaborative whiteboard

In active development

The problem. Multiplayer whiteboards like Figma and Excalidraw depend on realtime sync that doesn't conflict and clean recovery after a dropped connection. The hard part is the backend, not the canvas.

What I built. A backend-first collaborative whiteboard built around an immutable operation log, snapshot-based replay, per-room version ordering, and websocket broadcast — the same primitives behind Figma and Excalidraw multiplayer. I designed the system before implementing it, with 18+ diagrams covering architecture, schema, sequence flows, reconnect recovery, and a Redis pub/sub scaling path.

  • Cursors and presence are kept off the persisted write path to keep the op-log clean
  • Scaling path designed from single-instance to Redis pub/sub
FastAPIWebSocketsPostgreSQLReactKonva

SmartFlow

2025

RL traffic optimization

30% lower average wait time

The problem. Fixed-timer traffic signals don't respond to real congestion, so an intersection can sit red over an empty lane while traffic backs up beside it.

What I built. A real-time vision pipeline tracks vehicles across 4-lane intersections, and a reinforcement-learning agent uses the live counts to adjust signal timing dynamically.

  • 94% vehicle detection accuracy at 30 FPS
  • Analyzes 500+ traffic patterns an hour, improving flow efficiency by 40%
YOLOv8DeepSORTReinforcement Learning

Applyzer

2025

AI job-application automation

100+ applications tracked end-to-end

The problem. Applying to jobs manually is slow, repetitive, and rarely tailored to each specific posting.

What I built. It generates tailored LaTeX resumes, cover letters, and cold emails, with an async scheduler that polls Gmail and sends context-aware follow-ups. A human stays in the loop throughout.

  • Hybrid scoring: TF-IDF 40%, keyword 35%, tech-match 25%
  • Redis-cached sub-second recommendations
FastAPIPostgreSQLRedisLangGraphGroq LLaMA 3

ATS Scorer

2025

Semantic resume matching

Deployed full-stack on AWS

The problem. Manual resume screening is slow and prone to bias, and strong candidates often get filtered out over keyword mismatches.

What I built. It scores resumes against a job description using SentenceTransformers for semantic matching, with LLaMA 3 generating feedback on what's missing.

  • Semantic matching rather than simple keyword overlap
  • Groq generates the feedback; MongoDB tracks applications
FastAPIReactGroq LLaMA 3SentenceTransformersAWSMongoDB

AeroMix

2024

Gesture-based audio control

96% recognition across 12 commands

The problem. Hands-free audio control without wearables or extra hardware — just a webcam.

What I built. A computer-vision system that maps hand gestures to audio commands in real time, using MediaPipe landmarks and lightweight ML classifiers.

  • Real-time at 30 FPS with <100ms latency
MediaPipeOpenCVML classifiers
02

About

What interests me most is the engineering around a model, not just the model itself. Reaching 94% accuracy is the straightforward part. Making it run at 30 frames per second on a live video feed, behind an API, under 200 milliseconds, and reliably enough for real users is where the real work is.

Most of what I've built sits at the boundary between a model and the system around it — voice interview pipelines, real-time proctoring, RAG agents that route across multiple LLM providers, and the Redis and Postgres infrastructure that keeps them fast. I've also spent time teaching it, including an 8-week NLP course and workshops for a few hundred people.

Agentic systems & RAG

LLM workflows that retrieve their own context and act on it, returning sub-second answers over 10K+ document chunks and routing across multiple providers when one model isn't enough.

LangGraph · LangChain · Groq LLaMA 3 · pgvector · Redis

Real-time computer vision

Vision models that run live on video rather than recorded clips — 94% vehicle detection at 30 FPS, with gaze tracking and proctoring under 200ms.

YOLOv8 · DeepSORT · MediaPipe · OpenCV · WebRTC

ML in production

Containerized services on GCP and AWS handling 10K+ images a day, with monitoring that surfaces problems before users run into them.

GCP Cloud Run · AWS · Docker · FastAPI

Backend & data plumbing

The infrastructure that makes AI reliable rather than a demo — websockets, vector stores, pub/sub, and 3-tier caching.

FastAPI · PostgreSQL · Redis Pub/Sub · MongoDB

Winner

Built an RL traffic system that cut wait time by 30%, beating 200+ teams at Hackers Playground.

1st place

First place at the IIIT-Delhi Ideathon for an AI-driven system architecture.

Finalist ×5

Reached the finals of 5 national hackathons, placing in the top 1% of 500+ teams across India.

Production

Shipped a pipeline processing 10K+ aerial images a day at sub-200ms latency.

Teaching

Designed and taught an 8-week NLP course with a 90% completion rate.

Community

Ran AI/ML hackathons and workshops, mentoring 100+ participants.

03

Experience

May 2026 – Present

Applied AI Intern

·Gravity AI

I prototype and ship features across the AI stack — prompting, retrieval, evaluation, and lightweight model integrations, along with backend work where it's needed. Much of the role is experimental: building and labeling datasets, defining success metrics, and turning the results into production code. The throughline is making the agents more reliable for real users, working closely with product and engineering to ship on schedule.

LLMsRAGEvaluationPythonFastAPI

Feb 2026 – Apr 2026

AI Engineer

·HiTouchCX (REBOO8)

Built four production AI systems, all deployed on GCP: live proctoring, a RAG support agent, a voice interviewer, and a LangGraph assessment engine.

YOLOv8DeepSORTMediaPipeLangGraphFastAPIRedis

Jun 2025 – Present

ML DevOps Trainee

·Kanchan Drones India

Built 5+ ML pipelines processing 10,000+ aerial images a day, plus a Unity desktop app that gives 15+ drone operators real-time imaging. The pipelines cut manual intervention by 65%.

OpenCVWebSocketsUnityML Pipelines

Jun 2025 – Sep 2025

AI/ML Intern

·Circus Theory · Dubai

Built edge ML pipelines across 8+ devices, reaching 92% accuracy on 50GB+ datasets with 45% lower latency.

YOLOv8TensorFlowPyTorchESP32

Nov 2025 – Dec 2025

AI/ML Instructor

·Apna College

Designed and taught an 8-week NLP curriculum covering everything from tokenization to transformers, with a 90% completion rate across 20+ students.

NLPTransformersCurriculum

Sep 2024 – Oct 2024

Credit Risk Modeling Intern

·Peak2Tails

Built Basel-compliant credit scorecards for 5,000+ users, improving prediction accuracy by 22% and cutting prep time by 40%.

PythonStatistical ModelingPandas
04

Contact

Working on something that needs to ship?

Open for freelance & contract projects — DM me. Email or any of the links below works, and I read everything that comes in.

arushkarnatak1881@gmail.com