TraceRAG
2026Enterprise GraphRAG observability
→ ~90% lower latency (4.5s → <500ms)
The problem. Root-cause analysis in distributed systems is slow. When something breaks, engineers end up tracing service dependencies by hand to work out what a given bug affected.
What I built. An observability platform built on a custom GraphRAG engine that maps service dependencies and automates root-cause analysis. A hybrid LLM router classifies intent using Groq Llama-3.1-8B and GLiNER for zero-shot NER, backed by LadybugDB, a custom multi-hop graph database that runs batched Cypher traversals. The React Flow frontend visualizes the blast radius of a bug across GitHub PRs and Jira tickets, with plain-English summaries.
- Batched Cypher queries in LadybugDB eliminate N+1 bottlenecks
- Sustained 13 req/s with zero errors under a 25-way concurrent load test
- Interactive blast-radius view spanning GitHub PRs and Jira tickets