Selected Work · 2024 — 2025

Projects

AI infrastructure, safety systems, and applied research — from KV-cache management for long-context inference, to runtime safety layers for agent tool invocations, to state-of-the-art models for vision and retrieval.

2025 · Systems

IceCache: Semantic Paging for 1M-Token Context

A KV-cache management layer that clusters semantically related tokens and integrates with vLLM's PagedAttention — enabling 128K-context inference on a single 24GB RTX 4090 with <3% accuracy loss.

2025 · Infrastructure

Token-Budget-Aware Pool Router

An autoscaler for vLLM fleets that estimates each request's token budget and dispatches to appropriately sized pools — eliminating OOM preemptions and cutting GPU cost by 61%.

2025 · Safety

STARS: Runtime Skill Firewall for Agents

A low-latency MCP middleware that scores every tool invocation against user request + runtime context, blocking policy-violating calls at p99 <8ms.

2025 · Interpretability

World Model Lens

Building tools to inspect how language models construct internal world representations. Current research at Elemental Research Lab investigating mechanistic interpretability of world models.

2024 · Computer Vision

Pyramid Hierarchical Spatial-Spectral Transformer

State-of-the-art 99.21% accuracy on Kennedy Space Center hyperspectral data, outperforming 10+ baselines. 12+ citations. Top 3% on the HSI leaderboard.

2024 · NLP / RAG

High-Performance PDF Retrieval with RAG

End-to-end RAG pipeline achieving sub-second query response at 99.1% availability with 500+ concurrent queries. Reduced hallucination by 35%.

YRG

Mechanistic Interpretability World Models AI Safety

Email gogiredd@usc.edu

Mobile +1 213 527 8588

Location Los Angeles, CA

About Me

Hi, I'm Yugandhar! I'm a Master's student in Computer Science at the University of Southern California, and a research member at Elemental Research Lab where I'm building World Model Lens — tools for inspecting how language models construct internal world representations.

I see the main goal of my work as understanding how neural networks represent and process information internally, and using that understanding to make AI systems safer and more reliable. My core research interests are mechanistic interpretability, world models, and AI safety.

I also build high-performance AI infrastructure — from KV-cache management systems that enable long-context inference on consumer GPUs, to token-budget-aware routing gateways for production LLM fleets, to runtime safety layers for agent tool invocations.

Before USC, I graduated top 2% from SRM Institute of Science & Technology with a B.Tech in Computer Science (AI & ML specialization) with a perfect 4.0 GPA. I've published research across NLP, computer vision, and document retrieval — 78 citations, h-index of 5. You can see my papers here.

Previously, I worked as an AI Research Intern at the University of St Andrews (Transformer architectures for historical image classification) and as a Computer Vision Research Intern at Trinity College Dublin (maritime detection systems processing 10TB+ satellite imagery). I also worked as a Software Engineer at HydroMind, building developer tools and scaling testing infrastructure.

What I'm Working On Now

I'm currently focused on mechanistic interpretability research — specifically on understanding how language models build internal world representations and what tools we need to inspect them. I'm also building open-source AI infrastructure: IceCache (semantic KV-cache paging for vLLM), a token-budget-aware routing gateway, and STARS (a runtime safety firewall for MCP tool invocations).

I'm targeting AI Engineer and AI Researcher roles. If you're working on interpretability, AI safety, or high-performance ML systems, I'd love to connect — reach out.

Technical Expertise

AI / ML & Research

PyTorch · JAX · Hugging Face Transformers · FAISS · vLLM · ONNX Runtime · LangChain · RAG systems · fine-tuning · mechanistic interpretability tooling.

Computer Vision

OpenCV · YOLO · ResNet · HRNet · Vision Transformers · hyperspectral image classification.

Systems & Infrastructure

Rust · Go · CUDA / Triton kernels · Kubernetes · Terraform · Redis · Kafka · Prometheus · AWS · Docker · CI/CD.

Languages

Python (primary) · C++ · Rust · Go · SQL · JavaScript.

· · ·

Get in touch

Working on interpretability, safety, or ML systems? Let's talk.

I'm targeting AI Engineer and AI Researcher roles. gogiredd@usc.edu

Research

Areas & ongoing work

Mechanistic interpretability of language models and parameter-efficient deep learning across NLP and computer vision.

Current · Elemental Research Lab

World Model Lens

Investigating how language models build internal world representations. Building inspection tools for mechanistic interpretability of world models — understanding what concepts LLMs encode, how they compose them, and how to make this process transparent.

2024 · Published · 12+ citations

Pyramid Hierarchical Spatial-Spectral Transformer

Novel Transformer architecture for hyperspectral image classification achieving state-of-the-art 99.21% accuracy. Reduced parameters by 30% (2.1M to 1.47M) while maintaining accuracy, enabling edge deployment.

2024 · Published · 15 citations

Graph-Based Sentence Selection with Transformer Fusion for Text Summarization

Combined graph-based extractive methods with Transformer fusion for text summarization, demonstrating synergistic improvements over standalone approaches.

2024 · Published · 11 citations

Sustainable NLP: Parameter Efficiency for Resource-Constrained Environments

Systematic exploration of parameter-efficient methods for deploying NLP models in resource-constrained settings — reducing compute requirements while preserving performance.

Publications · 78 citations · h-index 5

Papers

Seven peer-reviewed papers and preprints across NLP, computer vision, and document retrieval — including parameter-efficient deep learning, hyperspectral image classification, and long-document language modeling.

2024 · Computer Vision · Cited 28

Efficient CAPTCHA Image Recognition Using CNNs and LSTM

A hybrid CNN-LSTM architecture for solving variable-length character CAPTCHAs — the CNN extracts local visual features, the LSTM models sequential character dependencies.

2024 · NLP · Cited 15

Graph-Based Sentence Selection with Transformer Fusion for Text Summarization

A hybrid framework pairing graph-based extractive sentence selection with Transformer-based abstractive fusion. Synergistic gains over either approach standalone.

2024 · NLP / RAG · Cited 13

End-to-End Neural Embedding Pipeline for Large-Scale PDF Retrieval

A retrieval pipeline using Sentence Transformers and distributed FAISS over large PDF corpora, with a chunking strategy designed to preserve semantic coherence.

2024 · NLP · Cited 11

Sustainable NLP: Parameter Efficiency for Resource-Constrained Environments

A systematic study of LoRA, adapters, prefix tuning, distillation, and quantization for deploying NLP models under tight compute budgets.

2024 · Computer Vision · Cited 9

Advanced Underwater Image Quality Enhancement

A two-stage pipeline combining Super-Resolution CNNs with multi-scale Retinex defogging for underwater image enhancement — clearer imagery for marine vision tasks.

2024 · NLP · Cited 2

Systematic Exploration of Dialogue Summarization Approaches

A reproducibility study comparing extractive, abstractive, and hybrid dialogue summarization methods on SAMSum and DialogSum, with proposed methodological refinements.

2025 · LLMs · Preprint

Never Lost in the Middle Again: Teaching LLMs to Care About the Center of Long Documents

Empirically diagnoses and mitigates the "lost-in-the-middle" failure mode — LLMs underweighting information located in the center of long documents.

View full profile on Google Scholar →

Career

Experience

Research, internships, industry, and education.

Current

Research Member

Mar 2026 — Present

Elemental Research Lab — New York

Building World Model Lens — tools for inspecting how language models construct internal world representations. Research in mechanistic interpretability and AI safety.

Graduate Student Researcher

Apr 2026 — Present

USC Viterbi School of Engineering — Los Angeles

Researcher at USC InfoLab on W4H (Wearables for Health), an NIH-funded initiative building an open-source toolkit that transforms wearable sensor data into clinical intelligence.

Research Internships

AI Research Intern

Jun 2024 — Oct 2024

University of St Andrews — Scotland

Pioneered Transformer architecture for historical image classification, achieving 18% accuracy improvement (78% → 94%) on 50,000+ museum artifacts. Constructed end-to-end deep learning pipeline integrating attention mechanisms with spatial feature fusion, enabling accurate classification of previously unidentifiable specimens.

Computer Vision Research Intern

Jul 2024 — Nov 2024

Trinity College Dublin — Ireland

Architected maritime detection system processing 10TB+ satellite imagery (HRNet + ResNet101), achieving 12% accuracy gain and 98.3% precision. Optimized inference by 30%, enabling real-time monitoring. Implemented advanced augmentation strategies reducing critical miss rates by 20% in adverse weather conditions.

Industry

Software Engineer Intern

Sep 2024 — Mar 2025

HydroMind — Chennai, India

Architected intelligent Command Palette (React + Node.js) achieving 40% developer velocity acceleration, eliminating 15+ weekly hours of manual work. Scaled testing infrastructure from 60% to 85% coverage (340 tests added), reducing post-deployment bugs by 45% and production incidents by 30%.

Education

M.S. Computer Science

Expected May 2027 · GPA 3.65/4.0

University of Southern California — Los Angeles

Coursework: Analysis of Algorithms, Artificial Intelligence, Advanced Machine Learning.

B.Tech Computer Science (AI & ML Specialization)

Sep 2021 — Jun 2025 · GPA 4.0/4.0 · Top 2%

SRM Institute of Science & Technology — Chennai

Research focus: Deep learning, NLP, and computer vision. Published 7 papers. Scaled 200+ member organization by 45% as President. Mentored 4 junior researchers in deep learning methodology.

Projects

About Me

What I'm Working On Now

Technical Expertise

Working on interpretability, safety, or ML systems? Let's talk.

Areas & ongoing work

Papers

Experience

Current

Research Internships

Industry

Education

IceCache: Semantic Paging for 1M-Token Context

Technical Approach

Key Results

Token-Budget-Aware Pool Router

Technical Approach

Key Results

STARS: Runtime Skill Firewall for Agents

Technical Approach

Key Results

World Model Lens

Research Questions

Pyramid Hierarchical Spatial-Spectral Transformer

Key Results

Graph-Based Sentence Selection with Transformer Fusion for Text Summarization

Approach

Key Results

Sustainable NLP: Parameter Efficiency for Resource-Constrained Environments

Approach

Key Results

High-Performance PDF Retrieval with RAG

Key Results

Efficient CAPTCHA Image Recognition Using CNNs and LSTM

Approach

Key Results

End-to-End Neural Embedding Pipeline for Large-Scale PDF Retrieval

Approach

Key Results

Advanced Underwater Image Quality Enhancement

Approach

Key Results

Systematic Exploration of Dialogue Summarization Approaches

Approach

Key Results

Never Lost in the Middle Again: Teaching LLMs to Care About the Center of Long Documents

Approach

Key Results