Rahul Varma

Applied Scientist | ML/AI Engineer

Specializing in Computer Vision, Edge AI, and Production ML Systems. Building intelligent systems that process millions of images and serve real-time predictions at scale.

Get In Touch View Projects

About Me

I'm a Staff Applied Scientist at RadiusAI with a passion for building production-ready AI systems that solve real-world problems. With expertise in computer vision, model optimization, and MLOps, I've deployed systems that process 30+ fps in real-time retail environments and handle 500K+ images monthly.

I hold a Master's degree in Computer Science from Arizona State University (GPA: 4.0) and have contributed to multiple patents in the field of computer vision and assisted checkout systems. My work spans from research to production deployment, including model compression, knowledge distillation, and edge computing on resource-constrained devices.

Beyond my work at RadiusAI, I'm passionate about using AI for social good. I'm currently involved with the Microlab Aicacia project at Collaborative Earth, where we're leveraging AI and RAG-based systems to make reforestation knowledge more accessible to practitioners worldwide.

Professional Experience

RadiusAI
Staff Applied Scientist
February 2022 – Present | Bellevue, WA
  • Shopassist Platform: Deployed YOLOv8 and Transformer-based object detection processing 30 fps, reducing customer wait time from 5s to 1s
  • Edge Optimization: Achieved <50ms inference latency on NVIDIA Jetson devices with TensorRT optimization
  • Model Compression: Implemented teacher-student distillation (SAM → YOLOv8-seg), achieving 4% accuracy improvement while reducing model size by 8x
  • Active Learning: Designed model-assisted annotation system reducing manual labeling from 50 to 10 hours per 10K-image batch (80% reduction)
  • Infrastructure: Built Rust-based video streaming pipeline with TensorRT, processing 4+ concurrent HD streams with 99.9% uptime
Amazon
Software Development Engineer Intern
May 2021 – August 2021 | Palo Alto, CA
  • Designed Prometheus target allocation service for OpenTelemetry Operator
  • Improved metric scraping latency by 30% and enabled horizontal scaling across 1000+ service endpoints
Decision Theater at ASU
Data Scientist
May 2020 – May 2021 | Tempe, AZ
  • Analyzed Arizona high school dataset (50K+ students) to identify early dropout risk indicators with 78% accuracy
  • Developed LSTM-based sentiment classifier for product launches, analyzing 100K+ social media posts with 82% accuracy

Featured Projects

QU-Net: Lightweight Region Proposal System

Computer Vision Edge AI Binary Segmentation Master's Thesis

A lightweight U-Net-based binary segmentation model that identifies salient regions in images, enabling efficient deep learning on resource-constrained edge devices by reducing unnecessary computation.

Technical Approach: QU-Net generates binary masks highlighting regions of interest, allowing downstream models to focus computation only on relevant areas instead of processing entire images equally.

Key Features:

  • Binary segmentation for region proposals on edge devices
  • 25% computational reduction on MS COCO dataset
  • 57% computational reduction on Cityscapes dataset
  • Outperforms Dynamic Convolutions baseline

Impact: Enables deployment of sophisticated computer vision models on edge devices with severe resource constraints, making AI more accessible for IoT, mobile, and embedded systems.

📄 Read Full Thesis →

Microlab Aicacia

AI for Good Reforestation RAG NLP

Part of Collaborative Earth's initiative to support global reforestation efforts by making domain knowledge more accessible to practitioners worldwide.

Technical Approach: Developed a RAG-based search and retrieval system using vector embeddings to enable practitioners to find relevant reforestation information from curated knowledge sources.

Key Features:

  • Scraped and processed data from open-source journals and publicly accessible resources
  • Built vector database with embedding models for semantic search
  • Implemented retrieval augmented generation (RAG) for Q&A functionality
  • Developed automated data pipelines for continuous knowledge curation

Impact: Makes reforestation knowledge accessible to global practitioners, addressing barriers in finding actionable information for specific projects and supporting global ecological regeneration efforts.

IRaaS: Image Recognition as a Service

Cloud Computing AWS Auto-Scaling Real-time Detection

An auto-scalable cloud application providing real-time object detection as a service using AWS infrastructure and the Darknet machine learning model.

Technical Approach: The system processes real-time video streams from Raspberry Pi devices, automatically scaling cloud resources based on demand to handle concurrent requests efficiently.

Key Features:

  • Real-time object detection on video streams using Darknet ML model
  • Automatic horizontal scaling (scale-out/scale-in) based on request volume
  • Concurrent request handling with intelligent resource allocation
  • Integration with AWS EC2, S3, and SQS for distributed processing

Impact: Demonstrates scalable cloud architecture for ML inference, enabling cost-effective deployment of object detection services that automatically adapt to varying workloads and demand patterns.

PruneAway: Neural Network Pruning Framework

Model Optimization PyTorch Structured Pruning

A neural network pruning framework that reduces model size and computational requirements while maintaining accuracy through intelligent structured pruning techniques.

Technical Approach: Implements L1-norm based structured pruning on ResNet architectures, systematically removing less important filters and channels to compress models for efficient deployment.

Key Features:

  • Structured pruning using L1Strategy on convolutional layers
  • Dependency-aware pruning that maintains model integrity
  • Filter ranking based on activation outputs and gradients
  • Iterative pruning with fine-tuning for accuracy preservation

Impact: Reduces model size and inference time while preserving accuracy, enabling deployment of deep learning models on resource-constrained devices and lowering computational costs in production environments.

Patents & Publications

My research and development work has resulted in multiple granted patents in computer vision and AI-assisted retail systems, focusing on real-time object detection, edge computing optimization, and intelligent checkout solutions.

📜
US12272217B1 Granted
Automatic item identification during assisted checkout based on visual features
A system that automatically identifies retail items at checkout using computer vision and deep learning, enabling fast and accurate product recognition without manual barcode scanning.
🏪
Point of sale station for assisted checkout system
An intelligent point-of-sale architecture that integrates AI-powered item detection with traditional retail systems, streamlining the checkout process and improving customer experience.
🔍
Adaptive region-scale proposing for object recognition
A novel approach to dynamically adjust region proposal scales in object detection models, improving accuracy for objects of varying sizes while maintaining computational efficiency.
📚
A simple, efficient and innovative biometric human identification using weighted thresholding and KNN
Early research in biometric identification systems using machine learning techniques, exploring efficient algorithms for accurate human recognition.

Technical Skills

🤖 ML/AI Frameworks

PyTorch TensorFlow Scikit-learn Hugging Face ONNX TensorRT

👁️ Computer Vision

YOLOv5/v8 Mask R-CNN SAM OpenCV Detectron2

⚙️ MLOps & Infrastructure

Docker Kubernetes AWS Apache Airflow MLflow

💻 Programming Languages

Python Rust Go SQL C++

📊 Data & Databases

PostgreSQL MongoDB Snowflake Pandas NumPy Apache Spark

🔧 Specializations

Edge AI Model Compression RAG Systems Real-time Systems Active Learning Knowledge Distillation

Blog & Writing

I share insights on machine learning, computer vision, and production AI systems on Medium. My writing explores practical techniques, lessons learned from deploying ML at scale, and emerging trends in AI research and engineering.

Binary Neural Networks — Future of low-cost neural networks?

Published in Towards Data Science

Exploring how binary neural networks can bridge the gap between research and production ML by reducing memory and computation costs. Learn about straight-through estimators, XNOR operations, and how 1-bit networks can run efficiently on edge devices.

Read More →

Statistical analysis on a dataset you don't understand

Published in Towards Data Science

A practical guide to analyzing unknown datasets using statistical techniques. Covers QQ plots, normality tests, correlation analysis, and feature engineering to achieve 99.6% accuracy on a Wells Fargo competition dataset with zero domain knowledge.

Read More →

Variable-sized Video Mini-batching

Published in Towards Data Science

A novel batching algorithm for training models on videos with unequal frame counts. Solves the challenge of processing variable-length videos without trimming, essential for action recognition and autonomous driving applications.

Read More →

Let's Connect

I'm always interested in discussing new opportunities, collaborations, or just chatting about AI and machine learning.