Pallavi
Miriyala
Building intelligent systems for production-scale AI

AI ML Engineer

Designing intelligent systems that combine AI, infrastructure, and production-scale engineering — from multimodal inference to autonomous agents.

pallavi@ai-systems ~ zsh
[] CUDA connected
[] Qwen-VL initialized
[] Vector DB online
[] API gateway active
AI Pipeline · Liverunning
Data Input
GPU Inference
Embedding
Vector Search
LLM Reasoning
API Response
0+
Scenes Processed
Movie Intelligence Pipeline
0+
Embeddings Indexed
FAISS vector search
Sub-second
Sub-second Retrieval
Vector similarity search
GPU-Accelerated
GPU-Accelerated
ONNX Runtime · CUDA

Systems I Build

Production-focused AI engineering.

01

AI Pipelines

Production-grade ML workflows, multimodal inference, embedding systems, and automated intelligence pipelines at scale.

02

Computer Vision

GPU-accelerated inference with ONNX Runtime, YOLO, RetinaFace, InsightFace, and large-scale scene analysis systems.

03

Agentic AI

Autonomous reasoning systems using ReAct workflows, LLM orchestration, tool-calling, and intelligent multi-step automation.

04

Cloud Infrastructure

AWS deployments, Dockerized services, PostgreSQL at scale, Cloudflare networking, CI/CD, and MLflow tracking.

Engineering Case Studies

Systems designed
for real-world scale.

Production-grade AI — built, deployed, and running.

01 / pipeline
Video Ingestion
Face Detection (RetinaFace)
ONNX Runtime GPU Inference
Embedding → FAISS Index
Qwen VL Scene Description
FastAPI Production Layer
Engineering Challenge

Resolved production-critical CUDA/TensorRT runtime conflicts (nvinfer_10.dll) — diagnosed provider incompatibilities and migrated to CUDAExecutionProvider, restoring full GPU inference at 360K+ scene scale.

Production ML System · Mango Mass Media

Movie Scene Intelligence Pipeline

End-to-end AI pipeline processing 360,000+ movie scenes with automated face detection, recognition, and scene classification. Powers dam-studio-master — a React/Node.js digital asset management platform serving processed media to production.

Qwen VLFAISSONNX RuntimeRetinaFacePostgreSQLAWS RDSFastAPI
02 / pipeline
Production Data Stream
Statistical Drift Detection
LLM Decision Agent
Root Cause Analysis
Adaptive Retraining
MLflow Version Tracking
Engineering Challenge

Designed modular architecture compatible with TensorFlow, PyTorch, and Scikit-learn — ensuring the agent deploys into any existing MLOps stack without changes to the pipeline.

Autonomous MLOps Agent

DriftGuard AI

Intelligent monitoring system that detects data drift and concept drift, automatically triggers adaptive retraining, and uses an LLM-powered Decision Agent (GPT / Claude) to reason about root causes and explain every remediation action in plain language.

PythonScikit-learnGPT-4ClaudeMLflowStatistical Drift Detection
View on GitHub
Multi-Agent MLOps

auto-ml-pipeline-agent

Autonomous 4-agent system (Monitor → Diagnose → Strategize → Execute) for self-healing ML pipelines with real-time metrics collection and AI-driven optimization.

PythonMulti-AgentLLMMLOps
MLOps Automation

auto-ml-guardian

Proactive ML model health management with Human-in-the-Loop controls routing critical changes for approval — framework-agnostic across sklearn, PyTorch, TensorFlow.

PythonPyTorchTensorFlowScikit-learn
Real-Time ML Pipeline

stream-anomaly-guardian

Industrial IoT anomaly detection with Apache Kafka streaming, ADWIN concept drift detection, and self-healing model retraining for high-velocity sensor data.

PythonApache KafkaRiver/ADWINScikit-learnDocker
Vision AI Agent

sentient-desktop-agent

AI agent that proactively assists by understanding your screen. Uses LLaVA locally for visual perception and GPT-4o / Claude 3.5 for reasoning in a continuous Perception-Reasoning-Action cycle.

PythonLLaVAOllamaGPT-4oClaude
DevOps AI System

devops-maestro-agent

Multi-agent LLM system for autonomous DevOps — Planner, Knowledge Retrieval, Diagnosis, and Solution agents collaborate to diagnose incidents and troubleshoot infrastructure.

PythonMulti-AgentRAGLLM
Adaptive AI Agent

adapti-persona-agent

Context-aware agent that dynamically adopts expert personas (Engineer, Strategist, Analyst) with ChromaDB vector memory, tool orchestration, and RAG-powered adaptive assistance.

PythonGPT-4oChromaDBRAGLangChain
Agentic Workflows

agentic-canvas

Framework for orchestrating complex human-in-the-loop agentic workflows with state persistence and dynamic tool integration across multi-step tasks.

PythonMulti-AgentState Persistence
ML Classification

Multiple Disease Prediction

Multi-class classifier predicting diseases from user-reported symptoms with 92%+ accuracy — Grid Search hyperparameter tuning, deployed as an interactive Streamlit app.

Scikit-learnPandasStreamlit
Browser Extension

Text Summarizer Extension

Chrome extension using Google Gemini AI for real-time text summarization and Q&A on selected web content, with secure API key storage and customizable output.

JavaScriptGemini APIChrome Extension

Infrastructure & Scaling

Building AI systems end-to-end.

I care about everything below the model — GPU inference optimization, vector retrieval at scale, deployment pipelines, networking, and cloud-native architecture that holds up in production.

GPU-accelerated ONNX Runtime inference — CUDA / TensorRT
PostgreSQL at scale — GIN indexes, materialized views, AWS RDS
Cloud-native deployments — AWS, Docker, Cloudflare CDN
Agentic AI orchestration — ReAct, tool-calling, LLM pipelines
Production Stack
User Interface Layer
Cloudflare CDN + Edge Routing
React + FastAPI Services
AI Inference & Vector Searchactive
PostgreSQL + AWS RDS

Engineering Philosophy

Building intelligent systems that scale beyond experimentation.

I'm deeply interested in how modern AI systems work end-to-end — from model behavior and inference optimization to deployment architecture and production reliability.

Communication Channel

Let's build something intelligent.

Interested in AI systems, infrastructure, automation, or production-scale engineering collaborations?