Ponsubash Raj R

Software Engineer (Agentic AI, System Design & Cloud Architecture)

I build AI systems that are useful beyond the demo: agentic workflows, retrieval systems, durable background jobs, and cloud architectures that can survive real constraints.

What I Do

🧠

Agentic AI Systems

Designing tool-using AI workflows with retrieval, structured outputs, orchestration, and feedback loops that stay grounded in product needs.

🤖

System Design & Cloud

Building systems with queues, durable state, observability, storage boundaries, and deployment paths that make AI applications reliable.

🔬

Applied AI Engineering

Turning research ideas into working products through evaluation, graceful degradation, and careful engineering trade-offs.

6
Project Case Studies
1
Blog Draft
3
Published Papers
2
Conferences Presented
250+
LeetCode Problems

Project case study

CourseFlow

Back to projects

Problem Statement

YouTube has excellent courses, but playlists are passive, hard to search, difficult to review, and fragile when AI processing hits rate limits. CourseFlow turns a playlist into a durable learning system with transcripts, notes, quizzes, search, review cards, exports, and optional diagrams.

Why it matters

The project demonstrates production-oriented AI architecture: asynchronous queues, resumable work, quota-aware scheduling, private object storage, vector search, state machines, and cloud deployment. It also fits the real learner workflow: watch, search, quiz, review, and export without restarting from scratch after failures.

System Architecture

Courseflow architecture

Features

Durable playlist processing

Ingests playlists or videos, tracks every video state, retries failures, and checkpoints transcript and note chunks.

Active learning layer

Generates structured notes, semantic search, adaptive LangGraph quizzes, SM-2 spaced repetition cards, and exam planning.

Diagram and export pipeline

Supports Mermaid and optional illustrative diagrams, then exports lessons or complete courses as Markdown, PDF, or Anki decks.

Claude Desktop MCP integration

Exposes user-scoped course listing, search, and grounded Q&A through a local read-only MCP server.

System Design Decisions

Queues over blocking requests

Transcription, notes, embeddings, PDFs, and diagrams run through Celery so HTTP requests stay responsive.

Redis plus PostgreSQL

Redis handles fast counters and active state; PostgreSQL stores durable usage events and recoverable processing state.

Hybrid edge/cloud boundary

A local fetcher handles YouTube calls that are sensitive to datacenter blocking, while AWS owns durable state and user-facing services.

Quota-aware scheduling

Groq response headers drive backpressure and retry timing, reducing repeated failures under rate limits.

Complete Tech Stack

  • React 18, TypeScript, Vite, React Router
  • TanStack Query, Tailwind CSS, React Markdown, Recharts
  • Python 3.12, FastAPI, Pydantic, Async SQLAlchemy
  • PostgreSQL 16, pgvector, Alembic
  • Redis, Celery, Celery Beat
  • MinIO locally, Amazon S3 in production
  • Groq LLM and Whisper APIs
  • LangGraph, LangChain Groq, sentence-transformers
  • yt-dlp, youtube-transcript-api, pydub, ffmpeg
  • Mermaid CLI, Puppeteer, Sharp, Cloudflare Workers AI
  • Docker Compose, AWS EC2, IAM, CloudWatch, GitHub Actions OIDC
  • Local read-only MCP server for Claude Desktop

Takeaway

CourseFlow is the strongest expression of my current interests: agentic workflows, cloud architecture, durable distributed jobs, retrieval, and real product-grade system design around AI constraints.

Link to Github

Project case study

Docflow

Back to projects

Problem Statement

Users need a private way to upload documents, organize them, and ask grounded questions across PDFs and images. Docflow solves this with multi-user authentication, async processing, OCR, hybrid retrieval, chat history, and source-grounded answers.

Why it matters

Document Q&A is only useful when retrieval is accurate and user isolation is strict. Docflow combines parent-child chunking, vector search, BM25, and Reciprocal Rank Fusion while keeping every file, chat, object key, and vector scoped to the authenticated user.

System Architecture

Docflow architecture

Features

Document ingestion

Accepts PDFs and images, stores raw files, queues work, extracts text, performs OCR for scanned PDFs, and tracks processing status.

Hybrid retrieval

Combines Qdrant vector search with BM25 keyword search and Reciprocal Rank Fusion for robust document lookup.

Multi-user chat

Supports separate files, chats, messages, and search results for every user with bearer-token sessions.

AWS demo deployment

Runs on EC2 with Nginx, containers, private S3 storage, PostgreSQL, Redis, Qdrant, CloudWatch logs, and IAM role credentials.

System Design Decisions

Async processing boundary

Celery and Redis keep upload requests fast while the worker handles extraction, OCR, chunking, embeddings, and vector writes.

Parent-child chunking

Small child chunks improve retrieval precision; larger parent chunks give the LLM enough context to answer coherently.

User-scoped storage

Database rows, object keys, and Qdrant payloads all carry ownership metadata to prevent cross-user leakage.

Grounded fallback

The prompt instructs the model to say when evidence is missing instead of inventing unsupported document facts.

Complete Tech Stack

  • React, Vite, Lucide
  • FastAPI, Pydantic
  • Celery, Redis, LangGraph
  • SQLite locally, PostgreSQL for AWS
  • MinIO locally, Amazon S3 in AWS
  • Qdrant vector database
  • BM25, vector search, Reciprocal Rank Fusion
  • sentence-transformers embeddings
  • Tesseract OCR, PyMuPDF
  • Groq through LangChain
  • structlog, LangSmith, CloudWatch
  • Docker, Nginx, AWS EC2

Takeaway

Docflow shows practical RAG beyond a demo: background jobs, hybrid retrieval, cloud storage, tenant isolation, operational logging, and grounded generation.

Link to Github

Project case study

Pulse

Back to projects

Problem Statement

Engineers and researchers follow too many sources: RSS feeds, GitHub, arXiv, newsletters, and saved articles. Pulse collects this information, normalizes it, enriches it with AI, and serves a personalized mobile feed.

Why it matters

A useful technical reader needs more than summaries. It needs ingestion isolation, deduplication, ranking, semantic search, learning modes, digests, trends, and a deployment path that can run cheaply for a single owner.

System Architecture

Pulse architecture

Features

Multi-source ingestion

Collects AI and software content from RSS, GitHub, arXiv, and Gmail newsletters with normalization and deduplication.

LLM enrichment

Uses structured Groq calls for summaries, categories, entities, and scoring.

Personalized mobile feed

Ranks content using reading, bookmarking, and hiding behavior, with offline cache and network states in the Expo app.

Search and learning modes

Includes semantic and hybrid search, LangGraph Socratic quizzes, corpus-grounded Ask mode, digests, and trends.

System Design Decisions

PostgreSQL as knowledge store

Combines relational content metadata with pgvector semantic retrieval and full-text search.

Failure isolation

Ingestion normalizes and deduplicates sources independently so one failing source does not poison the entire feed.

Single-owner security model

Uses a static API key as a lightweight access gate, suitable for personal deployment but explicitly not multi-user auth.

Cheap deployment path

Supports local Docker and a zero-cost Render + Supabase route for running the system as a personal tool.

Complete Tech Stack

  • Expo SDK 56, React Native, Expo Router
  • TanStack Query, Zustand
  • FastAPI, SQLAlchemy async, Pydantic
  • PostgreSQL, pgvector, Alembic
  • Groq
  • LangGraph
  • local MiniLM or Supabase gte-small embeddings
  • APScheduler or Supabase Cron
  • Docker Compose
  • Render, Supabase
  • GitHub Actions, EAS

Takeaway

Pulse is a personal intelligence system: ingestion, enrichment, retrieval, ranking, and mobile UX tied together into a daily engineering workflow.

Link to Github

Project case study

Smart Notes Generator

Back to projects

Problem Statement

Lecture PDFs and slides often contain important diagrams that disappear when converted into plain text prompts. Smart Notes Generator creates structured notes while preserving extracted figures in the right context.

Why it matters

Students need trustworthy notes from private course material without sending images to an external model. This project uses a local-first workflow where files and images stay on the machine while text prompts preserve diagram placeholders.

System Architecture

Smart Notes Generator architecture

Features

Image-aware extraction

Processes PDFs and PowerPoints, extracts raster/vector figures, filters low-quality images, deduplicates repeated assets, and preserves meaningful diagrams.

Zero-token image strategy

Sends lightweight placeholders such as IMG tokens to Claude, then restores the original local images after generation.

Local evaluation and RAG

Scores notes for coverage, structure, density, length, and faithfulness, then supports grounded Q&A over saved notes.

Agent refinement

Allows targeted edits on specific sections while preserving diagrams and minimizing prompt size.

System Design Decisions

Local-first privacy boundary

Source files and extracted images remain local; the model receives text plus placeholders instead of image payloads.

Session state vs saved state

Temporary generation state lives in memory and OS temp files, while saved notes, evaluations, chunks, and chats live in SQLite.

Fallback retrieval

Semantic embeddings are used when available, with TF-IDF and Jaccard fallbacks for dependency-light environments.

Export fallback ladder

PDF export tries WeasyPrint, falls back to xhtml2pdf, and finally returns browser-printable HTML.

Complete Tech Stack

  • React 18, TypeScript, Vite
  • Tailwind CSS, Radix UI, lucide-react
  • FastAPI, uvicorn, Pydantic v2
  • PyMuPDF, python-pptx, Pillow
  • Anthropic Python SDK
  • SQLite
  • markdown, WeasyPrint, xhtml2pdf
  • scikit-learn, NumPy
  • sentence-transformers all-MiniLM-L6-v2
  • RAG with heading chunks and semantic/TF-IDF/Jaccard retrieval

Takeaway

Smart Notes Generator shows careful AI product design around privacy, cost, graceful degradation, evaluation, and diagram-preserving postprocessing.

Link to Github

Project case study

AI Learning Coach IDE

Back to projects

Problem Statement

Most coding assistants optimize for the final answer. AI Learning Coach focuses on how the learner thinks while solving DSA and competitive programming problems.

Why it matters

Learners often repeat hidden process mistakes: skipping planning, rewriting the same region, mishandling boundaries, or abandoning approaches after pauses. The extension turns the IDE into an event-driven learning observatory and gives reflective feedback without leaking solutions.

System Architecture

AI Learning Coach architecture

Features

Problem context parsing

Captures pasted LeetCode, Codeforces, or plain-text statements and extracts title, difficulty, constraints, examples, tags, and expected complexity when available.

Session signal capture

Records edits, saves, diagnostics, cursor movement, idle time, undo/redo-like behavior, and AST snapshots.

Pedagogical feedback

Returns structured observations, issues, process feedback, learning suggestions, and reflection questions in a VS Code webview.

System Design Decisions

Signal compression before LLM calls

Raw editor events are noisy, so the extension derives compact metrics such as edit churn, planning time ratio, boundary error density, and abandoned attempts.

Local Python agent boundary

The extension talks to a Python agent over line-delimited JSON, separating editor concerns from feedback generation.

Schema-constrained output

Pydantic schemas keep LLM feedback predictable and prevent the webview from receiving unstructured assistant output.

Pedagogical guardrails

The prompt emphasizes process feedback and avoids direct code fixes or full solutions.

Complete Tech Stack

  • VS Code Extension API
  • TypeScript compiler API for AST fingerprints
  • TypeScript command and event layers
  • VS Code webview UI
  • JSON-over-stdio bridge
  • Python embedded agent
  • Pydantic schemas
  • Groq chat completion
  • In-memory learner profile and session history
  • Optional FastAPI backend entrypoint

Takeaway

AI Learning Coach is an agentic developer-tool project focused on behavior modeling, structured context, and useful feedback rather than shortcut generation.

Link to Github

Project case study

Interactive Reinforcement Learning Pong Trainer

Back to projects

Problem Statement

RL training loops are often opaque. This project creates a visual Pong environment where a Double DQN agent trains online while exposing reward, loss, Q-values, actions, and state dynamics.

Why it matters

Reinforcement learning becomes easier to understand when reward shaping, exploration, target networks, and policy behavior can be inspected live. The trainer turns a classic control problem into an interactive laboratory.

System Architecture

RL Pong system architecture

Features

Live reward shaping

Lets users tune terminal rewards, miss penalties, alignment rewards, shaping alpha, and action-change penalties at runtime.

Step-by-step debugging

Manual stepping mode shows recent reward, loss, state information, and training behavior frame by frame.

Battle mode

Runs a greedy survival challenge with escalating difficulty and a persistent CSV leaderboard.

Model persistence

Saves model weights, target network, optimizer state, episode counters, reward/loss history, and hit statistics.

System Design Decisions

Dense reward shaping

Distance-to-intercept shaping provides feedback while the ball is still moving, reducing the sparseness of Pong rewards.

Double DQN

Uses a target network and Double DQN targets for more stable online learning.

Multiple opponent modes

Supports perfect, heuristic, and human opponents to test different training and evaluation conditions.

UI-first observability

Graphs and diagnostics are rendered directly into the game loop so the learning process stays visible.

Complete Tech Stack

  • PyTorch
  • Pygame
  • Double Deep Q-Network
  • Experience replay buffer
  • Target network updates
  • Adam optimizer, MSE loss, ReLU MLP
  • Matplotlib charts inside Pygame
  • CSV leaderboard persistence
  • Online training and greedy evaluation modes

Takeaway

The RL trainer reflects my interest in making complex learning systems inspectable, tunable, and grounded in actual behavior rather than hidden training logs.

Link to Github

Blogs

Notes on agentic AI, system design, cloud architecture, and lessons from building real AI systems.

Research & Publications

Academic contributions through publications

Automatic Generation of Medical Imaging Diagnostic Report using BLIP Model

9th International Conference on Computational Intelligence in Data Science (ICCIDS) • January 2026

The generation of radiology reports forms a very important aspect of medical imaging, which furnishes physicians with timely and accurate diagnostic information. However, interpreting medical images and compiling their findings remains cumbersome for radiologists. To address this challenge, we propose an automatic report generation system based on the BLIP (Bootstrapping Language-Image Pre-training) model. By fine-tuning a pre-trained BLIP image captioning framework using paired chest X-ray images and their corresponding diagnostic reports, the model learns to describe visual findings in clear and clinically relevant text. Experimental results show that our model achieves strong performance with BLEU, METEOR, ROUGE, and RadGraph-F1 scores of 0.5859, 0.4021, 0.6780, and 0.6424 respectively, demonstrating its ability to generate meaningful and clinically grounded descriptions.

My Contribution: Lead researcher, conducted experiments, wrote parts of the manuscript.

A Study of ML and DL approaches for Sentiment Analysis in Code-Mixed Tamil and Tulu Texts

Fifth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages • May 2025

Explored sentiment classification in Tamil-English and Tulu-English code-mixed datasets using both machine learning (ML) and deep learning (DL) approaches. DL models, while theoretically capable of capturing richer contextual and semantic relationships, underperform with limited data availability. ML models are better suited for sentiment analysis of code-mixed texts, particularly in low-resource settings, as they effectively leverage n-gram-based features without requiring extensive labeled data.

My Contribution: Conducted experiments for DL, wrote manuscript.

Word-Level Language Identification in Dravidian Code-Mixed Text Using Machine Learning: A Comparative Analysis of Models and Vectorization Techniques

Forum for Information Retrieval Evaluation • December 2024

Language Identification (LI) is a major component for various applications such as Sentiment Analysis, Machine Translation, Information Retrieval, and Natural Language Processing. In multilingual countries like India, especially among the younger generation, social media often contains code-mixed texts where local languages are combined with English. We analyze the performances of 5 models using 2 different vectorizers for the Word-Level Language Identification of Dravidian Languages.

My Contribution: Lead researcher, conducted experiments, wrote manuscript.

Conference Presentations

  • Forum for Information Retrieval Evaluation 2024, DAIICT , Gandhinagar, India (Online)
  • 9th International Conference on Computational Intelligence in Data Science 2026, SSN College of Engineering, Chennai, India

About

I'm a Software Engineer focused on agentic AI, system design, and cloud-native AI applications. My background combines computer science fundamentals from SSN College of Engineering with hands-on experience building retrieval systems, learning platforms, backend workflows, and deployed cloud architectures.

My work is shaped by one recurring question: how do we make AI systems reliable when they move from a notebook to a real product? I care about queues, state, rate limits, isolation, observability, retrieval quality, and the engineering decisions that make intelligent systems maintainable.

Currently, I'm focused on agentic workflows, RAG architectures, MCP integrations, asynchronous processing, PostgreSQL/pgvector systems, and cloud deployments on AWS and adjacent platforms.

I'm seeking opportunities in AI engineering and backend/platform engineering where I can build systems that combine strong AI capability with production-grade architecture.

Agentic AI

  • RAG Systems
  • Tool-calling & Agents
  • MCP Integrations
  • Structured LLM Workflows

System Design

  • Queues and Background Workers
  • PostgreSQL, Redis, Vector Search
  • Object Storage and User Isolation
  • Rate Limits, Retries, Observability

Cloud & Engineering

  • FastAPI and REST APIs
  • Docker and GitHub Actions
  • AWS EC2, S3, IAM, CloudWatch
  • REST API and Product Engineering

Beyond Work

When I'm not coding or reading papers, I enjoy playing chess, solving rubik's cube, and competitive programming problems.

Resume

Download or view my complete professional resume

Last updated: June 2026

Quick Summary

  • Education: M.Tech Computer Science & Engineering (5 Years Integrated), SSN College of Engineering
  • Specialization: Agentic AI, System Design, Cloud Architectures
  • Skills: FastAPI, PostgreSQL, Redis, RAG, AWS, Docker, LangGraph
  • Interests: Agentic workflows, cloud-native AI systems, durable backend architecture

Contact

Let's connect and discuss opportunities

Get in Touch

I'm open to discussing AI engineering, backend/platform roles, cloud architecture work, and research collaborations around agentic systems.

Based in Chennai, India

Available for remote opportunities worldwide and open to relocation for the right role.