Computer Vision Engineer
Started in a difficult place with a laptop and one question — how do machines actually process images and video? That question changed everything.
The Real Story
I didn't rush to publish half-finished projects. 2023-2024 was pure learning — mastering fundamentals through practice. Then in 2025-2026, I built production-grade projects from scratch with proper architecture and deployment pipelines. This is the professional approach.
The Beginning · Jun 2023
College wasn't financially possible. I was in a difficult situation after 12th — in a mess, figuring out what to do next. Self-study wasn't a statement. It was simply what I could do.
Then in the middle of all that, something completely hooked me — the idea that a computer could look at a video frame and actually understand what it's seeing.
"How does a machine actually process images and video? How does it see?"
That question wouldn't leave me. I opened YouTube, found courses, and started digging.
"I wasn't escaping anything. I was genuinely fascinated. The idea that pixels could become understanding — I needed to know how that worked."
"Before touching any model I wanted to understand what was actually happening underneath — so I started with math. Linear algebra, probability, calculus. I wanted to understand gradient descent, not just call a function that does it."
Foundations · 2023
Before any model: linear algebra, probability, calculus. Why does backprop work? What is gradient descent actually doing? Real understanding, not just working code.
Then classical ML: linear regression, logistic regression, SVM. Each concept from math to implementation. Built practice projects to solidify understanding — not for publishing, for learning.
Deep Learning Foundations · 2023-2024
Classical ML was great — but the original pull was always visual. I moved into deep learning: neural networks, backprop by hand, activation functions. Then CNNs. Then transfer learning.
Practiced with cats vs dogs (77.6% — thrilled at the time). Experimented with industrial defects. But these were learning exercises, not portfolio pieces.
"I could've rushed to publish those early experiments. But I knew the difference between 'it works on my laptop' and 'it's production-ready.' I chose to keep learning."
"Classification tells you what. Detection tells you where — in real-time. That shift felt completely different. Like I was building something that could actually watch."
Object Detection · 2024
Learned YOLO deeply — anchor boxes, NMS, IOU, custom dataset training. Practiced with activity detection, PPE compliance, fall detection concepts. Understood real-time constraints and edge deployment challenges.
These concepts would later become the foundation for production projects.
Two-Stage Detection · 2024
After YOLO, I went backwards intentionally. RCNN → Fast RCNN → Faster RCNN. Understanding the evolution — why each architecture exists, what it fixed — made everything click deeper.
Region proposal networks, ROI pooling, two-stage vs one-stage tradeoffs. Practiced on mask detection concepts.
"I didn't want to just use architectures. I wanted to understand why each one exists — what it fixed that the previous one couldn't. That thinking changed how I approach everything."
"MediaPipe gave me 33 3D points describing a person's whole body in real-time. I kept thinking — this is 99 numbers per frame per second. What can I build with 99 numbers?"
Pose & Face · 2024-2025
Discovered MediaPipe and face_recognition — understanding the body through 3D landmarks, tracking identity, real-time pose estimation. Learned how to work with sequential landmark data and spatial relationships.
Practiced yoga pose classification concepts and face filter logic — building understanding for future production systems.
Tracking & OCR · 2025
Objects that disappear behind things and reappear — DeepSORT for maintaining unique IDs across frames and occlusions. Then OCR: LSTM, Transformers, CTC loss.
Learned to design and train a custom CRNN from scratch, then export models to ONNX for framework-agnostic inference and deployment-ready pipelines. Built an end-to-end system: detect → track → verify → read → collect evidence.
Understood how multiple models communicate in production: YOLO for detection, DeepSORT for identity consistency, CRNN for OCR, and ONNX Runtime to decouple training from inference.
"Hackathons forced me to learn deployment the hard way. No tutorials, no time — just 'the demo is in 6 hours and it needs to work.' That's where I learned Docker, GCP, and RAG."
Deployment Under Pressure · Sep 2025 – Feb 2026
From September 2025 to now: 7+ hackathons across healthcare, safety, conservation, education, social impact. No wins yet — but each one forced me to build and deploy in 24-48 hours.
Learned RAG, GCP, Docker, Datadog, ElevenLabs — not from tutorials, because the demo was due and I had no choice. This is where theory became deployment experience.
Production Portfolio · 2025-2026
Now I had the foundation (2023-2024) and deployment experience (hackathons). Time to build production-grade projects from scratch with proper architecture, documentation, and deployment pipelines.
Each project built end-to-end: problem → architecture → implementation → ONNX export → documentation → GitHub publication.
"The difference between my 2024 practice projects and my 2026 published work? Architecture. Documentation. ONNX deployment. Proper Git history. That's what makes it production-ready."
"Detection draws a box around something. Segmentation traces every single pixel that belongs to it. It's a different kind of precision — and in medical imaging, that precision is the difference between useful and useless."
Semantic Segmentation · Feb 2026
After hackathons pushed me into deployment and production projects solidified my pipeline skills, the next frontier was clear: not just detecting objects — understanding them at the pixel level.
Studied encoder-decoder architectures, skip connections, and DiceBCE loss. The jump from bounding boxes to masks is a completely different class of problem — and I built a production-grade medical imaging system to prove it.
"The original question was 'how do machines see?' Three years in, I realise it doesn't have a final answer. Every architecture is just a better, deeper answer. That's what keeps me going."
Technical Arsenal
Deep Learning
Computer Vision
Deployment & Production
AI & LLMs
Competition Record
Hackathons participated Sep 2025 - Feb 2026
Wins yet — but learned deployment under pressure
Build, deploy, demo — every single time
Production CV projects published to GitHub
Full Credentials
Let's Connect
I build production systems — from training scripts to deployed products with ONNX optimization. If you're working on something where visual intelligence matters, I'd love to talk.