You don't just use pretrained weights — you understand why each architecture was designed. Click each to explore.
Simple CNN — Image Classification
Your first convolutional network built from scratch in PyTorch. Every layer placed intentionally: why 3×3 kernels, why max-pool, why ReLU over sigmoid in hidden layers.
ResNet-50 with Transfer Learning
Fine-tune a pretrained ResNet-50 on a custom image dataset (plant disease, product defects, or your own choice). Understand skip connections, why they solve vanishing gradients, and how to replace the classifier head.
YOLOv5 — Real-Time Object Detection
Run and fine-tune YOLOv5 on a custom object detection dataset. Understand anchor boxes, IoU, NMS, and how to label your own data with Roboflow. Build a live webcam demo.
U-Net — Image Segmentation
Build U-Net from scratch for semantic segmentation. Applied to a medical imaging dataset (chest X-ray lung segmentation). Understand encoder-decoder architecture and skip connections for pixel-wise prediction.
We teach PyTorch exclusively. This is a deliberate choice based on where the industry and research community has moved. Here's the full picture:
| Dimension | PyTorch ✓ (taught here) | TensorFlow / Keras |
|---|---|---|
| Research adoption | ~80% of ML papers use PyTorch Winner | Declining in research |
| Industry (AI startups) | Default choice at most AI startups Winner | Used in legacy systems |
| Debugging experience | Python-native, easy to step through Winner | Graph mode harder to debug |
| Dynamic computation | Dynamic by default — intuitive Winner | Static graph (Keras helps) |
| HuggingFace / LLMs | Near-universal PyTorch backend Winner | Limited support |
| Interview signal | Expected at most AI interviews Winner | Acceptable but secondary |
Computer Vision Engineer
CV engineers are in high demand at autonomous vehicle, surveillance, medtech, and manufacturing companies.
₹12–24 LPA fresher rangeDeep Learning Engineer
Roles at AI labs, product startups, and R&D divisions of large tech firms building neural-network-powered products.
₹14–28 LPA fresher rangeAI in Healthcare / MedTech
Medical imaging AI is one of the fastest-growing CV application areas. U-Net and segmentation skills are directly applicable.
High growth sectorResearch Engineer / Intern
Top research internships (IISc, IITB, CMU remote) require PyTorch fluency and an understanding of CNN architectures.
₹60–150k/month stipend// This course is for
- Perceptron → MLP: forward pass, weights, biases
- Activation functions: ReLU, sigmoid, tanh, GELU — when to use each
- Backpropagation: chain rule, gradient flow, vanishing gradients
- NumPy MLP from scratch on MNIST
- PyTorch: tensors, autograd, nn.Module, optim
- Training loop: forward → loss → backward → step
- GPU setup on Google Colab — move tensors to CUDA
- Convolution: kernels, stride, padding, receptive field
- Pooling: max vs average, global average pooling
- Batch Normalisation: why it works, where to place it
- Dropout: regularisation for deep networks
- Build LeNet → AlexNet → VGG in PyTorch step by step
- DataLoader, transforms, augmentation with torchvision
- Train on CIFAR-10: reach 93%+ accuracy
- ResNet: skip connections, residual blocks, why depth works now
- EfficientNet: compound scaling, MobileNet-style efficiency
- Learning rate schedulers: StepLR, CosineAnnealing, OneCycleLR
- Mixed precision training (torch.cuda.amp) — 2x speed on Colab
- Data augmentation: RandomCrop, Cutout, MixUp
- Grad-CAM: visualise what the network attends to
- TensorBoard logging for training diagnostics
- Transfer learning: feature extraction vs fine-tuning
- Load pretrained ResNet-50 via torchvision.models
- Replace classifier head for custom number of classes
- Freeze/unfreeze layers — progressive unfreezing strategy
- Custom ImageFolder dataset: organise, load, augment
- Project: plant disease classification (38 classes, 87k images)
- Confusion matrix, per-class accuracy, misclassification analysis
- Object detection fundamentals: bounding boxes, IoU, NMS
- Two-stage (R-CNN) vs one-stage (YOLO) detectors
- Anchor boxes: intuition and design
- YOLOv5 architecture walkthrough and configuration
- Label custom data with Roboflow (free tier)
- Fine-tune YOLOv5 on custom dataset
- [email protected] evaluation, precision-recall curves
- Live inference: webcam demo in OpenCV
- Semantic vs instance vs panoptic segmentation
- U-Net architecture: encoder-decoder, skip connections
- Dice loss, IoU loss for segmentation tasks
- Project: chest X-ray lung segmentation
- ONNX export: framework-agnostic model format
- TorchScript: serialise for production
- FastAPI endpoint for image inference
- Capstone: end-to-end CV system of your choice
CIFAR-10 CNN Classifier
Build a custom CNN from scratch in PyTorch. Beat 90% accuracy, visualise filters, and interpret predictions with Grad-CAM heatmaps.
Plant Disease Detection
Fine-tune ResNet-50 on 87k plant disease images across 38 categories. Deploy as a FastAPI endpoint that accepts image uploads.
Custom Object Detector
Label your own dataset (any object: vehicles, products, defects), train YOLOv5, and run real-time inference on a video stream. Fully deployable demo.
End-to-End CV System
Pick your domain: medical imaging, retail, agriculture, or satellite imagery. Build a full pipeline from data collection → model training → Grad-CAM interpretability → FastAPI serving.
Newton JEE Silver Badge
ML Practitioner — Deep Learning & Computer Vision
Silver Badge — The Deep Learning Signal
The Silver badge in Deep Learning & CV tells recruiters something specific: you can build, train, and deploy vision models end-to-end with PyTorch. It's a verifiable credential that sits alongside your ML Mastery Silver badge as a two-badge cluster that opens CV/DL interview doors.
The YOLOv5 session was insane. We labelled our own data using Roboflow and had a working object detector running on a webcam by end of session. I showed the demo in my interview at a robotics startup and got the offer on the spot.
Divya is the best instructor I've had — in any medium. She explains skip connections in ResNet using a metaphor that made me understand in 30 seconds what three YouTube videos couldn't explain in three hours. The Grad-CAM session changed how I think about model debugging.
I was worried about GPU access but the pre-configured Colab notebooks removed every barrier. From session one I was training on GPUs. The plant disease project alone took my portfolio from zero to something I'm actually proud to show.
The medical imaging capstone was genuinely hard — and I mean that as a compliment. Building a chest X-ray segmentation model and presenting it to Divya for review pushed me to a level I wouldn't have reached alone. It's now my top portfolio piece.