Assignment 1: Performance & Hardware Analysis
A deep dive into ResNet architectures (18, 32, 50) and SVM benchmarks. Features full analysis of FLOPs and hardware speedup factors.
Bridging the gap between Deep Learning research and Production Engineering
Analysis of model depth vs. efficiency, FLOPs tracking, and hardware-aware training (CPU vs. GPU).
Automating model lifecycles, reproducibility, version control for data/models, and scalable deployment.
A deep dive into ResNet architectures (18, 32, 50) and SVM benchmarks. Features full analysis of FLOPs and hardware speedup factors.
Focusing on Dockerizing ML models and setting up automated GitHub Action pipelines for model testing.
| Backend | Model | Accuracy | Time (ms) | FLOPs |
|---|---|---|---|---|
| GPU (CUDA) | ResNet-18 | 85.59% | 59,971 | 5.51e+08 |
| CPU (Intel) | ResNet-18 | 85.24% | 2,375,155 | 5.51e+08 |
Note: The GPU achieved a speedup of ~40x compared to the CPU while maintaining identical FLOPs.
| Architecture | Params | GFLOPs |
|---|---|---|
| ResNet-18 | 11.2 M | 0.55 |
| ResNet-32 | 21.3 M | 1.14 |
| ResNet-50 | 23.5 M | 1.28 |