Mini-Vision-V1: CIFAR-10 CNN Classifier
Welcome to Mini-Vision-V1, the first model in the Mini-Vision series. This project demonstrates a robust implementation of a Convolutional Neural Network (CNN) for image classification using the CIFAR-10 dataset. It is designed to be lightweight, efficient, and easy to understand, making it perfect for beginners learning PyTorch.
Model Description
Mini-Vision-V1 is a custom 4-layer CNN architecture. It utilizes Batch Normalization and Dropout to prevent overfitting and ensure stable training. With only 1.34M parameters, it achieves a competitive accuracy on the CIFAR-10 test set.
- Dataset: CIFAR-10 (32x32 color images, 10 classes)
- Framework: PyTorch
- Total Parameters: 1.34M
Model Architecture
The network consists of 4 convolutional blocks followed by a classifier head.
| Layer | Input Channels | Output Channels | Kernel Size | Stride | Padding | Activation | Other |
|---|---|---|---|---|---|---|---|
| Conv Block 1 | 3 | 32 | 5 | 1 | 2 | ReLU | MaxPool(2), BatchNorm |
| Conv Block 2 | 32 | 64 | 5 | 1 | 2 | ReLU | MaxPool(2), BatchNorm |
| Conv Block 3 | 64 | 128 | 5 | 1 | 2 | ReLU | MaxPool(2), BatchNorm |
| Conv Block 4 | 128 | 256 | 5 | 1 | 2 | ReLU | MaxPool(2), BatchNorm |
| Flatten | - | - | - | - | - | - | Output: 1024 |
| Linear 1 | 1024 | 256 | - | - | - | ReLU | Dropout(0.5) |
| Linear 2 | 256 | 10 | - | - | - | - | - |
Training Strategy
The model was trained using standard practices for CIFAR-10 to maximize performance on a small footprint.
- Optimizer: SGD (Momentum=0.9)
- Initial Learning Rate: 0.007
- Scheduler: StepLR (Step size=5, Gamma=0.5)
- Loss Function: CrossEntropyLoss
- Batch Size: 256
- Epochs: Total 100 epochs, Best Accuracy 31 epoch
- Data Augmentation:
- Random Crop (32x32 with padding=4)
- Random Horizontal Flip
Performance
The model achieved the following results on the CIFAR-10 test set:
| Metric | Value |
|---|---|
| Test Accuracy | 78% |
| Parameters | 1.34M |
Training Visualization (TensorBoard)
Below are the training and testing curves visualized via TensorBoard.
1. Training Loss
2. Test Loss
Quick Start
Dependencies
- Python 3.x
- PyTorch
- Torchvision
- requirements.txt
Inference
You can easily load the model and perform inference on a single image using the test.py file.
File Structure
.
βββ model.py # Model architecture definition
βββ train.py # Training script
βββ test.py # Inference script
βββ Mini-Vision-V1.pth # Trained model weights
βββ config.json
βββ README.md
βββ assets
βββ train_loss.png # Visualized train loss graph
βββ test_loss.png # Visualized test loss graph
License
This project is licensed under the MIT License.
- Downloads last month
- 28

