Mini-Vision-V1: CIFAR-10 CNN Classifier

Welcome to Mini-Vision-V1, the first model in the Mini-Vision series. This project demonstrates a robust implementation of a Convolutional Neural Network (CNN) for image classification using the CIFAR-10 dataset. It is designed to be lightweight, efficient, and easy to understand, making it perfect for beginners learning PyTorch.

Model Description

Mini-Vision-V1 is a custom 4-layer CNN architecture. It utilizes Batch Normalization and Dropout to prevent overfitting and ensure stable training. With only 1.34M parameters, it achieves a competitive accuracy on the CIFAR-10 test set.

Dataset: CIFAR-10 (32x32 color images, 10 classes)
Framework: PyTorch
Total Parameters: 1.34M

Model Architecture

The network consists of 4 convolutional blocks followed by a classifier head.

Layer	Input Channels	Output Channels	Kernel Size	Stride	Padding	Activation	Other
Conv Block 1	3	32	5	1	2	ReLU	MaxPool(2), BatchNorm
Conv Block 2	32	64	5	1	2	ReLU	MaxPool(2), BatchNorm
Conv Block 3	64	128	5	1	2	ReLU	MaxPool(2), BatchNorm
Conv Block 4	128	256	5	1	2	ReLU	MaxPool(2), BatchNorm
Flatten	-	-	-	-	-	-	Output: 1024
Linear 1	1024	256	-	-	-	ReLU	Dropout(0.5)
Linear 2	256	10	-	-	-	-	-

Training Strategy

The model was trained using standard practices for CIFAR-10 to maximize performance on a small footprint.

Optimizer: SGD (Momentum=0.9)
Initial Learning Rate: 0.007
Scheduler: StepLR (Step size=5, Gamma=0.5)
Loss Function: CrossEntropyLoss
Batch Size: 256
Epochs: Total 100 epochs, Best Accuracy 31 epoch
Data Augmentation:
- Random Crop (32x32 with padding=4)
- Random Horizontal Flip

Performance

The model achieved the following results on the CIFAR-10 test set:

Metric	Value
Test Accuracy	78%
Parameters	1.34M

Training Visualization (TensorBoard)

Below are the training and testing curves visualized via TensorBoard.

1. Training Loss

(Recorded every step)

2. Test Loss

(Recorded every epoch)

Quick Start

Dependencies

Python 3.x
PyTorch
Torchvision
requirements.txt

Inference

You can easily load the model and perform inference on a single image using the test.py file.

File Structure

.
├── model.py               # Model architecture definition
├── train.py               # Training script
├── test.py                # Inference script
├── Mini-Vision-V1.pth     # Trained model weights
├── config.json
├── README.md
└── assets
      ├── train_loss.png   # Visualized train loss graph
      └── test_loss.png    # Visualized test loss graph

License

This project is licensed under the MIT License.

Downloads last month: 28

Dataset used to train LWWZH/Mini-Vision-V1

Collection including LWWZH/Mini-Vision-V1

Mini-Vision-Series

Collection

2 items • Updated 1 day ago