Mini-Vision-V1: CIFAR-10 CNN Classifier

Model Size Accuracy

Welcome to Mini-Vision-V1, the first model in the Mini-Vision series. This project demonstrates a robust implementation of a Convolutional Neural Network (CNN) for image classification using the CIFAR-10 dataset. It is designed to be lightweight, efficient, and easy to understand, making it perfect for beginners learning PyTorch.

Model Description

Mini-Vision-V1 is a custom 4-layer CNN architecture. It utilizes Batch Normalization and Dropout to prevent overfitting and ensure stable training. With only 1.34M parameters, it achieves a competitive accuracy on the CIFAR-10 test set.

  • Dataset: CIFAR-10 (32x32 color images, 10 classes)
  • Framework: PyTorch
  • Total Parameters: 1.34M

Model Architecture

The network consists of 4 convolutional blocks followed by a classifier head.

Layer Input Channels Output Channels Kernel Size Stride Padding Activation Other
Conv Block 1 3 32 5 1 2 ReLU MaxPool(2), BatchNorm
Conv Block 2 32 64 5 1 2 ReLU MaxPool(2), BatchNorm
Conv Block 3 64 128 5 1 2 ReLU MaxPool(2), BatchNorm
Conv Block 4 128 256 5 1 2 ReLU MaxPool(2), BatchNorm
Flatten - - - - - - Output: 1024
Linear 1 1024 256 - - - ReLU Dropout(0.5)
Linear 2 256 10 - - - - -

Training Strategy

The model was trained using standard practices for CIFAR-10 to maximize performance on a small footprint.

  • Optimizer: SGD (Momentum=0.9)
  • Initial Learning Rate: 0.007
  • Scheduler: StepLR (Step size=5, Gamma=0.5)
  • Loss Function: CrossEntropyLoss
  • Batch Size: 256
  • Epochs: Total 100 epochs, Best Accuracy 31 epoch
  • Data Augmentation:
    • Random Crop (32x32 with padding=4)
    • Random Horizontal Flip

Performance

The model achieved the following results on the CIFAR-10 test set:

Metric Value
Test Accuracy 78%
Parameters 1.34M

Training Visualization (TensorBoard)

Below are the training and testing curves visualized via TensorBoard.

1. Training Loss

Training Loss (Recorded every step)

2. Test Loss

Test Loss (Recorded every epoch)

Quick Start

Dependencies

  • Python 3.x
  • PyTorch
  • Torchvision
  • requirements.txt

Inference

You can easily load the model and perform inference on a single image using the test.py file.

File Structure

.
β”œβ”€β”€ model.py               # Model architecture definition
β”œβ”€β”€ train.py               # Training script
β”œβ”€β”€ test.py                # Inference script
β”œβ”€β”€ Mini-Vision-V1.pth     # Trained model weights
β”œβ”€β”€ config.json
β”œβ”€β”€ README.md
└── assets
      β”œβ”€β”€ train_loss.png   # Visualized train loss graph
      └── test_loss.png    # Visualized test loss graph

License

This project is licensed under the MIT License.

Downloads last month
28
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train LWWZH/Mini-Vision-V1

Collection including LWWZH/Mini-Vision-V1