Image Quality Assessment for Machines: Paradigm, Large-scale Database, and Models

中文 | English | Model | Dataset

🎯 Project Overview

🤖 Machine-Centric: We bypass human perception to evaluate images from the perspective of the deep learning models that use them.
📈 Task-Driven Metrics: Directly measure how degradations like blur, noise, or compression artifacts impact the performance of downstream vision tasks.
💡 A New Paradigm: MIQA offers a new lens for optimizing image processing pipelines where machines make the final decision.

✨ Does MIQA Work?

Performance improvement across tasks when filtering low-quality images using MIQA scores

🗝️ Key Results

Our results provide clear evidence of MIQA's effectiveness across three representative computer vision tasks: classification, detection, and segmentation. The framework consistently identifies images that degrade model performance. By filtering these detrimental samples, MIQA directly leads to improved outcomes and demonstrates the universal utility of a machine-centric approach. This transforms quality assessment from a passive metric into a proactive tool, safeguarding downstream models against the unpredictable image quality of real-world conditions and ensuring robust performance when it matters most.

🛠️ Installation Guide

Step 1: Install Dependencies

To get started, you'll need to install two essential libraries: mmcv and mmsegmentation.

Install mmcv and mmsegmentation

For the latest version of mmsegmentation, follow the installation guide here: MMsegmentation Installation Guide
Alternatively, you can install a specific version of mmsegmentation based on your CUDA and PyTorch versions. You can find the version compatibility details here: MMCV Installation Guide

Step 2: Handle CUDA Version Compatibility

If your CUDA version is relatively high, such as 12.7 or higher, you might encounter a version mismatch with mmcv. In this case, you may need to install a compatible version of mmcv.

Install a compatible version of mmcv

For example, if you need a specific version of mmcv, you can uninstall the existing versions and install a compatible one as follows:

pip uninstall mmcv mmcv-full -y
mim install "mmcv>=2.0.0rc4,<2.2.0"  # The version specified here is just an example. You should choose a version that is compatible with your CUDA and PyTorch setup.*

Step 3: Install Required Libraries

pip install -r requirements.txt

📦 Model Weights & Performance

Where things live

Role	Location
Application code (training, inference, evaluation)	This GitHub repository: github.com/XiaoqiWang/MIQA
Published RA-MIQA checkpoints (9 files)	Hugging Face model repo: xiaoqi-wang/miqa
MIQD-2.5M database	Hugging Face dataset: xiaoqi-wang/miqd-2.5m

Naming & cache

Checkpoint pattern on the Hub: miqa_ra_miqa_{cls|det|ins}_{composite|consistency|accuracy}_metric.pth.tar

Examples:

miqa_ra_miqa_cls_composite_metric.pth.tar
miqa_ra_miqa_det_consistency_metric.pth.tar
miqa_ra_miqa_ins_accuracy_metric.pth.tar

On first run, huggingface_hub downloads into models/checkpoints/{composite|consistency|accuracy}_metric/.

🚀 Quick Start

Assess a Single Image

Run MIQA inference for a single image using command-line interface:

# Evaluate a single image for classification-oriented MIQA

python img_inference.py --input path/to/image.jpg --task cls --model ra_miqa

Evaluate a Directory of Images

Process all images within a directory

# Assess all images in a directory (e.g., detection-oriented MIQA)

python img_inference.py --input ./assets/demo_images/coco_demo --task det --model ra_miqa

Save Results and Visualizations

To save outputs and generate visualized results:

# Save the predicted scores and visualization for a single image
python img_inference.py --input path/to/image.jpg --task cls --model ra_miqa --save-results --visualize

# Save batch results and generate visualization for a directory
python img_inference.py --input ./assets/demo_images/imagenet_demo --task ins --save-results --visualize

🎬 Video Assessment

Video Quality Assessment offers two workflows: (1) Frame-by-Frame Annotation: Generates fully annotated videos for detailed visual inspection. Suitable for demos and qualitative analysis but computationally intensive. (2) Selective Sampling & Aggregation: Samples frames to produce plots and structured data (.json) for efficient, quantitative analysis. Ideal for batch processing and reporting.

Analyze a Single Video (Frame-by-Frame Annotation)

Run MIQA video inference for one video and save the annotated output.

# Evaluate a single video using RA-MIQA (classification-oriented MIQA)
python video_annotator_inference.py --input assets/demo_video/brightness_distorted.mp4 --task cls --model ra_miqa

Evaluate a Directory of Videos (Frame-by-Frame Annotation)

Process all videos within a given folder:

# Assess all videos in a directory for object detection-oriented MIQA
python video_annotator_inference.py --input assets/demo_video/ --task det --model ra_miqa

The primary output is a new .mp4 video file. This video shows the original footage playing alongside a dynamic side panel that displays the real-time quality score and a line chart that grows as the video progresses.

🎥 Example: Frame-wise MIQA Predictions on Videos

Brightness Variation	Compression Artifacts	Minimal Perceptual Distortion

Analyze a Single Video (Selective Sampling & Aggregation)

For efficient, quantitative analysis, this script samples frames from the video instead of processing all of them. It is significantly faster and designed for generating analytical reports.

# Analyze a video, sample frames, and create a dual-granularity plot
python video_analytics_inference.py --input assets/demo_video/gaussian_distorted.mp4 --task ins --visualize --viz-granularity both

Evaluate a Directory of Videos (Selective Sampling & Aggregation)

This workflow is highly optimized for batch processing.

# Analyze all videos in a directory, sampling 120 frames from each
python video_analytics_inference.py --input assets/demo_video/ --task det --video-frames 120 --visualize

python video_analytics_inference.py --input assets/demo_video/jpeg_distorted.mp4  --task det --visualize --viz-granularity both
# viz-granularity both : Specifies the type of plot to generate. 'composite' creates a comprehensive, side-by-side comparison chart showing:
#1. The raw, frame-level quality scores. 2. The smoothed, per-second average quality scores.

This process does not create a new video. Instead, it generates two key outputs for each video analyzed:

A .png image: A detailed time-series plot showing the quality score fluctuation over the video's duration.
A .json file: A structured data file containing per-second aggregated scores, overall statistics (average, min, max, std. dev), and video metadata.

🎥 Example: Frame-wise MIQA Predictions on Videos

Brightness Variation	Compression Artifacts	Minimal Perceptual Distortion

🏃 Training and Evaluation

Training

CUDA_VISIBLE_DEVICES=0,1 python train.py \
      --dataset 'miqa_cls' \
      --path_miqa_cls 'path/to/datasets_miqa_cls' \
      --train_split_file '../data/dataset_splitting/miqa_cls_train.csv' \
      --val_split_file '../data/dataset_splitting//miqa_cls_val.csv' \
      --metric_type 'composite' --loss_name 'mse' --is_two_transform \
      -a 'RA-MIQA' --pretrained --transform_type 'simple_transform' \
      -b 256 --epochs 5 --warmup_epochs 1 --validate_num 2 --lr 1e-4 \
      --image_size 288 --crop_size 224 --workers 8 -p 100 \
      --multiprocessing-distributed --world-size 1 --rank 0

More training scripts are available in the "scripts" directory.

Evaluation on Standard Benchmarks

# Evaluate on miqa_cls val set
python evaluate.py --model_name ra_miqa  --train_dataset cls  --test_dataset cls  --metric_type composite

# Cross-dataset evaluation: evaluate the RA-MIQA model trained on miqa_cls dataset and tested on miqa_det dataset
python evaluate.py --model_name ra_miqa  --train_dataset cls  --test_dataset det  --metric_type composite

📈 Benchmarks

Tabel 1: Performance Benchmark on Composite Score

Category	Method	Image Classification				Object Detection				Instance Segmentation
Category	Method	SRCC ↑	PLCC ↑	KRCC ↑	RMSE ↓	SRCC ↑	PLCC ↑	KRCC ↑	RMSE ↓	SRCC ↑	PLCC ↑	KRCC ↑	RMSE ↓
HVS-based	PSNR	0.2388	0.2292	0.1661	0.2928	0.3176	0.3456	0.2148	0.2660	0.3242	0.3530	0.2196	0.2553
	SSIM	0.3027	0.2956	0.2119	0.2874	0.4390	0.4505	0.3011	0.2531	0.4391	0.4512	0.3011	0.2435
	VSI	0.3592	0.3520	0.2520	0.2816	0.4874	0.4940	0.3355	0.2465	0.4919	0.4985	0.3392	0.2365
	LPIPS	0.3214	0.3280	0.2258	0.2842	0.5264	0.5376	0.3697	0.2390	0.5342	0.5453	0.3754	0.2287
	DISTS	0.3878	0.3804	0.2724	0.2782	0.5266	0.5352	0.3659	0.2395	0.5363	0.5450	0.3738	0.2288
	HyperIQA	0.2496	0.2279	0.1741	0.2929	0.4462	0.4463	0.3031	0.2537	0.4456	0.4518	0.3031	0.2434
	MANIQA	0.3403	0.3255	0.2387	0.2844	0.4574	0.4617	0.3124	0.2515	0.4636	0.4680	0.3176	0.2411

Machine-based	ResNet-18	0.5131	0.5427	0.3715	0.2527	0.7541	0.7734	0.5625	0.1797	0.7582	0.7790	0.5674	0.1711
	ResNet-50	0.5581	0.5797	0.4062	0.2451	0.7743	0.7925	0.5824	0.1729	0.7729	0.7933	0.5826	0.1661
	EfficientNet-b1	0.5901	0.6130	0.4320	0.2377	0.7766	0.7950	0.5859	0.1720	0.7808	0.7999	0.5918	0.1637
	EfficientNet-b5	0.6330	0.6440	0.4680	0.2301	0.7866	0.8041	0.5971	0.1685	0.7899	0.8074	0.6013	0.1610
	ViT-small	0.5998	0.6161	0.4407	0.2370	0.7992	0.8142	0.6099	0.1646	0.7968	0.8139	0.6083	0.1585
	RA-MIQA	0.7003	0.6989	0.5255	0.2152	0.8125	0.8264	0.6263	0.1596	0.8188	0.8340	0.6333	0.1505

Table 2: Consistency & Accuracy Score Benchmark

Method	Image Classification						Object Detection						Instance Segmentation
	Accuracy Score			Consistency Score			Accuracy Score			Consistency Score			Accuracy Score			Consistency Score
	SRCC ↑	PLCC ↑	RMSE ↓	SRCC ↑	PLCC ↑	RMSE ↓	SRCC ↑	PLCC ↑	RMSE ↓	SRCC ↑	PLCC ↑	RMSE ↓	SRCC ↑	PLCC ↑	RMSE ↓	SRCC ↑	PLCC ↑	RMSE ↓
HVS-based Methods
PSNR	0.2034	0.1620	0.3541	0.2927	0.2812	0.2692	0.2234	0.2449	0.2747	0.3712	0.3933	0.2839	0.2182	0.2398	0.2616	0.3796	0.4061	0.2770
SSIM	0.2529	0.2101	0.3509	0.3740	0.3663	0.2610	0.3434	0.3419	0.2662	0.5128	0.5130	0.2651	0.3271	0.3284	0.2545	0.5174	0.5204	0.2589
VSI	0.3020	0.2515	0.3473	0.4392	0.4336	0.2528	0.3799	0.3685	0.2634	0.5700	0.5571	0.2565	0.3703	0.3645	0.2509	0.5757	0.5749	0.2481
LPIPS	0.2680	0.2355	0.3488	0.3927	0.4032	0.2567	0.4064	0.3987	0.2598	0.6196	0.6232	0.2415	0.3972	0.3941	0.2476	0.6300	0.6344	0.2344
DISTS	0.3291	0.2768	0.3448	0.4683	0.4628	0.2487	0.4089	0.3999	0.2597	0.6174	0.6178	0.2429	0.4069	0.4012	0.2468	0.6255	0.6270	0.2362
HyperIQA	0.2100	0.1649	0.3540	0.2966	0.2777	0.2695	0.3646	0.3545	0.2649	0.5009	0.4943	0.2684	0.3486	0.3442	0.2530	0.5056	0.4995	0.2626
MANIQA	0.2924	0.2435	0.3481	0.3963	0.3870	0.2587	0.3839	0.3823	0.2618	0.4991	0.4975	0.2679	0.3755	0.3749	0.2498	0.5096	0.5098	0.2608
Machine-based Methods
ResNet-50	0.4734	0.4411	0.3221	0.5989	0.6551	0.2119	0.6955	0.6898	0.2051	0.8252	0.8457	0.1648	0.6863	0.6847	0.1964	0.8320	0.8480	0.1607
EfficientNet-b5	0.5586	0.5149	0.3076	0.6774	0.7168	0.1956	0.7042	0.6991	0.2026	0.8353	0.8530	0.1612	0.6933	0.6949	0.1938	0.8419	0.8564	0.1565
ViT-small	0.5788	0.5197	0.3066	0.6798	0.7189	0.1950	0.7121	0.7052	0.2008	0.8459	0.8620	0.1566	0.7168	0.7146	0.1885	0.8487	0.8616	0.1539
RA-MIQA	0.6573	0.5823	0.2917	0.7707	0.7866	0.1732	0.7448	0.7370	0.1915	0.8526	0.8692	0.1527	0.7363	0.7327	0.1834	0.8632	0.8756	0.1464

📚 Citation

If you find this work useful in your research, please consider citing:

@article{wang2025miqa,
  title={Image Quality Assessment for Machines: Paradigm, Large-scale Database, and Models},
  author={Wang, Xiaoqi and Zhang, Yun and Lin, Weisi},
  journal={arXiv preprint arXiv:2508.19850},
  year={2025}
}

📧 Contact

Project Maintainer: Xiaoqi Wang
Issues: Please use GitHub Issues for bug reports and feature requests

Downloads last month: 13

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train xiaoqi-wang/miqa

Paper for xiaoqi-wang/miqa

Image Quality Assessment for Machines: Paradigm, Large-scale Database, and Models

Paper • 2508.19850 • Published Aug 27, 2025