Why LoRA Resists Label Noise: A Theoretical Framework for Noise-Robust Parameter-Efficient Fine-Tuning
Abstract
Low-Rank Adaptation (LoRA) methods demonstrate inherent robustness to label noise through theoretical analysis of memorization limits, bias-variance tradeoffs, and temporal learning dynamics, with RACT enabling effective noise detection.
Parameter-efficient fine-tuning methods like Low-Rank Adaptation (LoRA) have become the dominant paradigm for adapting large pretrained models. We present a theoretical framework explaining an underexplored property: LoRA's inherent resistance to label noise. Our analysis reveals three key insights. First, we prove that rank-r LoRA cannot memorize all possible label assignments once the sample size exceeds O(r(d+k-r)), limiting its capacity to fit arbitrary noise. Second, we derive an optimal rank balancing approximation bias and noise-induced variance, showing it decreases with noise rate. Third, we establish temporal separation: clean patterns are learned early while noise memorization occurs later. We propose RACT (Rank-Aware Curriculum Training), leveraging rank discrepancy for noise detection. Experiments validate our predictions, with RACT achieving 91.1% F1 for noise detection on AG News while maintaining 91.46% accuracy, competitive with baselines that lack noise detection capability.
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper