This project investigates the performance of EfficientNet-B0 on the long-tailed CIFAR-10-LT dataset and explores techniques to mitigate class imbalance. The baseline model achieves an overall accuracy of 84.23% but demonstrates significant imbalanced between head and tail classes, with high recall but low precision for head classes and low recall but high precision for tail classes. To address these issues, six strategies are implemented, including Reweight, and decoupled training with five advanced loss functions: Modified Cross-Entropy, Balanced Softmax, Focal Loss, Class-Balanced Loss, and Class-Balanced Focal Loss. The decoupled training separates feature representation learning from classifier training, allowing targeted adjustments at the classifier level. Performance is evaluated using accuracy, F1-score, precision, recall. Results show that Reweight improves tail-class recall but lower head-class recall, whereas decoupling combined with advanced loss functions significantly improves both precision and recall across head and tail classes, with Modified Cross-Entropy achieving the highest overall accuracy of 93.78%. This project demonstrates that decoupled training with advanced loss functions effectively address the class imbalance problem in the CIFAR10-LT dataset and improving classification accuracy of EfficientNet-B0.
The project focuses on reproducing the EfficientNet-B0 model using the CIFAR10-LT dataset, addressing the class imbalance problem in the CIFAR10-LT dataset and improving the classification accuracy of EfficientNet-B0 baseline model. The baseline model source code was provided, with fixed hyperparameters for training:
Learning rate: 1e-4
Weight decay: 0.9999
Number of epochs: 30
Batch size: 128
The EfficientNet-B0 was reproduced using the CIFAR10-LT dataset. The EfficientNet-B0 baseline model achieved an overall accuracy of 84.15% on the CIFAR10-LT dataset. While this looks reasonably strong at first, per-class metrics shows several class imbalance and difficulty in performance. Figure 1 shows Confusion Matrix for baseline model. The square with darker blue shade shows higher recall ( 90%) while the one with lighter shade shows lower recall ( 70%).
The first technique implemented is Reweight. This ensures that during training, each class is sampled with equal probability regardless of how many samples it actually has, preventing the model from being biased toward head classes. To achieve this, the modified model extracted all class labels from the dataset and used a Counter to calculate how many samples belonged to each class.
It then assigned each class a weight inversely proportional to its frequency, meaning that tail classes received higher weights.
Every sample in the dataset was assigned its corresponding class weight, and the model used torch.utils.data.WeightedRandomSampler to sample data points based on these weights, thereby improving the representation of tail classes during training.
The EfficientNet-B0 with reweight technique achieved an overall accuracy of 85.82%, an increase of 1.59% from baseline model and weighted average F1-score of 85.84%,increase by 1.66% compared to baseline model. The confusion norm with reweight technique is shown in Figure 2 and the performance metrics with reweight is shown in Table 2. Comparison of Figures 1 and 2 shows that the reweight technique fixes the class imbalance by increasing the recall in some of the tail classes with darker shade in the last few squares.
Improve Technique: Decoupling with Modified Cross-Entropy
Improve Technique: Decoupling with Balanced Softmax
Improve Technique: Decoupling with Focal
Improve Technique: Decoupling with Class-Balanced
Improve Technique: Decoupling with Class-Balanced Focal