Backdoors in deep learning models rely on underfitting, not overfitting.
The article introduces a new way to measure overfitting in machine learning models, focusing on small sets of unlabeled test data. This method can show how well a model fits training and test data, track changes during training, and detect differences between classes. Surprisingly, the study found that backdoors in deep learning models are actually linked to underfitting, not overfitting. Even neural networks without backdoors can exhibit patterns similar to backdoors, which are consistently classified as one class.