Debiasing language models for social biases underestimated, impacts downstream tasks.
Pre-trained language models have learned social biases and methods have been proposed to remove these biases while keeping useful information. Previous research has shown that debiasing affects the performance of tasks, but it's unclear if the tasks include enough biased data. This study compared the impact of debiasing on different tasks using datasets with biased words. The results show that debiasing effects are consistently underestimated across all tasks. By focusing on instances with biased words, the effects of debiasing can be accurately evaluated.