CALIBRATED POPULARITY RE-RANKING WITH ALTERNATIVE DIVERGENCE MEASURES FOR POPULARITY BIAS MITIGATION


Creative Commons License

Yalçın E.

Eskişehir Technical University Journal of Science and and Technology A- Applied Sciences and Engineering, cilt.26, sa.3, ss.260-278, 2025 (Hakemli Dergi)

Özet

Popularity bias significantly limits the effectiveness of recommender systems by disproportionately favoring popular items and reducing exposure to diverse, less-known content. This bias negatively impacts personalization and marginalizes niche users and item providers. To address this challenge, calibrated recommendation methods have gained attention, notably the Calibrated Popularity (CP) approach, due to its simplicity, effectiveness, and model-agnostic nature. Originally, CP employs Jensen–Shannon divergence (JSD) to align the popularity distribution of recommended items with users’ historical interaction patterns. However, the choice of divergence measure substantially impacts calibration effectiveness and recommendation diversity. In this study, we systematically explore several alternative divergence measures, including Chi-Square, Wasserstein, Kullback–Leibler, Hellinger, Total Variation, Bhattacharyya, Cosine, and Renyi divergences, within the CP framework. Additionally, we propose a novel divergence-independent evaluation metric, namely Overall Similarity Error, enabling consistent assessment of calibration quality across divergence measures. Experimental results on two benchmark datasets using two collaborative filtering algorithms highlighted critical insights. More aggressive divergences, particularly Chi-Square, significantly enhanced calibration quality, reduced popularity bias, and increased recommendation diversity, albeit with a modest reduction in accuracy. In contrast, smoother divergences, such as JSD, maintained higher accuracy but provided limited improvements in reducing popularity bias. Also, the performed group-based analysis categorizing users into mainstream, balanced, and niche segments based on their historical popularity preferences revealed distinct patterns: balanced users typically achieved higher accuracy due to their evenly distributed preferences; mainstream users showed superior calibration results benefiting from robust signals of popular items; niche users obtained more diverse and personalized recommendations, clearly benefiting from aggressive divergence measures. These results underscore the complexity of addressing popularity bias and highlight the importance of adopting adaptive, user-aware calibration strategies to effectively balance accuracy, diversity, and fairness in recommender systems.

Popularity bias significantly limits the effectiveness of recommender systems by disproportionately favoring popular items and reducing exposure to diverse, less-known content. This bias negatively impacts personalization and marginalizes niche users and item providers. To address this challenge, calibrated recommendation methods have gained attention, notably the Calibrated Popularity (CP) approach, due to its simplicity, effectiveness, and model-agnostic nature. Originally, CP employs Jensen–Shannon divergence (JSD) to align the popularity distribution of recommended items with users’ historical interaction patterns. However, the choice of divergence measure substantially impacts calibration effectiveness and recommendation diversity. In this study, we systematically explore several alternative divergence measures, including Chi-Square, Wasserstein, Kullback–Leibler, Hellinger, Total Variation, Bhattacharyya, Cosine, and Renyi divergences, within the CP framework. Additionally, we propose a novel divergence-independent evaluation metric, namely Overall Similarity Error, enabling consistent assessment of calibration quality across divergence measures. Experimental results on two benchmark datasets using two collaborative filtering algorithms highlighted critical insights. More aggressive divergences, particularly Chi-Square, significantly enhanced calibration quality, reduced popularity bias, and increased recommendation diversity, albeit with a modest reduction in accuracy. In contrast, smoother divergences, such as JSD, maintained higher accuracy but provided limited improvements in reducing popularity bias. Also, the performed group-based analysis categorizing users into mainstream, balanced, and niche segments based on their historical popularity preferences revealed distinct patterns: balanced users typically achieved higher accuracy due to their evenly distributed preferences; mainstream users showed superior calibration results benefiting from robust signals of popular items; niche users obtained more diverse and personalized recommendations, clearly benefiting from aggressive divergence measures. These results underscore the complexity of addressing popularity bias and highlight the importance of adopting adaptive, user-aware calibration strategies to effectively balance accuracy, diversity, and fairness in recommender systems.