Post-Calibration Techniques: Balancing Calibration and Score Distribution Alignment

Authors

Affiliations

Arthur Charpentier

Université du Québec à Montréal

Agathe Fernandes Machado

Université du Québec à Montréal

Emmanuel Flachaire

Aix-Marseille School of Economics, Aix-Marseille Univ.

Ewen Gallic

Aix-Marseille School of Economics, Aix-Marseille Univ.

François Hu

Milliman France

Published

December 14, 2024

1 Introduction

This notebook is the online appendix of the article titled Post-Calibration Techniques: Balancing Calibration and Score Distribution Alignment.” It provides supplementary materials to the main part of the paper.

The poster associated with the paper was presented at NeurIPS workshop on Bayesian Decision-making and Uncertainty, Vancouver, British Columbia, Canada on December 14, 2024. https://neurips.cc/virtual/2024/98915

1.1 Abstract of the Paper

A binary scoring classifier can appear well-calibrated according to standard calibration metrics, even when the distribution of scores does not align with the distribution of the true events. In this paper, we investigate the impact of post-processing calibration on the score distribution (sometimes named ‘recalibration’). Using simulated data, where the true probability is known, followed by real-world datasets with prior knowledge on event distributions, we compare the performance of an XGBoost model before and after applying calibration techniques. The results show that while applying methods such as Platt scaling, Beta calibration, or isotonic regression can improve the model’s calibration, they may also lead to an increase in the divergence between the score distribution and the underlying event probability distribution.