From Uncertainty to Precision: Enhancing Binary Classifier Performance through Calibration

Authors
Affiliations

Arthur Charpentier

Université du Québec à Montréal

Agathe Fernandes Machado

Université du Québec à Montréal

Emmanuel Flachaire

Aix-Marseille School of Economics, Aix-Marseille Univ.

Ewen Gallic

Aix-Marseille School of Economics, Aix-Marseille Univ.

François Hu

Université de Montréal

Published

February 9, 2024

Introduction

This ebook is the online supplementary materials for the article titled “From Uncertainty to Precision: Enhancing Binary Classifier Performance through Calibration.”. The preprint is available on arXiv: https://arxiv.org/abs/2402.07790.

In this ebook, we are interested in the calibration of a binary classifier for which the scores returned by the model should be interpreted as probabilities.

The codes are written in R. This notebook explains them. You can download the scripts on the following github repository:

Structure

This ebook is structured into three main parts

Part 1: Synthetic Data and Calibration Metrics

In the first part, we provide an overview of the synthetic data utilized in our study and introduce various calibration metrics and visualization techniques. This foundational chapter lays the groundwork for understanding the subsequent analysis (Chapter 1). We then move to the presentation of the recalibration techniques and we apply them to our synthetic data (Chapter 2)

Part 2: Calibration of Random Forests

Moving forward, we look into the calibration of random forests. Initially, we estimate single random forest regressors and classifiers, detailing the calibration process step by step. We explore the effects of recalibration on the forest scores, both before and after adjustment (Chapter 3). Following this, we employ bootstrap simulations to further explore the calibration of random forests using synthetic data. We analyze the calibration metrics pre- (Chapter 4) and post-recalibration (Chapter 5).

Part 3: Real-World Data Analysis

In the final part, we transition to real-world data on default prediction. We kick off with a comprehensive grid search to identify the optimal set of hyperparameters for both random forest regressors and classifiers (Chapter 6). Subsequently, we conduct bootstrap simulations to evaluate the calibration of random forests on real-world data, and examine the relationship between performance and calibration metrics. The simulations are run in Chapter 7 and the results are presented in Chapter 8.