We load the dataset where the sensitive attribute ((S)) is the race, obtained Chapter 4.3:
load("../data/df_race.rda")
We also load the dataset where the sensitive attribute is also the race, but where where the target variable ((Y), ZFYA) is binary (1 if the student obtained a standardized first year average over the median, 0 otherwise). This dataset was saved in Chapter 5.5:
load("../data/df_race_c.rda")
We also need the predictions made by the classifier (see Chapter 5):
# Predictions on train/test setsload("../data/pred_aware.rda")load("../data/pred_unaware.rda")# Predictions on the factuals, on the whole datasetload("../data/pred_aware_all.rda")load("../data/pred_unaware_all.rda")
We load the adjacency matrix that translates the assumed causal structure, obtained in Chapter 4.3:
load("../data/adj.rda")
6.2 Counterfactuals with fairadapt
We adapt the code from Plečko, Bennett, and Meinshausen (2021) to handle the test set. This avoids estimating cumulative distribution and quantile functions on the test set, which would otherwise necessitate recalculating quantile regression functions for each new sample.
We do not need to adapt Y here, so we need to remove it from the adjacency matrix:
adj_wo_Y <- adj[-4,-4]adj_wo_Y
S X1 X2
S 0 1 1
X1 0 0 1
X2 0 0 0
We create a dataset with the sensitive attribute and the two other predictors:
df_race_fpt <- df_race_c |>select(S, X1, X2)
Let us have a look at the levels of our sensitive variable:
levels(df_race_fpt$S)
[1] "Black" "White"
The reference class here consists of Black individuals.
Two configurations will be considered in turn:
The reference class consists of Black individuals, and FairAdapt will be used to obtain the counterfactual UGPA and LSAT scores for White individuals as if they had been Black.
The reference class consists of White individuals, and FairAdapt will be used to obtain the counterfactual UGPA and LSAT scores for Black individuals as if they had been White.
we have two predictive models for the FYA (above median = 1, or below median = 0):
unaware (without S)
aware (with S)
we have the counterfactual characteristics obtained with fairadapt in two situations depending on the reference class:
Black individuals as reference
White individuals as reference.
The predictive models will be used to compare predictions made using:
Raw characteristics (initial characteristics).
Characteristics possibly altered through FairAdapt for individuals who were not in the reference group (i.e., using counterfactuals).
6.2.1 Unaware Model
The predicted values using the initial characteristics (the factuals), for the unaware model are stored in the object pred_unaware_all. We put in a table the initial characteristics (factuals) and the prediction made by the unaware model:
Let us build a dataset containing only counterfactual characteristics (obtained with fairadapt): values for \(X_1\) and \(X_2\) of White individuals as if they had been Black, and values for \(X_1\) and \(X_2\) of Black individuals as if they had been White.
Recall we created an object called df_counterfactuals_fpt which contains the counterfactual characteristics of all students, obtained with fairadapt:
df_counterfactuals_fpt
# A tibble: 19,567 × 3
S X1 X2
<fct> <dbl> <dbl>
1 Black 2.7 31.3
2 Black 2.6 28
3 Black 2.7 21
4 Black 3.1 28.1
5 Black 3.3 21.0
6 Black 3.3 26.9
7 Black 2.4 29.6
8 Black 2.3 29.8
9 Black 3.3 21
10 Black 2.85 33.5
# ℹ 19,557 more rows
We make predictions with the aware model on these counterfactuals:
model_aware <- pred_aware$modelpred_aware_fpt <-predict( model_aware, newdata = df_counterfactuals_fpt, type ="response")
Then, we create a table with the counterfactuals and the predicted value by the aware model:
Lastly, we can visualize the distribution of predicted values by the aware model once the characteristics of the individuals who are not on the reference group have been modified using fairadapt.
# A tibble: 2 × 5
S X1 X2 pred type
<fct> <dbl> <dbl> <dbl> <chr>
1 Black 2.8 29 0.300 factual
2 White 2.8 34 0.382 factual
The characteristics of these two individuals would be, according to what was estimated using fairadapt, if the reference group was the one in which they do not belong:
Let us assume here that the reference group is “White individuals” (i.e., the group with the most individuals in the dataset). We focus on the minority, i.e., Black individuals. We consider here that the model is fair towards the minority class if: \[
P(\hat{Y}_{S \leftarrow \text{White}} = 1 | S = \text{Black}, X_1, X_2) = P(\hat{Y} = 1 | S = \text{White}, X_1, X_2)
\] If the model is fair with respect to this criterion, the proportion of Black individuals predicted to have grades above the median should be the same as if they had been white.
Plečko, Drago, Nicolas Bennett, and Nicolai Meinshausen. 2021. “Fairadapt: Causal Reasoning for Fair Data Pre-Processing.”arXiv Preprint arXiv:2110.10200.
Plečko, Drago, and Nicolai Meinshausen. 2020. “Fair Data Adaptation with Quantile Preservation.”Journal of Machine Learning Research 21 (242): 1–44.