7  Objectives

Objectives

The chapters of this part of the ebook present the following methods to build counterfactuals for a categorical variable:

  1. Using matching (Chapter 11),
  2. Using optimal transport on label-encoded variables (Chapter 9),
  3. Using optimal transport on the simplex (Chapter 10).

\[ \definecolor{wongBlack}{RGB}{0,0,0} \definecolor{wongGold}{RGB}{230, 159, 0} \definecolor{wongLightBlue}{RGB}{86, 180, 233} \definecolor{wongGreen}{RGB}{0, 158, 115} \definecolor{wongYellow}{RGB}{240, 228, 66} \definecolor{wongBlue}{RGB}{0, 114, 178} \definecolor{wongOrange}{RGB}{213, 94, 0} \definecolor{wongPurple}{RGB}{204, 121, 167} %\definecolor{colA}{RGB}{255, 221, 85} %\definecolor{colB}{RGB}{148, 78, 223} %\definecolor{colC}{RGB}{63, 179, 178} \definecolor{colA}{RGB}{0, 114, 178} \definecolor{colB}{RGB}{213, 94, 0} \definecolor{colC}{RGB}{204, 121, 167} \definecolor{colGpeZero}{RGB}{0,160,138} \definecolor{colGpeUn}{RGB}{242, 173, 0} \]

Consider a categorical variables \(x \in \{{\color{colA}A}, {\color{colB}B}, {\color{colC}C}\}\), with group-specific distributions. The categorical variable could be, for example, the treatment administered for a disease: A=surgery, B=medication, C=no treatment. We consider two groups (this could non-Black/Black, for example). We want to build a counterfactual category for each individual from a group to the other.

Let Group 0 and Group 1 represent two subpopulations in which the distribution of \(x\) differs. For example, we can have:

Group 0 could be, for example, Black individuals, whereas Group 1 could be individuals who are not Black.

The objective is to define, for each individual in Group 0, a counterfactual category that reflects the distributional characteristics of Group 1. That is, we want to know what would be the medical treatment (surgery, medication, no treatment) of a Black individual that received a given treatment (e.g., “C=no treatment) had they been non-Black.

To do so, the remainder of this part of the ebook revisits two approaches:

  1. Using matching (Chapter 11),
  2. Using optimal transport on label-encoded variables (Chapter 9),
  3. Using transport on the simplex (Chapter 10) (the one used in our paper).