The Confusion Matrix and ROC

Business Use Case

A company - Topo’s Tacos sells DIY taco kits on a monthly-based continuity program. Subscribers pay $20.00/month to receive taco kits complete with taco shells, packaged Mexican cheese, and a variety of meats or veggies. Topo’s Tacos has been cash flow positive since fiscal year 2021 through the first 2 months of 2022. However, in recent months, the company’s CFO has become quite concerned with the downward-trending trajectory of Topo’s Tacos’ revenue. He thinks it has something to do with customer retention. So, he tasks his sharpest analysts to build a binary classification model for predicting churn. A random sample of n=200 observations is drawn.

The analysts quickly begin to implement scientific decision making by setting-up the hypothesis testing framework as follows.

\[\begin{eqnarray} H_0 &=& \text{The customers will not churn = 0} \\ H_1 &=& \text{The customers will churn = 1} \end{eqnarray}\]

Taking the Confusion out of The Confusion Matrix

The confusion matrix may sound confusing by name alone, but it is simply a table of predicted vs. actual values as part of the holistic assessment of classification models.

Sweeping through the confusion matrix below in a clockwise fashion produces the following results.

When we have an observation that is predicted as a “NO” and the actual observation is also a “NO,” we have what is called a True Negative (TN):

 
n = 200 PREDICTED NO PREDICTED
YES
ACTUAL
NO
TN
ACTUAL
YES

Next, when we have an observation that is predicted as a “YES” and the actual observation is a “NO,” we have what is called a False Positive (FP). Think of it this way: we predicted a “YES,” but got it wrong.
 
n = 200 PREDICTED NO PREDICTED YES
ACTUAL NO TN FP
ACTUAL YES TP


A Predicted “YES” that yields an actual “YES,” produces a True Positive (TP). No explanation necessary on this front.

n=200 PREDICTED:
NO
PREDICTED:
YES
ACTUAL:
NO
TN FP
ACTUAL:
YES
TP


Conversely, a Predicted “NO” that yields an actual “YES,” produces a False Negative (FN). False expectations yet again…

n=200 PREDICTED:
NO
PREDICTED:
YES
ACTUAL:
NO
TN FP
ACTUAL:
YES
FN TP


And now for a quick recap of what each value on the confusion matrix represents.

Terminology in Context

  • True Negative (TN): a prediction for no churn (“NO”) was made. No churn (“NO”) was actually observed.
  • False Positice (FP): a prediction for churn (“YES”) was made. No churn (“NO”) was actually observed.
  • True Positive (TP): a prediction for churn (“YES”) was made. Churn (“YES”) was actually observed.
  • False Negative (FN): a prediction for no churn (“NO”) was made. Churn (“YES”) was actually observed.

Measuring the Values

Remember the sample size of n=200 observations from the Business Use Case way up top? Let’s take a smaller sample size of n=30 to illustrate the mechanics of calculating confusion matrix values based on different thresholds.

n=30 PREDICTED:
NO
PREDICTED:
YES
ACTUAL:
NO
TN FP
ACTUAL:
YES
FN TP

Example 1.

Let us calculate the values on the confusion matrix when the threshold is set to .4. Churn likelihood has been binarized and numericized, respectively, in the following table. Note: This is mock data created for instructional purposes. The last (Predicted) column is calculated based on the threshold of 0.4. If the Probability in column 3 is greater than 0.40, the result is churn (1). If it is less than 0.4, the result of the prediction is no churn (0).

\[P > 0.4 \rightarrow \text{H1} = 1\] \[P < 0.4 \rightarrow \text{H0} = 0\]

# Churn Probability Predicted Outcome
1 1 0.26 0
2 1 0.99 1
3 1 0.27 0
4 0 0.24 0
5 0 0.67 1
6 0 0.15 0
7 1 0.67 1
8 1 0.82 1
9 1 0.11 0
10 0 0.08 0
11 0 0.48 1
12 1 0.47 1
13 1 0.75 1
14 1 0.33 0
15 1 0.95 1
16 0 0.11 0
17 1 0.50 1
18 1 0.73 1
19 1 0.52 1
20 1 0.59 1
21 1 0.65 1
22 1 0.09 0
23 1 0.52 1
24 1 0.57 1
25 1 0.19 0
26 1 0.34 0
27 1 0.07 0
28 0 0.69 1
29 1 0.56 1
30 1 0.75 1


To calculate the 4 quadrants of the confusion matrix (TN, FP, FN, and TP), we must compare the “Predicted” column with the actual (“Churn”) column. For example, let’s proceed to sum all the rows that have a value of 0 in both of these columns (TN). There are 4 instances of TN. Similarly, all rows that have a value of 1 in these two columns would yield TP. There are 15 such rows. In the same manner, we can tally up the FP and FN, respectively.

\[\sum_1^{30}{TN} = 4\] \[\sum_1^{30}{TP} = 15\]

# Churn Probability Predicted Outcome
1 1 0.26 0 FN
2 1 0.99 1 TP
3 1 0.27 0 FN
4 0 0.24 0 TN
5 0 0.67 1 FP
6 0 0.15 0 TN
7 1 0.67 1 TP
8 1 0.82 1 TP
9 1 0.11 0 FN
10 0 0.08 0 TN
11 0 0.48 1 FP
12 1 0.47 1 TP
13 1 0.75 1 TP
14 1 0.33 0 FN
15 1 0.95 1 TP
16 0 0.11 0 TN
17 1 0.5 1 TP
18 1 0.73 1 TP
19 1 0.52 1 TP
20 1 0.59 1 TP
21 1 0.65 1 TP
22 1 0.09 0 FN
23 1 0.52 1 TP
24 1 0.57 1 TP
25 1 0.19 0 FN
26 1 0.34 0 FN
27 1 0.07 0 FN
28 0 0.69 1 FP
29 1 0.56 1 TP
30 1 0.75 1 TP


So, from the this threshold of 0.4, the derived confusion matrix is as follows:

n=30 PREDICTED:
NO
PREDICTED:
YES
ACTUAL:
NO
TN = 4 FP = 3
ACTUAL:
YES
FN = 8 TP = 15


Now, from this confusion matrix, let’s calculate the True Positive Rate (TPR) and False Positive Rate (FPR). They are defined and calculated as follows:

\[\begin{eqnarray} \text{True Positive Rate = TPR} &=& \frac{\text{# of True Positives}}{\text{# of Actual YES}} = \frac{TP}{TP + FN} = \frac{15}{15+8} = 0.43\\ \\ \text{False Positive Rate = FPR} &=& \frac{\text{# of False Positives}}{\text{# of Actual NO}} = \frac{FP}{FP + TN} = \frac{3}{3+4} = 0.65\\ \end{eqnarray}\]

Receiver Operating Characteristic (ROC)

On a high level, the ROC plots the relationship between the True Positive Rate (TPR) on the y-axis vs. the False Positive Rate (FPR) on the x-axis. Thus far, we have only calculated the TPR and FPR over one single threshold of 0.4. To make this a sound and proper curve, additional (FPR, TPR) coordinates for more thresholds would have to be calculated and plotted, respectively.

False Positive Rate True Positive Rate Threshold
0 0
0.428571429 0.652173913 0.4
0.571428571 0.826086957 0.2
0.857142857 0.913043478 0.1
1 1