 # Area Under the receiver operator Curve (AUC)

AUC is a common way of evaluating binary classification problems. Two main reason are:

• It’s insensitive to unbalanced datasets e.g. fraud detection, conversion prediction.
• It does not require the predictions to be thresholded (e.g. assign to one class or the other). Operates directly on classification scores.

An AUC of 0.5 is random performance, while an AUC of 1 is perfect classification. Personally I transform this to GINI = 2AUC – 1.  The main reason being, that intuitively for me 0 being random and 1 being perfect makes more sense.

# Understanding AUC

First let us look at a confusion matrix:

Imagine we have a set of scores from a classification model (this could be probabilities, etc.) along with their true labels.  The numbers in the confusion matrix assume we have pick a threshold $$t$$, where $$>= t$$ we assign the positive label and $$< t$$ we assign the negative label.

If we have $$M$$ unique scores from the classification model, we could have $$M +1$$ possible choices of $$t$$. If we plot the True Positive Rate and False Positive Rate over all the ranges of $$t$$, we get our Receiver Operating Characteristic (ROC) Curve.

Figure 1 shows an example ROC. The blue line is our performance, the grey dotted line is an example of what we would see from random classification.

The Area Under the receiver operator Curve (AUC) for this example would be the area under the blue line. The area under the grey line would be 0.5, hence why an AUC of 0.5 is random performance and AUC of 1 is perfect classification.

# Score Distributions

In the top-left of figure 2 we have a hypothetical score distribution of the output of a classifier. The two “clumps” of scores are the negative (blue) and positive (red) examples. You can imagine as we move the threshold line from left to right, we will change the values of the TP and FP and therefore the TPR and FPR.

How much the score distributions of the negatives and positive overlap will determine our AUC.  In figure 3 we show some different score distributions, the red distribution is the positive class, the blue the negative.