Topic | AUC ROC | AU PR Curve |
---|---|---|
Method | Captures the tradeoff between true positive rate and false positive rate at different probability thresholds | Captures the trade-off between precision and Recall as the probability threshold varries |
Random classifier | A random classifier will have AUC = 0.5, and a diagnoal \(y=x\) (45 degrees) line. | A random classifier will have a horizontal line (depnding on how it is being plotted) at the positive rate ratio and average precision (area under the PR curve) equal to the positive rate ratio (see example below) |
Prefect classifier | A perfect classifier will have AUC = 1, and a horizontal line at 1. | A perfect classifier will have a point at 1,1 (see below) |
Skewed Data | AUC ROC is more appropriate for relatively balanced datasets as it is overly optimistic in highly imbalanced datasets. This happens becase the false positive rate can be very low even if the classifier has very low precision. | PR curves shine in highly imbalanced datasets, and they are more informative in situations where the focus is on correctly identifying the positive cases, while minimizing the number of false positives. |
Let’s see some examples.
import pandas as pd
d = pd.DataFrame({'prob' : [0.9,0.8,0.8,0.7,0.7,0.5,0.4,0.3,0.2,0.1], 'label': [1,1,1,1,1,0,0,0,0,0]})
d
## prob label
## 0 0.9 1
## 1 0.8 1
## 2 0.8 1
## 3 0.7 1
## 4 0.7 1
## 5 0.5 0
## 6 0.4 0
## 7 0.3 0
## 8 0.2 0
## 9 0.1 0
The above can be thought of as a perfect classifier as it fully separates positive from negative instances at 0.5. The perfect AU PR curve is as follows:
import matplotlib.pyplot as plt
from sklearn.metrics import PrecisionRecallDisplay, roc_curve, RocCurveDisplay, auc
display = PrecisionRecallDisplay.from_predictions(d.label, d.prob, name="Perfect classifier, AUPR")
Similarly, the perfect AUC ROC curve:
def get_auc_plot(d,t):
fpr, tpr, thresholds = roc_curve(d.label, d.prob)
roc_auc = auc(fpr, tpr)
display = RocCurveDisplay(fpr=fpr, tpr=tpr, roc_auc=roc_auc,
estimator_name=t)
display.plot()
get_auc_plot(d,"Perfect classifier, AUC ROC")
Now let’s generate a random classifier (i.e., all instances have the same probability of being positive):
# Random: same output every time
r = d.copy()
#np.random.seed(0)
r['prob'] = [0.2 for _ in range(10)] # np.random.randint(2, size=10)
display = PrecisionRecallDisplay.from_predictions(r.label, r.prob, name="Random classifier, AUPR")
By default, the above functions adds two points: (0,1) and (0, positive ratio in dataset). If we don’t want the second point, we can draw the AU PR curve as follows:
## <sklearn.metrics._plot.precision_recall_curve.PrecisionRecallDisplay object at 0x30f6aedf0>
For the same random classifier, the AUC ROC is:
sklearn
, you can check out the discussion here, and you can also try to call precision, recall, thresholds = precision_recall_curve(r.label, r.prob)
which will show you the datapoints that sklearn
uses to draw these graphs.
An explanation of why AUC ROC is overly optimistic
Consider a dataset that has 10 positives and 100K negatives. Compare two models:
Obviously Model B is better. However:
Because the denominator is so big for FPR, these two models will end up having very similar AUC curves (FPR will be close to zero). Simply put, a large change in the number of false positives resulted in a tiny change in the FPR and thereby, ROC is unable to reflect the superior performance of Model B in the context that true negatives are not relevant to the problem.
In contrast, the Precision-Recall (PR) curve is specifically tailored for the detection of rare events and is the metric that should be used when the positive class is of more interest than the negative one. Because precision and recall don’t consider true negatives, the PR curve is not affected by the data imbalance. Back to the example above:
Clearly, PR analysis is more informative compared to the ROC analysis above.