We can evaluate such classifiers with:
In addition, depending on the type of problem, we can use accuracy at top n, or other ranking metrics.
Averaging techniques when we do OVA (or OVO):
macro: arithmetic mean of all metrics across classes. (In OVO: average of all possible pairwise combinations of classes)
weighted: accounts for class imbalance and estimates the weighted average (fewer instances less impact on the averaged score)
micro: this is the same as accuracy. Micro-averaging is found by dividing the sum of the diagonal cells of the matrix by the sum of all the cells — i.e., accuracy (rarely used).