We are in Beta and we are offering 50% off! Use code BETATESTER at checkout.
You can access a significantly larger sample of the platform's content for free by logging in with your Gmail account. Sign in now to explore.

Bias-variance equation

Machine Learning Easy Seen in real interview

What is the bias variance trade off? How is it expressed using an equation?

We can describe the bias-variance tradeoff through the following equation:

\[ Error = \sigma^2 + Bias^2(\hat{f}(x)) + Var(\hat{f}(x)) \]

where:

\[ \begin{align} Bias^2(\hat{f}(x)) &= E[\hat{f}(x) - f(x)] \\ Var(\hat{f}(x)) &= E \bigg( \big ( E[\hat{f}(x)] - \hat{f}(x)\big )^2 \bigg ) \\ \sigma^2 &= E[(y - f(x))^2] \end{align} \]

(\(\sigma^2\) is the variance of the unobserved error (hence noise)). High-bias models tend (are likely) to underfit; high-variance models tend (are likely) to overfit. Trying to lower the bias typically increases variance and vice a versa, hence the trade off.


Note that we can choose a biased estimator as long as it reduces our variance by more than the square of the bias (and vice a versa)! For instance, L2-regularization is an example of biased estimators that if tuned correctly can reduce error.


Bias-variance tradeoff for classification: If instead of MSE we use 0-1 loss, the bias-variance tradeoff does no longer hold. In fact, in that case, bias and variance combine multiplicatively (see elements of statistical learning, exercise 7.2). If the estimate is on the correct side of the decision boundary, then the bias is negative, and decreasing the variance will decrease the miclassification rate. But if the estimate is on the worng side of the decision boundary, then the bias is positive, so it pays to increase the variance.

 

Topics

Bias-variance tradeoff, Formula derivation
Similar questions

Provide feedback