You can access a significantly larger sample of the platform's content for free by logging in with your Gmail account. Sign in now to explore.

What is the bias variance trade off? How is it expressed using an equation?

We can describe the bias-variance tradeoff through the following equation:

\[ Error = \sigma^2 + Bias^2(\hat{f}(x)) + Var(\hat{f}(x)) \]

where:

\[ \begin{align} Bias^2(\hat{f}(x)) &= E[\hat{f}(x) - f(x)] \\ Var(\hat{f}(x)) &= E \bigg( \big ( E[\hat{f}(x)] - \hat{f}(x)\big )^2 \bigg ) \\ \sigma^2 &= E[(y - f(x))^2] \end{align} \]

(\(\sigma^2\) is the variance of the unobserved error (hence noise)). High-bias models tend (are likely) to underfit; high-variance models tend (are likely) to overfit. Trying to lower the bias typically increases variance and vice a versa, hence the trade off.

Note that we can choose a biased estimator as long as it reduces our variance by more than the square of the bias (and vice a versa)! For instance, L2-regularization is an example of biased estimators that if tuned correctly can reduce error.

Bias-variance tradeoff for classification: If instead of MSE we use 0-1 loss, the bias-variance tradeoff does no longer hold. In fact, in that case, bias and variance combine multiplicatively (see elements of statistical learning, exercise 7.2). If the estimate is on the correct side of the decision boundary, then the bias is negative, and decreasing the variance will decrease the miclassification rate. But if the estimate is on the worng side of the decision boundary, then the bias is positive, so it pays to increase the variance.

Bias-variance tradeoff, Formula derivation

- Gradient descent vs. stochastic gradient descent Medium (Gradient descent, Stochastic gradient descent, Minibatch)
- Analyze prediction error Medium (Prediction error, Bias-variance tradeoff, Diagnostics, Learning curves)
- Linear regression likelihood function Medium (Linear regression, Likelihood function, Formula derivation)
- Example of high-bias and high variance Medium (Bias-variance tradeoff)
- Bias-variance biased estimator Medium (Bias-variance tradeoff)