You can access a significantly larger sample of the platform's content for free by logging in with your Gmail account. Sign in now to explore.

Derive the likelihood of a linear regression model \(y=WX + \epsilon\) if the error distribution is normally distributed around zero (\(\epsilon \sim N(0,\sigma^2)\)).

The likelihood function will be (assume \(N\) observations):

\[ \begin{align} L &= \prod_i^N p(Y_i | X_i; W) = \prod_i^N N(Y_i | WX; \sigma^2) = \prod_i^N \frac{1}{2\pi \sigma^2} \exp(-\frac{1}{2\sigma^2}(y-\mu)^2) \\ logL = LL &= N \log(\frac{1}{2\pi \sigma^2}) - N \frac{1}{2\sigma^2} \sum_i^N (y-\mu)^2) \end{align} \]

We can replace \(\mu = WX\) since the error has zero mean. Typically we want to minimize the negative log-likelihood, hence:

\[ -LL = NLL = N \frac{1}{2\sigma^2} \sum_i^N (y-WX)^2) + \text{constant} \propto \sum_i^N (y-WX)^2) \]

Linear regression, Likelihood function, Formula derivation

- Bias-variance equation Easy (Bias-variance tradeoff, Formula derivation)
- Range of R2 when combining regressions Medium (Linear regression, Goodness of fit, R-squared, Correlation)
- Linear regression assumptions Easy (Linear regression)
- Normalization vs. Standardization Easy (Linear regression, Standardization, Normalization)
- Hypothesis testing in regression coefficients Medium (Linear regression, Hypothesis testing)