\[ \begin{align} \bar{Y}&= \frac{1}{n} \sum_i Y_i \\ var(Y) &= \hat\sigma^2 = \frac{1}{n-1} \sum_i (Y_i - \bar{Y})^2 \\ var(\bar{Y}) &= var(\frac{1}{n} \sum_i Y_i) = \frac{1}{n^2} * n * var(Y) = \frac{\hat\sigma^2}{n} \end{align} \]
If we overestimate the variance, it is more likely to get false negatives; if we underestimate the variance then it is more likely to get false positives. To understand this, consider the following two-sample test:
\[ T = \frac{\Delta}{\sqrt{Var(\Delta)}} \] Overestimating variance increases the denominator, which then decreases the estimated T score, which will lead to False negatives as we will mistakenly not reject the Null. On the other hand, if we underestimate variance, we will end up rejecting the null more often as the above denominator will be smaller.
High variance in particular is an issue because it affects power analysis and increases the necessary sample size of the experiment. In fact, assuming that significance level \(\alpha=0.05\), power can be defined by \(\delta\), the minimum delta of practical significance:
\[ \text{Power}_{\delta} = P(|T| \geq 1.96 | \text{true diff is } \delta) \] where \(T\) is the t-statistic value. Then, assuming treatment and control are of equal size, the total number of samples you need to achieve 80% power is (p.189 Kohavi, Tang, and Xu (2020)):
\[ n \approx \frac{16 \sigma^2} {\delta^2} \] As a result, the sample size increases with variance (and decreases with \(\delta^2\)). Because of the effect of variance on sample size, we tend to try to artificially reduce variance, with some of the following techniques: