We are in Beta and we are offering 50% off! Use code BETATESTER at checkout.
You can access a significantly larger sample of the platform's content for free by logging in with your Gmail account. Sign in now to explore.

Documents edited by AI

Probability theory Easy

Assume that 1 in every 100 documents is written by AI. We have a predictive model, that has 96% accuracy in predicting a true positive. The model makes a False Positive prediction in 10% of the human-written documents. If a document is labeled as written by AI, what is the likelihood that it is indeed written by AI?

To answer this question, we will apply the Bayes’ rule. From the description, we know that:

\[\begin{align} \Pr(\text{Written by AI}) &= 1/100 \\ \Pr(\text{Predicted Positive | Written by AI}) &= 96/100 \\ \Pr(\text{Predicted Positive | Written by humans}) &= 10/100 \end{align}\]

We are looking for the probability of \(\Pr(\text{Written by AI | Predicted Positive})\). From Bayes’ rule, we know that:

\[ \Pr(\text{Written by AI | Predicted Positive}) = \frac{\Pr(\text{Predicted Positive | Written by AI}) \Pr(\text{Written by AI})}{\Pr(\text{Predicted Positive})} \]

From the above, we only need to estimate the denominator:

\[ \begin{align} \Pr(\text{Predicted Positive} &= \Pr(\text{Predicted Positive | Written by AI}) * \Pr(\text{ Written by AI}) \\ &+ \Pr(\text{Predicted Positive | Written by humans}) * \Pr(\text{ Written by humans}) \\ &= 0.96 * 0.01 + 0.1 * 0.99 = 0.1086 \end{align} \]

Plugging the numbers we get:

\[ \Pr(\text{Written by AI | Predicted Positive}) = \frac{(0.96 * 0.01)}{0.1086} = 0.088 \]


The Bayes’ rule gives counter-intuitive results when there is an imbalance in the input distribution. In this case, the fact that AI prevalence is so low (1%) updates the posterior probability to be really low, despite the high recall (True Positive Rate) of the model (96%).


Topics

Bayes rule, Conditional probability
Similar questions

Provide feedback