We are in Beta and we are offering 50% off! Use code BETATESTER at checkout.
You can access a significantly larger sample of the platform's content for free by logging in with your Gmail account. Sign in now to explore.

Sample from random generator

Machine Learning Coding Medium Seen in real interview

You are given a random uniform generator that draws integers from the set {1,2,3,4,5,6,7}. How would you generate random uniform numbers from 1 to 10?

1-10 is a larger set than the given 1-7. Hence, it is clear that we will need to combine multiple draws from the available generator to get the desired outcome. For instance, we can assume the following mapping of two-length sequences:

\[ \begin{align} 11 &\rightarrow 1 \\ 12 &\rightarrow 2 \\ 13 &\rightarrow 3 \\ 14 &\rightarrow 4 \\ 15 &\rightarrow 5 \\ 16 &\rightarrow 6 \\ 17 &\rightarrow 7 \\ 21 &\rightarrow 8 \\ 22 &\rightarrow 9 \\ 23 &\rightarrow 10 \\ \end{align} \]

Then we can ignore every other combination and we have our random generator for 1-10. Any problem with this approach?

Obviously, this approach wastes the majority of draws. Can we do better?

There are 49 combinations (\(7 * 7\)) that we can get from two consecutive draws. We combine them as follows:

\[ f(\text{draw1, draw2}) = (\text{draw1}-1) * 7 + \text{draw2} - 1 \]

where we shifted the minimum combination to be zero:

\[ 0 * 7 + 0 = 0\\ 0 * 7 + 1 = 1\\ ...\\ 6 * 7 + 6 = 48 \] Now we need to map the numbers 0-48 to 1-10 such that we can sample uniformly. The maximum numbers (0-48) that we can use for each mapped number 1-10 without biasing the results is:

\[ 48 // 10 = 4 \]

This means that number 1 can be mapped by at most 4 unique numbers between 0,48, number 2 can be mapped by at most 4 unique numbers between 0,48, etc. This also suggests that we will have to sacrifice 9 numbers (i.e, 49 unique numbers - 4 * 10 = 9 numbers that will have no mapping).

We can achieve this goal by using modulo arithmetic as follows:

\[ map(draw1, draw2) =f(\text{draw1, draw2}\big) \% 10 + 1 \]

which will give us the following mapping:

\[ \begin{align} 0\%10 + 1 &\rightarrow 1 \\ 1\%10 + 1 &\rightarrow 2 \\ ...\\ 9\%10 + 1 &\rightarrow 10 \\ ...\\ 39\%10 + 1 &\rightarrow 10 \\ 40-48 &\rightarrow \text{Ignore} \end{align} \]

One way to code this up is as follows:

def random_1_to_7():
    return np.random.randint(1, 7)

def random_1_to_10():
    while True:
        a = random_1_to_7()
        b = random_1_to_7()
        combined = (a - 1) * 7 + b - 1
        if combined < 40:
            return (combined) % 10 + 1

To run it:

import numpy as np
np.random.seed(1234)
from collections import Counter
# Test the function
samples = 1000
results = Counter([random_1_to_10() for _ in range(samples)])
print([f"({i}:{results[i]/samples:.3f})" for i in range(1,11)])
## ['(1:0.091)', '(2:0.121)', '(3:0.104)', '(4:0.092)', '(5:0.085)', '(6:0.105)', '(7:0.089)', '(8:0.086)', '(9:0.124)', '(10:0.103)']
## True
## Counter({10: 132, 6: 118, 2: 114, 9: 109, 3: 105, 5: 92, 4: 87, 7: 83, 8: 82, 1: 78})

And as a result in this case we only wast 8/49 draws.


There are simpler variations of this problem, with the most frequently asked being to unbiase a biased coin. The main logic of how you can solve these questions is the same. You will need to sacrfice some draws and combine the rest.


Topics

Sample, Uniform, Random number generator
Similar questions

Provide feedback