Sample Ratio Mismatch
|
Product Data Scientist, Data Scientist, Applied Scientist |
A/B Testing |
Sample Ratio Mismatch |
Medium |
Yes |
|
|
Minimum remove to make valid parentheses (Leetcode 1249)
|
Data Scientist, Applied Scientist, Machine Learning Engineer |
Data Structures and Algorithms |
Stack |
Medium |
Yes |
|
|
Remove duplicates in place (leetcode 26)
|
Data Scientist, Applied Scientist, Machine Learning Engineer |
Data Structures and Algorithms |
Array, Two pointers |
Easy |
Yes |
|
|
Linear regression with gradient descent
|
Data Scientist, Applied Scientist, Machine Learning Engineer |
Machine Learning Coding |
Gradient descent, Linear regression |
Medium |
Yes |
|
|
Imbalanced dataset
|
Product Data Scientist, Data Scientist, Applied Scientist, Machine Learning Engineer |
Machine Learning |
Class imbalance |
Medium |
Yes |
|
|
Trailing by two: should we go for two or three?
|
Product Data Scientist, Data Scientist, Applied Scientist |
Probability theory |
Independence, Decision making |
Easy |
Yes |
|
|
Artist maxranks
|
Product Data Scientist, Data Scientist |
SQL |
Join |
Easy |
No |
|
|
Songs that stay in the chars for a while
|
Product Data Scientist, Data Scientist |
SQL |
Subquery, CTE, Join, ALL, Window functions |
Medium |
No |
|
|
Measuring sticks
|
Product Data Scientist, Data Scientist, Applied Scientist |
Statistics |
Variance |
Medium |
Yes |
|
|
Simpson’s paradox
|
Product Data Scientist, Data Scientist, Applied Scientist |
A/B Testing |
Simpson’s paradox |
Medium |
Yes |
|
|
Interview success probability
|
Product Data Scientist, Data Scientist, Applied Scientist, Machine Learning Engineer |
Application process |
Application prep |
Easy |
No |
|
|
Offer validity and cooling-off periods
|
Product Data Scientist, Data Scientist, Applied Scientist, Machine Learning Engineer |
Application process |
Application prep |
Easy |
No |
|
|
Questions worksheet (recruiter-screen)
|
Product Data Scientist, Data Scientist, Applied Scientist, Machine Learning Engineer |
Application process |
Discussion prep, Application prep |
Medium |
Yes |
|
|
Referrals vs. online applications
|
Product Data Scientist, Data Scientist, Applied Scientist, Machine Learning Engineer |
Application process |
Application prep |
Easy |
No |
|
|
Response time
|
Product Data Scientist, Data Scientist, Applied Scientist, Machine Learning Engineer |
Application process |
Application prep |
Easy |
No |
|
|
Resume review
|
Product Data Scientist, Data Scientist, Applied Scientist, Machine Learning Engineer |
Application process |
Application prep, Resume |
Easy |
No |
|
|
Timeline: from application to offer
|
Product Data Scientist, Data Scientist, Applied Scientist, Machine Learning Engineer |
Application process |
Application prep |
Easy |
No |
|
|
Up-level during interview
|
Product Data Scientist, Data Scientist, Applied Scientist, Machine Learning Engineer |
Application process |
Application prep |
Easy |
No |
|
|
Questions worksheet (behavioral)
|
Product Data Scientist, Data Scientist, Applied Scientist, Machine Learning Engineer |
Behavioral |
Discussion prep |
Hard |
Yes |
|
|
Sample from random generator
|
Data Scientist, Applied Scientist, Machine Learning Engineer |
Machine Learning Coding |
Sample, Uniform, Random number generator |
Medium |
Yes |
|
|
Simulate dynamic coin flips
|
Data Scientist, Applied Scientist, Machine Learning Engineer |
Machine Learning Coding |
Simulation |
Easy |
Yes |
|
|
Encoding categorical features
|
Product Data Scientist, Data Scientist, Applied Scientist, Machine Learning Engineer |
Machine Learning |
Categorical features, Embeddings, One-hot encoding, Hashing |
Easy |
Yes |
|
|
k-means
|
Product Data Scientist, Data Scientist, Applied Scientist, Machine Learning Engineer |
Machine Learning |
k-means |
Medium |
Yes |
|
|
Range of R2 when combining regressions
|
Product Data Scientist, Data Scientist, Applied Scientist, Machine Learning Engineer |
Machine Learning |
Linear regression, Goodness of fit, R-squared, Correlation |
Medium |
Yes |
|
|
Measuring counterfactual impact
|
Product Data Scientist, Data Scientist |
Metrics |
Problem solving, Counterfactual |
Hard |
Yes |
|
|
Sudden drop in user engagement
|
Product Data Scientist, Data Scientist |
Metrics |
Problem solving, Root-cause analysis |
Medium |
Yes |
|
|
Games between two players
|
Product Data Scientist, Data Scientist, Applied Scientist |
Probability theory |
Recursive relationship |
Medium |
Yes |
|
|
Red and blue balls
|
Product Data Scientist, Data Scientist, Applied Scientist |
Probability theory |
Counting, Combinations, Repetition, Binomial |
Easy |
Yes |
|
|
Sum of normally distributed random variables
|
Product Data Scientist, Data Scientist, Applied Scientist |
Probability theory |
Normal, PDF, CDF |
Medium |
Yes |
|
|
Choose house or techno
|
Product Data Scientist, Data Scientist |
SQL |
Logical OR |
Easy |
No |
|
|
Expensive house songs I
|
Product Data Scientist, Data Scientist |
SQL |
Subquery, CTE, Join |
Medium |
No |
|
|
Expensive house songs II
|
Product Data Scientist, Data Scientist |
SQL |
Subquery, CTE, Join, Window functions |
Hard |
No |
|
|
Songs that ranked 1 to 50
|
Product Data Scientist, Data Scientist |
SQL |
Between |
Easy |
No |
|
|
Biased coin
|
Product Data Scientist, Data Scientist, Applied Scientist |
Statistics |
Expectation, CLT, Binomial, Normal, Bernoulli, Hypothesis testing, CDF |
Medium |
Yes |
|
|
Prussian horses
|
Product Data Scientist, Data Scientist, Applied Scientist |
Statistics |
Poisson, Hypothesis testing, CDF |
Medium |
Yes |
|
|
Relationship between p-val and confidence interval
|
Product Data Scientist, Data Scientist, Applied Scientist |
Statistics |
Confidence interval, P-value, Hypothesis testing |
Easy |
Yes |
|
|
Two-sample t-test
|
Product Data Scientist, Data Scientist, Applied Scientist |
Statistics |
Hypothesis testing |
Easy |
Yes |
|
|
Choose a project
|
Product Data Scientist, Data Scientist, Applied Scientist, Machine Learning Engineer |
Technical deep dive |
Discussion prep |
Hard |
Yes |
|
|
Questions worksheet (deep dive)
|
Product Data Scientist, Data Scientist, Applied Scientist, Machine Learning Engineer |
Technical deep dive |
Discussion prep |
Hard |
Yes |
|
|
AA tests
|
Product Data Scientist, Data Scientist, Applied Scientist |
A/B Testing |
Variance |
Easy |
Yes |
|
|
Counterfactual definition
|
Product Data Scientist, Data Scientist, Applied Scientist |
A/B Testing |
Counterfactual |
Easy |
Yes |
|
|
Equal-sized treatment and control groups
|
Product Data Scientist, Data Scientist, Applied Scientist |
A/B Testing |
Power, Variance, Sample size |
Medium |
Yes |
|
|
False discovery control
|
Product Data Scientist, Data Scientist, Applied Scientist |
A/B Testing |
False discovery rate, Multiple hypotheses testing, Benjamini & Hochberg, Bonferroni |
Easy |
Yes |
|
|
Interference
|
Product Data Scientist, Data Scientist, Applied Scientist |
A/B Testing |
Interference |
Easy |
Yes |
|
|
Multi-armed and contextual bandits in AB testing
|
Product Data Scientist, Data Scientist, Applied Scientist |
A/B Testing |
Contextual bandits, Multi-armed bandits |
Medium |
Yes |
|
|
Normality assumption
|
Product Data Scientist, Data Scientist, Applied Scientist |
A/B Testing |
Normality |
Medium |
Yes |
|
|
Novelty and primacy effects
|
Product Data Scientist, Data Scientist, Applied Scientist |
A/B Testing |
Novelty effects, Primacy effects |
Easy |
Yes |
|
|
Randomization checks
|
Product Data Scientist, Data Scientist, Applied Scientist |
A/B Testing |
Randomization |
Easy |
Yes |
|
|
Randomization level
|
Product Data Scientist, Data Scientist, Applied Scientist |
A/B Testing |
Randomization, Variance |
Medium |
Yes |
|
|
Reducing variance in AB testing
|
Product Data Scientist, Data Scientist, Applied Scientist |
A/B Testing |
Variance |
Medium |
Yes |
|
|
Bootstrap
|
Product Data Scientist, Data Scientist, Applied Scientist, Machine Learning Engineer |
Machine Learning |
Bootstrap |
Easy |
Yes |
|
|
Linear regression with stochastic gradient descent
|
Data Scientist, Applied Scientist, Machine Learning Engineer |
Machine Learning Coding |
Stochastic Gradient descent, Linear regression |
Medium |
Yes |
|
|
Logistic regression with gradient descent
|
Data Scientist, Applied Scientist, Machine Learning Engineer |
Machine Learning Coding |
Gradient descent, Logistic regression |
Medium |
Yes |
|
|
Cross validation
|
Product Data Scientist, Data Scientist, Applied Scientist, Machine Learning Engineer |
Machine Learning |
Cross validation, Offline evaluation |
Easy |
Yes |
|
|
Intercept
|
Product Data Scientist, Data Scientist, Applied Scientist, Machine Learning Engineer |
Machine Learning |
Linear regression, Intercept |
Easy |
No |
|
|
Interpretability
|
Product Data Scientist, Data Scientist, Applied Scientist, Machine Learning Engineer |
Machine Learning |
ML interpretability |
Easy |
Yes |
|
|
Linear regression assumptions
|
Product Data Scientist, Data Scientist, Applied Scientist, Machine Learning Engineer |
Machine Learning |
Linear regression |
Easy |
Yes |
|
|
Logistic regression assumptions
|
Product Data Scientist, Data Scientist, Applied Scientist, Machine Learning Engineer |
Machine Learning |
Logistic regression |
Easy |
Yes |
|
|
Missing data
|
Product Data Scientist, Data Scientist, Applied Scientist, Machine Learning Engineer |
Machine Learning |
Missing data |
Easy |
Yes |
|
|
MSE vs. MAE
|
Product Data Scientist, Data Scientist, Applied Scientist, Machine Learning Engineer |
Machine Learning |
MSE, MAE |
Easy |
Yes |
|
|
Multicollinearity
|
Product Data Scientist, Data Scientist, Applied Scientist, Machine Learning Engineer |
Machine Learning |
Multicollinearity, Linear regression |
Medium |
Yes |
|
|
Normalization vs. Standardization
|
Product Data Scientist, Data Scientist, Applied Scientist, Machine Learning Engineer |
Machine Learning |
Linear regression, Standardization, Normalization |
Easy |
Yes |
|
|
Outliers
|
Product Data Scientist, Data Scientist, Applied Scientist, Machine Learning Engineer |
Machine Learning |
Outliers, Cook’s distance, Regularization |
Easy |
Yes |
|
|
Principal Component Analysis (PCA)
|
Product Data Scientist, Data Scientist, Applied Scientist, Machine Learning Engineer |
Machine Learning |
PCA |
Easy |
Yes |
|
|
Random vs. stratified sampling
|
Product Data Scientist, Data Scientist, Applied Scientist, Machine Learning Engineer |
Machine Learning |
Sampling, Stratified sampling |
Easy |
Yes |
|
|
Characteristics of metrics
|
Product Data Scientist, Data Scientist |
Metrics |
Characteristics of metrics |
Easy |
Yes |
|
|
Types of metrics
|
Product Data Scientist, Data Scientist |
Metrics |
Types of metrics |
Easy |
Yes |
|
|
Consecutive tails
|
Product Data Scientist, Data Scientist, Applied Scientist |
Probability theory |
Permutations, Repetition |
Easy |
No |
|
|
Documents edited by AI
|
Product Data Scientist, Data Scientist, Applied Scientist |
Probability theory |
Bayes rule, Conditional probability |
Easy |
No |
|
|
Largest number rolled
|
Product Data Scientist, Data Scientist, Applied Scientist |
Probability theory |
Counting, Permutations, Repetition |
Medium |
Yes |
|
|
Median probability
|
Product Data Scientist, Data Scientist, Applied Scientist |
Probability theory |
Binomial, Uniform, CDF |
Medium |
Yes |
|
|
Monty Hall
|
Product Data Scientist, Data Scientist, Applied Scientist |
Probability theory |
Bayes rule, Conditional independence, Prior evidence |
Medium |
Yes |
|
|
Number of emails
|
Product Data Scientist, Data Scientist, Applied Scientist |
Probability theory |
Poisson distribution |
Easy |
No |
|
|
Paths to destination
|
Product Data Scientist, Data Scientist, Applied Scientist |
Probability theory |
Counting, Combinations |
Easy |
No |
|
|
Repeated rolls until 4
|
Product Data Scientist, Data Scientist, Applied Scientist |
Probability theory |
Geometric distribution |
Easy |
Yes |
|
|
Sample digits 1-10
|
Product Data Scientist, Data Scientist, Applied Scientist |
Probability theory |
Sample from samples, Uniform |
Medium |
Yes |
|
|
Two children I
|
Product Data Scientist, Data Scientist, Applied Scientist |
Probability theory |
Prior evidence |
Easy |
Yes |
|
|
Two fair die rolls
|
Product Data Scientist, Data Scientist, Applied Scientist |
Probability theory |
Independence, CDF, PMF |
Easy |
No |
|
|
Unfair coin probability
|
Product Data Scientist, Data Scientist, Applied Scientist |
Probability theory |
Bayes rule, Conditional probability |
Easy |
No |
|
|
Artists with more songs than others
|
Product Data Scientist, Data Scientist |
SQL |
Subquery, CTE, Join, Window functions |
Hard |
No |
|
|
Concat columns
|
Product Data Scientist, Data Scientist |
SQL |
Concat |
Easy |
No |
|
|
Histogram of songs
|
Product Data Scientist, Data Scientist |
SQL |
Recursive CTE, Join |
Hard |
No |
|
|
Label recent songs
|
Product Data Scientist, Data Scientist |
SQL |
Case |
Easy |
No |
|
|
Median songs per artist
|
Product Data Scientist, Data Scientist |
SQL |
CTE, Window functions |
Hard |
No |
|
|
Songs in charts with greater durations
|
Product Data Scientist, Data Scientist |
SQL |
Subquery, CTE, Join, Window functions |
Hard |
No |
|
|
Songs per genre
|
Product Data Scientist, Data Scientist |
SQL |
Group by |
Easy |
No |
|
|
Songs that did not enter the charts or entered high
|
Product Data Scientist, Data Scientist |
SQL |
Subquery, Join |
Medium |
No |
|
|
Songs with letters
|
Product Data Scientist, Data Scientist |
SQL |
Regexp |
Easy |
No |
|
|
An intuitive way to write power
|
Product Data Scientist, Data Scientist, Applied Scientist |
Statistics |
Power, Hypothesis testing |
Easy |
Yes |
|
|
Buy and sell stocks
|
Product Data Scientist, Data Scientist, Applied Scientist |
Statistics |
Gambler ruin, Expectation, Recursion, Random walk |
Medium |
Yes |
|
|
CI of flipping heads
|
Product Data Scientist, Data Scientist, Applied Scientist |
Statistics |
Confidence Interval, CLT, Bernoulli trials |
Medium |
Yes |
|
|
Confidence interval definition
|
Product Data Scientist, Data Scientist, Applied Scientist |
Statistics |
Confidence interval, Hypothesis testing |
Easy |
Yes |
|
|
Confidence intervals that overlap
|
Product Data Scientist, Data Scientist, Applied Scientist |
Statistics |
Confidence interval, Hypothesis testing |
Medium |
No |
|
|
Covariance of dependent variables
|
Product Data Scientist, Data Scientist, Applied Scientist |
Statistics |
Variance, Uniform, Covariance, Expectation |
Medium |
Yes |
|
|
Distribution of a CDF
|
Product Data Scientist, Data Scientist, Applied Scientist |
Statistics |
CDF, Inverse transform |
Medium |
No |
|
|
Dynamic coin flips
|
Product Data Scientist, Data Scientist, Applied Scientist |
Statistics |
Expectation, Simulation |
Hard |
Yes |
|
|
Expected number of consecutive heads
|
Product Data Scientist, Data Scientist, Applied Scientist |
Statistics |
Expectation |
Medium |
Yes |
|
|
Gambler’s ruin win probability
|
Product Data Scientist, Data Scientist, Applied Scientist |
Statistics |
Gambler ruin, Random walk, Expectation |
Medium |
Yes |
|
|
Manual estimation of flips
|
Product Data Scientist, Data Scientist, Applied Scientist |
Statistics |
Normal, CDF, Binomial, CLT |
Medium |
No |
|
|
Monotonic draws
|
Product Data Scientist, Data Scientist, Applied Scientist |
Statistics |
Expectation |
Hard |
No |
|
|
Number of draws to get greater than 1
|
Product Data Scientist, Data Scientist, Applied Scientist |
Statistics |
Normal, Geometric, CDF, Expectation |
Medium |
Yes |
|
|
P-value definition
|
Product Data Scientist, Data Scientist, Applied Scientist |
Statistics |
P-value, Hypothesis testing |
Easy |
Yes |
|
|
Tests for normality
|
Product Data Scientist, Data Scientist, Applied Scientist |
Statistics |
Hypothesis testing, Normality |
Easy |
Yes |
|
|