You can access a significantly larger sample of the platform's content for free by logging in with your Gmail account. Sign in now to explore.

Short Description | Role | Area | Topic | Difficulty | Seen in interview | Solved | Overall rating |
---|---|---|---|---|---|---|---|

Normality assumption | Product Data Scientist, Data Scientist, Applied Scientist | A/B Testing | Normality | Medium | Yes | ||

Reducing variance in AB testing | Product Data Scientist, Data Scientist, Applied Scientist | A/B Testing | Variance | Medium | Yes | ||

Sample Ratio Mismatch | Product Data Scientist, Data Scientist, Applied Scientist | A/B Testing | Sample Ratio Mismatch | Medium | Yes | ||

Simpson’s paradox | Product Data Scientist, Data Scientist, Applied Scientist | A/B Testing | Simpson’s paradox | Medium | Yes | ||

Questions worksheet (recruiter-screen) | Product Data Scientist, Data Scientist, Applied Scientist, Machine Learning Engineer | Application process | Discussion prep, Application prep | Medium | Yes | ||

Referrals vs. online applications | Product Data Scientist, Data Scientist, Applied Scientist, Machine Learning Engineer | Application process | Application prep | Easy | No | ||

Response time | Product Data Scientist, Data Scientist, Applied Scientist, Machine Learning Engineer | Application process | Application prep | Easy | No | ||

Timeline: from application to offer | Product Data Scientist, Data Scientist, Applied Scientist, Machine Learning Engineer | Application process | Application prep | Easy | No | ||

Questions worksheet (behavioral) | Product Data Scientist, Data Scientist, Applied Scientist, Machine Learning Engineer | Behavioral | Discussion prep | Hard | Yes | ||

Minimum remove to make valid parentheses (Leetcode 1249) | Data Scientist, Applied Scientist, Machine Learning Engineer | Data Structures and Algorithms | Stack | Medium | Yes | ||

Remove duplicates in place (leetcode 26) | Data Scientist, Applied Scientist, Machine Learning Engineer | Data Structures and Algorithms | Array, Two pointers | Easy | Yes | ||

Analyze prediction error | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning | Prediction error, Bias-variance tradeoff, Diagnostics, Learning curves | Medium | Yes | ||

Bias-variance equation | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning | Bias-variance tradeoff, Formula derivation | Easy | Yes | ||

Linear regression with gradient descent | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning Coding | Gradient descent, Linear regression | Medium | Yes | ||

Sample from random generator | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning Coding | Sample, Uniform, Random number generator | Medium | Yes | ||

Gradient descent vs. stochastic gradient descent | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning | Gradient descent, Stochastic gradient descent, Minibatch | Medium | Yes | ||

Imbalanced dataset | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning | Class imbalance | Medium | Yes | ||

k-means | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning | k-means | Medium | Yes | ||

L1 (Lasso) vs. L2 (Ridge) regularization | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning | L1, L2, Redularization, Lasso, Ridge | Medium | Yes | ||

Linear regression likelihood function | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning | Linear regression, Likelihood function, Formula derivation | Medium | Yes | ||

Model interpretability | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning | Interpretability | Easy | Yes | ||

Multiclass evaluation metrics | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning | Multiclass, Diagnostics | Medium | Yes | ||

Measuring counterfactual impact | Product Data Scientist, Data Scientist | Metrics | Problem solving, Counterfactual | Hard | Yes | ||

Sudden drop in user engagement | Product Data Scientist, Data Scientist | Metrics | Problem solving, Root-cause analysis | Medium | Yes | ||

Documents edited by AI | Product Data Scientist, Data Scientist, Applied Scientist | Probability theory | Bayes rule, Conditional probability | Easy | No | ||

Monty Hall | Product Data Scientist, Data Scientist, Applied Scientist | Probability theory | Bayes rule, Conditional independence, Prior evidence | Medium | Yes | ||

Red and blue balls | Product Data Scientist, Data Scientist, Applied Scientist | Probability theory | Counting, Combinations, Repetition, Binomial | Easy | Yes | ||

Trailing by two: should we go for two or three? | Product Data Scientist, Data Scientist, Applied Scientist | Probability theory | Independence, Decision making | Easy | Yes | ||

Two children I | Product Data Scientist, Data Scientist, Applied Scientist | Probability theory | Prior evidence | Easy | Yes | ||

Artist maxranks | Product Data Scientist, Data Scientist | SQL | Join | Easy | No | ||

Histogram of songs | Product Data Scientist, Data Scientist | SQL | Recursive CTE, Join | Hard | No | ||

Songs that did not enter the charts or entered high | Product Data Scientist, Data Scientist | SQL | Subquery, Join | Medium | No | ||

Songs that stay in the chars for a while | Product Data Scientist, Data Scientist | SQL | Subquery, CTE, Join, ALL, Window functions | Medium | No | ||

Gambler’s ruin win probability | Product Data Scientist, Data Scientist, Applied Scientist | Statistics | Gambler ruin, Random walk, Expectation | Medium | Yes | ||

Manual estimation of flips | Product Data Scientist, Data Scientist, Applied Scientist | Statistics | Normal, CDF, Binomial, CLT | Medium | No | ||

Measuring sticks | Product Data Scientist, Data Scientist, Applied Scientist | Statistics | Variance | Medium | Yes | ||

Monotonic draws | Product Data Scientist, Data Scientist, Applied Scientist | Statistics | Expectation | Hard | No | ||

Relationship between p-val and confidence interval | Product Data Scientist, Data Scientist, Applied Scientist | Statistics | Confidence interval, P-value, Hypothesis testing | Easy | Yes | ||

Two-sample t-test | Product Data Scientist, Data Scientist, Applied Scientist | Statistics | Hypothesis testing | Easy | Yes | ||

Choose a project | Product Data Scientist, Data Scientist, Applied Scientist, Machine Learning Engineer | Technical deep dive | Discussion prep | Hard | Yes | ||

Questions worksheet (deep dive) | Product Data Scientist, Data Scientist, Applied Scientist, Machine Learning Engineer | Technical deep dive | Discussion prep | Hard | Yes | ||

Interview success probability | Product Data Scientist, Data Scientist, Applied Scientist, Machine Learning Engineer | Application process | Application prep | Easy | No | ||

Offer validity and cooling-off periods | Product Data Scientist, Data Scientist, Applied Scientist, Machine Learning Engineer | Application process | Application prep | Easy | No | ||

Resume review | Product Data Scientist, Data Scientist, Applied Scientist, Machine Learning Engineer | Application process | Application prep, Resume | Easy | No | ||

Up-level during interview | Product Data Scientist, Data Scientist, Applied Scientist, Machine Learning Engineer | Application process | Application prep | Easy | No | ||

AUC ROC and predicted output transformations | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning | AUC ROC | Easy | Yes | ||

Simulate dynamic coin flips | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning Coding | Simulation | Easy | Yes | ||

Encoding categorical features | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning | Categorical features, Embeddings, One-hot encoding, Hashing | Easy | Yes | ||

Example of high-bias and high variance | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning | Bias-variance tradeoff | Medium | Yes | ||

Range of R2 when combining regressions | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning | Linear regression, Goodness of fit, R-squared, Correlation | Medium | Yes | ||

Games between two players | Product Data Scientist, Data Scientist, Applied Scientist | Probability theory | Recursive relationship | Medium | Yes | ||

Sum of normally distributed random variables | Product Data Scientist, Data Scientist, Applied Scientist | Probability theory | Normal, PDF, CDF | Medium | Yes | ||

Unfair coin probability | Product Data Scientist, Data Scientist, Applied Scientist | Probability theory | Bayes rule, Conditional probability | Easy | No | ||

Choose house or techno | Product Data Scientist, Data Scientist | SQL | Logical OR | Easy | No | ||

Expensive house songs I | Product Data Scientist, Data Scientist | SQL | Subquery, CTE, Join | Medium | No | ||

Expensive house songs II | Product Data Scientist, Data Scientist | SQL | Subquery, CTE, Join, Window functions | Hard | No | ||

Songs that ranked 1 to 50 | Product Data Scientist, Data Scientist | SQL | Between | Easy | No | ||

Biased coin | Product Data Scientist, Data Scientist, Applied Scientist | Statistics | Expectation, CLT, Binomial, Normal, Bernoulli, Hypothesis testing, CDF | Medium | Yes | ||

Prussian horses | Product Data Scientist, Data Scientist, Applied Scientist | Statistics | Poisson, Hypothesis testing, CDF | Medium | Yes | ||

AA tests | Product Data Scientist, Data Scientist, Applied Scientist | A/B Testing | Variance | Easy | Yes | ||

Counterfactual definition | Product Data Scientist, Data Scientist, Applied Scientist | A/B Testing | Counterfactual | Easy | Yes | ||

Equal-sized treatment and control groups | Product Data Scientist, Data Scientist, Applied Scientist | A/B Testing | Power, Variance, Sample size | Medium | Yes | ||

False discovery control | Product Data Scientist, Data Scientist, Applied Scientist | A/B Testing | False discovery rate, Multiple hypotheses testing, Benjamini & Hochberg, Bonferroni | Easy | Yes | ||

Interference | Product Data Scientist, Data Scientist, Applied Scientist | A/B Testing | Interference | Easy | Yes | ||

Multi-armed and contextual bandits in AB testing | Product Data Scientist, Data Scientist, Applied Scientist | A/B Testing | Contextual bandits, Multi-armed bandits | Medium | Yes | ||

Novelty and primacy effects | Product Data Scientist, Data Scientist, Applied Scientist | A/B Testing | Novelty effects, Primacy effects | Easy | Yes | ||

Randomization checks | Product Data Scientist, Data Scientist, Applied Scientist | A/B Testing | Randomization | Easy | Yes | ||

Randomization level | Product Data Scientist, Data Scientist, Applied Scientist | A/B Testing | Randomization, Variance | Medium | Yes | ||

Bias-variance biased estimator | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning | Bias-variance tradeoff | Medium | Yes | ||

Bootstrap | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning | Bootstrap | Easy | Yes | ||

K-means from scratch | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning Coding | k-means | Medium | Yes | ||

Linear regression with stochastic gradient descent | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning Coding | Stochastic Gradient descent, Linear regression | Medium | Yes | ||

Logistic regression with gradient descent | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning Coding | Gradient descent, Logistic regression | Medium | Yes | ||

Naive Bayes from scratch | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning Coding | Gaussian Naive Bayes | Medium | Yes | ||

Principal Component Analysis (PCA) from scratch | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning Coding | Principal Component Analysis (PCA) | Medium | Yes | ||

Common causes of data leakage | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning | Data leakage | Medium | Yes | ||

Comparing decision trees with random forests | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning | Decision trees, Random forests | Easy | Yes | ||

Correlation with binary variables | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning | Correlation, Hypothesis testing, Point-biserial correlation coefficient | Easy | No | ||

Cross validation | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning | Cross validation, Offline evaluation | Easy | Yes | ||

Discretization drawbacks | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning | Categorical variables, Discretization | Easy | Yes | ||

Feature engineering in the era of deep learning | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning | Feature engineering | Easy | Yes | ||

Gradient boosting vs. random forests | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning | Gradient boosting, Random forests, Bagging, Boosting | Medium | Yes | ||

Hypothesis testing in regression coefficients | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning | Linear regression, Hypothesis testing | Medium | No | ||

Intercept | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning | Linear regression, Intercept | Easy | No | ||

Interpretability | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning | ML interpretability | Easy | Yes | ||

Linear regression assumptions | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning | Linear regression | Easy | Yes | ||

Linear regression with duplicated rows | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning | Linear regression, Statistical significance | Easy | Yes | ||

Logistic regression and standardization | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning | Logistic regression, Standardization | Easy | Yes | ||

Logistic regression assumptions | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning | Logistic regression | Easy | Yes | ||

Missing data | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning | Missing data | Easy | Yes | ||

MSE vs. MAE | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning | MSE, MAE | Easy | Yes | ||

Multicollinearity | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning | Multicollinearity, Linear regression | Medium | Yes | ||

Non-probability sampling | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning | Sampling, Non-probability | Easy | Yes | ||

Normalization vs. Standardization | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning | Linear regression, Standardization, Normalization | Easy | Yes | ||

Not enough data to train a model | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning | Data limitations | Easy | Yes | ||

Outliers | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning | Outliers, Cook’s distance, Regularization | Easy | Yes | ||

Principal Component Analysis (PCA) | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning | PCA | Easy | Yes | ||

Random vs. stratified sampling | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning | Sampling, Random sampling, Stratified sampling | Easy | Yes | ||

Weighted and importance sampling | Data Scientist, Applied Scientist, Machine Learning Engineer | Machine Learning | Sampling, Weighted sampling, Importance sampling | Easy | No | ||

Characteristics of metrics | Product Data Scientist, Data Scientist | Metrics | Characteristics of metrics | Easy | Yes | ||

Types of metrics | Product Data Scientist, Data Scientist | Metrics | Types of metrics | Easy | Yes | ||

Consecutive tails | Product Data Scientist, Data Scientist, Applied Scientist | Probability theory | Permutations, Repetition | Easy | No | ||

Largest number rolled | Product Data Scientist, Data Scientist, Applied Scientist | Probability theory | Counting, Permutations, Repetition | Medium | Yes | ||

Median probability | Product Data Scientist, Data Scientist, Applied Scientist | Probability theory | Binomial, Uniform, CDF | Medium | Yes | ||

Number of emails | Product Data Scientist, Data Scientist, Applied Scientist | Probability theory | Poisson distribution | Easy | No | ||

Paths to destination | Product Data Scientist, Data Scientist, Applied Scientist | Probability theory | Counting, Combinations | Easy | No | ||

Repeated rolls until 4 | Product Data Scientist, Data Scientist, Applied Scientist | Probability theory | Geometric distribution | Easy | Yes | ||

Sample digits 1-10 | Product Data Scientist, Data Scientist, Applied Scientist | Probability theory | Sample from samples, Uniform | Medium | Yes | ||

Two fair die rolls | Product Data Scientist, Data Scientist, Applied Scientist | Probability theory | Independence, CDF, PMF | Easy | No | ||

Artists with more songs than others | Product Data Scientist, Data Scientist | SQL | Subquery, CTE, Join, Window functions | Hard | No | ||

Concat columns | Product Data Scientist, Data Scientist | SQL | Concat | Easy | No | ||

Label recent songs | Product Data Scientist, Data Scientist | SQL | Case | Easy | No | ||

Median songs per artist | Product Data Scientist, Data Scientist | SQL | CTE, Window functions | Hard | No | ||

Songs in charts with greater durations | Product Data Scientist, Data Scientist | SQL | Subquery, CTE, Join, Window functions | Hard | No | ||

Songs per genre | Product Data Scientist, Data Scientist | SQL | Group by | Easy | No | ||

Songs with letters | Product Data Scientist, Data Scientist | SQL | Regexp | Easy | No | ||

An intuitive way to write power | Product Data Scientist, Data Scientist, Applied Scientist | Statistics | Power, Hypothesis testing | Easy | Yes | ||

Buy and sell stocks | Product Data Scientist, Data Scientist, Applied Scientist | Statistics | Gambler ruin, Expectation, Recursion, Random walk | Medium | Yes | ||

CI of flipping heads | Product Data Scientist, Data Scientist, Applied Scientist | Statistics | Confidence Interval, CLT, Bernoulli trials | Medium | Yes | ||

Confidence interval definition | Product Data Scientist, Data Scientist, Applied Scientist | Statistics | Confidence interval, Hypothesis testing | Easy | Yes | ||

Confidence intervals that overlap | Product Data Scientist, Data Scientist, Applied Scientist | Statistics | Confidence interval, Hypothesis testing | Medium | No | ||

Covariance of dependent variables | Product Data Scientist, Data Scientist, Applied Scientist | Statistics | Variance, Uniform, Covariance, Expectation | Medium | Yes | ||

Distribution of a CDF | Product Data Scientist, Data Scientist, Applied Scientist | Statistics | CDF, Inverse transform | Medium | No | ||

Dynamic coin flips | Product Data Scientist, Data Scientist, Applied Scientist | Statistics | Expectation, Simulation | Hard | Yes | ||

Expected number of consecutive heads | Product Data Scientist, Data Scientist, Applied Scientist | Statistics | Expectation | Medium | Yes | ||

Number of draws to get greater than 1 | Product Data Scientist, Data Scientist, Applied Scientist | Statistics | Normal, Geometric, CDF, Expectation | Medium | Yes | ||

P-value definition | Product Data Scientist, Data Scientist, Applied Scientist | Statistics | P-value, Hypothesis testing | Easy | Yes | ||

Tests for normality | Product Data Scientist, Data Scientist, Applied Scientist | Statistics | Hypothesis testing, Normality | Easy | Yes |

2024 MADInterview.com. All rights reserved. User Terms and Conditions