with a Gmail account to unlock — no forms, no spam.

Role-specific and topic-specific questions with answers

Easy, medium, and hard questions that cover most topics in machine learning and data science interviews. Solutions that deep dive into explaining complicated concepts, with necessary references and simulations when needed.

Check out our guides if you are interested in company-specific and role-specific interview prep!

mockup
Problem Role Area Topics Difficulty Company (login required) Status
Sample Ratio Mismatch DS PDS AS A/B Testing Sample Ratio Mismatch Medium      
Questions worksheet (recruiter-screen) DS AS MLE PDS Application process Discussion prep, Application prep Medium      
Questions worksheet (behavioral) DS AS MLE PDS Behavioral Discussion prep Hard      
Minimum remove to make valid parentheses (Leetcode 1249) DS AS MLE Data Structures and Algorithms Stack Medium  
Remove duplicates in place (Leetcode 26) DS AS MLE Data Structures and Algorithms Array, Two pointers Easy
Bias-variance equation DS AS MLE Machine Learning Bias-variance tradeoff, Formula derivation Easy        
Linear regression with gradient descent DS AS MLE Machine Learning Coding Gradient descent, Linear regression Medium      
Gradient descent vs. stochastic gradient descent DS AS MLE Machine Learning Gradient descent, Stochastic gradient descent, Minibatch Medium  
Imbalanced dataset PDS DS AS MLE Machine Learning Class imbalance Medium      
Sudden drop in user engagement DS PDS Metrics Problem solving, Root-cause analysis Medium  
Documents edited by AI DS PDS AS Probability theory Bayes rule, Conditional probability Easy        
Trailing by two: should we go for two or three? DS PDS AS Probability theory Independence, Decision making Easy  
Artist maxranks DS AS PDS SQL Join Easy  
Two-sample t-test DS PDS AS Statistics Hypothesis testing Easy  
Choose a project DS AS MLE PDS Technical deep dive Discussion prep Hard
() Normality assumption DS PDS AS A/B Testing Normality Medium
() Reducing variance in AB testing DS PDS AS A/B Testing Variance Medium  
() Simpson’s paradox DS PDS AS A/B Testing Simpson’s paradox Medium    
() Interview success probability DS AS MLE PDS Application process Application prep Easy
() Offer validity and cooling-off periods DS AS MLE PDS Application process Application prep Easy
() Referrals vs. online applications DS AS MLE PDS Application process Application prep Easy
() Response time DS AS MLE PDS Application process Application prep Easy
() Resume review DS AS MLE PDS Application process Application prep, Resume Easy
() Timeline: from application to offer DS AS MLE PDS Application process Application prep Easy
() Up-level during interview DS AS MLE PDS Application process Application prep Easy
() K Closest Points to Origin (Leetcode 973) DS AS MLE Data Structures and Algorithms Heap Easy  
() Longest substring without repeating characters (Leetcode 3) DS AS MLE Data Structures and Algorithms Hash, Sliding Window Medium  
() Number of recent calls (Leetcode 933) DS AS MLE Data Structures and Algorithms Queue, DEQueue Easy
() Analyze prediction error DS AS MLE Machine Learning Prediction error, Bias-variance tradeoff, Diagnostics, Learning curves Medium        
() AUC ROC and predicted output transformations DS AS MLE Machine Learning AUC ROC Easy
() Sample from random generator DS AS MLE Machine Learning Coding Sample, Uniform, Random number generator Medium    
() Simulate dynamic coin flips DS AS MLE Machine Learning Coding Simulation Easy  
() Encoding categorical features PDS DS AS MLE Machine Learning Categorical features, Embeddings, One-hot encoding, Hashing Easy      
() Example of high-bias and high variance DS AS MLE Machine Learning Bias-variance tradeoff Medium
() k-means PDS DS AS MLE Machine Learning k-means Medium
() L1 (Lasso) vs. L2 (Ridge) regularization DS AS MLE Machine Learning L1, L2, Redularization, Lasso, Ridge Medium    
() Linear regression likelihood function DS AS MLE Machine Learning Linear regression, Likelihood function, Formula derivation Medium  
() Model interpretability DS AS MLE Machine Learning Interpretability Easy  
() Multiclass evaluation metrics DS AS MLE Machine Learning Multiclass, Diagnostics Medium    
() NDCG vs. Mean Average Precision (MAP) vs. Mean Reciprocal Rank (MRR) AS MLE Machine Learning Ranking metrics, NDCG, Mean average precision (MAP), Mean Reciprocal Rank (MRR), Recommender Systems Medium        
() Range of R2 when combining regressions PDS DS AS MLE Machine Learning Linear regression, Goodness of fit, R-squared, Correlation Medium  
() ROC vs. PR curve AS MLE Machine Learning AUC, ROC, AUPR, Precision, Recall, Evaluation metrics Medium        
() Measuring counterfactual impact DS PDS Metrics Problem solving, Counterfactual Hard    
() Games between two players DS PDS AS Probability theory Recursive relationship Medium  
() Monty Hall DS PDS AS Probability theory Bayes rule, Conditional independence, Prior evidence Medium
() Red and blue balls DS PDS AS Probability theory Counting, Combinations, Repetition, Binomial Easy  
() Sum of normally distributed random variables DS PDS AS Probability theory Normal, PDF, CDF Medium  
() Two children I DS PDS AS Probability theory Prior evidence Easy  
() Unfair coin probability DS PDS AS Probability theory Bayes rule, Conditional probability Easy
() Choose house or techno DS AS PDS SQL Logical OR Easy
() Expensive house songs I DS AS PDS SQL Subquery, CTE, Join Medium
() Expensive house songs II DS AS PDS SQL Subquery, CTE, Join, Window functions Hard      
() Histogram of songs DS AS PDS SQL Recursive CTE, Join Hard
() Monthly Active Users (MAU) DS AS PDS SQL Aggregation Easy
() Songs that did not enter the charts or entered high DS AS PDS SQL Subquery, Join Medium
() Songs that ranked 1 to 50 DS AS PDS SQL Between Easy
() Songs that stay in the chars for a while DS AS PDS SQL Subquery, CTE, Join, ALL, Window functions Medium
() Biased coin DS PDS AS Statistics Expectation, CLT, Binomial, Normal, Bernoulli, Hypothesis testing, CDF Medium  
() Gambler’s ruin win probability DS PDS AS Statistics Gambler ruin, Random walk, Expectation Medium  
() Manual estimation of flips DS PDS AS Statistics Normal, CDF, Binomial, CLT Medium
() Measuring sticks DS PDS AS Statistics Variance Medium  
() Monotonic draws DS PDS AS Statistics Expectation Hard
() Prussian horses DS PDS AS Statistics Poisson, Hypothesis testing, CDF Medium  
() Relationship between p-val and confidence interval DS PDS AS Statistics Confidence interval, P-value, Hypothesis testing Easy
() Questions worksheet (deep dive) DS AS MLE PDS Technical deep dive Discussion prep Hard
( Subscription required ) AA tests DS PDS AS A/B Testing Variance Easy      
( Subscription required ) Counterfactual definition DS PDS AS A/B Testing Counterfactual Easy  
( Subscription required ) Equal-sized treatment and control groups DS PDS AS A/B Testing Power, Variance, Sample size Medium  
( Subscription required ) False discovery control DS PDS AS A/B Testing False discovery rate, Multiple hypotheses testing, Benjamini & Hochberg, Bonferroni Easy    
( Subscription required ) Interference DS PDS AS A/B Testing Interference Easy      
( Subscription required ) Multi-armed and contextual bandits in AB testing DS PDS AS A/B Testing Contextual bandits, Multi-armed bandits Medium
( Subscription required ) Novelty and primacy effects DS PDS AS A/B Testing Novelty effects, Primacy effects Easy  
( Subscription required ) Randomization checks DS PDS AS A/B Testing Randomization Easy
( Subscription required ) Randomization level DS PDS AS A/B Testing Randomization, Variance Medium
( Subscription required ) Climbing stairs (Leetcode 70) DS AS MLE Data Structures and Algorithms Recursion, Dynamic programming Easy
( Subscription required ) Find if Path Exists in Graph (Leetcode 1971) DS AS MLE Data Structures and Algorithms DFS, BFS Medium  
( Subscription required ) Search in Binary Search Tree (Leetcode 700) DS AS MLE Data Structures and Algorithms Binary Search, Binary Search Tree Easy
( Subscription required ) Sort an Array (Leetcode 912) DS AS MLE Data Structures and Algorithms Recursion, Sorting Medium
( Subscription required ) Activation functions AS MLE Machine Learning Neural networks, Deep learning Easy  
( Subscription required ) Active learning AS MLE Machine Learning Labels, Label sampling Medium  
( Subscription required ) Adagrad vs. RMSProp vs. Adam AS MLE Machine Learning Neural networks, Deep learning, Optimization Hard      
( Subscription required ) Adaptive learning rate AS MLE Machine Learning Neural networks, Deep learning, Optimization Hard      
( Subscription required ) Approximate nearest neighbors AS MLE Machine Learning ANN, ANNOY, Nearest neighbors Medium    
( Subscription required ) Attention (intuition) AS MLE Machine Learning Neural networks, Deep learning, Transformers, LLMs Easy      
( Subscription required ) Baselines AS MLE DS Machine Learning Model evaluation Easy  
( Subscription required ) Bayesian frequentist statistics AS MLE Machine Learning Bayesian, Frequentist Easy
( Subscription required ) Bias-variance biased estimator DS AS MLE Machine Learning Bias-variance tradeoff Medium
( Subscription required ) Bootstrap PDS DS AS MLE Machine Learning Bootstrap Easy    
( Subscription required ) K-means from scratch DS AS MLE Machine Learning Coding k-means Medium      
( Subscription required ) Linear regression with stochastic gradient descent DS AS MLE Machine Learning Coding Stochastic Gradient descent, Linear regression Medium
( Subscription required ) Logistic regression with gradient descent DS AS MLE Machine Learning Coding Gradient descent, Logistic regression Medium
( Subscription required ) Naive Bayes from scratch DS AS MLE Machine Learning Coding Gaussian Naive Bayes Medium  
( Subscription required ) Neural network implementation AS MLE Machine Learning Coding Gradient descent, Neural networks, Neuron Hard  
( Subscription required ) Principal Component Analysis (PCA) from scratch DS AS MLE Machine Learning Coding Principal Component Analysis (PCA) Medium
( Subscription required ) Common causes of data leakage DS AS MLE Machine Learning Data leakage Medium
( Subscription required ) Comparing decision trees with random forests DS AS MLE Machine Learning Decision trees, Random forests Easy  
( Subscription required ) Correlation with binary variables DS AS MLE Machine Learning Correlation, Hypothesis testing, Point-biserial correlation coefficient Easy  
( Subscription required ) Cross validation PDS DS AS MLE Machine Learning Cross validation, Offline evaluation Easy      
( Subscription required ) Decide between a multinomial vs. a binary modeling approach AS MLE Machine Learning Modeling, Multinomial, Binary Easy
( Subscription required ) Discretization drawbacks DS AS MLE Machine Learning Categorical variables, Discretization Easy
( Subscription required ) Ensembles AS MLE DS Machine Learning Ensembles, Numerical example Medium
( Subscription required ) Examples of encoder and decoder models AS MLE Machine Learning LLMs, Transformers, Encoder, Decoder Easy    
( Subscription required ) Exponentially weighted moving average AS MLE Machine Learning Exponentially weighted moving average, Formula derivation, Proof Medium
( Subscription required ) Feature crossing AS MLE Machine Learning Feature engineering, Deep learning Easy  
( Subscription required ) Feature engineering in the era of deep learning DS AS MLE Machine Learning Feature engineering Easy
( Subscription required ) Gini impurity vs. information gain AS MLE Machine Learning Decision tree, Information Gain, Gini impurity Medium
( Subscription required ) Gradient boosting vs. random forests DS AS MLE Machine Learning Gradient boosting, Random forests, Bagging, Boosting Medium        
( Subscription required ) Gradient descent vs. Stochastic Gradient descent and local minima AS MLE Machine Learning Gradient descent (GD), Stochastic gradient descent (SGD), Local minima, Optimization, Deep learning Medium        
( Subscription required ) Gradient descent vs. Stochastic Gradient descent and learning rate AS MLE Machine Learning Gradient descent (GD), Stochastic gradient descent (SGD), Learning rate, Optimization, Deep learning Medium  
( Subscription required ) Predict whether a movie will receive good reviews AS MLE Machine Learning Hands on Feature engineering, Data exploration, ML modeling, Logistic regression, One hot encoding Hard
( Subscription required ) How to get more labels AS MLE Machine Learning Modeling, Label encoding Medium
( Subscription required ) Hypothesis testing in regression coefficients DS AS MLE Machine Learning Linear regression, Hypothesis testing Medium
( Subscription required ) Information gain in decision trees AS MLE Machine Learning Decision tree, Entroy, Information Gain, Formula derivation Medium
( Subscription required ) Intercept PDS DS AS MLE Machine Learning Linear regression, Intercept Easy
( Subscription required ) Interpretability PDS DS AS MLE Machine Learning ML interpretability Easy  
( Subscription required ) L2 regularization vs. weight decay AS MLE Machine Learning Neural networks, Deep learning, Regularization Hard        
( Subscription required ) Linear regression assumptions PDS DS AS MLE Machine Learning Linear regression Easy  
( Subscription required ) Linear regression with duplicated rows DS AS MLE Machine Learning Linear regression, Statistical significance Easy
( Subscription required ) Linear regression with stochastic gradient descent (formula derivation) AS MLE Machine Learning Linear regression, Stochastic gradient descent, Formula derivation Medium
( Subscription required ) Logistic regression and standardization DS AS MLE Machine Learning Logistic regression, Standardization Easy    
( Subscription required ) Logistic regression assumptions PDS DS AS MLE Machine Learning Logistic regression Easy      
( Subscription required ) Minimization of loss function intuition AS MLE Machine Learning Neural networks, Deep learning, Optimization Easy    
( Subscription required ) Missing data PDS DS AS MLE Machine Learning Missing data Easy    
( Subscription required ) Momentum AS MLE Machine Learning Neural networks, Deep learning, Optimization Hard        
( Subscription required ) MSE vs. MAE PDS DS AS MLE Machine Learning MSE, MAE Easy
( Subscription required ) Multi-headed attention and self attention AS MLE Machine Learning Neural networks, Deep learning, Transformers, LLMs Medium    
( Subscription required ) Multicollinearity PDS DS AS MLE Machine Learning Multicollinearity, Linear regression Medium  
( Subscription required ) Negative sampling AS MLE Machine Learning Neural networks, Deep learning, Negative sampling Medium    
( Subscription required ) Neuaral networks in layman terms AS MLE Machine Learning Neural networks, Deep learning Easy  
( Subscription required ) Non-probability sampling DS AS MLE Machine Learning Sampling, Non-probability Easy
( Subscription required ) Normalization in neural networks AS MLE Machine Learning Neural networks, Deep learning, Batch normalization, Layer normalization Medium      
( Subscription required ) Normalization vs. Standardization PDS DS AS MLE Machine Learning Linear regression, Standardization, Normalization Easy
( Subscription required ) Not enough data to train a model DS AS MLE Machine Learning Data limitations Easy
( Subscription required ) Optimize multiple objectives AS MLE Machine Learning Modeling, Multiple objectives Easy
( Subscription required ) Outliers PDS DS AS MLE Machine Learning Outliers, Cook’s distance, Regularization Easy
( Subscription required ) Overfitting in neural networks AS MLE Machine Learning Neural networks, Deep learning, Overfitting Medium        
( Subscription required ) Positional embeddings AS MLE Machine Learning Feature engineering, Deep learning, Transformers, LLMs, Positional embeddings, Positional encodings Hard  
( Subscription required ) Principal Component Analysis (PCA) PDS DS AS MLE Machine Learning PCA Easy
( Subscription required ) Prove that a median minizes MAE AS MLE Machine Learning MAE, Median, Formula derivation, Proof Hard    
( Subscription required ) Random forest feature importance AS MLE Machine Learning Feature importance, Explainability, Gini importance, Permutation importance Medium      
( Subscription required ) Random vs. stratified sampling PDS DS AS MLE Machine Learning Sampling, Stratified sampling Easy  
( Subscription required ) Self-supervised learning AS MLE Machine Learning Neural networks, Deep learning, Contrastive learning Easy    
( Subscription required ) SMOTE AS MLE Machine Learning Imbalanced classification, SMOTE, Data augmentation Easy        
( Subscription required ) API patterns MLE Machine Learning System Design APIs, GraphQL, REST Easy  
( Subscription required ) Build an ML system to predict Ad clicks AS MLE Machine Learning System Design ML system design, Feature engineering, Data exploration, ML modeling, Monitoring, Deployment, Business metrics Hard  
( Subscription required ) Cloud vs. on-device deployment MLE Machine Learning System Design Deployment, Cloud, Edge Medium  
( Subscription required ) Complex vs. simple deployment MLE Machine Learning System Design Deployment Easy
( Subscription required ) Crons, schedulers, orchestrattors MLE Machine Learning System Design ML infra Medium  
( Subscription required ) Data, model, and pipeline parallelism MLE Machine Learning System Design Parallelism Medium
( Subscription required ) Debug an ML model MLE Machine Learning System Design Best practices Medium
( Subscription required ) How to speed up inference MLE Machine Learning System Design Inference Easy        
( Subscription required ) ML system design tools and use cases MLE Machine Learning System Design ML infra, CDN, Kafka, Reddis, Dynamo, Cassandra, Chubby, PGVector, DBT, Feast, MLFlow, Statsig, Airflow, Fiddler Hard    
( Subscription required ) Online prediction, vs. batch prediction MLE Machine Learning System Design Inference Medium      
( Subscription required ) Simple model deployment process MLE Machine Learning System Design Deployment, Docker Easy
( Subscription required ) Training tracking MLE Machine Learning System Design Best practices Medium
( Subscription required ) Types of data distribution shifts MLE Machine Learning System Design Train-serving skew, Covariate shift, Label shift, Concept shift Medium    
( Subscription required ) Transfer learning AS MLE Machine Learning Neural networks, Deep learning, Transformers, LLMs, Catastrophic forgetting Easy
( Subscription required ) Transfer learning vs. knowledge distillation AS MLE Machine Learning LLM, Deep learning, Transfer learning, Knowledge distillation Medium
( Subscription required ) Vanishing and exploding gradients (mathematical explaination) AS MLE Machine Learning Neural networks, Deep learning, Mathematical explaination Hard    
( Subscription required ) Weight initialization AS MLE Machine Learning Neural networks, Deep learning Medium  
( Subscription required ) Weighted and importance sampling DS AS MLE Machine Learning Sampling, Weighted sampling, Importance sampling Easy
( Subscription required ) Characteristics of metrics DS PDS Metrics Characteristics of metrics Easy    
( Subscription required ) Types of metrics DS PDS Metrics Types of metrics Easy
( Subscription required ) Consecutive tails DS PDS AS Probability theory Permutations, Repetition Easy
( Subscription required ) Largest number rolled DS PDS AS Probability theory Counting, Permutations, Repetition Medium
( Subscription required ) Median probability DS PDS AS Probability theory Binomial, Uniform, CDF Medium  
( Subscription required ) Number of emails DS PDS AS Probability theory Poisson distribution Easy  
( Subscription required ) Paths to destination DS PDS AS Probability theory Counting, Combinations Easy
( Subscription required ) Repeated rolls until 4 DS PDS AS Probability theory Geometric distribution Easy
( Subscription required ) Sample digits 1-10 DS PDS AS Probability theory Sample from samples, Uniform Medium      
( Subscription required ) Two fair die rolls DS PDS AS Probability theory Independence, CDF, PMF Easy
( Subscription required ) Artists with more songs than others DS AS PDS SQL Subquery, CTE, Join, Window functions Hard      
( Subscription required ) Concat columns DS AS PDS SQL Concat Easy
( Subscription required ) Engagement Score by User DS AS PDS SQL CTEs, WINDOW FUNCTION Hard
( Subscription required ) Follower-Following Ratios DS AS PDS SQL CTEs, Aggregation Medium
( Subscription required ) Label recent songs DS AS PDS SQL Case Easy
( Subscription required ) Median songs per artist DS AS PDS SQL CTE, Window functions Hard
( Subscription required ) Most Liked Content Per User DS AS PDS SQL Join, Window Function, CTE Medium
( Subscription required ) Songs in charts with greater durations DS AS PDS SQL Subquery, CTE, Join, Window functions Hard
( Subscription required ) Songs per genre DS AS PDS SQL Group by Easy
( Subscription required ) Songs with letters DS AS PDS SQL Regexp Easy
( Subscription required ) An intuitive way to write power DS PDS AS Statistics Power, Hypothesis testing Easy
( Subscription required ) Buy and sell stocks DS PDS AS Statistics Gambler ruin, Expectation, Recursion, Random walk Medium  
( Subscription required ) CI of flipping heads DS PDS AS Statistics Confidence Interval, CLT, Bernoulli trials Medium
( Subscription required ) Confidence interval definition DS PDS AS Statistics Confidence interval, Hypothesis testing Easy  
( Subscription required ) Confidence intervals that overlap DS PDS AS Statistics Confidence interval, Hypothesis testing Medium
( Subscription required ) Covariance of dependent variables DS PDS AS Statistics Variance, Uniform, Covariance, Expectation Medium
( Subscription required ) Distribution of a CDF DS PDS AS Statistics CDF, Inverse transform Medium
( Subscription required ) Dynamic coin flips DS PDS AS Statistics Expectation, Simulation Hard  
( Subscription required ) Expected number of consecutive heads DS PDS AS Statistics Expectation Medium
( Subscription required ) Number of draws to get greater than 1 DS PDS AS Statistics Normal, Geometric, CDF, Expectation Medium
( Subscription required ) P-value definition DS PDS AS Statistics P-value, Hypothesis testing Easy        
( Subscription required ) Tests for normality DS PDS AS Statistics Hypothesis testing, Normality Easy