with a Gmail account to unlock — no forms, no spam.

Role-specific and topic-specific questions with answers

Easy, medium, and hard questions that cover most topics in machine learning and data science interviews. Solutions that deep dive into explaining complicated concepts, with necessary references and simulations when needed.

mockup
Problem Role Area Topics Difficulty Company (login required) Status
Sample Ratio Mismatch DS PDS AS A/B Testing Sample Ratio Mismatch Medium      
Questions worksheet (recruiter-screen) DS AS MLE PDS Application process Discussion prep, Application prep Medium      
Questions worksheet (behavioral) DS AS MLE PDS Behavioral Discussion prep Hard      
Minimum remove to make valid parentheses (Leetcode 1249) DS AS MLE Data Structures and Algorithms Stack Medium  
Remove duplicates in place (Leetcode 26) DS AS MLE Data Structures and Algorithms Array, Two pointers Easy
Bias-variance equation DS AS MLE Machine Learning Bias-variance tradeoff, Formula derivation Easy        
Linear regression with gradient descent DS AS MLE Machine Learning Coding Gradient descent, Linear regression Medium      
Gradient descent vs. stochastic gradient descent DS AS MLE Machine Learning Gradient descent, Stochastic gradient descent, Minibatch Medium  
Imbalanced dataset PDS DS AS MLE Machine Learning Class imbalance Medium      
Sudden drop in user engagement DS PDS Metrics Problem solving, Root-cause analysis Medium  
Documents edited by AI DS PDS AS Probability theory Bayes rule, Conditional probability Easy        
Trailing by two: should we go for two or three? DS PDS AS Probability theory Independence, Decision making Easy  
Artist maxranks DS AS PDS SQL Join Easy  
Two-sample t-test DS PDS AS Statistics Hypothesis testing Easy  
Choose a project DS AS MLE PDS Technical deep dive Discussion prep Hard
   Normality assumption DS PDS AS A/B Testing Normality Medium
   Reducing variance in AB testing DS PDS AS A/B Testing Variance Medium  
   Simpson’s paradox DS PDS AS A/B Testing Simpson’s paradox Medium    
   Interview success probability DS AS MLE PDS Application process Application prep Easy
   Offer validity and cooling-off periods DS AS MLE PDS Application process Application prep Easy
   Referrals vs. online applications DS AS MLE PDS Application process Application prep Easy
   Response time DS AS MLE PDS Application process Application prep Easy
   Resume review DS AS MLE PDS Application process Application prep, Resume Easy
   Timeline: from application to offer DS AS MLE PDS Application process Application prep Easy
   Up-level during interview DS AS MLE PDS Application process Application prep Easy
   K Closest Points to Origin (Leetcode 973) DS AS MLE Data Structures and Algorithms Heap Easy  
   Longest substring without repeating characters (Leetcode 3) DS AS MLE Data Structures and Algorithms Hash, Sliding Window Medium  
   Number of recent calls (Leetcode 933) DS AS MLE Data Structures and Algorithms Queue, DEQueue Easy
   Analyze prediction error DS AS MLE Machine Learning Prediction error, Bias-variance tradeoff, Diagnostics, Learning curves Medium        
   AUC ROC and predicted output transformations DS AS MLE Machine Learning AUC ROC Easy
   Sample from random generator DS AS MLE Machine Learning Coding Sample, Uniform, Random number generator Medium    
   Simulate dynamic coin flips DS AS MLE Machine Learning Coding Simulation Easy  
   Encoding categorical features PDS DS AS MLE Machine Learning Categorical features, Embeddings, One-hot encoding, Hashing Easy      
   Example of high-bias and high variance DS AS MLE Machine Learning Bias-variance tradeoff Medium
   k-means PDS DS AS MLE Machine Learning k-means Medium
   L1 (Lasso) vs. L2 (Ridge) regularization DS AS MLE Machine Learning L1, L2, Redularization, Lasso, Ridge Medium    
   Linear regression likelihood function DS AS MLE Machine Learning Linear regression, Likelihood function, Formula derivation Medium  
   Model interpretability DS AS MLE Machine Learning Interpretability Easy  
   Multiclass evaluation metrics DS AS MLE Machine Learning Multiclass, Diagnostics Medium    
   NDCG vs. Mean Average Precision (MAP) vs. Mean Reciprocal Rank (MRR) AS MLE Machine Learning Ranking metrics, NDCG, Mean average precision (MAP), Mean Reciprocal Rank (MRR), Recommender Systems Medium        
   Range of R2 when combining regressions PDS DS AS MLE Machine Learning Linear regression, Goodness of fit, R-squared, Correlation Medium  
   ROC vs. PR curve AS MLE Machine Learning AUC, ROC, AUPR, Precision, Recall, Evaluation metrics Medium        
   Measuring counterfactual impact DS PDS Metrics Problem solving, Counterfactual Hard    
   Games between two players DS PDS AS Probability theory Recursive relationship Medium  
   Monty Hall DS PDS AS Probability theory Bayes rule, Conditional independence, Prior evidence Medium
   Red and blue balls DS PDS AS Probability theory Counting, Combinations, Repetition, Binomial Easy  
   Sum of normally distributed random variables DS PDS AS Probability theory Normal, PDF, CDF Medium  
   Two children I DS PDS AS Probability theory Prior evidence Easy  
   Unfair coin probability DS PDS AS Probability theory Bayes rule, Conditional probability Easy
   Choose house or techno DS AS PDS SQL Logical OR Easy
   Expensive house songs I DS AS PDS SQL Subquery, CTE, Join Medium
   Expensive house songs II DS AS PDS SQL Subquery, CTE, Join, Window functions Hard      
   Histogram of songs DS AS PDS SQL Recursive CTE, Join Hard
   Songs that did not enter the charts or entered high DS AS PDS SQL Subquery, Join Medium
   Songs that ranked 1 to 50 DS AS PDS SQL Between Easy
   Songs that stay in the chars for a while DS AS PDS SQL Subquery, CTE, Join, ALL, Window functions Medium
   Biased coin DS PDS AS Statistics Expectation, CLT, Binomial, Normal, Bernoulli, Hypothesis testing, CDF Medium  
   Gambler’s ruin win probability DS PDS AS Statistics Gambler ruin, Random walk, Expectation Medium  
   Manual estimation of flips DS PDS AS Statistics Normal, CDF, Binomial, CLT Medium
   Measuring sticks DS PDS AS Statistics Variance Medium  
   Monotonic draws DS PDS AS Statistics Expectation Hard
   Prussian horses DS PDS AS Statistics Poisson, Hypothesis testing, CDF Medium  
   Relationship between p-val and confidence interval DS PDS AS Statistics Confidence interval, P-value, Hypothesis testing Easy
   Questions worksheet (deep dive) DS AS MLE PDS Technical deep dive Discussion prep Hard
   AA tests DS PDS AS A/B Testing Variance Easy      
   Counterfactual definition DS PDS AS A/B Testing Counterfactual Easy  
   Equal-sized treatment and control groups DS PDS AS A/B Testing Power, Variance, Sample size Medium  
   False discovery control DS PDS AS A/B Testing False discovery rate, Multiple hypotheses testing, Benjamini & Hochberg, Bonferroni Easy    
   Interference DS PDS AS A/B Testing Interference Easy      
   Multi-armed and contextual bandits in AB testing DS PDS AS A/B Testing Contextual bandits, Multi-armed bandits Medium
   Novelty and primacy effects DS PDS AS A/B Testing Novelty effects, Primacy effects Easy  
   Randomization checks DS PDS AS A/B Testing Randomization Easy
   Randomization level DS PDS AS A/B Testing Randomization, Variance Medium
   Climbing stairs (Leetcode 70) DS AS MLE Data Structures and Algorithms Recursion, Dynamic programming Easy
   Find if Path Exists in Graph (Leetcode 1971) DS AS MLE Data Structures and Algorithms DFS, BFS Medium  
   Search in Binary Search Tree (Leetcode 700) DS AS MLE Data Structures and Algorithms Binary Search, Binary Search Tree Easy
   Sort an Array (Leetcode 912) DS AS MLE Data Structures and Algorithms Recursion, Sorting Medium
   Activation functions AS MLE Machine Learning Neural networks, Deep learning Easy  
   Active learning AS MLE Machine Learning Labels, Label sampling Medium  
   Adagrad vs. RMSProp vs. Adam AS MLE Machine Learning Neural networks, Deep learning, Optimization Hard      
   Adaptive learning rate AS MLE Machine Learning Neural networks, Deep learning, Optimization Hard      
   Approximate nearest neighbors AS MLE Machine Learning ANN, ANNOY, Nearest neighbors Medium    
   Attention (intuition) AS MLE Machine Learning Neural networks, Deep learning, Transformers, LLMs Easy      
   Baselines AS MLE DS Machine Learning Model evaluation Easy  
   Bayesian frequentist statistics AS MLE Machine Learning Bayesian, Frequentist Easy
   Bias-variance biased estimator DS AS MLE Machine Learning Bias-variance tradeoff Medium
   Bootstrap PDS DS AS MLE Machine Learning Bootstrap Easy    
   K-means from scratch DS AS MLE Machine Learning Coding k-means Medium      
   Linear regression with stochastic gradient descent DS AS MLE Machine Learning Coding Stochastic Gradient descent, Linear regression Medium
   Logistic regression with gradient descent DS AS MLE Machine Learning Coding Gradient descent, Logistic regression Medium
   Naive Bayes from scratch DS AS MLE Machine Learning Coding Gaussian Naive Bayes Medium  
   Neural network implementation AS MLE Machine Learning Coding Gradient descent, Neural networks, Neuron Hard  
   Principal Component Analysis (PCA) from scratch DS AS MLE Machine Learning Coding Principal Component Analysis (PCA) Medium
   Common causes of data leakage DS AS MLE Machine Learning Data leakage Medium
   Comparing decision trees with random forests DS AS MLE Machine Learning Decision trees, Random forests Easy  
   Correlation with binary variables DS AS MLE Machine Learning Correlation, Hypothesis testing, Point-biserial correlation coefficient Easy  
   Cross validation PDS DS AS MLE Machine Learning Cross validation, Offline evaluation Easy      
   Decide between a multinomial vs. a binary modeling approach AS MLE Machine Learning Modeling, Multinomial, Binary Easy
   Discretization drawbacks DS AS MLE Machine Learning Categorical variables, Discretization Easy
   Ensembles AS MLE DS Machine Learning Ensembles, Numerical example Medium
   Examples of encoder and decoder models AS MLE Machine Learning LLMs, Transformers, Encoder, Decoder Easy    
   Exponentially weighted moving average AS MLE Machine Learning Exponentially weighted moving average, Formula derivation, Proof Medium
   Feature crossing AS MLE Machine Learning Feature engineering, Deep learning Easy  
   Feature engineering in the era of deep learning DS AS MLE Machine Learning Feature engineering Easy
   Gini impurity vs. information gain AS MLE Machine Learning Decision tree, Information Gain, Gini impurity Medium
   Gradient boosting vs. random forests DS AS MLE Machine Learning Gradient boosting, Random forests, Bagging, Boosting Medium        
   Gradient descent vs. Stochastic Gradient descent and local minima AS MLE Machine Learning Gradient descent (GD), Stochastic gradient descent (SGD), Local minima, Optimization, Deep learning Medium        
   Gradient descent vs. Stochastic Gradient descent and learning rate AS MLE Machine Learning Gradient descent (GD), Stochastic gradient descent (SGD), Learning rate, Optimization, Deep learning Medium  
   Predict whether a movie will receive good reviews AS MLE Machine Learning Hands on Feature engineering, Data exploration, ML modeling, Logistic regression, One hot encoding Hard
   How to get more labels AS MLE Machine Learning Modeling, Label encoding Medium
   Hypothesis testing in regression coefficients DS AS MLE Machine Learning Linear regression, Hypothesis testing Medium
   Information gain in decision trees AS MLE Machine Learning Decision tree, Entroy, Information Gain, Formula derivation Medium
   Intercept PDS DS AS MLE Machine Learning Linear regression, Intercept Easy
   Interpretability PDS DS AS MLE Machine Learning ML interpretability Easy  
   L2 regularization vs. weight decay AS MLE Machine Learning Neural networks, Deep learning, Regularization Hard        
   Linear regression assumptions PDS DS AS MLE Machine Learning Linear regression Easy  
   Linear regression with duplicated rows DS AS MLE Machine Learning Linear regression, Statistical significance Easy
   Linear regression with stochastic gradient descent (formula derivation) AS MLE Machine Learning Linear regression, Stochastic gradient descent, Formula derivation Medium
   Logistic regression and standardization DS AS MLE Machine Learning Logistic regression, Standardization Easy    
   Logistic regression assumptions PDS DS AS MLE Machine Learning Logistic regression Easy      
   Minimization of loss function intuition AS MLE Machine Learning Neural networks, Deep learning, Optimization Easy    
   Missing data PDS DS AS MLE Machine Learning Missing data Easy    
   Momentum AS MLE Machine Learning Neural networks, Deep learning, Optimization Hard        
   MSE vs. MAE PDS DS AS MLE Machine Learning MSE, MAE Easy
   Multi-headed attention and self attention AS MLE Machine Learning Neural networks, Deep learning, Transformers, LLMs Medium    
   Multicollinearity PDS DS AS MLE Machine Learning Multicollinearity, Linear regression Medium  
   Negative sampling AS MLE Machine Learning Neural networks, Deep learning, Negative sampling Medium    
   Neuaral networks in layman terms AS MLE Machine Learning Neural networks, Deep learning Easy  
   Non-probability sampling DS AS MLE Machine Learning Sampling, Non-probability Easy
   Normalization in neural networks AS MLE Machine Learning Neural networks, Deep learning, Batch normalization, Layer normalization Medium      
   Normalization vs. Standardization PDS DS AS MLE Machine Learning Linear regression, Standardization, Normalization Easy
   Not enough data to train a model DS AS MLE Machine Learning Data limitations Easy
   Optimize multiple objectives AS MLE Machine Learning Modeling, Multiple objectives Easy
   Outliers PDS DS AS MLE Machine Learning Outliers, Cook’s distance, Regularization Easy
   Overfitting in neural networks AS MLE Machine Learning Neural networks, Deep learning, Overfitting Medium        
   Positional embeddings AS MLE Machine Learning Feature engineering, Deep learning, Transformers, LLMs, Positional embeddings, Positional encodings Hard  
   Principal Component Analysis (PCA) PDS DS AS MLE Machine Learning PCA Easy
   Prove that a median minizes MAE AS MLE Machine Learning MAE, Median, Formula derivation, Proof Hard    
   Random forest feature importance AS MLE Machine Learning Feature importance, Explainability, Gini importance, Permutation importance Medium      
   Random vs. stratified sampling PDS DS AS MLE Machine Learning Sampling, Stratified sampling Easy  
   Self-supervised learning AS MLE Machine Learning Neural networks, Deep learning, Contrastive learning Easy    
   SMOTE AS MLE Machine Learning Imbalanced classification, SMOTE, Data augmentation Easy        
   API patterns MLE Machine Learning System Design APIs, GraphQL, REST Easy  
   Build an ML system to predict Ad clicks AS MLE Machine Learning System Design ML system design, Feature engineering, Data exploration, ML modeling, Monitoring, Deployment, Business metrics Hard  
   Cloud vs. on-device deployment MLE Machine Learning System Design Deployment, Cloud, Edge Medium  
   Complex vs. simple deployment MLE Machine Learning System Design Deployment Easy
   Crons, schedulers, orchestrattors MLE Machine Learning System Design ML infra Medium  
   Data, model, and pipeline parallelism MLE Machine Learning System Design Parallelism Medium
   Debug an ML model MLE Machine Learning System Design Best practices Medium
   How to speed up inference MLE Machine Learning System Design Inference Easy        
   ML system design tools and use cases MLE Machine Learning System Design ML infra, CDN, Kafka, Reddis, Dynamo, Cassandra, Chubby, PGVector, DBT, Feast, MLFlow, Statsig, Airflow, Fiddler Hard    
   Online prediction, vs. batch prediction MLE Machine Learning System Design Inference Medium      
   Simple model deployment process MLE Machine Learning System Design Deployment, Docker Easy
   Training tracking MLE Machine Learning System Design Best practices Medium
   Types of data distribution shifts MLE Machine Learning System Design Train-serving skew, Covariate shift, Label shift, Concept shift Medium    
   Transfer learning AS MLE Machine Learning Neural networks, Deep learning, Transformers, LLMs, Catastrophic forgetting Easy
   Transfer learning vs. knowledge distillation AS MLE Machine Learning LLM, Deep learning, Transfer learning, Knowledge distillation Medium
   Vanishing and exploding gradients (mathematical explaination) AS MLE Machine Learning Neural networks, Deep learning, Mathematical explaination Hard    
   Weight initialization AS MLE Machine Learning Neural networks, Deep learning Medium  
   Weighted and importance sampling DS AS MLE Machine Learning Sampling, Weighted sampling, Importance sampling Easy
   Characteristics of metrics DS PDS Metrics Characteristics of metrics Easy    
   Types of metrics DS PDS Metrics Types of metrics Easy
   Consecutive tails DS PDS AS Probability theory Permutations, Repetition Easy
   Largest number rolled DS PDS AS Probability theory Counting, Permutations, Repetition Medium
   Median probability DS PDS AS Probability theory Binomial, Uniform, CDF Medium  
   Number of emails DS PDS AS Probability theory Poisson distribution Easy  
   Paths to destination DS PDS AS Probability theory Counting, Combinations Easy
   Repeated rolls until 4 DS PDS AS Probability theory Geometric distribution Easy
   Sample digits 1-10 DS PDS AS Probability theory Sample from samples, Uniform Medium      
   Two fair die rolls DS PDS AS Probability theory Independence, CDF, PMF Easy
   Artists with more songs than others DS AS PDS SQL Subquery, CTE, Join, Window functions Hard      
   Concat columns DS AS PDS SQL Concat Easy
   Label recent songs DS AS PDS SQL Case Easy
   Median songs per artist DS AS PDS SQL CTE, Window functions Hard
   Songs in charts with greater durations DS AS PDS SQL Subquery, CTE, Join, Window functions Hard
   Songs per genre DS AS PDS SQL Group by Easy
   Songs with letters DS AS PDS SQL Regexp Easy
   An intuitive way to write power DS PDS AS Statistics Power, Hypothesis testing Easy
   Buy and sell stocks DS PDS AS Statistics Gambler ruin, Expectation, Recursion, Random walk Medium  
   CI of flipping heads DS PDS AS Statistics Confidence Interval, CLT, Bernoulli trials Medium
   Confidence interval definition DS PDS AS Statistics Confidence interval, Hypothesis testing Easy  
   Confidence intervals that overlap DS PDS AS Statistics Confidence interval, Hypothesis testing Medium
   Covariance of dependent variables DS PDS AS Statistics Variance, Uniform, Covariance, Expectation Medium
   Distribution of a CDF DS PDS AS Statistics CDF, Inverse transform Medium
   Dynamic coin flips DS PDS AS Statistics Expectation, Simulation Hard  
   Expected number of consecutive heads DS PDS AS Statistics Expectation Medium
   Number of draws to get greater than 1 DS PDS AS Statistics Normal, Geometric, CDF, Expectation Medium
   P-value definition DS PDS AS Statistics P-value, Hypothesis testing Easy        
   Tests for normality DS PDS AS Statistics Hypothesis testing, Normality Easy