MockPreps is India's leading online platform for government and competitive exam preparation. We offer 50,000+ practice questions and comprehensive mock test series for various exams including SSC, UPSC, Banking, Railway, and more.

Which exams can I prepare for on MockPreps?

You can prepare for SSC CGL, UPSC CSE, Banking exams (SBI PO, IBPS), Railway exams (RRB NTPC), State PSC exams, CAT, GATE, NEET, JEE, and many other government and competitive exams.

Are the mock tests free?

MockPreps offers both free and premium mock tests. Free users can access limited practice questions and mock tests, while premium subscribers get unlimited access to all content and advanced features.

Linear Regression Interview Questions & Answers, 2025

About Linear Regression

Linear Regression: An In-Depth Overview

Linear Regression is one of the most fundamental and widely used algorithms in statistics and machine learning. It is primarily employed for predictive modeling, helping to understand the relationship between a dependent variable (target) and one or more independent variables (predictors). Despite its simplicity, linear regression serves as the foundation for more complex techniques and is essential for data analysis across industries such as finance, healthcare, marketing, and engineering.

1. Concept of Linear Regression

At its core, linear regression assumes a linear relationship between the dependent variable ( Y ) and independent variable(s) ( X ). Mathematically, the simplest form is expressed as:

[
Y = \beta_0 + \beta_1 X + \epsilon
]

Where:

( Y ) = Dependent variable (outcome we want to predict)
( X ) = Independent variable (predictor or feature)
( \beta_0 ) = Intercept (value of Y when X=0)
( \beta_1 ) = Slope (amount by which Y changes for a unit change in X)
( \epsilon ) = Error term (difference between actual and predicted values)

When there is more than one independent variable, it becomes Multiple Linear Regression:

[
Y = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + ... + \beta_n X_n + \epsilon
]

This allows modeling of more complex real-world scenarios.

2. Assumptions of Linear Regression

Linear regression works well under certain assumptions. Understanding these assumptions is critical because violation can lead to misleading results:

Linearity: The relationship between the independent and dependent variables should be linear.
Independence: Observations must be independent of each other.
Homoscedasticity: Constant variance of errors ((\epsilon)) across all levels of independent variables.
Normality of Errors: The residuals (differences between actual and predicted values) should be normally distributed.
No multicollinearity (for multiple regression): Independent variables should not be highly correlated with each other.

If these assumptions hold, linear regression produces unbiased, consistent, and efficient estimates of the coefficients.

3. Types of Linear Regression

Linear regression can be broadly categorized into:

Simple Linear Regression:
Involves one independent variable predicting a dependent variable. Example: Predicting a person’s weight based on their height.
Multiple Linear Regression:
Involves multiple independent variables. Example: Predicting house prices based on size, location, and number of bedrooms.
Polynomial Regression (technically an extension):
Models nonlinear relationships by introducing polynomial terms of predictors. Example: ( Y = \beta_0 + \beta_1 X + \beta_2 X^2 + \epsilon )
Ridge and Lasso Regression (Regularized Linear Regression):
These methods add penalties to prevent overfitting in models with many variables. Ridge uses ( L2 ) penalty, Lasso uses ( L1 ) penalty.

4. Objective and Cost Function

The goal of linear regression is to fit the best line that minimizes the error between predicted and actual values. The most common approach is Ordinary Least Squares (OLS), which minimizes the sum of squared errors (SSE):

[
SSE = \sum_{i=1}^{n} (Y_i - \hat{Y_i})^2
]

Where:

( Y_i ) = Actual value
( \hat{Y_i} ) = Predicted value

Minimizing SSE ensures that predictions are as close as possible to actual outcomes. Gradient descent can also be used to iteratively adjust coefficients in large datasets.

5. Model Evaluation Metrics

Evaluating a linear regression model requires assessing how well the model predicts the dependent variable. Common metrics include:

R-squared (( R^2 )):
Measures the proportion of variance in the dependent variable explained by independent variables. Ranges from 0 to 1.

[
R^2 = 1 - \frac{\sum (Y_i - \hat{Y_i})^2}{\sum (Y_i - \bar{Y})^2}
]

Adjusted R-squared:
Adjusts R-squared for the number of predictors in the model, preventing overestimation of model performance in multiple regression.
Mean Absolute Error (MAE):
Average of absolute differences between predicted and actual values.

[
MAE = \frac{1}{n} \sum_{i=1}^{n} |Y_i - \hat{Y_i}|
]

Mean Squared Error (MSE) & Root Mean Squared Error (RMSE):
Measures the average squared difference. RMSE is the square root of MSE and is in the same unit as the dependent variable.

[
MSE = \frac{1}{n} \sum_{i=1}^{n} (Y_i - \hat{Y_i})^2
]

6. Steps in Building a Linear Regression Model

Data Collection: Gather relevant data for predictors and target variables.
Data Preprocessing: Handle missing values, outliers, and categorical variables (encoding).
Exploratory Data Analysis (EDA): Visualize relationships using scatter plots, correlation matrices, etc.
Splitting Data: Divide dataset into training and testing sets to evaluate performance.
Model Training: Fit a linear regression model using OLS or gradient descent.
Model Evaluation: Assess performance using metrics like R-squared, MAE, RMSE.
Model Interpretation: Examine coefficients to understand the impact of each variable.
Prediction: Use the model to predict new observations.

7. Advantages of Linear Regression

Simplicity and Interpretability: Easy to understand and explain.
Computational Efficiency: Requires less computation than complex models.
Basis for Other Models: Foundation for logistic regression, generalized linear models, and regularized regression.
Predictive Power: Works well for linear relationships and moderate datasets.

8. Limitations of Linear Regression

Assumes Linearity: Fails if relationships are nonlinear.
Sensitive to Outliers: Extreme values can significantly impact the model.
Multicollinearity: Highly correlated predictors distort coefficient estimates.
Overfitting/Underfitting: May overfit if too many predictors or underfit if relationships are complex.
Cannot Model Complex Patterns: Limited compared to tree-based or neural network models.

9. Applications of Linear Regression

Finance: Predicting stock prices, loan defaults, or credit scores.
Healthcare: Estimating disease progression, patient outcomes, or drug effectiveness.
Marketing: Forecasting sales, customer behavior, or advertising effectiveness.
Economics: Modeling economic indicators like GDP growth, inflation, or unemployment.
Engineering: Predicting material strength, energy consumption, or equipment failure.

Linear regression’s versatility makes it applicable in almost any field where quantitative prediction is needed.

Fresher Interview Questions

1. What is Linear Regression?

Answer:
Linear Regression is a supervised machine learning algorithm used to predict a continuous numerical value based on one or more input variables.

It finds the best-fit straight line that represents the relationship between:

Independent variable(s) (X)
Dependent variable (Y)

Example:
Predicting salary based on years of experience.

2. What is the equation of Linear Regression?

Answer:
The equation of a straight line:

[
Y = mX + c
]

Where:

Y = dependent variable (output)
X = independent variable (input)
m = slope (how much Y changes when X changes)
c = intercept (value of Y when X = 0)

3. What is Simple Linear Regression?

Answer:
Simple Linear Regression involves:

One independent variable
One dependent variable

Example:
Salary = f(Experience)

[
Salary = m \times Experience + c
]

4. What is Multiple Linear Regression?

Answer:
Multiple Linear Regression uses more than one independent variable.

Equation:

[
Y = b_0 + b_1X_1 + b_2X_2 + ... + b_nX_n
]

Example:
House price based on:

Area
Location
Number of rooms

5. What are the assumptions of Linear Regression?

Answer:
Linear Regression works best when the following assumptions are met:

Linearity – Relationship between X and Y is linear
Independence – Observations are independent
Homoscedasticity – Constant variance of errors
Normality – Residuals are normally distributed
No multicollinearity – Independent variables are not highly correlated

6. What is the cost function in Linear Regression?

Answer:
The cost function measures how well the model fits the data.

Most commonly used:

Mean Squared Error (MSE)

[
MSE = \frac{1}{n} \sum (Y_{actual} - Y_{predicted})^2
]

The goal is to minimize this error.

7. What is Gradient Descent?

Answer:
Gradient Descent is an optimization algorithm used to find the best values of m and c by minimizing the cost function.

Steps:

Start with random values of m and c
Calculate error
Update m and c in the direction of minimum error
Repeat until convergence

8. What is Learning Rate?

Answer:
Learning rate (α) controls how big the step size is during gradient descent.

Too large → overshoots minimum
Too small → slow convergence

9. What are residuals?

Answer:
Residual = Actual value − Predicted value

[
Residual = Y_{actual} - Y_{predicted}
]

Residuals help evaluate model accuracy.

10. What is R-squared (R²)?

Answer:
R² shows how much variance in the dependent variable is explained by the model.

R² = 1 → perfect fit
R² = 0 → no explanatory power

Example:
R² = 0.85 means 85% of the variation is explained.

11. What is Adjusted R-squared?

Answer:
Adjusted R² adjusts R² based on the number of independent variables.

It penalizes adding irrelevant features.

12. Difference between R² and Adjusted R²

R²	Adjusted R²
Increases when variables are added	Increases only if variables are useful
Can be misleading	More reliable

13. What is Multicollinearity?

Answer:
Multicollinearity occurs when independent variables are highly correlated with each other.

Problems:

Unstable coefficients
Difficult interpretation

Solution:

Remove correlated variables
Use VIF (Variance Inflation Factor)

14. What is VIF?

Answer:
VIF measures how much multicollinearity exists.

VIF < 5 → acceptable
VIF > 10 → serious multicollinearity

15. What is Overfitting?

Answer:
Overfitting occurs when the model:

Fits training data very well
Performs poorly on unseen data

Causes:

Too many features
Small dataset

16. What is Underfitting?

Answer:
Underfitting happens when:

Model is too simple
Cannot capture patterns in data

17. How to handle overfitting in Linear Regression?

Answer:

Use regularization
Reduce features
Increase data
Cross-validation

18. What is Regularization?

Answer:
Regularization adds a penalty term to the cost function to reduce overfitting.

19. What is Ridge Regression?

Answer:
Ridge Regression adds L2 penalty.

[
Cost = MSE + \lambda \sum w^2
]

Reduces large coefficients
Does not make them zero

20. What is Lasso Regression?

Answer:
Lasso adds L1 penalty.

[
Cost = MSE + \lambda \sum |w|
]

Can shrink coefficients to zero
Performs feature selection

21. Difference between Ridge and Lasso

Ridge	Lasso
L2 penalty	L1 penalty
No feature elimination	Feature elimination
Handles multicollinearity	Sparse model

22. What evaluation metrics are used for Linear Regression?

Answer:

Mean Absolute Error (MAE)
Mean Squared Error (MSE)
Root Mean Squared Error (RMSE)
R² Score

23. What is RMSE?

Answer:
RMSE is the square root of MSE.

[
RMSE = \sqrt{MSE}
]

It is in the same unit as target variable.

24. Can Linear Regression handle non-linear data?

Answer:
No, but we can:

Use polynomial features
Apply transformations (log, square)

25. When should Linear Regression not be used?

Answer:

When relationship is highly non-linear
Presence of strong outliers
Categorical target variable

26. Real-time example of Linear Regression

Answer:

Salary prediction
House price prediction
Sales forecasting
Demand prediction

27. What libraries are used for Linear Regression in Python?

Answer:

scikit-learn
statsmodels
numpy
pandas

28. Explain Linear Regression in one line (Interview Tip)

Answer:
Linear Regression predicts continuous values by finding the best-fit straight line between input and output variables.

29. What are the advantages of Linear Regression?

Answer:

Simple and easy to implement
Interpretable results
Fast training
Works well with linearly related data

30. What are the disadvantages of Linear Regression?

Answer:

Sensitive to outliers
Assumes linearity
Poor performance with complex data

31. What are model coefficients in Linear Regression?

Answer:
Model coefficients represent the relationship strength between independent variables and the dependent variable.

Positive coefficient → Y increases as X increases
Negative coefficient → Y decreases as X increases

Example:
If coefficient = 5 → 1 unit increase in X increases Y by 5 units.

32. What is the intercept in Linear Regression?

Answer:
The intercept is the value of Y when all X values are zero.

It helps position the regression line correctly.

33. How do you detect outliers in Linear Regression?

Answer:

Box plots
Z-score
IQR (Interquartile Range)
Scatter plots

Outliers can skew the regression line.

34. How do outliers affect Linear Regression?

Answer:

Change slope significantly
Increase error values
Reduce model accuracy

Linear Regression is sensitive to outliers.

35. How to handle outliers?

Answer:

Remove extreme outliers
Transform data (log, square root)
Use robust regression techniques

36. What is homoscedasticity?

Answer:
Homoscedasticity means the variance of residuals is constant across all values of X.

This ensures reliable predictions.

37. What is heteroscedasticity?

Answer:
Heteroscedasticity occurs when residual variance changes with X.

Impact:

Unreliable coefficients
Invalid statistical tests

38. How do you detect heteroscedasticity?

Answer:

Residual vs fitted value plot
Breusch–Pagan test
White test

39. How to fix heteroscedasticity?

Answer:

Log transformation
Weighted Least Squares
Remove outliers

40. What is normality of residuals?

Answer:
Residuals should follow a normal distribution.

This helps in:

Accurate confidence intervals
Hypothesis testing

41. What is correlation vs regression?

Answer:

Correlation	Regression
Measures relationship strength	Predicts values
No causation	Assumes dependency
Symmetric	Directional

42. Can Linear Regression be used for classification?

Answer:
No. Linear Regression predicts continuous values.

For classification, use:

Logistic Regression

43. What is Ordinary Least Squares (OLS)?

Answer:
OLS minimizes the sum of squared residuals to find the best-fit line.

It is the most common method used in Linear Regression.

44. What is feature scaling and why is it needed?

Answer:
Feature scaling brings all features to a similar range.

Required for:

Faster convergence
Gradient descent efficiency

Methods:

Standardization
Normalization

45. Is feature scaling required for Linear Regression?

Answer:
Not mandatory for OLS, but important for gradient descent and regularization.

46. What is polynomial regression?

Answer:
Polynomial Regression models non-linear relationships by adding polynomial terms.

Example:

[
Y = b_0 + b_1X + b_2X^2
]

47. What is bias in Linear Regression?

Answer:
Bias is error due to oversimplified assumptions.

High bias leads to underfitting.

48. What is variance in Linear Regression?

Answer:
Variance measures how much the model changes with different datasets.

High variance leads to overfitting.

49. Explain Bias–Variance Tradeoff

Answer:

Low bias + high variance → overfitting
High bias + low variance → underfitting

Goal: Balance both.

50. What is cross-validation?

Answer:
Cross-validation tests model performance on multiple data splits.

Most common:

K-Fold Cross Validation

51. What is K-Fold Cross Validation?

Answer:
Data is divided into K equal parts.

Train on K-1 folds
Test on remaining fold
Repeat K times

52. What is p-value in Linear Regression?

Answer:
p-value measures statistical significance of a feature.

p < 0.05 → significant feature
p > 0.05 → insignificant feature

53. What is confidence interval?

Answer:
Confidence interval provides a range where the true coefficient lies.

Common: 95% confidence interval.

54. What is dummy variable trap?

Answer:
Occurs when dummy variables are perfectly correlated.

Solution:

Drop one dummy variable

55. How do you handle categorical variables in Linear Regression?

Answer:

One-Hot Encoding
Label Encoding (carefully)

56. What is data leakage?

Answer:
Data leakage occurs when test data influences training.

It results in overly optimistic performance.

57. What is train-test split?

Answer:
Data is split into:

Training set
Testing set

Common ratio:

70:30 or 80:20

58. What are leverage points?

Answer:
Leverage points are data points with extreme X values.

They strongly influence the regression line.

59. What is Cook’s Distance?

Answer:
Measures the influence of each observation on the regression model.

High value → influential point.

60. Explain Linear Regression workflow

Answer:

Collect data
Clean data
Handle missing values
Feature scaling
Train model
Evaluate model
Tune model

61. How do you explain Linear Regression to a non-technical person?

Answer:
It finds a straight line that best predicts future values based on past data.

62. What are common mistakes in Linear Regression?

Answer:

Ignoring assumptions
Not handling outliers
Using too many features
Not validating model

63. What is the effect of adding irrelevant features?

Answer:

Reduced model performance
Overfitting
Lower Adjusted R²

64. Can Linear Regression handle missing values?

Answer:
No. Missing values must be:

Removed
Imputed (mean/median)

65. Interview Tip: How do you justify Linear Regression?

Answer:
Use Linear Regression when:

Relationship is linear
Target variable is continuous
Interpretability is important

Experienced Interview Questions

1. What is Linear Regression? Explain with an equation.

Answer:
Linear Regression is a supervised learning algorithm used to model the relationship between a dependent variable (target) and one or more independent variables (features) by fitting a linear equation.

Simple Linear Regression:
[
y = \beta_0 + \beta_1 x + \varepsilon
]

Multiple Linear Regression:
[
y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + ... + \beta_n x_n + \varepsilon
]

Where:

(y) → dependent variable
(x) → independent variable(s)
(\beta_0) → intercept
(\beta_i) → coefficients
(\varepsilon) → error term

2. What are the key assumptions of Linear Regression?

Answer:
Linear Regression relies on the following assumptions:

Linearity – Relationship between features and target is linear
Independence – Observations are independent
Homoscedasticity – Constant variance of residuals
Normality of Errors – Residuals are normally distributed
No Multicollinearity – Independent variables are not highly correlated

Violation of these assumptions can lead to biased or inefficient estimates.

3. What is Ordinary Least Squares (OLS)?

Answer:
OLS is a method used to estimate regression coefficients by minimizing the sum of squared residuals.

[
\text{Minimize } \sum (y_i - \hat{y}_i)^2
]

OLS provides:

Best Linear Unbiased Estimator (BLUE)
Efficient estimates when assumptions hold

4. How do you interpret regression coefficients?

Answer:

Slope ((\beta_i)): Change in target variable for a one-unit change in the predictor, keeping other variables constant
Intercept ((\beta_0)): Expected value of the target when all predictors are zero

Example:
If salary = 30,000 + 2,000 × years_of_experience
→ Each additional year increases salary by ₹2,000.

5. What is R-squared and Adjusted R-squared?

Answer:

R-squared ((R^2))

Measures proportion of variance explained by the model.

[
R^2 = 1 - \frac{SS_{res}}{SS_{tot}}
]

Ranges from 0 to 1
Increases with more features (even irrelevant ones)

Adjusted R-squared

Penalizes unnecessary features.

[
Adjusted\ R^2 = 1 - \left(\frac{1-R^2}{n-p-1}\right)(n-1)
]

Preferred for multiple regression models.

6. What is multicollinearity? How do you detect it?

Answer:
Multicollinearity occurs when independent variables are highly correlated, leading to unstable coefficients.

Detection methods:

Correlation matrix
Variance Inflation Factor (VIF)

[
VIF = \frac{1}{1 - R^2}
]

VIF > 5 → moderate issue
VIF > 10 → severe multicollinearity

Solutions:

Remove correlated features
Use PCA
Apply Ridge Regression

7. What are residuals? How do you analyze them?

Answer:
Residuals are differences between actual and predicted values:

[
Residual = y - \hat{y}
]

Residual analysis includes:

Residual vs fitted plot (check homoscedasticity)
Q-Q plot (check normality)
Autocorrelation plot (Durbin–Watson test)

8. Difference between Linear Regression and Logistic Regression?

Linear Regression	Logistic Regression
Predicts continuous values	Predicts probabilities
Uses OLS	Uses Maximum Likelihood
Output is unbounded	Output between 0 and 1
Uses MSE loss	Uses Log loss

9. What is Gradient Descent in Linear Regression?

Answer:
Gradient Descent is an iterative optimization algorithm used when datasets are large.

Update rule:
[
\beta = \beta - \alpha \frac{\partial J(\beta)}{\partial \beta}
]

Types:

Batch Gradient Descent
Stochastic Gradient Descent (SGD)
Mini-batch Gradient Descent

10. What is overfitting in Linear Regression?

Answer:
Overfitting occurs when the model captures noise instead of patterns, performing well on training data but poorly on test data.

Causes:

Too many features
Multicollinearity
Small dataset

Prevention:

Regularization
Feature selection
Cross-validation

11. Explain Regularization in Linear Regression.

Answer:
Regularization adds a penalty term to reduce model complexity.

Ridge Regression (L2)

[
Loss = MSE + \lambda \sum \beta^2
]

Shrinks coefficients
Handles multicollinearity

Lasso Regression (L1)

[
Loss = MSE + \lambda \sum |\beta|
]

Performs feature selection

Elastic Net

Combination of L1 and L2.

12. What evaluation metrics are used for Linear Regression?

Answer:

Mean Absolute Error (MAE)
Mean Squared Error (MSE)
Root Mean Squared Error (RMSE)
R-squared
Adjusted R-squared

13. How do you handle outliers in Linear Regression?

Answer:

Detect using boxplots, Cook’s distance
Transform data (log, sqrt)
Remove or cap outliers
Use robust regression

14. What is the bias–variance tradeoff?

Answer:

High bias → underfitting
High variance → overfitting
Goal is to balance both using regularization and proper model complexity.

15. Real-time scenario: When would Linear Regression fail?

Answer:

Non-linear relationships
High outliers
Heteroscedastic data
Categorical variables not encoded
Time-series data with autocorrelation

16. How do you check if Linear Regression is suitable for a dataset?

Answer:

Scatter plots for linearity
Correlation analysis
Residual diagnostics
Compare with non-linear models

17. Explain feature scaling in Linear Regression.

Answer:
Feature scaling improves convergence in gradient descent.

Methods:

Standardization (Z-score)
Min-Max Scaling

Important when using regularization.

18. What is Cook’s Distance?

Answer:
Measures influence of each data point on regression coefficients.

Large Cook’s Distance → influential outlier

19. What is the difference between parametric and non-parametric models?

Answer:

Linear Regression → parametric (assumes fixed form)
Decision Trees → non-parametric

20. How do you deploy a Linear Regression model in production?

Answer:

Train and validate model
Serialize model (pickle/joblib)
Build API (Flask/FastAPI)
Monitor performance and drift

21. What is the difference between correlation and regression?

Answer:

Correlation measures the strength and direction of a relationship between two variables.
Regression models the relationship and predicts the dependent variable.

Correlation does not imply causation, while regression attempts to quantify impact.

22. What happens if assumptions of Linear Regression are violated?

Answer:

Linearity violated → Biased predictions
Homoscedasticity violated → Inefficient estimates
Normality violated → Invalid hypothesis tests
Multicollinearity → Unstable coefficients
Autocorrelation → Incorrect confidence intervals

23. What is heteroscedasticity? How do you handle it?

Answer:
Heteroscedasticity occurs when variance of residuals changes with predictors.

Detection:

Residual vs fitted plot
Breusch–Pagan test

Handling:

Log or Box-Cox transformation
Weighted Least Squares
Robust standard errors

24. What is autocorrelation in Linear Regression?

Answer:
Autocorrelation means residuals are correlated across observations, common in time-series data.

Detection:

Durbin–Watson test (value ~2 is ideal)

Solutions:

Add lag variables
Use time-series models
Generalized Least Squares

25. Explain hypothesis testing in Linear Regression.

Answer:
Used to test statistical significance of coefficients.

Null Hypothesis (H₀): β = 0
Alternative Hypothesis (H₁): β ≠ 0

Tests used:

t-test → individual coefficients
F-test → overall model significance

26. What is p-value? How do you interpret it?

Answer:
p-value measures probability of observing results assuming null hypothesis is true.

p < 0.05 → statistically significant
p ≥ 0.05 → not significant

Lower p-value → stronger evidence against null hypothesis.

27. What is adjusted R² preferred over R²?

Answer:
Adjusted R² penalizes irrelevant features.

R² always increases
Adjusted R² increases only if feature improves the model

Best metric for multiple linear regression.

28. Explain bias and variance in Linear Regression.

Answer:

Bias: Error from overly simple model
Variance: Error from overly complex model

Linear regression generally has low variance, high bias when underfitting.

29. How does sample size affect Linear Regression?

Answer:

Small dataset → unreliable coefficients
Large dataset → stable estimates, better generalization

Rule of thumb: 10–15 observations per predictor.

30. What is feature selection? Why is it important?

Answer:
Feature selection reduces noise and improves interpretability.

Methods:

Forward selection
Backward elimination
Recursive Feature Elimination (RFE)
Lasso regression

31. What is the difference between Ridge and Lasso?

Ridge	Lasso
L2 penalty	L1 penalty
Shrinks coefficients	Can make coefficients zero
No feature selection	Performs feature selection
Handles multicollinearity	Sparse model

32. What is Elastic Net and why is it used?

Answer:
Elastic Net combines Ridge and Lasso penalties.

Used when:

Many correlated features
Lasso selects only one variable

33. Explain polynomial regression.

Answer:
Polynomial regression models non-linear relationships by adding polynomial terms.

Example:
[
y = \beta_0 + \beta_1 x + \beta_2 x^2
]

Still considered linear in coefficients.

34. How do categorical variables work in Linear Regression?

Answer:
Categorical variables are converted using dummy encoding.

Avoid dummy variable trap by dropping one category
Use One-Hot Encoding

35. What is the dummy variable trap?

Answer:
Occurs when dummy variables are perfectly correlated.

Solution: Drop one category to avoid multicollinearity.

36. How do interaction terms work?

Answer:
Interaction terms capture combined effect of features.

Example:
[
y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \beta_3 x_1 x_2
]

Used when one variable’s effect depends on another.

37. What is leverage in Linear Regression?

Answer:
Leverage measures how far a data point’s predictor values are from the mean.

High leverage points can strongly influence the model.

38. Difference between confidence interval and prediction interval?

Answer:

Confidence interval → Mean prediction
Prediction interval → Individual prediction (wider)

39. What is cross-validation? Why is it used?

Answer:
Cross-validation evaluates model stability.

k-Fold CV
Leave-One-Out CV

Reduces overfitting and bias.

40. Explain model interpretability in Linear Regression.

Answer:
Linear regression is highly interpretable:

Coefficients show feature impact
Sign indicates direction
Magnitude shows strength

Preferred in finance, healthcare, policy models.

41. What is Weighted Linear Regression?

Answer:
Assigns weights to observations.

Used when:

Data reliability varies
Heteroscedasticity exists

42. How does scaling affect coefficients?

Answer:
Scaling changes coefficient magnitude but not predictions.

Important for:

Regularization
Gradient descent convergence

43. What is model drift in Linear Regression?

Answer:
Occurs when data distribution changes over time.

Handled by:

Monitoring metrics
Retraining models
Data validation

44. When should you not use Linear Regression?

Answer:

Non-linear patterns
High outliers
Categorical target
Complex interactions

45. Real-time scenario: Sales prediction model performs poorly in production. What steps do you take?

Answer:

Check data drift
Re-evaluate assumptions
Inspect residuals
Remove outliers
Add interaction or polynomial terms
Retrain model

46. What is Cook’s Distance and why is it important?

Answer:
Measures influence of data points.

High Cook’s Distance → model sensitive to that point.

47. Explain why Linear Regression is sensitive to outliers.

Answer:
OLS minimizes squared errors, giving more weight to extreme values.

48. How do you interpret negative R²?

Answer:
Model performs worse than predicting the mean.

Indicates poor fit or incorrect assumptions.

49. What is Maximum Likelihood Estimation (MLE) in Linear Regression?

Answer:
MLE estimates parameters assuming normally distributed errors.

Equivalent to OLS under Gaussian noise.

50. How do you explain Linear Regression to a business stakeholder?

Answer:
“It shows how much each factor impacts the outcome and helps forecast future values using historical trends.”

51. What is the difference between training error and test error?

Answer:

Training error → fit on known data
Test error → generalization ability

52. Can Linear Regression handle missing values?

Answer:
No. Missing values must be:

Imputed
Removed

53. What is a partial regression plot?

Answer:
Shows relationship between target and one predictor, controlling for others.

54. Explain why Linear Regression is a baseline model.

Answer:
Simple, fast, interpretable, and sets performance benchmark.

55. How do you tune hyperparameters in regularized regression?

Answer:
Use Grid Search with Cross-Validation to tune λ.