What is the purpose of regression diagnostics?
+
Regression diagnostics is a set of techniques used to identify and address issues with the model, such as influential data points and collinearity among predictors.
What is an influential data point?
+
An influential data point is an observation that has a disproportionate impact on the model's results, often causing the model to be overly sensitive to that particular data point.
How can I identify influential data points?
+
You can use techniques such as Cook's Distance, DFBETAS, and leverage plots to identify influential data points.
What is collinearity?
+
Collinearity is a situation where two or more predictors are highly correlated with each other, leading to unstable estimates of the model's coefficients.
How can I detect collinearity?
+
You can use techniques such as correlation matrices, variance inflation factors (VIF), and condition indices to detect collinearity.
What is a variance inflation factor (VIF)?
+
A VIF is a measure of the degree of collinearity among predictors, with higher values indicating greater collinearity.
How do I interpret a VIF value?
+
A VIF value greater than 5-10 indicates significant collinearity, while a value between 2-5 indicates moderate collinearity.
Can I remove a variable with high VIF?
+
Yes, removing a variable with high VIF can help to reduce collinearity and improve the model's stability.
What is a condition index?
+
A condition index is a measure of the ratio of the largest eigenvalue to the smallest eigenvalue of the correlation matrix, with higher values indicating greater collinearity.
How do I use Cook's Distance to identify influential data points?
+
You can use Cook's Distance to identify data points that have a large impact on the model's results, with values greater than 1 indicating influential data points.