We will use a mtcars dataset for demonstration. Let’s try fitting a full model by including every predictor.

We can see that predictor disp is significant at \(\alpha = 0.01\). \(Adj-r^2\) is 0.8565. So, let’s try creating a nested model in which we exclude disp.

We can see that \(Adj-r^2\) in the second model is lower than that of the first. It is not surprising as disp is a significant predictor.

But then, how can we be certain that the first model is a better model than the second? We can use anova()  to compare them.

We can see that RSS in the base model is much less than that of the second model. Also, we can see that the second model is significantly different from the first model at \(\alpha = 0.05\). So, yep, the base_model is better.