Lasso Regression is another method in improving a model through adjusting coefficients. Unlike Ridge Regression in which every variable is still kept in the model, Lasso Regression could create a model with only a subset of all variables.

Lasso penalty term is as follows:

$$\lambda\sum_{j=1}^p|{\beta_{j}|}$$

We need to come up the value of $$\lambda$$ as in Ridge Regression. So, let’s get started.

We will use the glmnet library and mtcars dataset. We need to create a matrix of predictors as a data frame is not applicable to glmnet() .

Next, we create an array of the dependent variable.

Then we separate 80% of data to a training set, 20% to a test set.

Now we are ready to fit the Lasso Regression. First, we will use automatic $$\lambda$$ generation from glmnet() .

Then a for loop()  for MSE calculation.

Here is the lowest MSE.

GGPLOT visualization:

Let’s see if Lasso Regression only includes a subset of predictors.

Yep, only cyl, drat, and wt are included in the model.

As in Ridge Regression, let’s try the built-in k-CV functionality.

The best $$\lambda$$ is 0.1138016. Then we will use the $$\lambda$$ to forecast and calculate MSE.

Now, let’s compare to the plain vanilla lm() .

No matter how we choose $$\lambda$$, MSEs are lower than that of plain vanilla lm() .

TL;DR Lasso Regression could not only improve the model predictability but also the interpretability as only a subset of all variables may be included in the final model.