Predicting Compressive Strength of Concrete: ML Regression

This project explores the application of machine learning techniques to predict the compressive strength of high-performance concrete. The compressive strength is a crucial property that impacts the structural capabilities of concrete in construction. The project leverages various regression models to predict this strength based on the concrete mixture’s composition and age.

This task Regression Assignment was for the Machine Learning Course – MAI

GitHub Repo: https://github.com/dev-nanthan/regression-for-prediction

R2 and RSE Score for the Models Compared

  • Lasso – Residual Standard Rrror (RSE): 9.541807510700313
  • Lasso – R2 score: 0.5952917560883537
  • Ridge – Residual Standard Error [RSE]: 9.5414
  • Ridge – R2 statistic: 0.5953
  • Simple Linear – Residual Standard Error [RSE]: 10.3412
  • Simple Linear – R2 statistic: 0.6202

Linear Regression Model

The foundation of predictive modeling in this project begins with a simple linear regression model. Linear regression attempts to model the relationship between two or more features in a linear approach. We use multivariate ordinary least squares regression, considering multiple inputs like the composition of the concrete mixture and its age to predict a single output – the compressive strength.

Ridge Regression Model

Ridge regression introduces regularization to the linear regression model to prevent overfitting. This technique adds a penalty to the size of coefficients, shrinking them to avoid overly complex models. The regularization strength, controlled by the hyperparameter alpha (α), is optimized through cross-validation to find the balance between bias and variance.

Lasso Regression Model

Lasso regression, similar to Ridge, applies regularization but with a mechanism that can completely eliminate the weight of less important features. It is particularly useful for feature selection in models with high dimensionality. The process involves tuning the alpha (α) parameter to find the optimal model complexity.

Model Optimization and Selection

The project culminates in the exploration and selection of the best regression model to predict concrete strength. This involves experimenting with different models available in scikit-learn, fine-tuning their parameters, and evaluating their performance based on the residual standard error (RSE) and the R^2 statistic. The aim is to design a model that accurately predicts concrete compressive strength using the provided datasets.