By Jerin Lalichan

Performance metrics in ML

Evaluation of the performance of a model is important. Performance metrics are certain measures to quantify the performance of the model during the training and testing phases. In Machine learning, there are generally two kinds of performance metrics in use. For regression models and for classification models. Below are the most popular metrics in use:

Regression Metrics

Mean Squared Error (MSE)

It is simply the average of the sum of the squares of the difference between the actual value and predicted values. Due to the squaring in this equation, small errors are overestimated. Also because for that reason, this is very much prone to outliers.

2. Mean Absolute Error (MAE)

Mean Absolute Error is the average of the difference between the ground truth(actual value) and the predicted values. Since there is no squaring, the error estimated is not exaggerated or overestimated. Also, it is better with outliers than MSE. Even though it gives the distance between the actual and predicted values, it doesn't give the direction of the error.

3. Root mean squared error (RMSE)

This is basically the square root of MSE. Since the square root is taken, the problem of Overestimation in MSE is solved. And it is also better with outliers because of that reason.

4. R² - Coefficient of determination

SSres = Residual sum of squares

SStot = Total sum of squares

This is a metric calculated using other metrics. The higher value of R² indicates that the model was able to capture well the variance in the target variable. But it's value will keep on increasing with the addition of more features, even if the feature is not significant(ie. of less correlation). This is taken care of by Adjusted R²

5. Adjusted R²

           Where:
                   n = number of observations
                 k = number of independent variables
                    Ra² = adjusted R²

It will be always lower than R², as it adjusts for the increasing predictors and only shows improvement if there is a real improvement.

I am doing a challenge - #66DaysofData in which I will be learning something new from the Data Science field for 66 days, and I will be posting daily topics on my LinkedIn, On my GitHub repository, and on my blog as well.

Stay Curious!

By Jerin Lalichan

Search This Blog

The Datamatics

Day 4 - Performance metrics in Machine Learning - Regression

By Jerin Lalichan

Performance metrics in ML

Regression Metrics

Mean Squared Error (MSE)

2. Mean Absolute Error (MAE)

3. Root mean squared error (RMSE)

4. R² - Coefficient of determination

5. Adjusted R²

Comments

Post a Comment

Popular posts from this blog

#66DaysOfData ? Here is why you should also accept the challenge.

Day 17 - Ensemble Techniques in ML - Averaging, Weighted average

What exactly is Data Science ? Who is a data scientist ? Explained in a simple way.