KNN Algorithm

It is a supervised Machine learning algorithm, which can be used for both regression and classification problems. In this algorithm:

The distance between the test data point and all the training data points is calculated
The nearest K number of points is selected
In the case of Regression, the average of those points is taken as the predicted value.
In the case of Classification, the probability for each class is calculated and the output will be the class with the highest probability.

Euclidean distance

Manhattan distance

KNN is a Lazy Learner?

Generally, algorithms can be of two types, lazy learners and eager learners. Eager learners make a generalization with the training data set before receiving test data in order to predict the output of test data. But lazy learner, say KNN, it doesn't make a generalization, ie, no model is created with the training data. Instead, it waits until the arrival of test data to do the math. So basically, eager learners work during training and work less during testing. It's just reverse in the case of KNN.

Advantages of KNN

It is very simple and easy to execute
Easy to understand the math behind the algorithm
No training is needed
No hyperparameter tuning required
Doesn't make any assumptions about the distribution of data

Disadvantages of KNN

Since it doesn't create a model, it requires a lot of storage space to store the training data
It takes time to calculate the distance from each test point to all of the training data points.
Finding the optimum value of K.
Not suitable for higher dimensional data

I am doing a challenge - #66DaysofData in which I will be learning something new from the Data Science field for 66 days, and I will be posting daily topics on my LinkedIn, On my GitHub repository, and on my blog as well.

Stay Curious!

By Jerin Lalichan

Search This Blog

The Datamatics

Day 6 - K-Nearest Neighbor (KNN) Algorithm

KNN Algorithm

Euclidean distance

Manhattan distance

KNN is a Lazy Learner?

Advantages of KNN

Disadvantages of KNN

Comments

Post a Comment

Popular posts from this blog

#66DaysOfData ? Here is why you should also accept the challenge.

What exactly is Data Science ? Who is a data scientist ? Explained in a simple way.

Day 17 - Ensemble Techniques in ML - Averaging, Weighted average