COMPUTER SCIENCE AND ENGINEERING
MACHINE LEARNING
Question
[CLICK ON ANY CHOICE TO KNOW THE RIGHT ANSWER]


Euclidean Distance


Manhattan distance


Minkowski distance


Hamming distance

Detailed explanation1: 8) Which of the following distance measure do we use in case of categorical variables in kNN? Both Euclidean and Manhattan distances are used in case of continuous variables, whereas hamming distance is used in case of categorical variable.
Detailed explanation2: What distance metrics are used in KNN? A. Euclidean distance, cosine similarity measure, Minkowsky, correlation, and Chisquare, are used in the kNN classifier.
Detailed explanation3: Why using KNN ? KNN is an algorithm that is useful for matching a point with its closest k neighbors in a multidimensional space. It can be used for data that are continuous, discrete, ordinal and categorical which makes it particularly useful for dealing with all kind of missing data.
Detailed explanation4: Euclidean Distance – This distance is the most widely used one as it is the default metric that SKlearn library of Python uses for KNearest Neighbour. It is a measure of the true straight line distance between two points in Euclidean space.
Detailed explanation5: Hamming distance is used to measure the distance between categorical variables, and the Cosine distance metric is mainly used to find the amount of similarity between two data points.