COMPUTER SCIENCE AND ENGINEERING
MACHINE LEARNING
Question
[CLICK ON ANY CHOICE TO KNOW THE RIGHT ANSWER]
|
|
Euclidean Distance
|
|
Manhattan distance
|
|
Minkowski distance
|
|
Hamming distance
|
Detailed explanation-1: -8) Which of the following distance measure do we use in case of categorical variables in k-NN? Both Euclidean and Manhattan distances are used in case of continuous variables, whereas hamming distance is used in case of categorical variable.
Detailed explanation-2: -What distance metrics are used in KNN? A. Euclidean distance, cosine similarity measure, Minkowsky, correlation, and Chi-square, are used in the k-NN classifier.
Detailed explanation-3: -Why using KNN ? KNN is an algorithm that is useful for matching a point with its closest k neighbors in a multi-dimensional space. It can be used for data that are continuous, discrete, ordinal and categorical which makes it particularly useful for dealing with all kind of missing data.
Detailed explanation-4: -Euclidean Distance – This distance is the most widely used one as it is the default metric that SKlearn library of Python uses for K-Nearest Neighbour. It is a measure of the true straight line distance between two points in Euclidean space.
Detailed explanation-5: -Hamming distance is used to measure the distance between categorical variables, and the Cosine distance metric is mainly used to find the amount of similarity between two data points.